ShenZhen Hao Qi Core Technology Introduction: Storage and Computing Integrated AI Chip
Smart life, smart city, and slowly we will enter the era of the Internet of Things, and massive amounts of data will swarm. In particular, various application terminals and edge sides need to process more and more data, and put forward higher and higher requirements for processor stability and power consumption. In this way, the shortcomings of traditional computing systems and architectures It seems to be more and more prominent, which makes the integration and development of computing + storage + AI a major direction.
At present, whether it is a PC or a supercomputing, the processor and the memory chip are separated. This is the computing architecture established by von Neumann more than 50 years ago. With the development of technology, the architectural bottleneck of the separation of storage and computing becomes more and more obvious.
The general chip design idea is to add a large number of parallel computing units. In the traditional computing architecture, storage has always been a limited and scarce resource. With the increase of computing units, the bandwidth and size of the memory that can be used by each unit will gradually decrease. With the advent of the era of artificial intelligence, this contradiction has become more and more prominent. In many AI inference operations, more than 90% of computing resources are consumed in the process of data transfer. The bandwidth from inside the chip to the outside, and the on-chip buffer space limit the efficiency of the operation. Therefore, in the industry and academia, more and more people believe that the integration of storage and computing is the future trend, which can well solve the "storage wall" problem.
The memory-computing integrated AI chip based on the NOR flash memory architecture can directly perform full-precision matrix convolution operations (multiplication and addition operations) in the storage unit by using the analog characteristics of NOR Flash. It avoids the bottleneck of data transmission back and forth between ALU and memory, thus greatly reducing power consumption and improving computing efficiency. Its Flash storage unit can store the weight parameters of the neural network, and can also complete the multiplication and addition operations related to the weights, so that the multiplication and addition operations and storage are integrated into a Flash unit. For example, 1 million Flash cells can store 1 million weight parameters, and can also perform 1 million multiplication and addition operations in parallel. Compared with the traditional Von Neumann architecture deep learning chip, this kind of computing efficiency is very high, and the cost is low, because DRAM, SRAM and on-chip parallel computing units are omitted, thus simplifying the system design.
At present, the main application field of this AI chip based on NOR flash memory architecture is the applications that are sensitive to cost and computing efficiency (especially power consumption), such as low power consumption on the edge side, low-cost speech recognition, etc. . With the development of artificial intelligence and the Internet of Things, it can also expand more application scenarios.
ShenZhen Hao Qi Core Technology believes that in the development of storage and computing integrated AI chips, in addition to storage and computing technology itself, the follow-up of industry-related interface standards is particularly important, especially for new storage-based applications. It is also necessary to continuously improve the construction of the ecosystem in order for the entire industry to develop.