In-store Computing Technical Notes

 1. In-store computing has a wide range of application scenarios at the cloud edge

Based on their different device characteristics and computing methods, in-store computing products can provide a wide range of AI capabilities such as inference and training for cloud-side applications, improving computing efficiency, reducing system power consumption and equipment costs.


1.1 End-side application scenarios


According to IDC, the number of IoT devices worldwide will exceed 40 billion in 2025, generating.


The amount of data is close to 80ZB[10],in many scenarios such as smart city, smart home, autonomous driving, etc.


More than half of the data needs to be processed locally by the terminal, with a single device computing power requirement of about 0.1 to 64 TOPS.


Between In addition, various types of terminal devices have high requirements for runtime, power consumption, portability, etc.


For example, smart glasses/headphones need to guarantee a full load standby time of more than 16 hours and the maximum operating power consumption of mobile phones.


Future development of end-side devices will focus more on latency, power consumption, cost and privacy, etc.


demand characteristics, as shown in Figure 1.



1.2 Side-by-side use scenarios


With the rapid rise of edge computing applications such as cloud gaming and Telematics, massive amounts of data will be processed at the edge side and the traffic model will gradually expand from the cloud side to the edge side. The demand for single device arithmetic power in edge computing scenarios is around 64~256TOPS, with high latency requirements, such as end-to-end latency required for smart ports.


In addition, the end-to-end delay is 3~100ms due to the edge-side devices.


They are usually deployed in locations such as near data production or use, and have high heat dissipation requirements. Overall.


The future development of edge-side devices will focus more on demand characteristics such as latency, power consumption, cost and versatility.


As shown in Figure 1-2.


Compared with traditional solutions, deposit and computation in one has unique advantages in areas such as deep learning and can be compared to traditional solutions, deposit and computation in one has unique advantages in areas such as deep learning and can provide.


The computing efficiency ratio is tens of times higher than conventional devices, and the in-store computing chip can provide, through architectural innovation.


The comprehensive performance of the chip and board is expected to have a wide range of applications in edge-side reasoning scenarios.


Servicing a wide range of Edge Al operations.



1.3 Cloud-side application scenarios


Unstructured data, mainly images, voice and video, is growing at a high rate. According to IDC forecasts, the demand for intelligent computing power will increase 500 times by 2030, and intelligent computing centres with AI computing power as the core will become the mainstream of computing power infrastructure, with large-scale intensive construction of AI chips bringing high power consumption challenges. Smart computing centres call for new AI chips to meet the characteristics of large computing power, high bandwidth and low power consumption on the cloud side, as shown in Figure 1-3.



As a key next-generation AI chip technology for smart computing centres, in-store computing is evolving towards high computing power, versatility and high computing accuracy, and is expected to provide green and energy-efficient large-scale AI computing power for smart computing centres.


2. Five technical challenges in in-store computing


The generalised storage computing technology is gradually moving from academic research to commercial application, with near-storage computing and in-storage processing facing high manufacturing and packaging technology thresholds in the product implementation phase. In-deposit computing is less mature and needs to be strengthened in many aspects, from device development and manufacturing, circuit design, chip architecture, EDA tool chain to software and algorithm ecology, which requires closer collaboration among all segments of the industry chain.


2.1 Low maturity of new devices and difficult to upgrade manufacturing processes


In-deposit computing is an important issue in terms of the maturity of new devices. The use of traditional and new devices are two important ways to implement in-deposit computing. While traditional devices such as NOR Flash and SRAM are relatively mature, newer devices such as RRAM, PCM and MRAM have different characteristics in terms of device consistency, erasure times, power consumption and reliability, which affect the performance of in-deposit computing products in terms of computational accuracy, lifetime and energy consumption.


Existing manufacturing lines cannot be seamlessly switched for the introduction of new devices, and there is still room to improve the existing process level. At the chip manufacturing stage, manufacturers are required to make changes to existing production lines, such as continuous optimisation of masks, equipment tuning and other aspects. In addition, process miniaturisation for new devices does not fully follow the experience of existing transistor processes, and it is difficult to balance high reliability and precision when new device processes are compatible with advanced processes.


2.2 Circuit design affects the chip's computing efficiency


Circuit design is a central determinant of the energy efficiency benefits of memory computing chips, and the overall technology is not yet mature. The circuit design is divided into two main parts: the storage computing core (Macro) and the peripheral circuits. The design of the storage and computing units and circuit connections varies from core to core, and many cutting-edge research and development results are of varying levels of efficiency, and have not yet been technically deposited. The peripheral circuitry is designed to help the chip achieve full computational capability, including input and output interface processing and the accumulation of computation results from the cores, which requires adaptation to the cores and low energy and area consumption. In addition, analogue in-store computation also involves the use of advanced ADCs, DACs, TIAs and other modules, which also pose a challenge in terms of area and power consumption.


2.3 Poor versatility and scalability of chip architecture scenarios


The current small number of commercial in-deposit computing chips are small in terms of computational power and are mainly implemented for specific areas on the end side, with no mature chip architecture with large computational power, which cannot provide effective support to drive in-deposit computing products to cloud-side scenarios. On the one hand, the current in-store computing chip supports a limited variety of operators, which makes it difficult to meet the rich computing needs of many neural network algorithms and lacks scenario versatility. On the other hand, there is a lack of mature multi-core collaboration mechanisms and unified on-chip and inter-chip interconnection protocols and standards, making it difficult to realise large computing chips.



2.4 The EDA tool chain is not yet robust


The design of in-store computing chips differs significantly from conventional chips, and the current EDA tools to aid design and simulation verification are not yet mature. This is demonstrated by the fact that.


Lack of standard cell libraries and fast assembly tools. Different memory devices use different memory cell structures and existing EDA tools do not provide a comprehensive library of standard cells for use by chip designers, which can only be done by hand drawing. In addition, current in-deposit computing chips are inefficient to productise and there is a lack of automated tools to enable rapid assembly of large scale in-deposit arrays.


Lack of tools for functional and performance simulation. There are currently no tools to optimise the efficiency of simulation for in-store computing scenarios, which requires a lot of time for functional and performance verification of in-store computing, and is even more difficult when implementing large scale in-store arrays.


Lack of modelling and error assessment tools. Inaccurate modelling and error assessment can lead to discrepancies between the actual calculation results and the desired results, e.g. modelling the circuit noise of a device can help developers to evaluate their solutions and make timely adjustments during the design phase. The current lack of tools for modelling ADC/DAC/TIA related circuit noise in devices in in-store computing studies poses a challenge for chip design solution evaluation and chip availability.



2.5 Inadequate software and algorithm ecology


Lack of common development environment and compiler support. In order to effectively exploit the arithmetic power of the memory computing chip, the compiler needs to map the neural network model arithmetic to the underlying memory computing unit.


The problem of matching neural network algorithms is challenging. There are several mainstream neural network model quantization schemes in the industry, which vary according to the model characteristics. In addition, in-store computing is suitable for highly parallel processing scenarios, but some neural network applications fragment the computation of multiplication and accumulation under matrices, and the mismatch between the algorithm and the computational characteristics of the chip can lead to low hardware utilisation.


3. Five development proposals for in-deposit computing


China Mobile combined with arithmetic network business development requirements, put forward proposals for the development of in-store computing, and work with the industry to accelerate the industrialization process.


3.1 Recommendation 1: Synergise advanced packaging technologies to achieve a combination of different solutions


Various memory devices have their own advantages in terms of in-store computing solutions, and can be combined with near-store computing and in-store processing solutions, such as advanced packaging techniques such as 2.5D/3D/Chiplet, to achieve a high degree of integration of in-store computing chips from different processes and devices to achieve complementary advantages, taking into account cost, energy efficiency, performance, accuracy and versatility, as shown in Figure 3-1. In this process, new devices such as RRAM, PCM and MRAM need to be matured and made compatible with advanced processes in order to fully exploit their advantages such as low energy consumption and high density.



Figure 3-1 Example of a future advanced chip:


3D±t" (left), Ta Koji Ripun (centre), Suwa Hashi Kuni (right)


 


3.2 Recommendation 2: Optimise circuit and chip architecture design, to ensure energy efficiency benefits and evolutionary capability


Circuit design and chip architecture are critical to the energy efficiency and versatility of the DIC. On the one hand, the circuit design capability of the storage and computing arrays and peripheral modules should be strengthened to ensure the overall high parallelism and low power consumption of the chip, and on the other hand, a sustainable evolution of the general-purpose storage and computing chip architecture should be built to support larger scale computing power requirements, more algorithms and application scenarios.




3.3 Recommendation 3: Accelerate the incubation of EDA tools to shorten the chip development cycle


The industrialisation of in-deposit computing requires extensive support from EDA and other upstream companies in the industry chain. In order to ensure mass production of the chip, the chip design, EDA and manufacturers need to work together to create EDA tools covering cell simulation, reliability design, low-power design, computational module design and other aspects to provide a strong support for the design and simulation verification of the in-deposit computing chip. In addition, the integration of storage and computing will provide an opportunity to promote the development of the domestic EDA industry.




3.4 Recommendation 4: Build a development ecology and programming framework to accelerate application scale development


In order to promote the large-scale application of intra-depository computing, the establishment of corresponding development environments and compilation platforms has become an inevitable demand. The industry needs to work together to promote open source and standard ecologies, build programming frameworks for intra-depository computing, improve automated algorithm development, simulation and compilation tools, and build an algorithm design and development ecology for intra-depository computing and the characteristics of time-based computing.




3.5 Recommendation 5: Close collaboration between industry, academia and research to promote end-side to cloud-side evolution


As the scope of in-store computing applications moves from the edge to the cloud side.


The gradual evolution requires the promotion of close collaboration between industry, academia and research to establish.


End-to-end technology stack. In-store computing is suitable for audio, video.


Autonomous driving, decision analysis and many other application scenarios are currently.


Commercial NOR Flash, SRAM in-store computing chips.


Mainly for end-side voice and video with small to medium computing power requirements.


frequency scenarios, which could further enable general-purpose large-calculus chips in the future.


Communication, natural language understanding, autonomous driving for the cloud edge.


This is an efficient computing service for scenarios such as this. Therefore, there is a need for industry, academia and research to work closely.


We work in close collaboration to build a chain cooperation platform to pull through the full link between device and chip development, tool chain construction, software ecology construction, industrial development programme testing and application.


The company provides a full range of links to device and chip development, tool chain building, software ecology building, industry development, solution testing and application.


4. Industry development initiatives


In response to the challenges and problems faced by the development of the narrow storage and computing integration, China Mobile, as a new arithmetic network.


The leader and practitioner of the development concept, we hope to cooperate fully with our partners around technology, industry, and Work on three aspects of ecology to open up the industrial chain in all aspects of storage and calculation and promote ecological development accelerating the industrialization process and truly unlocking the enormous potential of the storage and computing technology in terms of performance and cost.


Helping the country achieve original technological innovation and leadership in computing.


Jointly explore the core technologies of storage and computing integration. We will work together on key technologies in new materials, chip architectures, compilers and other areas, and jointly explore the application scenarios of storage and computing integration to support the new development path of the country's new computing infrastructure and help the development strategies of a strong network, digital China and a smart society to be implemented.


We will work together to accelerate the maturity of the storage and accounting industry. We will work together to address common problems in the depository and accounting industry chain, promote effective linkages between upstream and downstream industry chain, production, supply and marketing, enhance the resilience of the industry chain, strengthen the depth and breadth of penetration of new technologies into the industry, explore experimental demonstrations of depository and accounting integration, and collaborate to promote innovation and healthy development of the industry chain.


Jointly promote the ecological prosperity of storage and computing integration. Through various ways and means, such as standard setting, open source promotion and industrial cooperation, we will accelerate the maturation of storage and computing technologies, accelerate the migration of upper layer applications, and jointly build a prosperous industrial ecology from low-level chips to upper layer applications.

https://en.witmem.com/news/industry_news1/in_store_computing.html

评论

此博客中的热门博文

The International Real Estate Expo is the event of the year for those involved in the real estate industry.

CIM and neural network model technologies both have the potential to revolutionize the computing industry

Why not plan to visit the next International Real Estate Expo today?