Renesas Processing-In-Memory Technology Delivers 8.8 TOPS/W AI Performance
June 13, 2019
Product
A ternary SRAM structure, SRAM circuits that read memory at low power, and technologies to mitigate manufacturing process variation errors enable industry-leading TOPS/W.
Renesas has developed an AI accelerator that achieved 8.8 TOPS/W when processing CNN algorithms on a test chip. The accelerator is based on a processing-in-memory (PIM) architecture that performs multiply-accumulate (MAC) operations in a memory circuit as data is being read from the circuit. It will be part of the company’s family of embedded AI (e-AI) edge compute solutions.
The PIM accelerator integrates:
- A ternary (-1, 0, 1) SRAM structure – Along with a digital calculation block that minimizes calculation errors, the ternary structure allows the accelerator to switch between bit calculation based on the required accuracy/resolution (for example, 1.5 bit (ternary) and 4-bits). This also allows users to balance accuracy and power consumption.
- A specialized SRAM circuit that reads memory at low power – A 1-bit comparator and replica cell allows the SRAM current to be controlled, resulting in a precision memory readout circuit. The circuit also ceases operation altogether when it encounters a neural network node (neuron) that is not activated.
- Technology that prevents calculation errors due to manufacturing process variations – To offset process variations that can cause errors in the values of SRAM bit line currents, the interior of the chip was lined with multiple SRAM calculation circuit blocks. Node calculations are then selectively allocated to blocks with the least process variation. Renesas believes this reduces calculation errors such that they become essentially irrelevant.
Together, the three technologies minimize memory access time as well as the power consumption of MAC operations. The ternary, as opposed to binary, SRAM structure also enables high accuracy in large-scale CNN workloads.
According to Renesas, the resulting 8.8 TOPS/W is the industry’s highest power efficiency at an accuracy ratio of more than 99 percent. The company reportedly demonstrated this performance at the 2019 Symposia on VLSI Technology and Circuits, where the test chip was connected to a small battery and various input peripherals.
Renesas classifies the accelerator’s performance per watt as an enabler of incremental learning directly on endpoints. For more information on the company’s e-AI technology, visit www.renesas.com/us/en/solutions/key-technology/e-ai.html.