IBM researchers have taken another step towards making in-memory computing based on phase change (PCM) memory devices a reality. Papers in Nature and Frontiers in Neuroscience this month present IBM work using a mixed-signal architecture with PCM devices to run deep neural networks at high accuracy. IBM demonstrated a novel approach to coping with natural variations present in PCM (memristive) devices to improve accuracy by training models to accommodate noise.
The broad goal of these “neuromorphic” processing approaches is to ‘mimic’ the low-power processing techniques used by biological systems and eliminate some of the data movement required by traditional compute architectures. IBM researcher Manuel LeGallo, an author on both papers, described the work in a blog last week. Manuel LeGallo, IBM “While there has been significant progress in the development of hardware-accelerator architectures for inference, many of the existing set-ups physically split the memory and processing units. This means that DNN models are typically stored in off-chip memory, and that computational tasks require a constant shuffling of data between the memory and computing units – a process that slows down computation and limits the maximum achievable energy efficiency,” wrote LeGallo.
“Our research, featured in Nature Communications , exploits in-memory computing methods using resistance-based (memristive) storage devices as a promising non-von Neumann approach for developing hardware that can efficiently support DNN inference models. Specifically, we propose an architecture based on phase-change memory (PCM) that, like the human brain, has no separate compartments to store and compute data, and therefore consumes significantly less energy.”
A major challenge in using PCM devices is achieving and maintaining computational accuracy. PCM technology is analog in nature, and computational precision is limited due to device variability as well as read and write conductance noise. IBM was seeking a way to train the neural networks so that transferring the digitally trained weights to the analog resistive memory devices would not result in significant loss of accuracy.
Wrote LeGallo, “Our approach was to explore injecting noise to the synaptic weights during the training of DNNs in software as a generic method to improve the network resilience against analog in-memory computing hardware non-idealities. Our assumption was that injecting noise comparable to the device noise during the training of DNNs would improve the robustness of the models.” It turned out they were correct. “Training ResNet-type networks this way resulted in no considerable accuracy loss when transferring weights to PCM devices. We achieved an accuracy of 93.7% on the CIFAR-10 dataset and a top-1 accuracy on the ImageNet benchmark of 71.6% after mapping the trained weights to analog PCM synapses. And after programing the trained weights of ResNet-32 on 723,444 PCM devices of a prototype chip, the accuracy computed from the measured hardware,” weights stayed above 92.6% over a period of 1 day. To the best of our knowledge, this is the highest accuracy experimentally reported to-date on the CIFAR-10 dataset by any analog resistive memory hardware.”
IBM has long conducted productive research into using PCM technology for in-memory computing. As explained in the Nature paper, (Accurate deep neural network inference using computational phase-change memory), PCM is a memristive technology, which records data in a nanometric volume of phase-change material sandwiched between two electrodes. The phase-change material is in the low-resistive crystalline phase in an as-fabricated device. By applying a current pulse of sufficient amplitude (typically referred to as the RESET pulse) an amorphous region around the narrow bottom electrode is created via a melt-quench process. The device will be in a low conductance state if the high-resistive amorphous region blocks the current path between the two electrodes. The size of the amorphous region can be modulated in an almost completely analog manner by the application of suitable electrical pulses.
Here is the formal description of test system used taken from the Nature paper:
“The experimental platform is built around a prototype PCM chip that comprises 3 million PCM devices. The PCM array is organized as a matrix of word lines (WL) and bit lines (BL). In addition to the PCM devices, the prototype chip integrates the circuitry for device addressing, and for write and read operations. The PCM chip is interfaced to a hardware platform comprising a field programmable gate array (FPGA) board and an analog-front-end (AFE) board. The AFE board contains the digital-to-analog converters, and provides the power supplies as well as the voltage and current reference sources to the PCM chip. The FPGA board implements the data acquisition and the digital logic to interface with the PCM device under test and with all the electronics of the AFE board.
“The PCM devices are integrated into the chip in 90-nm CMOS technology using the key-hole process described in the ref. 52. The phase-change material is doped Ge2Sb2C5. The bottom electrode has a radius of ~20 nm and a height of ~50 nm. The phase-change material is ~100 nm thick and extends to the top electrode, whose radius is ~100 nm. All experiments performed in this work are done on an array containing 1 million devices accessed via transistors, which is organized as a matrix of 512 WL and 2048 BL.” Training the models to accommodate noise is an important achievement. LeGallo noted, “In an era transitioning more and more towards AI-based technologies, including internet-of-things battery-powered devices and autonomous vehicles, such technologies would highly benefit from fast, low-powered, and reliably accurate DNN inference engines. The strategies developed in our studies show great potential towards realizing accurate AI hardware-accelerator architectures to support DNN training and inferencing in an energy-efficient manner.”
“Fellow researchers affiliated with King’s College London, The Swiss Federal Institute of Technology in Zurich (ETH Zürich), and École Polytechnique Fédérale de Lausanne (EPFL) also contributed to this work. Our research is part of the IBM AI Hardware Center , which was launch one year ago. The center focuses on enabling next-generation chips and systems that support the tremendous processing power and unprecedented speed that AI requires to realize its full potential,” wrote LeGallo.
Much work remains to be done but the recent work is significant and will advance […]