DOI: https://doi.org/10.29363/nanoge.neumatdecas.2023.056
Publication date: 9th January 2023
Machine Learning applications, especially Deep Neural Networks (DNNs) have seen ubiquitous use in computer vision, speech recognition, and robotics. However, the growing complexity of DNN models have necessitated efficient hardware implementations. The key compute primitives of DNNs are matrix vector multiplications which leads to significant data movement between memory and processing units in today’s von Neumann systems. A promising alternative would be co-locating memory and processing elements, which can be further extended to performing computations inside the memory itself. We believe in-memory computing is a propitious candidate for future DNN accelerators since it mitigates the memory wall bottleneck. In this talk, I will discuss various in-memory computing primitives in both CMOS and emerging non-volatile memory (NVM) technologies. Subsequently, I will describe how such primitives can be incorporated in stand-alone machine learning accelerator architectures. Finally, I will focus on the challenges associated with designing such in-memory computing accelerators, and explore future opportunities.