华中农业大学教师主页平台管理系统 Xizi Chen--Home-- SubMac: Exploiting the Subword-Based Computation in RRAM-Based CNN Accelerator for Energy Saving and Speedup

Xizi Chen

Supervisor of Master's Candidates

Name (Simplified Chinese):Xizi Chen

Name (English):Xizi Chen

Name (Pinyin):chenxizi

Administrative Position:专任教师

Academic Titles:专任教师

Status:Employed

Education Level:博士

Degree:Doctoral degree

Business Address:华中农业大学第一综合楼B座413

E-Mail:

Alma Mater:The Hong Kong University of Science and Technology

Teacher College:College of Informatics

School/Department:Huazhong Agricultural University

Discipline:Computer Architecture Computer Applications Technology

Recommended MA Supervisor

Other Contact Information:

PostalAddress：

Email：

Paper Publications

Current position: Home > Scientific Research > Paper Publications

SubMac: Exploiting the Subword-Based Computation in RRAM-Based CNN Accelerator for Energy Saving and Speedup

Release time:2021-09-08 Hits:

Affiliation of Author(s):The Hong Kong University of Science and Technology (HKUST)

Journal:Integration, the VLSI Journal (CCF-C)

Funded by:This work is partially supported by Hong Kong Research Grant Council (RGC) under Grant 619813.

Key Words:Convolutional Neural Network (CNN), Resistive RAM, data encoding, dynamic quantization, computation saving

Abstract:Although the CMOS-based CNN accelerators have achieved impressive progress, the memory wall issue and the high power density are still the major barriers for substantial improvement in energy efficiency and throughput. As an attractive alternative, recently the Resistive RAM-based accelerators have delivered significant breakthroughs by leveraging the in-situ computation. However, there are still some challenges, including the high computation complexity and the large energy overhead at the analog/digital interfacing circuits. In this work, we take advantage of the subword-based computation in the Resistive RAM-based accelerator to achieve energy saving and speedup. First, an encoding method is proposed for the weights and activations to reduce the energy consumption of the in-situ computation and the resolution requirement of ADC. Then the resolution of ADC is further optimized based on the distribution of the subword computation results. Furthermore, a dynamic quantization scheme is proposed to skip 67%–87% of the subword computations which outperforms the conventional quantization schemes. We fully investigate the influences of the encoding scheme and the layer-wise quantization range scaling on the performance of dynamic quantization. Finally, we demonstrate the effectiveness of the proposed algorithms under different hardware configurations and network complexities. A dedicated architecture, SubMac, is proposed to implement the above schemes. Experimental results show that the energy efficiency and the throughput are improved by 2.8–5.7 and 2.5–7.9 times, respectively, when compared with the state-of-the-art Resistive RAM-based accelerators.

Note:CCF-C

Co-author:Jingbo Jiang,Jingyang Zhu,Chi-Ying Tsui

First Author:Xizi Chen

Translation or Not:no

Date of Publication:2019-01-01

Included Journals:SCI

Pre One:Accelerating Large Kernel Convolutions with Nested Winograd Transformation Next One:CompRRAE: RRAM-Based Convolutional Neural Network Accelerator with Reduced Computations Through a Runtime Activation Estimation