Click:   The Last Update Time:--

Xizi Chen

Supervisor of Master's Candidates
Name (Simplified Chinese):Xizi Chen
Name (English):Xizi Chen
Name (Pinyin):chenxizi
Administrative Position:专任教师
Academic Titles:专任教师
Status:Employed
Education Level:博士
Degree:Doctoral degree
Business Address:华中农业大学第一综合楼B座413
E-Mail:
Alma Mater:The Hong Kong University of Science and Technology
Teacher College:College of Informatics
School/Department:Huazhong Agricultural University
Discipline:Computer Architecture    Computer Applications Technology    
Other Contact Information:

PostalAddress:

Email:

Paper Publications
SubMac: Exploiting the Subword-Based Computation in RRAM-Based CNN Accelerator for Energy Saving and Speedup
Release time:2021-09-08    Hits:

Affiliation of Author(s):The Hong Kong University of Science and Technology (HKUST)

Journal:Integration, the VLSI Journal (CCF-C)

Funded by:This work is partially supported by Hong Kong Research Grant Council (RGC) under Grant 619813.

Key Words:Convolutional Neural Network (CNN), Resistive RAM, data encoding, dynamic quantization, computation saving

Abstract:Although the CMOS-based CNN accelerators have achieved impressive progress, the memory wall issue and the high power density are still the major barriers for substantial improvement in energy efficiency and throughput. As an attractive alternative, recently the Resistive RAM-based accelerators have delivered significant breakthroughs by leveraging the in-situ computation. However, there are still some challenges, including the high computation complexity and the large energy overhead at the analog/digital interfacing circuits. In this work, we take advantage of the subword-based computation in the Resistive RAM-based accelerator to achieve energy saving and speedup. First, an encoding method is proposed for the weights and activations to reduce the energy consumption of the in-situ computation and the resolution requirement of ADC. Then the resolution of ADC is further optimized based on the distribution of the subword computation results. Furthermore, a dynamic quantization scheme is proposed to skip 67%–87% of the subword computations which outperforms the conventional quantization schemes. We fully investigate the influences of the encoding scheme and the layer-wise quantization range scaling on the performance of dynamic quantization. Finally, we demonstrate the effectiveness of the proposed algorithms under different hardware configurations and network complexities. A dedicated architecture, SubMac, is proposed to implement the above schemes. Experimental results show that the energy efficiency and the throughput are improved by 2.8–5.7 and 2.5–7.9 times, respectively, when compared with the state-of-the-art Resistive RAM-based accelerators.

Note:CCF-C

Co-author:Jingbo Jiang,Jingyang Zhu,Chi-Ying Tsui

First Author:Xizi Chen

Translation or Not:no

Date of Publication:2019-01-01

Included Journals:SCI