访问量:   最后更新时间:--

陈夕子

硕士生导师
教师姓名:陈夕子
教师英文名称:Xizi Chen
教师拼音名称:chenxizi
职务:专任教师
主要任职:专任教师
在职信息:在职
学历:博士
学位:博士学位
办公地点:华中农业大学第一综合楼B座413
电子邮箱:
毕业院校:香港科技大学
所属院系:信息学院
所在单位:信息学院
学科:计算机系统结构    计算机应用技术    
其他联系方式

通讯/办公地址:

论文成果
Tight Compression: Compressing CNN Through Fine-Grained Pruning and Weight Permutation for Efficient Implementation
发布时间:2021-09-08    点击次数:

DOI码:10.1109/TCAD.2022.3178047

所属单位:HZAU & HKUST

发表刊物:IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD,CCF-A类)

关键字:Deep learning, pruning, weight sparsity, neural network compression

摘要:The unstructured sparsity after pruning poses a challenge to the efficient implementation of deep learning models in existing regular architectures like systolic arrays. On the other hand, coarse-grained structured pruning is suitable for implementation in regular architectures but tends to have higher accuracy loss than unstructured pruning when the pruned models are of the same size. In this work, we propose a model compression method based on a novel weight permutation scheme to fully exploit the fine-grained weight sparsity in the hardware design. Through permutation, the optimal arrangement of the weight matrix is obtained, and the sparse weight matrix is further compressed to a small and dense format to make full use of the hardware resources. Two pruning granularities are explored. In addition to the unstructured weight pruning, we also propose a more fine-grained subword-level pruning to further improve the compression performance. Compared to the state-of-the-art works, the matrix compression rate is significantly improved from 5.88× to 14.13×. As a result, the throughput and energy efficiency are improved by 2.75 and 1.86 times, respectively.

备注:中国计算机学会 CCF-A 类

合写作者:Jingyang Zhu,Jingbo Jiang,Chi-Ying Tsui

第一作者:Xizi Chen

论文类型:期刊论文

是否译文:

发表时间:2023-02-01

收录刊物:SCI

发布期刊链接:https://ieeexplore.ieee.org/document/9781609/metrics#metrics