Select by topic:

Conference Papers

[SC'24] (To appear) RecFlex: Enabling Feature Heterogeneity-Aware Optimization for Deep Recommendation Models with Flexible Schedules. Zaifeng Pan, Zhen Zheng, Feng Zhang,Bing Xie, Ruofan Wu, Shaden Smith, Chuanjie Liu, Olatunji Ruwase, Xiaoyong Du, Yufei Ding.

[USENIX ATC'24] (To appear) OPER: Optimality-Guided Embedding Table Parallelization for Large-scale Recommendation Model. Zheng Wang, Yuke Wang, Boyuan Feng, Guyue Huang, Dheevatsa Mudigere, Bharath Muthiah, Ang Li, Yufei Ding.

[ISCA'24] (To appear) Soter: Analytical Tensor-Architecture Modeling and Automatic Tensor Program Tuning for Spatial Accelerators. Fuyu Wang, Minghua Shen, Yufei Ding, Nong Xiao.

[ASPLOS'24][Distinguished Artifact Award] OnePerc: A Randomness-aware Compiler for Photonic Quantum Computing. Hezi Zhang, Jixuan Ruan, Hassan Shapourian, Ramana Rao Kompella, Yufei Ding. [paper]

[ASPLOS'24][Distinguished Artifact Award] EVT: Accelerating Deep Learning Training with Epilogue Visitor Tree. Zhaodong Chen, Andrew Kerr, Richard Cai, Jack Kosaian, Haicheng Wu, Yufei Ding, Yuan Xie. [paper]

[ASPLOS'24] MECH: Multi-Entry Communication Highway for Superconducting Quantum Chiplets. Hezi Zhang, Keyi Yin, Anbang Wu, Hassan Shapourian, Alireza Shabani, Yufei Ding. [paper]

[ASPLOS'24] RAP: Resource-aware Automated GPU Sharing for Multi-GPU Recommendation Model Training and Input Preprocessing. Zheng Wang, Yuke Wang, Jiaqi Deng, Da Zheng, Ang Li, Yufei Ding. [paper]

[ASPLOS'24] ZENO: A Type-based Optimization Framework for Zero Knowledge Neural Network Inference. Boyuan Feng, Zheng Wang, Yuke Wang, Shu Yang, Yufei Ding. [paper]

[MICRO'23] QuComm: Optimizing Collective Communication for Distributed Quantum Computing. Anbang Wu, Yufei Ding, Ang Li. [paper]

[MICRO'23] RM-STC: Row-Merge Dataflow Inspired GPU Sparse Tensor Core for Energy-Efficient Sparse Acceleration. Guyue Huang, Zhengyang Wang, Po-An Tsai, Chen Zhang, Yufei Ding, Yuan Xie. [paper]

[USENIX ATC'23] TC-GNN: Bridging Sparse GNN Computation and Dense Tensor Cores on GPUs. Yuke Wang, Boyuan Feng, Zheng Wang, Guyue Huang, Yufei Ding. [paper]

[OSDI'23] MGG: Accelerating Graph Neural Networks with Fine-grained intra-kernel Communication-Computation Pipelining on Multi-GPU Platforms. Yuke Wang, Boyuan Feng, Zheng Wang, Tong Geng, Ang Li, Kevin Barker, Yufei Ding. [paper]

[ISCA'23] OneQ: A Compilation Framework for Photonic One-Way Quantum Computation. Hezi Zhang, Anbang Wu, Yuke Wang, Gushu Li, Hassan Shapourian, Alireza Shabani, Yufei Ding. [paper]

[ISCA'23] Q-BEEP: Quantum Bayesian Error Mitigation Employing Poisson Modeling over the Hamming Spectrum. Samuel Stein, Nathan Wiebe, Yufei Ding, James Ang, Ang Li. [paper]

[ISCA'23] ECSSD: Hardware/Data Layout Co-Designed In-Storage-Computing Architecture for Extreme Classification. Siqi Li, Fengbin Tu, Liu Liu, Jilan Lin, Zheng Wang, Yangwook Kang, Yufei Ding, Yuan Xie. [paper]

[MLSys'23] ALCOP: Automatic Load-Compute Pipelining in Deep Learning Compiler for AI-GPUs. Guyue Huang, Yang Bai, Liu Liu, Yuke Wang, Bei Yu, Yufei Ding, Yuan Xie. [paper]

[VLDB'23] SPG: Structure-Private Graph Database via SqueezePIR. Ling Liang, Jilan Lin, Zheng Qu, Ishtiyaque Ahmad, Liu Liu, Fengbin Tu, Trinabh Gupta, Yufei Ding, Yuan Xie. [paper]

[PPoPP'23] Dynamic N:M Fine-grained Structured Sparse Attention Mechanism. Zhaodong Chen, Zheng Qu, Yuying Quan, Liu Liu, Yufei Ding, Yuan Xie. [paper]

[MICRO'22] AutoComm: A Framework for Enabling Efficient Communication in Distributed Quantum Program. Anbang Wu, Hezi Zhang, Gushu Li , Alireza Shabani, Yuan Xie, and Yufei Ding. [paper]

[SC'22] EL-Rec: Efficient Large-scale Recommendation Model Training via Tensor-train Embedding. Zheng Wang, Yuke Wang, Boyuan Feng, Dheevatsa Mudigere, Bharath Muthiah, Yufei Ding. [paper]

[USENIX ATC'22] Faith: An Efficient Framework for Transformer Verification on GPUs. Boyuan Feng, Tianqi Tang, Yuke Wang, Zhaodong Chen, Zheng Wang, Shu Yang, Yuan Xie, Yufei Ding. [paper]

[SC'22] LightSeq2: Accelerated Training for Transformer-Based Models on GPUs. Xiaohui Wang, Yang Wei, Ying Xiong, Guyue Huang, Xian Qian, Yufei Ding, Mingxuan Wang, Lei Li. [paper]

[ISCA'22] A Synthesis Framework for Stitching Surface Code with Superconducting Quantum Devices. Anbang Wu, Gushu LI, Hezi Zhang, Gian Giacomo Guerreschi, Yufei Ding, and Yuan Xie. [paper]

[ISCA'22][Best Paper Nominee] EQC : Ensembled Quantum Computing for Variational Quantum Algorithms. Samuel Stein, Nathan Wiebe, Yufei Ding, Peng Bo, Karol Kowalski, Nathan Baker, James Ang, and Ang Li. [paper]

[ISCA'22] INSPIRE: In-Storage Private Information Retrieval via Protocol and Architecture Co-design. Jilan Lin, Ling Liang, Zheng Qu, Ishtiyaque Ahmad, Liu Liu, Fengbin Tu, Trinabh Gupta, Yufei Ding, Yuan Xie. [paper]

[DAC'22] Shfl-BW: Accelerating Deep Neural Network Inference with Tensor-Core Aware Weight Pruning. Guyue Huang, Haoran Li, Minghai Qin, Fei Sun, Yufei Ding and Yuan Xie. [paper] [code]

[DAC'22][Best Paper Nominee] Heuristic Adaptability to Input Dynamics for SpMM on GPUs. Guohao Dai, Guyue Huang, Shang Yang, Zhongming Yu, Hengrui Zhang, Yufei Ding, Yuan Xie, Huazhong Yang, Yu Wang. [paper] [code]

[MLSys'22] Understanding GNN Computational Graph: A Coordinated Computation, IO, and Memory Perspective. Hengrui Zhang, Zhongming Yu, Guohao Dai, Guyue Huang, Yufei Ding, Yuan Xie, Yu Wang. [paper] [code]

[PPoPP'22] QGTC: Accelerating Quantized GNN via GPU Tensor Core GPUs. Yuke Wang*, Boyuan Feng*, Yufei Ding. (* co-primary authors) [paper]

[ASPLOS'22] Paulihedral: A Generalized Block-Wise Compiler Optimization Framework For Quantum Simulation Kernels. Gushu Li, Anbang Wu, Yunong Shii, Ali Javadi-Abhari, Yufei Ding, Yuan Xie. [paper]

[ASPLOS'22] DOTA: Detect and Omit Weak Attentions for Scalable Transformer Acceleration. Zheng Qu, Liu Liu, Fengbin Tu, Zhaodong Chen, Yufei Ding, Yuan Xie. [paper]

[ISSCC'22] A 28nm 15.59μJ/Token Full-Digital Bitline-Transpose CIM-based Sparse Transformer Accelerator with Pipeline/Parallel Reconfigurable Modes. Fengbin Tu, Zihan Wu, Yiqi Wang, Ling Liang, Liu Liu, Yufei Ding, Leibo Liu , Shaojun Wei, Yuan Xie, Shouyi Yin. [paper]

[ISSCC'22] A 28nm 29.2TFLOPS/W BF16 and 36.5TOPS/W INT8 Reconfigurable Digital CIM Processor with Unified FP/INT Pipeline and Bitwise in-Memory Booth Multiplication for Cloud Deep Learning Acceleration. Fengbin Tu, Yiqi Wang, Zihan Wu, Ling Liang, Yufei Ding, Bongjin Kim, Leibo Liu, Shaojun Wei, Yuan Xie, Shouyi Yin. [paper]

[NeurIPS'22] Biologically Inspired Dynamic Thresholds for Spiking Neural Networks. Jianchuan Ding, Bo Dong, Felix Heide, Yufei Ding, Yunduo Zhou, Baocai Yin, Xin Yang. [paper]

[CIKM'21][Spotlight] An Efficient Quantitative Approach for Optimizing Convolutional Neural Networks. Yuke Wang, Boyuan Feng, Xueqiao Peng, Yufei Ding. [paper]

[NanoCom'21](Invited) On the Co-Design of Quantum Software and Hardware. Gushu Li, Anbang Wu, Yunong Shii, Ali Javadi-Abhari, Yufei Ding, Yuan Xie. [paper]

[SC'21] APNN-TC: Accelerating Arbitrary-Precision Neural Networks on Tensor Cores. Boyuan Feng*, Yuke Wang*, Tong Geng, Ang Li, Yufei Ding (* co-primary authors). [paper]

[SC'21] Efficient Tensor Core-based GPU Kernels for Structured Sparsity under Reduced Precision. Zhaodong Chen*, Zheng Qu*, Liu Liu, Yufei Ding, Yuan Xie (* co-primary authors). [paper]

[USENIX ATC'21] Palleon: A Runtime System for Efficient Video Processing toward Dynamic Class Skew. Boyuan Feng, Yuke Wang, Gushu Li, Yuan Xie, Yufei Ding. [paper]

[CCGrid'21] TiAcc: Triangle-inequality based Hardware Accelerator for K-means on FPGAs. Yuke Wang, Boyuan Feng, Gushu Li, Georgios Tzimpragos, Lei Deng, Yuan Xie, Yufei Ding. [paper]

[OSDI '21] GNNAdvisor: An Adaptive and Efficient Runtime System for GNN Acceleration on GPUs. Yuke Wang, Boyuan Feng, Gushu Li, Shuangchen Li, Lei Deng, Yuan Xie, Yufei Ding. [paper]

[IPDPS'21] DSXplore: Optimizing Convolutional Neural Networks via Sliding-Channel Convolution. Yuke Wang, Boyuan Feng, Yufei Ding. [paper]

[PPoPP'21] EGEMM-TC: Accelerating Scientific Computing on Tensor Cores with Extended Precision. Boyuan Feng, Yuke Wang, Guoyang Chen, Weifeng Zhang, Yuan Xie, Yufei Ding. [paper]

[AAAI'21] UAG: Uncertainty-aware Attention Graph Neural Network for Defending Adversarial Attacks. Boyuan Feng, Yuke Wang, Yufei Ding. [paper]

[ICASSP'21] SAGA: Sparse Adversarial Attack on EEG-based Brain Computer Interface. Boyuan Feng, Yuke Wang, Yufei Ding. [paper]

[MICRO'21] ENMC: Extreme Near-Memory Classification via Approximate Screening. Liu Liu*, Jilan Lin*, Zheng Qu, Yufei Ding, Yuan Xie (* co-primary authors). [paper]

[MICRO'21] Improving Streaming Graph Processing Performance Using Input Knowledge. Abanti Basak, Zheng Qu, Jilan Lin, Alaa R. Alameldeen, Zeshan Chishti, Yufei Ding, Yuan Xie. [paper]

[ICCAD'21] Overcoming the Memory Hierarchy Inefficiencies in Graph Processing Applications. Jilan Lin, Shuangchen Li, Yufei Ding, Yuan Xie. [paper]

[AAAI '20] Weighted-Sampling Audio Adversarial Example Attack. Xiaolei Liu, Kun Wan, Yufei Ding, Xiaosong Zhang, Qingxin Zhu. [paper]

[ICTAI'20] SGQuant: Squeezing the Last Bit on Graph Neural Networks with Specialized Quantization. Boyuan Feng*, Yuke Wang*, Xu Li, Shu Yang, Xueqiao Peng, Yufei Ding (* co-primary authors). [paper]

[NPC'20] A Close Look at Multi-Tenant Parallel CNN Inference for Autonomous Driving. Yitong Huang, Yu Zhang, Boyuan Feng, Xing Guo, Yanyong Zhang, Yufei Ding. [paper]

[OOPSLA'20][Distinguished Paper Award] Projection-based Runtime Assertions for Testing and Debugging Quantum Programs. Gushu Li, Li Zhou, Nengkun Yu, Yufei Ding, Mingsheng Ying, Yuan Xie. [paper]

[MICRO'20] DUET: Boosting Deep Neural Network Efficiency on Dual-Module Architecture. Liu Liu, Zheng Qu, Lei Deng, Fengbin Tu, Shuangchen Li, Xing Hu, Zhenyu Gu, Yufei Ding, Yuan Xie. [paper]

[ICML'20] Boosting Deep Neural Network Efficiency with Dual-Module Inference. Liu Liu, Lei Deng, Zhaodong Chen, Yuke Wang, Shuangchen Li, Jingwei Zhang, Yihua Yang, Zhenyu Gu, Xing Hu, Yufei Ding, Yuan Xie. [paper]

[DAC'20] Eliminating Redundant Computation in Noisy Quantum Computing Simulation. Gushu Li, Yufei Ding, Yuan Xie. [paper]

[ISCA'20] iPIM: Programmable In-Memory Image Processing Accelerator Using Near-Bank Architecture. Peng Gu, Xinfeng Xie, Yufei Ding, Guoyang Chen, Weifeng Zhang, Dimin Niu, Yuan Xie. [paper]

[ASPLOS'20] Towards Efficient Superconducting Quantum Processor Architecture Design. Gushu Li, Yufei Ding, Yuan Xie. [paper]

[ASPLOS'20] DeepSniffer: a DNN Model Extraction Framework based on Learning Architectural Hints. Xing Hu, Ling Liang, Lei Deng, Shuangchen Li, Xinfeng Xie, Yu Ji, Yufei Ding, Chang Liu, Timothy Sherwood, Yuan Xie. [paper]

[ICTAI'19] Reconciling Feature-Reuse and Overfitting in DenseNet with Specialized Dropout. Kun Wan, Shu Yang, Boyuan Feng, Lingwei Xie, Yufei Ding. [paper]

[FCCM'19](Poster) KPynq: A Work-Efficient Triangle-Inequality based K-means on FPGA. Yuke Wang, Zhaorui Zeng, Boyuan Feng, Lei Deng, Yufei Ding. [paper]

[ASPLOS'19] Tackling the Qubit Mapping Problem for NISQ-Era Quantum Devices. Gushu Li, Yufei Ding, Yuan Xie. [paper]

[ICLR'19] Dynamic Sparse Graph for Efficient Deep Learning. Liu Liu, Lei Deng, Xing Hu, Maohua Zhu, Guoqi Li, Yufei Ding, Yuan Xie. [paper]

[NSF'18] Inter-Disciplinary Research Challenges in Computer Systems for the 2020s. Albert Cohen, Xipeng Shen, Josep Torrellas, James Tuck, Yuanyuan Zhou, et al. Technical Report, National Science Foundation, USA, December, 2018. [paper]

[SysML'18] TOP: A Compiler-Based Framework for Optimizing Machine Learning Algorithms through Generalized Triangle Inequality. Yufei Ding, Lin Ning, Hui Guan, Xipeng Shen, Madanlal Musuvathi, Todd Mytkowicz. [paper]

[ICDE'18] Reuse-Centric K-Means Configuration. Hui Guan, Yufei Ding, Xipeng Shen, Hamid Krim. [paper]

[OOPSLA'17] GLORE: Generalized Loop Redundancy Elimination upon LER-Notation. Yufei Ding, Xipeng Shen. [paper]

[PLDI'17] Generalizations of the Theory and Deployment of Triangular Inequality for Compiler-Based Strength Reduction. Yufei Ding, Lin Ning, Hui Guan, Xipeng Shen. [paper]

[ICDE '17] Sweet KNN: An Efficient KNN on GPU through Reconciliation of Redundancy and Regularity. Guoyang Chen, Yufei Ding, Xipeng Shen. [paper]

[PLDI'15] Autotuning algorithmic choice for input sensitivity. Yufei Ding, Jason Ansel, Kalyan Veeramachaneni, Xipeng Shen, Una-May O'Reilly, Saman Amarasinghe. [paper]

[ICML'15] Yinyang K-Means: A Drop-In Replacement of the Classic K-Means with Consistent Speedup. Yufei Ding, Yue Zhao, Xipeng Shen, Madan Musuvathi, Todd Mytkowicz. [paper]

[VLDB'15] TOP: A Framework for Enabling Algorithmic Optimizations for Distance-Related Problems. Yufei Ding, Xipeng Shen, Madan Musuvathi, Todd Mytkowicz. [paper]

[ASPLOS'14] Finding the Limit: Examining the Potential and Complexity of Compilation Scheduling for JIT-Based Runtime System. Yufei Ding, Mingzhou Zhou, Zhijia Zhao, Sarah Eisenstat, Xipeng Shen. [paper]

[OOPSLA'14] Call Sequence Prediction through Probabilistic Calling Automata. Zhijia Zhao, Bo Wu, Mingzhou Zhou, Yufei Ding, Jianhua Sun, Xipeng Shen, Youfeng Wu. [paper]

[CGO'13] ProfMig: A Framework for Flexible Migration of Program Profiles Across Software Versions. Mingzhou Zhou, Bo Wu, Yufei Ding, and Xipeng Shen. [paper]

Journals

[TACO'23] MPU: Memory-Centric SIMT Processor via In-DRAM Near-Bank Computing. Xinfeng Xie, Peng Gu, Yufei Ding, Dimin Niu, Hongzhong Zheng, Yuan Xie. [paper]

[TC'22] A Systematic View of Model Leakage Risks in Deep Neural Network Systems. Xing Hu, Ling Liang, Xiaobing Chen, Lei Deng, Yu Ji, Yufei Ding, Zidong Du, Qi Guo, Timothy Sherwood, Yuan Xie. [paper]

[TC'22] Dynamic Sparse Attention for Scalable Transformer Acceleration. Liu Liu, Zheng Qu, Zhaodong Chen, Fengbin Tu, Yufei Ding, Yuan Xie. [paper]

[TCAD'22] SDP: Co-Designing Algorithm, Dataflow, and Architecture for in-SRAM Sparse NN Acceleration. Fengbin Tu, Yiqi Wang, Ling Liang, Yufei Ding, Leibo Liu, Shaojun Wei, Shouyi Yin, Yuan Xie. [paper]

[JSSC'23] TranCIM: Full-Digital Bitline-Transpose CIM-based Sparse Transformer Accelerator with Pipeline/Parallel Reconfigurable Modes. Fengbin Tu, Zihan Wu, Yiqi Wang, Ling Liang, Liu Liu, Yufei Ding, Leibo Liu, Shaojun Wei, Yuan Xie, Shouyi Yin. [paper]

[TCAD'21] STPAcc: Structural TI-based Pruning for Accelerating Distance-related Algorithms on CPU-FPGA Platforms. Yuke Wang, Boyuan Feng, Gushu Li, Lei Deng, Yuan Xie, Yufei Ding. [paper]

[TNNLS'21] Exploring Adversarial Attack in Spiking Neural Networks with Spike-Compatible Gradient. Ling Liang, Xing Hu, Lei Deng, Yujie Wu, Guoqi Li, Yufei Ding, Peng Li, Yuan Xie. [paper]

[Information Systems'21] Reuse-centric K-means Configuration. Lijun Zhang, Hui Guan, Yufei Ding, Xipeng Shen, Hamid Krim. [paper]

[TCAD'21] Rubik: A Hierarchical Architecture for Efficient Graph Neural Network Training. Xiaobing Chen, Yuke Wang, Xinfeng Xie, Xing Hu, Abanti Basak, Ling Liang, Mingyu Yan, Lei Deng, Yufei Ding, Zidong Du, Yuan Xie. [paper]

[BIOINF'20] Domain-Adversarial Multi-Task Framework for Novel Therapeutic Property Prediction of Compounds. Lingwei Xie, Song He, Zhongnan Zhang, Kunhui Lin, Xiaochen Bo, Shu Yang, Boyuan Feng, Kun Wan, Kang Yang, Jie Yang, Yufei Ding. [paper]

[JAMT'20] Chatter detection in high-speed milling processes based on ON-LSTM and PBT. Fei Shi, Hongrui Cao, Yuke Wang, Boyuan Feng, Yufei Ding. [paper]

[TNNLS'20] Comprehensive SNN Compression Using ADMM Optimization and Activity Regularization. Lei Deng, Yujie Wu, Yifan Hu, Ling Liang, Guoqi Li, Xing Hu, Yufei Ding, Peng Li, Yuan Xie. [paper]

[TNNLS'20] Effective and Efficient Batch Normalization Using Few Uncorrelated Data for Statistics Estimation. Zhaodong Chen, Lei Deng, Guoqi Li, Jiawei Sun, Ling Liang, Xing Hu, Yufei Ding, Yuan Xie. [paper]

[JSSC'20] Tianjic: A Unified and Scalable Chip Bridging Neuroscience and Deep Learning. Li Deng, Guanrui Wang, Guoqi Li, Shuangchen Li, Ling Liang, Maohua Zhu, Yujie Wu, Jing Pei, Zhenzhi Wu, Xing Hu, Yufei Ding, Wei He, Yuan Xie, Luping Shi. [paper]

[Neural Networks'19] Rethinking the Performance Comparison between SNNs and ANNs. Lei Deng, Yujie Wu, Xing Hu, Ling Liang, Yufei Ding, Guoqi Li, Guangshe Zhao, Peng Li, Yuan Xie. [paper]

[TVLSI'19] DASM: DAta-Streaming Based Computing in Non-Volatile Memory Architecture for Embedded System. Liang Chang, Xin Ma, Zhaohao Wang, Youguang Zhang, Yufei Ding, Weisheng Zhao, Yuan Xie. [paper]

Patents

[US Patent] Machine learning through parallelized stochastic gradient descent. Madanlal S. Musuvathi, Todd D. Mytkowicz, Yufei Ding.

[US Patent] Determining a course of action based on aggregated data. Madanlal S. Musuvathi, Todd D. Mytkowicz, Saeed Maleki, Yufei Ding.

[US Patent] Implementing network security measures in response to a detected cyber attack. Madanlal S. Musuvathi, Todd D. Mytkowicz, Saeed Maleki, Yufei Ding.

[US Patent] Determining a likelihood of a resource experiencing a problem based on telemetry data. Madanlal S. Musuvathi, Todd D. Mytkowicz, Saeed Maleki, Yufei Ding.

[US Patent] Determining a likelihood of a user interaction with a content element. Madanlal S. Musuvathi, Todd D. Mytkowicz, Saeed Maleki, Yufei Ding