Publications

Google Scholar

2025

  1. ICS
    BitWeaver: Read-Time Truncation near Memory
    Garrett Gagnon, Srikanth Malla, Yangwook Kang, and 1 more author
    In Proceedings of the ACM International Conference on Supercomputing, 2025
  2. NORA: Noise-Optimized Rescaling of LLMs on Analog Compute-in-Memory Accelerators
    Yayue Hou, Hsinyu Tsai, Kaoutar El Maghraoui, and 3 more authors
    In Proceedings of Design, Automation and Test in Europe Conference, 2025

2024

  1. NeurIPS-W
    MAPLE: Memory-Aware Predict and Load for Efficient LLM Inference
    Zhenyu Liu, Zhemin Zhang, Zirui Zhang, and 4 more authors
    In Workshop on Machine Learning and Compression, NeurIPS, 2024
  2. CAL
    SmartQuant: CXL-Based AI Model Store in Support of Runtime Configurable Weight Quantization
    Rui Xie, Asad Ul Haq, Linsen Ma, and 5 more authors
    IEEE Computer Architecture Letters, 2024
  3. Preprint
    Enhance DNN Adversarial Robustness and Efficiency via Injecting Noise to Non-Essential Neurons
    Zhenyu Liu, Garrett Gagnon, Swagath Venkataramani, and 1 more author
    arXiv preprint arXiv:2402.04325, 2024

2023

  1. ECSSD: Hardware/Data Layout Co-Designed In-Storage-Computing Architecture for Extreme Classification
    Siqi Li, Fengbin Tu, Liu Liu, and 5 more authors
    In Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023
  2. ALCOP: Automatic Load-Compute Pipelining in Deep Learning Compiler for AI-GPUs
    Guyue Huang, Yang Bai, Liu Liu, and 4 more authors
    Proceedings of Machine Learning and Systems, 2023
  3. Dynamic N: M fine-grained structured sparse attention mechanism
    Zhaodong Chen, Zheng Qu, Yuying Quan, and 3 more authors
    In Proceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2023

2022

  1. JSSC
    TranCIM: Full-Digital Bitline-Transpose CIM-based Sparse Transformer Accelerator With Pipeline/Parallel Reconfigurable Modes
    Fengbin Tu, Zihan Wu, Yiqi Wang, and 7 more authors
    IEEE Journal of Solid-State Circuits, 2022
  2. TC
    Dynamic sparse attention for scalable transformer acceleration
    Liu Liu, Zheng Qu, Zhaodong Chen, and 3 more authors
    IEEE Transactions on Computers, 2022
  3. INSPIRE: in-storage private information retrieval via protocol and architecture co-design
    Jilan Lin, Ling Liang, Zheng Qu, and 6 more authors
    In Proceedings of the 49th Annual International Symposium on Computer Architecture, 2022
  4. A one-for-all and o (v log (v))-cost solution for parallel merge style operations on sorted key-value arrays
    Bangyan Wang, Lei Deng, Fei Sun, and 4 more authors
    In Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2022
  5. DOTA: detect and omit weak attentions for scalable transformer acceleration
    Zheng Qu, Liu Liu, Fengbin Tu, and 3 more authors
    In Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2022
  6. A 28nm 15.59 \muJ/token full-digital bitline-transpose CIM-based sparse transformer accelerator with pipeline/parallel reconfigurable modes
    Fengbin Tu, Zihan Wu, Yiqi Wang, and 7 more authors
    In 2022 IEEE International Solid-State Circuits Conference (ISSCC), 2022

2021

  1. Computer
    \pi-rt: A runtime framework to enable energy-efficient real-time robotic vision applications on heterogeneous architectures
    Liu Liu, Jie Tang, Shaoshan Liu, and 3 more authors
    Computer, 2021
  2. SC
    Efficient tensor core-based gpu kernels for structured sparsity under reduced precision
    Zhaodong Chen, Zheng Qu, Liu Liu, and 2 more authors
    In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2021
  3. ENMC: Extreme near-memory classification via approximate screening
    Liu Liu, Jilan Lin, Zheng Qu, and 2 more authors
    In 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021

2020

  1. DUET: Boosting deep neural network efficiency on dual-module architecture
    Liu Liu, Zheng Qu, Lei Deng, and 6 more authors
    In 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020
  2. DAC
    Computation on sparse neural networks and its implications for future hardware
    Fei Sun, Minghai Qin, Tianyun Zhang, and 3 more authors
    In 2020 57th ACM/IEEE Design Automation Conference (DAC), 2020
  3. Boosting deep neural network efficiency with dual-module inference
    Liu Liu, Lei Deng, Zhaodong Chen, and 7 more authors
    In International Conference on Machine Learning, 2020

2019

  1. Dynamic Sparse Graph for Efficient Deep Learning
    Liu Liu, Lei Deng, Xing Hu, and 4 more authors
    In International Conference on Learning Representations, 2019

2018

  1. TCAD
    SemiMap: A semi-folded convolution mapping for speed-overhead balance on crossbars
    Lei Deng, Ling Liang, Guanrui Wang, and 7 more authors
    IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2018
  2. Fast object tracking on a many-core neural network chip
    Lei Deng, Zhe Zou, Xin Ma, and 7 more authors
    Frontiers in neuroscience, 2018
  3. TNNLS
    L1 -norm batch normalization for efficient training of deep neural networks
    Shuang Wu, Guoqi Li, Lei Deng, and 4 more authors
    IEEE transactions on neural networks and learning systems, 2018

2017

  1. Building energy-efficient multi-level cell STT-RAM caches with data compression
    Liu Liu, Ping Chi, Shuangchen Li, and 2 more authors
    In 2017 22nd Asia and South Pacific Design Automation Conference (ASP-DAC), 2017

2016

  1. Nvsim-cam: a circuit-level simulator for emerging nonvolatile memory based content-addressable memory
    Shuangchen Li, Liu Liu, Peng Gu, and 2 more authors
    In 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), 2016
  2. Leveraging 3D technologies for hardware security: Opportunities and challenges
    Peng Gu, Shuangchen Li, Dylan Stow, and 4 more authors
    In Proceedings of the 26th edition on Great Lakes Symposium on VLSI, 2016