A Scalable Multi-TeraOPS Deep Learning Processor Core for AI Trainina and Inference B Fleischer, S Shukla, M Ziegler, J Silberman, J Oh, V Srinivasan, J Choi, ... 2018 IEEE Symposium on VLSI Circuits, 35-36, 2018 | 154 | 2018 |
Lut-gemm: Quantized matrix multiplication based on luts for efficient inference in large-scale generative language models G Park, M Kim, S Lee, J Kim, B Kwon, SJ Kwon, B Kim, Y Lee, D Lee The Twelfth International Conference on Learning Representations, 2023 | 108 | 2023 |
Memory-efficient fine-tuning of compressed large language models via sub-4-bit integer quantization J Kim, JH Lee, S Kim, J Park, KM Yoo, SJ Kwon, D Lee Advances in Neural Information Processing Systems 36, 2024 | 75 | 2024 |
High-performance low-energy STT MRAM based on balanced write scheme D Lee, SK Gupta, K Roy Proceedings of the 2012 ACM/IEEE international symposium on Low power …, 2012 | 74 | 2012 |
DFX: A Low-latency Multi-FPGA Appliance for Accelerating Transformer-based Text Generation S Hong, S Moon, J Kim, S Lee, M Kim, D Lee, JY Kim 2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO), 616-630, 2022 | 57 | 2022 |
Structured Compression by Weight Encryption for Unstructured Pruning and Quantization SJ Kwon, D Lee, B Kim, P Kapoor, B Park, GY Wei Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2020 | 48 | 2020 |
Soft-error-resilient FPGAs using built-in 2-D Hamming product code SP Park, D Lee, K Roy IEEE transactions on very large scale integration (VLSI) systems 20 (2), 248-256, 2011 | 48 | 2011 |
Maximum Likelihood Training of Implicit Nonlinear Diffusion Model D Kim, B Na, SJ Kwon, D Lee, W Kang, I Moon Advances in Neural Information Processing Systems 35, 32270-32284, 2022 | 45 | 2022 |
A review of on-device fully neural end-to-end automatic speech recognition algorithms C Kim, D Gowda, D Lee, J Kim, A Kumar, S Kim, A Garg, C Han 2020 54th Asilomar Conference on Signals, Systems, and Computers, 277-283, 2020 | 39 | 2020 |
BiQGEMM: matrix multiplication with lookup table for binary-coding-based quantized DNNs Y Jeon, B Park, SJ Kwon, B Kim, J Yun, D Lee SC20: International Conference for High Performance Computing, Networking …, 2020 | 35 | 2020 |
A scalable multi-TeraOPS core for AI training and inference S Shukla, B Fleischer, M Ziegler, J Silberman, J Oh, V Srinivasan, J Choi, ... IEEE Solid-State Circuits Letters 1 (12), 217-220, 2019 | 34 | 2019 |
AlphaTuning: Quantization-Aware Parameter-Efficient Adaptation of Large-Scale Pre-Trained Language Models SJ Kwon, J Kim, J Bae, KM Yoo, JH Kim, B Park, B Kim, JW Ha, N Sung, ... arXiv preprint arXiv:2210.03858, 2022 | 33 | 2022 |
R-MRAM: A ROM-Embedded STT MRAM Cache D Lee, X Fong, K Roy IEEE Electron Device Letters 34 (10), 1256-1258, 2013 | 33 | 2013 |
Extremely Low Bit Transformer Quantization for On-Device Neural Machine Translation I Chung, B Kim, Y Choi, SJ Kwon, Y Jeon, B Park, S Kim, D Lee arXiv preprint arXiv:2009.07453, 2020 | 30 | 2020 |
Flexround: Learnable rounding based on element-wise division for post-training quantization JH Lee, J Kim, SJ Kwon, D Lee International Conference on Machine Learning, 18913-18939, 2023 | 26 | 2023 |
Area efficient ROM-embedded SRAM cache D Lee, K Roy IEEE Transactions on Very Large Scale Integration (VLSI) Systems 21 (9 …, 2013 | 26 | 2013 |
Viterbi-based efficient test data compression D Lee, K Roy IEEE Transactions on Computer-Aided Design of Integrated Circuits and …, 2012 | 25 | 2012 |
Energy-delay optimization of the STT MRAM write operation under process variations D Lee, K Roy IEEE Transactions on Nanotechnology 13 (4), 714-723, 2014 | 24 | 2014 |
DeepTwist: Learning Model Compression via Occasional Weight Distortion D Lee, P Kapoor, B Kim arXiv preprint arXiv:1810.12823, 2018 | 23 | 2018 |
Viterbi-based pruning for sparse matrix with fixed and high index compression ratio D Lee, D Ahn, T Kim, PI Chuang, JJ Kim International Conference on Learning Representations, 2018 | 23 | 2018 |