Follow
Cong Guo
Title
Cited by
Cited by
Year
Accelerating sparse dnn models without hardware-support via tile-wise sparsity
C Guo, BY Hsueh, J Leng, Y Qiu, Y Guan, Z Wang, X Jia, X Li, M Guo, ...
Proceedings of the International Conference for High Performance Computing …, 2020
922020
Dual-side sparse tensor core
Y Wang, C Zhang, Z Xie, C Guo, Y Liu, J Leng
2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture …, 2021
752021
Squant: On-the-fly data-free quantization via diagonal hessian approximation
C Guo, Y Qiu, J Leng, X Gao, C Zhang, Y Liu, F Yang, Y Zhu, M Guo
arXiv preprint arXiv:2202.07471, 2022
652022
Adversarial defense through network profiling based path extraction
Y Qiu, J Leng, C Guo, Q Chen, C Li, M Guo, Y Zhu
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2019
622019
Olive: Accelerating large language models via hardware-friendly outlier-victim pair quantization
C Guo, J Tang, W Hu, J Leng, C Zhang, F Yang, Y Liu, M Guo, Y Zhu
Proceedings of the 50th Annual International Symposium on Computer …, 2023
582023
Characterizing and demystifying the implicit convolution algorithm on commercial matrix-multiplication accelerators
Y Zhou, M Yang, C Guo, J Leng, Y Liang, Q Chen, M Guo, Y Zhu
2021 IEEE International Symposium on Workload Characterization (IISWC), 214-225, 2021
372021
Ant: Exploiting adaptive numerical data type for low-bit deep neural network quantization
C Guo, C Zhang, J Leng, Z Liu, F Yang, Y Liu, M Guo, Y Zhu
2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO …, 2022
352022
Balancing efficiency and flexibility for DNN acceleration via temporal GPU-systolic array integration
C Guo, Y Zhou, J Leng, Y Zhu, Z Du, Q Chen, C Li, B Yao, M Guo
2020 57th ACM/IEEE Design Automation Conference (DAC), 1-6, 2020
282020
Efficient Adaptive Activation Rounding for Post-Training Quantization
Z Li, C Guo, Z Zhu, Y Zhou, Y Qiu, X Gao, J Leng, M Guo
arXiv preprint arXiv:2208.11945, 2022
82022
Nesting forward automatic differentiation for memory-efficient deep neural network training
C Guo, Y Qiu, J Leng, C Zhang, Y Cao, Q Zhang, Y Liu, F Yang, M Guo
2022 IEEE 40th International Conference on Computer Design (ICCD), 738-745, 2022
72022
GMLake: Efficient and Transparent GPU Memory Defragmentation for Large-scale DNN Training with Virtual Memory Stitching
C Guo, R Zhang, J Xu, J Leng, Z Liu, Z Huang, M Guo, H Wu, S Zhao, ...
Proceedings of the 29th ACM International Conference on Architectural …, 2024
62024
JUNO: Optimizing High-Dimensional Approximate Nearest Neighbour Search with Sparsity-Aware Algorithm and Ray-Tracing Core Mapping
Z Liu, W Ni, J Leng, Y Feng, C Guo, Q Chen, C Li, M Guo, Y Zhu
Proceedings of the 29th ACM International Conference on Architectural …, 2024
52024
Accelerating sparse dnns based on tiled gemm
C Guo, F Xue, J Leng, Y Qiu, Y Guan, W Cui, Q Chen, M Guo
IEEE Transactions on Computers, 2024
42024
vTensor: Flexible Virtual Tensor Management for Efficient LLM Serving
J Xu, R Zhang, C Guo, W Hu, Z Liu, F Wu, Y Feng, S Sun, C Shao, Y Guo, ...
arXiv preprint arXiv:2407.15309, 2024
12024
Towards reliable AI applications via algorithm-based fault tolerance on NVDLA
MT Sanic, C Guo, J Leng, M Guo, W Ma
2022 18th International Conference on Mobility, Sensing and Networking (MSN …, 2022
12022
DSTC: Dual-Side Sparsity Tensor Core for DNNs Acceleration on Modern GPU Architectures
C Zhang, Y Wang, Z Xie, C Guo, Y Liu, J Leng, G Sun, Z Ji, R Wang, Y Xie, ...
IEEE Transactions on Computers, 2024
2024
A Survey: Collaborative Hardware and Software Design in the Era of Large Language Models
C Guo, F Cheng, Z Du, J Kiessling, J Ku, S Li, Z Li, M Ma, T Molom-Ochir, ...
arXiv preprint arXiv:2410.07265, 2024
2024
AdaptGear: Accelerating GNN Training via Adaptive Subgraph-Level Kernels on GPUs
Y Zhou, Y Song, J Leng, Z Liu, W Cui, Z Zhang, C Guo, Q Chen, L Li, ...
Proceedings of the 20th ACM International Conference on Computing Frontiers …, 2023
2023
The system can't perform the operation now. Try again later.
Articles 1–18