Ankit Gupta

Cited by

	All	Since 2019
Citations	1350	1087
h-index	15	13
i10-index	17	16

400

200

100

300

20122013201420152016201720182019202020212022202320245 14 42 42 44 58 48 22 67 120 216 383 279

Public access

View all

3 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Jonathan BerantAssociate Professor, Tel-Aviv University, Visiting Faculty Researcher, Google DeepMIndVerified email at cs.tau.ac.il
Mor GevaTel Aviv University, Google ResearchVerified email at tauex.tau.ac.il
Pritish KamathGoogle ResearchVerified email at google.com
Ramprasad SaptharishiTata Institute of Fundamental ResearchVerified email at tifr.res.in
Daniel DeutchTel Aviv UniversityVerified email at post.tau.ac.il
Yoav GoldbergProfessor, Bar Ilan University. Research Director, AI2-IsraelVerified email at cs.biu.ac.il
Matt GardnerScaled CognitionVerified email at scaledcognition.com
Youming QiaoUniversity of Technology SydneyVerified email at uts.edu.au

Ankit Gupta

IBM Research

Verified email at ibm.com - Homepage

Machine Learning Natural Language Processing


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Injecting numerical reasoning skills into language models M Geva, A Gupta, J Berant Proceedings of the 58th Annual Meeting of the Association for Computational …, 2020	199	2020
Arithmetic circuits: A chasm at depth 3 A Gupta, P Kamath, N Kayal, R Saptharishi SIAM Journal on Computing 45 (3), 1064-1079, 2016	182*	2016
Break It Down: A Question Understanding Benchmark T Wolfson, M Geva, A Gupta, M Gardner, Y Goldberg, D Deutch, J Berant Transactions of the Association for Computational Linguistics 8, 183-198, 2020	171	2020
Approaching the chasm at depth four A Gupta, P Kamath, N Kayal, R Saptharishi Journal of the ACM (JACM) 61 (6), 1-16, 2014	136	2014
On the parameterization and initialization of diagonal state space models A Gu, A Gupta, K Goel, C Ré Advances in Neural Information Processing Systems 35, 35971-35983, 2022	124	2022
Diagonal state spaces are as effective as structured state spaces A Gupta, A Gu, J Berant Advances in Neural Information Processing Systems 35, 22982-22994, 2022	114	2022
Long range language modeling via gated state spaces H Mehta, A Gupta, A Cutkosky, B Neyshabur The Eleventh International Conference on Learning Representations, 2023	96	2023
Scrolls: Standardized comparison over long language sequences U Shaham, E Segal, M Ivgi, A Efrat, O Yoran, A Haviv, A Gupta, W Xiong, ... Proceedings of the 2022 Conference on Empirical Methods in Natural Language …, 2022	76	2022
Analyzing transformers in embedding space G Dar, M Geva, A Gupta, J Berant Proceedings of the 61st Annual Meeting of the Association for Computational …, 2023	57	2023
Gmat: Global memory augmentation for transformers A Gupta, J Berant arXiv preprint arXiv:2006.03274, 2020	45	2020
Reconstruction of depth-4 multilinear circuits with top fan-in 2 A Gupta, N Kayal, S Lokam Proceedings of the forty-fourth annual ACM symposium on Theory of computing …, 2012	29	2012
Algebraic geometric techniques for depth-4 PIT & sylvester-gallai conjectures for varieties A Gupta Electronic Colloquium on Computational Complexity (ECCC) 21 (130), 1, 2014	26	2014
Memory-efficient Transformers via Top-k Attention A Gupta, G Dar, S Goodman, D Ciprut, J Berant Proceedings of the Second Workshop on Simple and Efficient Natural Language …, 2021	21	2021
Random arithmetic formulas can be reconstructed efficiently A Gupta, N Kayal, Y Qiao computational complexity 23, 207-303, 2014	21	2014
Efficient reconstruction of random multilinear formulas A Gupta, N Kayal, S Lokam 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science, 778-787, 2011	18	2011
Diagonal state space augmented transformers for speech recognition G Saon, A Gupta, X Cui ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023	15	2023
Simplifying and understanding state space models with diagonal linear rnns A Gupta, H Mehta, J Berant arXiv preprint arXiv:2212.00768, 2022	13	2022
Value-aware Approximate Attention A Gupta, J Berant Proceedings of the 2021 Conference on Empirical Methods in Natural Language …, 2021	4	2021
Never Train from Scratch: Fair Comparison of Long-Sequence Models Requires Data-Driven Priors I Amos, J Berant, A Gupta arXiv preprint arXiv:2310.02980, 2023	2	2023
Exploring the limits of decoder-only models trained on public speech recognition corpora A Gupta, G Saon, B Kingsbury arXiv preprint arXiv:2402.00235, 2024	1	2024

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors