Seguir
Samy Jelassi
Samy Jelassi
Dirección de correo verificada de fas.harvard.edu - Página principal
Título
Citado por
Citado por
Año
Vision transformers provably learn spatial structure
S Jelassi, M Sander, Y Li
Advances in Neural Information Processing Systems 35, 37822-37836, 2022
1042022
A momentumized, adaptive, dual averaged gradient method
A Defazio, S Jelassi
Journal of Machine Learning Research 23 (144), 1-34, 2022
96*2022
Repeat after me: Transformers are better than state space models at copying
S Jelassi, D Brandfonbrener, SM Kakade, E Malach
arXiv preprint arXiv:2402.01032, 2024
772024
Global convergence of neuron birth-death dynamics
G Rotskoff, S Jelassi, J Bruna, E Vanden-Eijnden
arXiv preprint arXiv:1902.01843, 2019
67*2019
A permutation-equivariant neural network architecture for auction design
J Rahme, S Jelassi, J Bruna, SM Weinberg
Proceedings of the AAAI conference on artificial intelligence 35 (6), 5664-5672, 2021
662021
A mean-field analysis of two-player zero-sum games
C Domingo-Enrich, S Jelassi, A Mensch, G Rotskoff, J Bruna
Advances in neural information processing systems 33, 20215-20226, 2020
612020
Auction learning as a two-player game
J Rahme, S Jelassi, SM Weinberg
arXiv preprint arXiv:2006.05684, 2020
552020
Towards understanding how momentum improves generalization in deep learning
S Jelassi, Y Li
International Conference on Machine Learning, 9965-10040, 2022
512022
Length generalization in arithmetic transformers
S Jelassi, S d'Ascoli, C Domingo-Enrich, Y Wu, Y Li, F Charton
arXiv preprint arXiv:2306.15400, 2023
372023
Smoothed analysis of the low-rank approach for smooth semidefinite programs
T Pumir, S Jelassi, N Boumal
Advances in Neural Information Processing Systems 31, 2018
292018
Towards closing the gap between the theory and practice of SVRG
O Sebbouh, N Gazagnadou, S Jelassi, F Bach, R Gower
Advances in neural information processing systems 32, 2019
232019
Dissecting adaptive methods in GANs
S Jelassi, D Dobre, A Mensch, Y Li, G Gidel
arXiv preprint arXiv:2210.04319, 2022
19*2022
Depth separation beyond radial functions
L Venturi, S Jelassi, T Ozuch, J Bruna
Journal of machine learning research 23 (122), 1-56, 2022
182022
Extra-gradient with player sampling for faster convergence in n-player games
S Jelassi, C Domingo-Enrich, D Scieur, A Mensch, J Bruna
International Conference on Machine Learning, 4736-4745, 2020
16*2020
Universal length generalization with turing programs
K Hou, D Brandfonbrener, S Kakade, S Jelassi, E Malach
arXiv preprint arXiv:2407.03310, 2024
52024
Depth Dependence of P Learning Rates in ReLU MLPs
S Jelassi, B Hanin, Z Ji, SJ Reddi, S Bhojanapalli, S Kumar
arXiv preprint arXiv:2305.07810, 2023
52023
Lora soups: Merging loras for practical skill composition tasks
A Prabhakar, Y Li, K Narasimhan, S Kakade, E Malach, S Jelassi
arXiv preprint arXiv:2410.13025, 2024
42024
Mixture of parrots: Experts improve memorization more than reasoning
S Jelassi, C Mohri, D Brandfonbrener, A Gu, N Vyas, N Anand, ...
arXiv preprint arXiv:2410.19034, 2024
32024
Q-probe: A lightweight approach to reward maximization for language models
K Li, S Jelassi, H Zhang, S Kakade, M Wattenberg, D Brandfonbrener
arXiv preprint arXiv:2402.14688, 2024
32024
The Role of Sparsity for Length Generalization in Transformers
N Golowich, S Jelassi, D Brandfonbrener, SM Kakade, E Malach
arXiv preprint arXiv:2502.16792, 2025
2025
El sistema no puede realizar la operación en estos momentos. Inténtalo de nuevo más tarde.
Artículos 1–20