Zhaohan Daniel Guo

Citado por

	Total	Desde 2019
Citas	8683	8599
Índice h	18	18
Índice i10	21	21

3000

1500

750

2250

201820192020202120222023202446 73 242 1138 2267 2951 1913

Acceso público

Ver todo

3 artículos

0 artículos

disponibles

no disponibles

Basado en requisitos de financiación

Coautores

Emma BrunskillAssociate Professor of Computer Science, Stanford UniversityDirección de correo verificada de cs.stanford.edu
Philip ThomasUniversity of Massachusetts AmherstDirección de correo verificada de cs.umass.edu
Shayan DoroudiAssistant Professor at the University of California, IrvineDirección de correo verificada de uci.edu
Yao LiuAmazonDirección de correo verificada de stanford.edu

Seguir

Zhaohan Daniel Guo

DeepMind

Dirección de correo verificada de google.com - Página principal

Reinforcement learning


Título Ordenar por citas Ordenar por año Ordenar por título	Citado por Citado por	Año
Bootstrap your own latent-a new approach to self-supervised learning JB Grill, F Strub, F Altché, C Tallec, P Richemond, E Buchatskaya, ... Advances in neural information processing systems 33, 21271-21284, 2020	6108	2020
Agent57: Outperforming the atari human benchmark AP Badia, B Piot, S Kapturowski, P Sprechmann, A Vitvitskyi, ZD Guo, ... International conference on machine learning, 507-517, 2020	644	2020
koray kavukcuoglu, Remi Munos, and Michal Valko. Bootstrap your own latent-a new approach to self-supervised learning JB Grill, F Strub, F Altché, C Tallec, P Richemond, E Buchatskaya, ... Advances in neural information processing systems 33, 21271-21284, 2020	455	2020
Never give up: Learning directed exploration strategies AP Badia, P Sprechmann, A Vitvitskyi, D Guo, B Piot, S Kapturowski, ... arXiv preprint arXiv:2002.06038, 2020	329	2020
Joint semantic utterance classification and slot filling with recursive neural networks D Guo, G Tur, W Yih, G Zweig 2014 IEEE Spoken Language Technology Workshop (SLT), 554-559, 2014	246	2014
A general theoretical paradigm to understand learning from human preferences MG Azar, ZD Guo, B Piot, R Munos, M Rowland, M Valko, D Calandriello International Conference on Artificial Intelligence and Statistics, 4447-4455, 2024	159	2024
Bootstrap latent-predictive representations for multitask reinforcement learning ZD Guo, BA Pires, B Piot, JB Grill, F Altché, R Munos, MG Azar International Conference on Machine Learning, 3875-3886, 2020	145	2020
Neural predictive belief representations ZD Guo, MG Azar, B Piot, BA Pires, R Munos arXiv preprint arXiv:1811.06407, 2018	89	2018
A pac rl algorithm for episodic pomdps ZD Guo, S Doroudi, E Brunskill Artificial Intelligence and Statistics, 510-518, 2016	65	2016
Byol-explore: Exploration by bootstrapped prediction Z Guo, S Thakoor, M Pîslar, B Avila Pires, F Altché, C Tallec, A Saade, ... Advances in neural information processing systems 35, 31855-31870, 2022	58	2022
Using options and covariance testing for long horizon off-policy policy evaluation Z Guo, PS Thomas, E Brunskill Advances in Neural Information Processing Systems 30, 2017	48	2017
Nash learning from human feedback R Munos, M Valko, D Calandriello, MG Azar, M Rowland, ZD Guo, Y Tang, ... arXiv preprint arXiv:2312.00886, 2023	46	2023
Bootstrap your own latent: A new approach to self-supervised learning. arXiv JB Grill, F Strub, F Altché, C Tallec, PH Richemond, E Buchatskaya, ... arXiv preprint arXiv:2006.07733, 2020	41	2020
Geometric entropic exploration ZD Guo, MG Azar, A Saade, S Thakoor, B Piot, BA Pires, M Valko, ... arXiv preprint arXiv:2101.02055, 2021	40	2021
Concurrent pac rl Z Guo, E Brunskill Proceedings of the AAAI Conference on Artificial Intelligence 29 (1), 2015	30	2015
Understanding self-predictive learning for reinforcement learning Y Tang, ZD Guo, PH Richemond, BA Pires, Y Chandak, R Munos, ... International Conference on Machine Learning, 33632-33656, 2023	27	2023
Generalized preference optimization: A unified approach to offline alignment Y Tang, ZD Guo, Z Zheng, D Calandriello, R Munos, M Rowland, ... arXiv preprint arXiv:2402.05749, 2024	23	2024
Pac continuous state online multitask reinforcement learning with identification Y Liu, Z Guo, E Brunskill Proceedings of the 2016 International Conference on Autonomous Agents …, 2016	21	2016
Understanding the performance gap between online and offline alignment algorithms Y Tang, DZ Guo, Z Zheng, D Calandriello, Y Cao, E Tarassov, R Munos, ... arXiv preprint arXiv:2405.08448, 2024	13	2024
Directed exploration for reinforcement learning ZD Guo, E Brunskill arXiv preprint arXiv:1906.07805, 2019	12	2019

El sistema no puede realizar la operación en estos momentos. Inténtalo de nuevo más tarde.

Artículos 1–20

Citas por año

Citas duplicadas

Citas combinadas

Añadir coautoresCoautores

Seguir

Citado por

Coautores