Seguir
Zongzhang Zhang
Título
Citado por
Citado por
Año
A survey on deep reinforcement learning
Q Liu, JW Zhai, ZZ Zhang, S Zhong, Q Zhou, P Zhang, J Xu
Chinese Journal of Computers 41 (1), 1-27, 2018
2042018
深度强化学习综述
刘全, 翟建伟, 章宗长, 钟珊, 周倩, 章鹏, 徐进
计算机学报 41 (1), 1-27, 2018
1202018
Weighted double Q-learning
Z Zhang, Z Pan, MJ Kochenderfer
IJCAI-2017, 3455-3461, 2017
1162017
A deep Bayesian policy reuse approach against non-stationary agents
Y Zheng, Z Meng, J Hao, Z Zhang, T Yang, C Fan
NeurIPS-2018, 954-964, 2018
932018
Hierarchical deep multiagent reinforcement learning with temporal abstraction
H Tang, J Hao, T Lv, Y Chen, Z Zhang, H Jia, C Ren, Y Zheng, Z Meng, ...
arXiv preprint arXiv:1809.09332, 2018
822018
Multi-Agent Incentive Communication via Decentralized Teammate Modeling
L Yuan, J Wang, F Zhang, C Wang, Z Zhang, Y Yu, C Zhang
AAAI-2022, 9466-9474, 2022
612022
Weighted double deep multiagent reinforcement learning in stochastic cooperative environments
Y Zheng, Z Meng, J Hao, Z Zhang
PRICAI-2018, 421-429, 2018
492018
A survey on deep reinforcement learning
L Quan, Z Jianwei, Z Zongchang, Z Shan, Z Qian
Chinese Journal of Computers 41 (01), 1-27, 2018
492018
Efficient deep reinforcement learning via adaptive policy transfer
T Yang, J Hao, Z Meng, Z Zhang, Y Hu, Y Chen, C Fan, W Wang, W Liu, ...
IJCAI-2020, 3094-3100, 2020
382020
Triple-GAIL: A multi-modal imitation learning framework with generative adversarial Nets
C Fei, B Wang, Y Zhuang, Z Zhang, J Hao, H Zhang, X Ji, W Liu
IJCAI-2020, 2929-2935, 2020
372020
Deep Q-learning with prioritized sampling
J Zhai, Q Liu, Z Zhang, S Zhong, H Zhu, P Zhang, C Sun
ICONIP-2016, 13-22, 2016
342016
Adapt to Environment Sudden Changes by Learning a Context Sensitive Policy
FM Luo, S Jiang, Y Yu, Z Zhang, YF Zhang
AAAI-2022, 7637-7646, 2022
322022
Multi-agent Dynamic Algorithm Configuration
K Xue, J Xu, L Yuan, M Li, C Qian, Z Zhang, Y Yu
NeurIPS-2022, 20147-20161, 2022
302022
Thompson sampling based Monte-Carlo planning in POMDPs
A Bai, F Wu, Z Zhang, X Chen
ICAPS-2014, 28-36, 2014
282014
Language Model Self-improvement by Reinforcement Learning Contemplation
JC Pang, P Wang, K Li, XH Chen, J Xu, Z Zhang, Y Yu
ICLR-2024, 2024
26*2024
Discovering Generalizable Multi-agent Coordination Skills from Multi-task Offline Data
F Zhang, C Jia, YC Li, L Yuan, Y Yu, Z Zhang
ICLR-2023, 2023
262023
Covering number as a complexity measure for POMDP planning and learning
Z Zhang, M Littman, X Chen
AAAI-2012, 1853-1859, 2012
252012
Efficient Multi-agent Communication via Self-supervised Information Aggregation
C Guan, F Chen, L Yuan, C Wang, H Yin, Z Zhang, Y Yu
NeurIPS-2022, 1020-1033, 2022
242022
Covering number for efficient heuristic-based POMDP planning
Z Zhang, D Hsu, WS Lee
ICML-2014, 28-36, 2014
242014
Policy Regularization with Dataset Constraint for Offline Reinforcement Learning
Y Ran, YC Li, F Zhang, Z Zhang, Y Yu
ICML-2023, 28701-28717, 2023
202023
El sistema no puede realizar la operación en estos momentos. Inténtalo de nuevo más tarde.
Artículos 1–20