Seguir
Jiasen Lu
Jiasen Lu
Senior Research Scientist, Allen Institute of Artificial Intelligence
Dirección de correo verificada de allenai.org - Página principal
Título
Citado por
Citado por
Año
Vqa: Visual question answering
A Agrawal*, J Lu*, S Antol*, M Mitchell, CL Zitnick, D Parikh, D Batra
International Journal of Computer Vision 123 (1), 4-31, 2017
5687*2017
Vqa: Visual question answering
S Antol, A Agrawal, J Lu, M Mitchell, D Batra, C Lawrence Zitnick, ...
Proceedings of the IEEE International Conference on Computer Vision, 2425-2433, 2015
56802015
Vilbert: Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks
J Lu, D Batra, D Parikh, S Lee
Advances in neural information processing systems, 2019
33052019
Hierarchical question-image co-attention for visual question answering
J Lu, J Yang, D Batra, D Parikh
Advances in neural information processing systems 29, 2016
18992016
Knowing when to look: Adaptive attention via a visual sentinel for image captioning
J Lu*, C Xiong*, D Parikh, R Socher
Proceedings of the IEEE Conference on Computer Vision and Pattern …, 2017
17192017
Graph R-CNN for Scene Graph Generation
J Yang*, J Lu*, S Lee, D Batra, D Parikh
arXiv preprint arXiv:1808.00191, 2018
9082018
Neural Baby Talk
J Lu*, J Yang*, D Batra, D Parikh
In Proceedings of the IEEE conference on computer vision and pattern …, 2018
5312018
12-in-1: Multi-Task Vision and Language Representation Learning
J Lu*, V Goswami*, M Rohrbach, D Parikh, S Lee
Proceedings of the IEEE Conference on Computer Vision and Pattern …, 2019
5002019
Parlai: A dialog research software platform
AH Miller, W Feng, A Fisch, J Lu, D Batra, A Bordes, D Parikh, J Weston
arXiv preprint arXiv:1705.06476, 2017
4242017
Self-monitoring navigation agent via auxiliary progress estimation
CY Ma, J Lu, Z Wu, G AlRegib, Z Kira, R Socher, C Xiong
arXiv preprint arXiv:1901.03035, 2019
2692019
Unified-IO: A unified model for vision, language, and multi-modal tasks
J Lu, C Clark, R Zellers, R Mottaghi, A Kembhavi
arXiv preprint arXiv:2206.08916, 2022
2592022
Merlot reserve: Neural script knowledge through vision and language and sound
R Zellers, J Lu, X Lu, Y Yu, Y Zhao, M Salehi, A Kusupati, J Hessel, ...
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022
1932022
Best of both worlds: Transferring knowledge from discriminative learning to a generative visual dialog model
J Lu, A Kannan, J Yang, D Parikh, D Batra
Advances in Neural Information Processing Systems 30, 2017
1422017
Sentinel gate for modulating auxiliary information in a long short-term memory (lstm) neural network
LU Jiasen, C Xiong, R Socher
US Patent 10,565,306, 2020
1362020
A Faster Pytorch Implementation of Faster R-CNN
J Yang*, J Lu*, D Batra, D Parikh
https://github.com/jwyang/faster-rcnn.pytorch, 2018
1072018
X-lxmert: Paint, caption and answer questions with multi-modal transformers
J Cho, J Lu, D Schwenk, H Hajishirzi, A Kembhavi
arXiv preprint arXiv:2009.11278, 2020
1012020
Multi-modal answer validation for knowledge-based vqa
J Wu, J Lu, A Sabharwal, R Mottaghi
Proceedings of the AAAI conference on artificial intelligence 36 (3), 2712-2721, 2022
992022
Spatially aware multimodal transformers for textvqa
Y Kant, D Batra, P Anderson, A Schwing, D Parikh, J Lu, H Agrawal
Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23 …, 2020
882020
Deeper lstm and normalized cnn visual question answering model
J Lu, X Lin, D Batra, D Parikh
GitHub repository 6, 2015
802015
Human action segmentation with hierarchical supervoxel consistency
J Lu, R Xu, JJ Corso
Proceedings of the IEEE Conference on Computer Vision and Pattern …, 2015
712015
El sistema no puede realizar la operación en estos momentos. Inténtalo de nuevo más tarde.
Artículos 1–20