Seguir
Jianfeng Wang
Jianfeng Wang
Dirección de correo verificada de microsoft.com
Título
Citado por
Citado por
Año
Florence: A new foundation model for computer vision
L Yuan, D Chen, YL Chen, N Codella, X Dai, J Gao, H Hu, X Huang, B Li, ...
arXiv preprint arXiv:2111.11432, 2021
6352021
Multimedia cloud computing
W Zhu, C Luo, J Wang, S Li
IEEE Signal Processing Magazine 28 (3), 59-69, 2011
5902011
End-to-end semi-supervised object detection with soft teacher
M Xu, Z Zhang, H Hu, J Wang, L Wang, F Wei, X Bai, Z Liu
Proceedings of the IEEE/CVF international conference on computer vision …, 2021
3962021
Git: A generative image-to-text transformer for vision and language
J Wang, Z Yang, X Hu, L Li, K Lin, Z Gan, Z Liu, C Liu, L Wang
arXiv preprint arXiv:2205.14100, 2022
3212022
An empirical study of training end-to-end vision-and-language transformers
ZY Dou, Y Xu, Z Gan, J Wang, S Wang, L Wang, C Zhu, P Zhang, L Yuan, ...
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022
2762022
An empirical study of gpt-3 for few-shot knowledge-based vqa
Z Yang, Z Gan, J Wang, X Hu, Y Lu, Z Liu, L Wang
Proceedings of the AAAI Conference on Artificial Intelligence 36 (3), 3081-3089, 2022
2692022
Scaling up vision-language pre-training for image captioning
X Hu, Z Gan, J Wang, Z Yang, Z Liu, Y Lu, L Wang
Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2022
2042022
The dawn of lmms: Preliminary explorations with gpt-4v (ision)
Z Yang, L Li, K Lin, J Wang, CC Lin, Z Liu, L Wang
arXiv preprint arXiv:2309.17421 9 (1), 1, 2023
1982023
Mm-react: Prompting chatgpt for multimodal reasoning and action
Z Yang, L Li, J Wang, K Lin, E Azarnasab, F Ahmed, Z Liu, C Liu, M Zeng, ...
arXiv preprint arXiv:2303.11381, 2023
1862023
Seed: Self-supervised distillation for visual representation
Z Fang, J Wang, L Wang, L Zhang, Y Yang, Z Liu
arXiv preprint arXiv:2101.04731, 2021
1612021
Prompting gpt-3 to be reliable
C Si, Z Gan, Z Yang, S Wang, J Wang, J Boyd-Graber, L Wang
arXiv preprint arXiv:2210.09150, 2022
1452022
Tap: Text-aware pre-training for text-vqa and text-caption
Z Yang, Y Lu, J Wang, X Yin, D Florencio, L Wang, C Zhang, L Zhang, ...
Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2021
1452021
Generalized decoding for pixel, image, and language
X Zou, ZY Dou, J Yang, Z Gan, L Li, C Li, X Dai, H Behl, J Wang, L Yuan, ...
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023
1262023
Order preserving hashing for approximate nearest neighbor search
J Wang, J Wang, N Yu, S Li
Proceedings of the 21st ACM international conference on Multimedia, 133-142, 2013
1242013
Optimized cartesian k-means
J Wang, J Wang, J Song, XS Xu, HT Shen, S Li
IEEE Transactions on Knowledge and Data Engineering 27 (1), 180-192, 2014
1182014
Mm-vet: Evaluating large multimodal models for integrated capabilities
W Yu, Z Yang, L Li, J Wang, K Lin, Z Liu, X Wang, L Wang
arXiv preprint arXiv:2308.02490, 2023
1052023
Facial age estimation with age difference
Z Hu, Y Wen, J Wang, M Wang, R Hong, S Yan
IEEE Transactions on Image Processing 26 (7), 3087-3097, 2016
1052016
Anchor box optimization for object detection
Y Zhong, J Wang, J Peng, L Zhang
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer …, 2020
1022020
Hierarchically structured reinforcement learning for topically coherent visual story generation
Q Huang, Z Gan, A Celikyilmaz, D Wu, J Wang, X He
Proceedings of the AAAI Conference on Artificial Intelligence 33 (01), 8465-8472, 2019
982019
Aligning large multi-modal model with robust instruction tuning
F Liu, K Lin, L Li, J Wang, Y Yacoob, L Wang
arXiv preprint arXiv:2306.14565, 2023
892023
El sistema no puede realizar la operación en estos momentos. Inténtalo de nuevo más tarde.
Artículos 1–20