Follow
Teng Wang
Teng Wang
Tencent
Verified email at connect.hku.hk - Homepage
Title
Cited by
Cited by
Year
End-to-end dense video captioning with parallel decoding
T Wang, R Zhang, Z Lu, F Zheng, R Cheng, P Luo
ICCV 2021, 6847-6857, 2021
2002021
Event-centric hierarchical representation for dense video captioning
T Wang, H Zheng, M Yu, Q Tian, H Hu
IEEE Transactions on Circuits and Systems for Video Technology 31 (5), 1890-1900, 2020
852020
Caption anything: Interactive image description with diverse multimodal controls
T Wang*, J Zhang*, J Fei*, Y Ge, H Zheng, Y Tang, Z Li, M Gao, S Zhao, ...
arXiv preprint arXiv:2305.02677, 2023
782023
Set-level guidance attack: Boosting adversarial transferability of vision-language pre-training models
D Lu, Z Wang, T Wang, W Guan, H Gao, F Zheng
Proceedings of the IEEE/CVF International Conference on Computer Vision, 102-111, 2023
442023
Video understanding with large language models: A survey
Y Tang, J Bi, S Xu, L Song, S Liang, T Wang, D Zhang, J An, J Lin, R Zhu, ...
arXiv preprint arXiv:2312.17432, 2023
432023
VLMixer: Unpaired Vision-Language Pre-training via Cross-Modal CutMix
T Wang, W Jiang, Z Lu, F Zheng, R Cheng, C Yin, P Luo
ICML 2022, 2022
402022
-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation
C Wu, T Wang, Y Ge, Z Lu, R Zhou, Y Shan, P Luo
International Conference on Machine Learning, 37713-37727, 2023
302023
Transferable decoding with visual entities for zero-shot image captioning
J Fei, T Wang, J Zhang, Z He, C Wang, F Zheng
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023
282023
Knowledge-aware prompt tuning for generalizable vision-language models
B Kan, T Wang, W Lu, X Zhen, W Guan, F Zheng
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023
242023
Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline
T Geng, T Wang, J Duan, R Cong, F Zheng
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023
212023
Dense-captioning events in videos: Sysu submission to activitynet challenge 2020
T Wang, H Zheng, M Yu
CVPR Workshops, 2020
122020
Accelerating Vision-Language Pretraining with Free Language Modeling
T Wang, Y Ge, F Zheng, R Cheng, Y Shan, X Qie, P Luo
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023
112023
Learning Grounded Vision-Language Representation for Versatile Understanding in Untrimmed Videos
T Wang*, J Zhang*, F Zheng, W Jiang, R Cheng, P Luo
arXiv preprint arXiv:2303.06378, 2023
92023
Llmva-gebc: Large language model with video adapter for generic event boundary captioning
Y Tang, J Zhang, X Wang, T Wang, F Zheng
arXiv preprint arXiv:2306.10354, 2023
72023
Multi-modal segment assemblage network for ad video editing with importance-coherence reward
Y Tang, S Xu, T Wang, Q Lin, Q Lu, F Zheng
Proceedings of the Asian Conference on Computer Vision, 3519-3535, 2022
72022
UniAV: Unified Audio-Visual Perception for Multi-Task Video Localization
T Geng, T Wang, Y Zhang, J Duan, W Guan, F Zheng
arXiv preprint arXiv:2404.03179, 2024
42024
Semantic-aware pretraining for dense video captioning
T Wang, Z Liu, F Zheng, Z Lu, R Cheng, P Luo
arXiv preprint arXiv:2204.07449, 2022
42022
Show, Tell and Rephrase: Diverse Video Captioning via Two-Stage Progressive Training
Z Liu, T Wang, J Zhang, F Zheng, W Jiang, K Lu
IEEE Transactions on Multimedia 25, 7894-7905, 2022
32022
PTVD: A Large-Scale Plot-Oriented Multimodal Dataset Based on Television Dramas
C Li, X Peng, T Wang, Y Ge, M Liu, X Xu, Y Wang, Y Shan
arXiv preprint arXiv:2306.14644, 2023
22023
Two in One Go: Single-stage Emotion Recognition with Decoupled Subject-context Transformer
X Li, T Wang, J Zhao, S Mao, J Wang, F Zheng, X Peng, X Li
Proceedings of the 32nd ACM International Conference on Multimedia, 9340-9349, 2024
2024
The system can't perform the operation now. Try again later.
Articles 1–20