Xitong YANG

Cited by

	All	Since 2019
Citations	1388	1323
h-index	18	18
i10-index	22	21

380

190

285

2016201720182019202020212022202320244 16 38 103 166 232 294 377 151

Public access

View all

7 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Larry DavisProfessor of Computer Science, University of MarylandVerified email at cs.umd.edu
Jiebo LuoAlbert Arendt Hopeman Professor of Engineering, University of RochesterVerified email at cs.rochester.edu
Xiaodong YangNVIDIA ResearchVerified email at nvidia.com
Zheng XuGoogle ResearchVerified email at google.com
Sriganesh MadhvanathDirector, Applied Research, eBay IncVerified email at acm.org
Edgar A. BernalChief Data ScientistVerified email at flxai.com
Ming-Yu LiuVice President of Research at NVIDIAVerified email at nvidia.com
Raja BalaPARCVerified email at parc.com
Ahmed TahaWhiteRabbit.AI (previously at University of Maryland)Verified email at cs.umd.edu
Yuncheng LiGoogleVerified email at google.com
Yi-Ting ChenAssistant Professor, National Yang Ming Chiao Tung UniversityVerified email at cs.nctu.edu.tw
Ji LiuMetaVerified email at meta.com

Xitong YANG

Research Scientist at FAIR, Meta

Verified email at fb.com

Computer Vision Video Understanding Deep Learning


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Towards Perceptual Image Dehazing by Physics-based Disentanglement and Adversarial Training X Yang, Z Xu, J Luo AAAI Conference on Artificial Intelligence (AAAI), 2018	239	2018
Cross-x learning for fine-grained visual categorization W Luo, X Yang, X Mo, Y Lu, LS Davis, J Li, J Yang, SN Lim Proceedings of the IEEE/CVF international conference on computer vision …, 2019	226	2019
STEP: Spatio-Temporal Progressive Learning for Video Action Detection X Yang, X Yang, MY Liu, F Xiao, L Davis, J Kautz Proceedings of the IEEE Conference on Computer Vision and Pattern …, 2019	167	2019
Deep multimodal representation learning from temporal data X Yang, P Ramesh, R Chitta, S Madhvanath, EA Bernal, J Luo Proceedings of the IEEE conference on computer vision and pattern …, 2017	123	2017
Asm-loc: Action-aware segment modeling for weakly-supervised temporal action localization B He, X Yang, L Kang, Z Cheng, X Zhou, A Shrivastava Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2022	78	2022
Tracking Illicit Drug Dealing and Abuse on Instagram Using Multimodal Analysis X Yang, J Luo ACM Transactions on Intelligent Systems and Technology (TIST) 8 (4), 2017	72	2017
Deep temporal multimodal fusion for medical procedure monitoring using wearable sensors EA Bernal, X Yang, Q Li, J Kumar, S Madhvanath, P Ramesh, R Bala IEEE Transactions on Multimedia 20 (1), 107-118, 2017	71	2017
Efficient video transformers with spatial-temporal token selection J Wang, X Yang, H Li, L Liu, Z Wu, YG Jiang European Conference on Computer Vision, 69-86, 2022	44	2022
Understanding the variational lower bound X Yang variational lower bound, ELBO, hard attention 22, 1-4, 2017	36	2017
Temporal fusion of multimodal data from multiple data acquisition systems to automatically recognize and classify an action X Yang, EA Bernal, S Madhvanath, R Bala, PS Ramesh, Q Li, J Kumar US Patent 9,805,255, 2017	35	2017
Semi-supervised vision transformers Z Weng, X Yang, A Li, Z Wu, YG Jiang European conference on computer vision, 605-620, 2022	34	2022
Pinterest board recommendation for twitter users X Yang, Y Li, J Luo Proceedings of the 23rd ACM international conference on Multimedia, 963-966, 2015	34	2015
Gta: Global temporal attention for video action understanding B He, X Yang, Z Wu, H Chen, SN Lim, A Shrivastava arXiv preprint arXiv:2012.08510, 2020	29	2020
Strong Baseline for Single Image Dehazing with Deep Features and Instance Normalization. Z Xu, X Yang, X Li, X Sun, P Harbin BMVC 2 (3), 5, 2018	24	2018
Iterative spatio-temporal action detection in video X Yang, X Yang, X Fanyi, MY Liu, J Kautz US Patent 11,017,556, 2021	19	2021
The effectiveness of instance normalization: a strong baseline for single image dehazing Z Xu, X Yang, X Li, X Sun arXiv preprint arXiv:1805.03305, 2018	19	2018
Ego-exo4d: Understanding skilled human activity from first-and third-person perspectives K Grauman, A Westbury, L Torresani, K Kitani, J Malik, T Afouras, ... arXiv preprint arXiv:2311.18259, 2023	18	2023
Beyond short clips: End-to-end video-level learning with collaborative memories X Yang, H Fan, L Torresani, LS Davis, H Wang Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2021	18	2021
Open-vclip: Transforming clip to an open-vocabulary video model via interpolated weight optimization Z Weng, X Yang, A Li, Z Wu, YG Jiang International Conference on Machine Learning, 36978-36989, 2023	16	2023
Towards scalable neural representation for diverse videos B He, X Yang, H Wang, Z Wu, H Chen, S Huang, Y Ren, SN Lim, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023	13	2023

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors