Zuxuan Wu
School of Computer Science
Fudan University, Shanghai, China
Email: zxwu AT fudan dot edu dot cn
I am an Associate Professor in School of Computer Science at Fudan University, and a member of the Fudan Vision and Learning Laboratory. I recieved my Ph.D. in Computer Science from the University of Maryland with Prof. Larry Davis in 2020. My research interests are in computer vision and deep learning. My current research particularly focuses on large-scale video understanding, computationally efficient frameworks and robust deep neural networks.
I'm currently looking for students with strong coding skills who are excited to design algorithms for visual understanding. If you are interested in working with me, please send me an email.
Publication [Google Scholar]
- Multi-Prompt Alignment for Multi-Source Unsupervised Domain Adaptation. [pdf]
- Advances in Neural Information Processing Systems (NeurIPS), New Orleans, USA, Dec., 2023.
- Haoran Chen, Xintong Han, Zuxuan Wu, Yu-Gang Jiang
- Learning from Rich Semantics and Coarse Locations for Long-tailed Object Detection. [pdf][code]
- Advances in Neural Information Processing Systems (NeurIPS), New Orleans, USA, Dec., 2023.
- Lingchen Meng, Xiyang Dai, Jianwei Yang, Dongdong Chen, Yinpeng Chen, Mengchen Liu, Yi-Ling Chen, Zuxuan Wu, Lu Yuan, Yu-Gang Jiang
- Implicit Temporal Modeling with Learnable Alignment for Video Recognition. [pdf][code]
- International Conference on Computer Vision (ICCV), Paris, France, Oct., 2023 (Oral)
- Shuyuan Tu, Qi Dai, Zuxuan Wu, Zhi-Qi Cheng, Han Hu, Yu-Gang Jiang
- Open-VCLIP: Transforming CLIP to an Open-vocabulary Video Model via Interpolated Weight Optimization. [pdf]
- International Conference on Machine Learning (ICML), Hawaii, USA, July, 2023
- Zejia Weng, Xitong Yang, Ang Li, Zuxuan Wu, Yu-Gang Jiang
- ResFormer: Scaling ViTs with Multi-Resolution Training. [pdf]
- IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, Canada, June, 2023
- Rui Tian, Zuxuan Wu, Qi Dai, Han Hu, Yu Qiao, Yu-Gang Jiang
- SVFormer: Semi-Supervised Video Transformer for Action Recognition. [pdf]
- IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, Canada, June, 2023
- Zhen Xing, Qi Dai, Han Hu, Jingjing Chen, Zuxuan Wu, Yu-Gang Jiang
- Detection Hub: Unifying Object Detection Datasets via Query Adaptation on Language Embedding. [pdf]
- IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, Canada, June, 2023
- Lingchen Meng, Xiyang Dai, Yinpeng Chen, Pengchuan Zhang, Dongdong Chen, Mengchen Liu, Jianfeng Wang, Zuxuan Wu, Lu Yuan, Yu-Gang Jiang
- Look Before You Match: Instance Understanding Matters in Video Object Segmentation. [pdf]
- IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, Canada, June, 2023
- Junke Wang, Dongdong Chen, Zuxuan Wu, Chong Luo, Chuanxin Tang, Xiyang Dai, Yucheng Zhao, Yujia Xie, Lu Yuan, Yu-Gang Jiang
- Masked Video Distillation: Rethinking Masked Feature Modeling for Self-supervised Video Representation Learning. [pdf]
- IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, Canada, June, 2023
- Rui Wang, Dongdong Chen, Zuxuan Wu, Yinpeng Chen, Xiyang Dai, Mengchen Liu, Lu Yuan, Yu-Gang Jiang
- Prototypical Residual Networks for Anomaly Detection and Localization. [pdf]
- IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, Canada, June, 2023
- Hui Zhang, Zuxuan Wu, Zheng Wang, Zhineng Chen, Yu-Gang Jiang
- Enhancing the Self-Universality for Transferable Targeted Attacks. [pdf]
- IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, Canada, June, 2023
- Zhipeng Wei, Jingjing Chen, Zuxuan Wu, Yu-Gang Jiang
- Vision Transformers are Good Mask Auto-Labelers. [pdf]
- IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, Canada, June, 2023
- Shiyi Lan, Xitong Yang, Zhiding Yu, Zuxuan Wu, Jose M. Alvarez, Anima Anandkumar
- Towards Scalable Neural Representation for Diverse Videos. [pdf]
- IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, Canada, June, 2023
- Bo He, Xitong Yang, Hanyu Wang, Zuxuan Wu, Hao Chen, Shuaiyi Huang, Yixuan Ren, Ser-Nam Lim, Abhinav Shrivastava
- Resolving Task Confusion in Dynamic Expansion Architectures for Class Incremental Learning. [pdf]
- The AAAI Conference on Artificial Intelligence (AAAI), Washington DC, USA, Feb., 2023
- Bingchen Huang, Zhineng Chen, Peng Zhou, Jiayin Chen, Zuxuan Wu
- OmniVL: One Foundation Model for Image-Language and Video-Language Tasks. [pdf]
- Advances in Neural Information Processing Systems (NeurIPS), New Orleans, USA, Dec., 2022.
- Junke Wang, Dongdong Chen, Zuxuan Wu, Chong Luo, Luowei Zhou, Yucheng Zhao, Yujia Xie, Ce Liu, Yu-Gang Jiang, Lu Yuan
- Semi-Supervised Vision Transformers. [pdf][code]
- European Conference on Computer Vision (ECCV), Tel Aviv, October, 2022.
- Zejia Weng, Xitong Yang, Ang Li, Zuxuan Wu, Yu-Gang Jiang
- Efficient Video Transformers with Spatial-Temporal Token Selection. [pdf][code]
- European Conference on Computer Vision (ECCV), Tel Aviv, October, 2022.
- Junke Wang, Xitong Yang, Hengduo Li, Li Liu, Zuxuan Wu, Yu-Gang Jiang
- Semi-Supervised Single-View 3D Reconstruction via Prototype Shape Priors. [pdf][code]
- European Conference on Computer Vision (ECCV), Tel Aviv, October, 2022.
- Zhen Xing, Hengduo Li, Zuxuan Wu, Yu-Gang Jiang
- BEVT: BERT Pretraining of Video Transformers. [pdf][code]
- IEEE Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, USA, June, 2022
- Rui Wang, Dongdong Chen, Zuxuan Wu, Yinpeng Chen, Xiyang Dai, Mengchen Liu, Yu-Gang Jiang, Luowei Zhou, Lu Yuan
- Cross-Modal Transferable Adversarial Attacks from Images to Videos. [pdf]
- IEEE Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, USA, June, 2022
- Zhipeng Wei, Jingjing Chen, Zuxuan Wu, Yu-Gang Jiang
- AdaViT: Adaptive Vision Transformers for Efficient Image Recognition. [pdf]
- IEEE Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, USA, June, 2022
- Lingchen Meng, Hengduo Li, Bor-Chun Chen, Shiyi Lan, Zuxuan Wu, Yu-Gang Jiang, Ser-Nam Lim
- ObjectFormer for Image Manipulation Detection and Localization. [pdf]
- IEEE Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, USA, June, 2022
- Junke Wang, Zuxuan Wu, Jingjing Chen, Xintong Han, Abhinav Shrivastava, Ser-Nam Lim, Yu-Gang Jiang
- Flag: Adversarial data augmentation for graph neural networks. [pdf]
- IEEE Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, USA, June, 2022
- Kezhi Kong, Guohao Li, Mucong Ding, Zuxuan Wu, Chen Zhu, Bernard Ghanem, Gavin Taylor, Tom Goldstein
- Boosting the Transferability of Video Adversarial Examples via Temporal Translation. [pdf]
- The AAAI Conference on Artificial Intelligence (AAAI), Virtual, Feb., 2022
- Zhipeng Wei, Jingjing Chen, Zuxuan Wu, Yu-Gang Jiang
- Attacking Video Recognition Models with Bullet-Screen Comments. [pdf]
- The AAAI Conference on Artificial Intelligence (AAAI), Virtual, Feb., 2022
- Kaichen, Zhipeng Wei, Jingjing Chen, Zuxuan Wu, Yu-Gang Jiang
- Towards transferable adversarial attacks on vision transformers. [pdf]
- The AAAI Conference on Artificial Intelligence (AAAI), Virtual, Feb., 2022
- Zhipeng Wei, Jingjing Chen, Micah Goldblum, Zuxuan Wu, Tom Goldstein, Yu-Gang Jiang
- Rethinking Pseudo Labels for Semi-Supervised Object Detection. [pdf]
- The AAAI Conference on Artificial Intelligence (AAAI), Virtual, Feb., 2022
- Hengduo Li, Zuxuan Wu, Abhinav Shrivastava, Larry Davis
- Encoding Robustness to Image Style via Adversarial Feature Perturbations. [pdf]
- Advances in Neural Information Processing Systems (NeurIPS), Virtual, Dec., 2021.
- Manli Shu, Zuxuan Wu, Micah Goldblum, Tom Goldstein
- Deep Video Inpainting Detection. [pdf]
- British Machine Vision Conference (BMVC), Virtual, Oct., 2021
- Peng Zhou, Ning Yu, Zuxuan Wu, Larry Davis, Abhinav Shrivastava, Ser-Nam Lim
- GTA: Global Temporal Attention for Video Action Understanding. [pdf]
- British Machine Vision Conference (BMVC), Virtual, Oct., 2021
- Bo He, Xitong Yang, Zuxuan Wu, Hao Chen, Ser-Nam Lim, Abhinav Shrivastava
- VideoLT: Large-scale Long-tailed Video Recognition. [pdf]
- International Conference on Computer Vision (ICCV), Virtual, Oct., 2021
- Xing Zhang, Zuxuan Wu, Zejia Weng, Huazhu Fu, Jingjing Chen, Yu-Gang Jiang, Larry Davis
- Exploring Visual Engagement Signals for Representation Learning. [pdf]
- International Conference on Computer Vision (ICCV), Virtual, Oct., 2021
- Menglin Jia, Zuxuan Wu, Austin Reiter, Claire Cardie, Serge Belongie, Ser-Nam Lim
- 2D or not 2D? Adaptive 3D Convolution Selection for Efficient Video Recognition. [pdf]
- IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Virtual, June, 2021
- Hengduo Li, Zuxuan Wu, Abhinav Shrivastava, Larry S. Davis
- Intentonomy: a Dataset and Study towards Human Intent Understanding [pdf][code]
- IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Virtual, June, 2021 (Oral)
- Menglin Jia, Zuxuan Wu, Austin Reiter, Claire Cardie, Serge Belongie, Ser-Nam Lim
- Efficient Object Embedding for Manipulated Image Retrieval [pdf]
- IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Virtual, June, 2021
- Bor-Chun Chen, Zuxuan Wu, Larry S. Davis, Ser-Nam Lim
- Making an Invisibility Cloak: Real World Adversarial Attacks on Object Detectors. [pdf][code]
- European Conference on Computer Vision (ECCV), Virtual, August, 2020.
- Zuxuan Wu, Ser-Nam Lim, Larry S. Davis, Tom Goldstein
- Learning from Noisy Anchors for One-stage Object Detection. [pdf]
- IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Virtual, June, 2020
- Hengduo Li, Zuxuan Wu, Chen Zhu, Caiming Xiong, Richard Socher, Larry S. Davis
- LiteEval: A Coarse-to-Fine Framework for Resource Efficient Video Recognition. [pdf][code]
- Advances in Neural Information Processing Systems (NeurIPS), Vancouver, Canada, Dec., 2019.
- Zuxuan Wu, Caiming Xiong, Yu-Gang Jiang, Larry S. Davis
- FiNet: Compatible and Diverse Fashion Image Inpainting. [pdf]
- International Conference on Computer Vision (ICCV), Seoul, Korea, Oct., 2019. (Oral)
- Xintong Han, Zuxuan Wu, Weilin Huang, Matthew R. Scott, Larry S. Davis
- ACE: Adapting to Changing Environments for Semantic Segmentation. [pdf]
- International Conference on Computer Vision (ICCV), Seoul, Korea, Oct., 2019.
- Zuxuan Wu, Xin Wang, Joseph E. Gonzalez, Tom Goldstein, Larry S. Davis
- AdaFrame: Adaptive Frame Selection for Fast Video Recognition. [pdf]
- IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, USA, June, 2019
- Zuxuan Wu, Caiming Xiong, Chih-Yao Ma, Richard Socher, Larry S Davis
- The Regretful Agent: Heuristic-Aided Navigation through Progress Estimation. [pdf]
- IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, USA, June, 2019.
- Chih-Yao Ma, Zuxuan Wu, Ghassan AlRegib, Caiming Xiong, Zsolt Kira
- Visual Content Recognition by Exploiting Semantic Feature Map with Attention and Multi-task Learning.[pdf]
- ACM Trans. Multimedia Comput. Commun (ACM TOMM), vol. 15, issue 1, pp. 6:1-6:22, 2019.
- Rui-Wei Zhao, Qi Zhang, Zuxuan Wu, Jianguo Li, Yu-Gang Jiang
- Self-Monitoring Navigation Agent via Auxiliary Progress Estimation. [pdf]
- International Conference on Learning Representations (ICLR), New Orleans, USA, May, 2019.
- Chih-Yao Ma, Jiasen Lu, Zuxuan Wu, Ghassan AlRegib, Zsolt Kira, Richard Socher, Caiming Xiong
- DCAN: Dual Channel-wise Alignment Networks for Unsupervised Scene Adaptation. [pdf][code]
- European Conference on Computer Vision (ECCV), Munich, Germany, September, 2018.
- Zuxuan Wu, Xintong Han, Yen-Liang Lin, Mustafa Gkhan Uzunbas, Tom Goldstein, Ser Nam Lim, Larry S. Davis
- BlockDrop: Dynamic Inference Paths in Residual Networks. [pdf][code]
- IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, USA, June, 2018. (Spotlight)
- Zuxuan Wu*, Tushar Nagarajan*, Abhishek Kumar, Steven Rennie, Larry S. Davis, Kristen Grauman, Rogerio Feris (* denotes equal contribution)
- VITON: An Image-based Virtual Try-on Network. [pdf][code]
- IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, USA, June, 2018. (Spotlight)
- Xintong Han, Zuxuan Wu, Zhe Wu, Ruichi Yu, Larry S. Davis
- Exploiting Feature and Class Relationships in Video Categorization with Regularized Deep Neural Networks. [pdf]
- IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), Vol. 40, Issue 2, pp. 352-364, 2018.
- Yu-Gang Jiang, Zuxuan Wu, Jun Wang, Xiangyang Xue, Shih-Fu Chang
- Fudan-Columbia Video Dataset (FCVID), one of the largest public Web video datasets with manual annotations.
- Deep Learning for Video Classification and Video Captioning. [pdf]
- In Frontiers of Multimedia Research, Shih-Fu Chang (Ed.), ACM Morgan & Claypool, New York, NY, USA, pp. 3-29, 2018
- Zuxuan Wu, Ting Yao, Yanwei Fu, Yu-Gang Jiang Surveying 100+ recent literatures on video classification and captioning with deep learning.
- Weakly-Supervised Spatial Context Networks. [pdf]
- arXiv preprint arXiv:1704.02998
- Zuxuan Wu, Larry S. Davis, Leonid Sigal
- Automatic Spatially-aware Fashion Concept Discovery. [pdf]
- International Conference on Computer Vision (ICCV), Venice, Italy, Oct., 2017.
- Xintong Han, Zuxuan Wu, Phoenix Huang, Xiao Zhang, Menglong Zhu, Yuan Li, Yang Zhao, Larry S. Davis
- Learning Fashion Compatibility with Bidirectional LSTMs. [pdf]
- ACM Multimedia (ACM MM), Mountain View, USA, Oct., 2017.
- Xintong Han, Zuxuan Wu, Yu-Gang Jiang, Larry S. Davis
- Learning Semantic Feature Map for Visual Content Recognition. [pdf]
- ACM Multimedia (ACM MM), Mountain View, USA, Oct., 2017.
- Rui-Wei Zhao, Zuxuan Wu, Jianguo Li, Yu-Gang Jiang
- Harnessing Object and Scene Semantics for Large-Scale Video Understanding. [pdf]
- IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, USA, June, 2016. (Spotlight)
- Zuxuan Wu, Yanwei Fu, Yu-Gang Jiang, Leonid Sigal Featured in Tech2, ACM Technews
- Multi-Stream Multi-Class Fusion of Deep Networks for Video Classification. [pdf]
- ACM Multimedia (ACM MM), Amsterdam, the Netherlands, Oct., 2016. (Oral Paper)
- Zuxuan Wu, Yu-Gang Jiang, Xi Wang, Hao Ye, Xiangyang Xue
- Modeling Spatial-Temporal Clues in a Hybrid Deep Learning Framework for Video Classification. [pdf]
- ACM Multimedia (ACM MM), Brisbane, Australia, Oct., 2015. (Oral Paper)
- Zuxuan Wu, Xi Wang, Yu-Gang Jiang, Hao Ye, Xiangyang Xue Obtain 91.3% accuracy on the UCF-101 dataset.
- Evaluating Two-Stream CNN for Video Classification. [pdf][motion CNN model]
- ACM International Conference on Multimedia Retrieval (ICMR), Shanghai, China, June, 2015
- Hao Ye, Zuxuan Wu, Rui-Wei Zhao, Xi Wang, Yu-Gang Jiang, Xiangyang Xue
- Exploring Inter-feature and Inter-class Relationships with Deep Neural Networks for Video Classification. [pdf]
- ACM Multimedia (ACM MM), Orlando, USA, Nov., 2014. (Oral Paper)
-
Zuxuan Wu, Yu-Gang Jiang, Jun Wang, Jian Pu, Xiangyang Xue
Academic Service
Journal Reviewer | TPAMI, JMLR, TIP, TMM, TCSVT, TNNLS |
Area Chair | NeurIPS'23, CVPR'23-24 |
Senior Program Committee | AAAI'23-24, IJCAI'23 |
Program Committee | CVPR, ICCV, ECCV, NeurIPS, ICLR, ICML |