2024

πŸ“œ Baochen Xiong, Xiaoshan Yang, Yaguang Song, Yaowei Wang, Changsheng Xu: Modality-Collaborative Test-Time Adaptation for Action Recognition, IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024

πŸ“œ Yifan Xu, Xiaoshan Yang, Yaguang Song, Changsheng Xu: Libra: Building Decoupled Vision System on Large Language Models, International Conference on Machine Learning (ICML), 2024

πŸ“‘ Xiaoshan Yang, Baochen Xiong, Yi Huang, Changsheng Xu: Cross-Modal Federated Human Activity Recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024

πŸ“œ Yifan Xu, Mengdan Zhang, Chaoyou Fu, Peixian Chen, Xiaoshan Yang, Ke Li, Changsheng Xu: Multi-modal queried object detection in the wild, Advances in Neural Information Processing Systems (NeurIPS), 2024

πŸ“‘ Fan Qi, Huaiwen Zhang, Xiaoshan Yang, Changsheng Xu: A Versatile Multimodal Learning Framework For Zero-shot Emotion Recognition, IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 2024

2023

πŸ“œ Baochen Xiong, Xiaoshan Yang, Yaguang Song, Yaowei Wang, Changsheng Xu: Client-Adaptive Cross-Model Reconstruction Network for Modality-Incomplete Multimodal Federated Learning, ACM Multimedia (MM), 2023

πŸ“œ Yiming Li, Xiaoshan Yang, Changsheng Xu: Iterative Learning with Extra and Inner Knowledge for Long-tail Dynamic Scene Graph Generation, ACM Multimedia (MM), 2023

πŸ“œ Qinghao Ye, Haiyang Xu, Ming Yan, Chenlin Zhao, Junyang Wang, Xiaoshan Yang, Ji Zhang, Fei Huang, Jitao Sang, Changsheng Xu: mPLUG-Octopus: The Versatile Assistant Empowered by A Modularized End-to-End Multimodal LLM, ACM Multimedia (MM), 2023

πŸ“‘ Linhui Xiao, Xiaoshan Yang, Fang Peng, Ming Yan, Yaowei Wang, Changsheng Xu: CLIP-VG: Self-paced Curriculum Adapting of CLIP for Visual Grounding, IEEE Transactions on Multimedia (TMM), 2023

πŸ“‘ Fang Peng, Xiaoshan Yang, Linhui Xiao, Yaowei Wang, Changsheng Xu: Sgva-clip: Semantic-guided visual adapting of vision-language models for few-shot image classification, IEEE Transactions on Multimedia (TMM), 2023

πŸ“‘ Hao Liu, Xiaoshan Yang, Changsheng Xu: Counterfactual scenario-relevant knowledge-enriched multi-modal emotion reasoning, ACM Transactions on Multimedia Computing, Communications and Applications 19 … 1 2023

πŸ“‘ Yagong Song, Xiaoshan Yang, Yaowei Wang, Changsheng Xu: Recovering generalization via pre-training-like knowledge distillation for out-of-distribution visual question answering, IEEE Transactions on Multimedia (TMM), 2023

πŸ“‘ Yaguang Song, Xiaoshan Yang, Changsheng Xu: Self-supervised calorie-aware heterogeneous graph networks for food recommendation, ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 2023

πŸ“œ Yuyang Wanyan, Xiaoshan Yang, Chaofan Chen, Changsheng Xu: Active Exploration of Multimodal Complementarity for Few-Shot Action Recognition. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023

πŸ“‘ Chaofan Chen, Xiaoshan Yang, Jinpeng Zhang, Bo Dong, Changsheng Xu: Category Knowledge-guided Parameter Calibration for Few-shot Object Detection. IEEE Transactions on Image Processing (TIP), 2023

2022

πŸ“‘ Xuan Ma, Xiaoshan Yang, Changsheng Xu: Multi-Source Knowledge Reasoning Graph Network for Multi-modal Commonsense Inference. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 2022

πŸ“‘ Yuyang Wanyan, Xiaoshan Yang, Xuan Ma, Changsheng Xu: Dual Scene Graph Convolutional Network for Motivation Prediction. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 2022

πŸ“œ Yi Huang, Xiaoshan Yang, Ji Zhang, Changsheng Xu: Relative Alignment Network for Source-Free Multimodal Video Domain Adaptation. ACM Multimedia (MM), 2022

πŸ“œ Chaofan Chen, Xiaoshan Yang, Ming Yan, Changsheng Xu: Attribute-guided Dynamic Routing Graph Network for Transductive Few-shot Learning. ACM Multimedia (MM), 2022

πŸ“œ Fan Qi, Zixin Zhang, Xiaoshan Yang, Huaiwen Zhang, Changsheng Xu: Feeling Without Sharing: A Federated Video Emotion Recognition Framework Via Privacy-Agnostic Hybrid Aggregation. ACM Multimedia (MM), 2022

πŸ“œ Yiming Li, Xiaoshan Yang, Changsheng Xu: Dynamic Scene Graph Generation via Anticipatory Pre-training. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022

πŸ“œ Jiabo Ye, Junfeng Tian, Ming Yan, Xiaoshan Yang, Xuwu Wang, Ji Zhang, Liang He, Xin Lin: Shifting More Attention to Visual Backbone: Query-modulated Refinement Networks for End-to-End Visual Grounding. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022

πŸ“œ Xiaoshan Yang, Baochen Xiong, Yi Huang, Changsheng Xu: Cross-Modal Federated Human Activity Recognition via Modality-Agnostic and Modality-Specific Representation Learning. AAAI Conference on Artificial Intelligence (AAAI), 2022, Oral

πŸ“‘ Xinhong Ma, Xiaoshan Yang, Junyu Gao, Changsheng Xu: The Model May Fit You: User-Generalized Cross-Modal Retrieval. IEEE Transactions on Multimedia (TMM), 2022

πŸ“‘ Yiming Li, Xiaoshan Yang, Xuhui Huang, Zhe Ma, and Changsheng Xu: Zero-shot Predicate Prediction for Scene Graph Parsing. IEEE Transactions on Multimedia (TMM), 2022

πŸ“‘ Yaguang Song, Xiaoshan Yang, Changsheng Xu: Self-supervised Calorie-aware Heterogeneous Graph Networks for Food Recommendation, ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 2022

2021

πŸ“œ Yi Huang, Xiaoshan Yang, Changsheng Xu: Multimodal Global Relation Knowledge Distillation for Egocentric Action Anticipation, ACM Multimedia (MM), 2021, Oral

πŸ“œ Fan Qi, Xiaoshan Yang, Changsheng Xu: Zero-shot Video Emotion Recognition via Multimodal Protagonist-aware Transformer Network, ACM Multimedia (MM), 2021, Oral

πŸ“‘ Yaguang Song, Junyu Gao, Xiaoshan Yang, Changsheng Xu: Learning Hierarchical Video Graph Networks for One-Stop Video Delivery, ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 2021

πŸ“‘ Yi Huang, Xiaoshan Yang, Junyu Gao, Changsheng Xu: Holographic Feature Learning of Egocentric-Exocentric Videos for Multi-Domain Action Recognition, IEEE Transactions on Multimedia (TMM), 2021

πŸ“œ Chaofan Chen, Xiaoshan Yang, Changsheng Xu, Xuhui Huang, Zhe Ma: ECKPN: Explicit Class Knowledge Propagation Network for Transductive Few-shot Learning, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021

πŸ“‘ Xuan Ma, Xiaoshan Yang, Junyu Gao, Changsheng Xu, Health Status Prediction with Local-Global Heterogeneous Behavior Graph, ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 2021

2020

πŸ“‘ Yi Huang, Xiaoshan Yang, Junyu Gao, Jitao Sang, Changsheng Xu, Knowledge-driven Egocentric Multimodal Activity Recognition, ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 2020

πŸ“‘ Fan Qi, Xiaoshan Yang, Changsheng Xu, Emotion Knowledge Driven Video Highlight Detection, IEEE Transactions on Multimedia (TMM), 2020

πŸ“‘ Junyu Gao, Xiaoshan Yang, Yingying Zhang, Changsheng Xu, Unsupervised Video Summarization via Relation-aware Assignment Learning, IEEE Transactions on Multimedia (TMM), 2020

πŸ“‘ Wei Wang, Junyu Gao, Xiaoshan Yang, Changsheng Xu: Learning Coarse-to-Fine Graph Neural Networks for Video-Text Retrieval, IEEE Transactions on Multimedia (TMM), 2020

πŸ“‘ Shan Zhang, Xiaoshan Yang, Yanxia Liu, Changsheng Xu, Asymmetric multi-stage CNNs for small-scale pedestrian detection, Neurocomputing, 2020

πŸ“‘ Fan Qi, Xiaoshan Yang, Tianzhu Zhang, Changsheng Xu, Discriminative multimodal embedding for event classification, Neurocomputing, 2020

πŸ“œ Xuecheng Ning, Xiaoshan Yang, Changsheng Xu, Multi-hop Interactive Cross-Modal Retrieval, International Conference on Multimedia Modeling (MMM), 2020

πŸ“œ Yiming Li, Xiaoshan Yang, Changsheng Xu: Structured Neural Motifs: Scene Graph Parsing via Enhanced Context, International Conference on Multimedia Modeling (MMM), 2020

πŸ“œ Yingying Zhang, Junyu Gao, Xiaoshan Yang, Chang Liu, Yan Li, Changsheng Xu: Find Objects and Focus on Highlights: Mining Object Semantics for Video Highlight Detection via Graph Neural Networks, Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2020

2019

πŸ“œ Tingting Xie, Xiaoshan Yang, Tianzhu Zhang, Changsheng Xu, Ioannis Patras: Exploring feature representation and training strategies in temporal action localization. IEEE International Conference on Image Processing (ICIP), 2019

πŸ“œ Yi Huang, Xiaoshan Yang, Changsheng Xu: Time-Guided High-Order Attention Model of Longitudinal Heterogeneous Healthcare Data. Pacific Rim International Conference on Artificial Intelligence (PRICAI), 2019

πŸ“‘ Xiaoshan Yang, Changsheng Xu: Image Captioning by Asking Questions. ACM Transactions on Multimedia Computing, Communications and Applications (TOMM), 2019

πŸ“œ Weiming Zhang, Yi Huang, Wanting Yu, Xiaoshan Yang, Wei Wang, Jitao Sang: Multimodal attribute and feature embedding for activity recognition, Proceedings of the ACM Multimedia Asia, MM Asia, 2019

2018

πŸ“‘ Junyu Gao, Tianzhu Zhang, Xiaoshan Yang, Changsheng Xu: P2T: Part-to-Target Tracking via Deep Regression Learning. IEEE Trans. Image Processing (TIP) 27(6): 3074-3086 (2018)

πŸ“‘ Xiaoshan Yang, Tianzhu Zhang, Changsheng Xu: Deep-Structured Event Modeling for User-Generated Photos. IEEE Trans. Multimedia (TMM) 20(8): 2100-2113 (2018)

πŸ“‘ Xiaoshan Yang, Tianzhu Zhang, Changsheng Xu: Text2Video: An End-to-end Learning Framework for Expressing Text With Videos. IEEE Trans. Multimedia (TMM) 20(9): 2360-2370 (2018)

πŸ“‘ Yifan Jiao, Zhetao Li, Shucheng Huang, Xiaoshan Yang, Bin Liu, Tianzhu Zhang: Three-Dimensional Attention-Based Deep Ranking Model for Video Highlight Detection. IEEE Trans. Multimedia (TMM) 20(10): 2693-2705 (2018)

πŸ“œ Yikun Sheng, Xiaoshan Yang, Changsheng Xu: A Standalone Demo for Quiz Game β€œDescribe and Guess”. MIPR 2018: 206-207

πŸ“œ Yikun Sheng, Xiaoshan Yang, Xueliang Liu, Changsheng Xu: Attribute-Assisted Domain Transfer from Image to Sketch. MIPR 2018: 287-292

πŸ“œ Fan Qi, Xiaoshan Yang, Changsheng Xu: A Unified Framework for Multimodal Domain Adaptation. ACM Multimedia (MM), 2018: 429-437, Oral

Early

πŸ“‘ Junyu Gao, Tianzhu Zhang, Xiaoshan Yang, Changsheng Xu: Deep Relative Tracking. IEEE Trans. Image Processing (TIP) 26(4): 1845-1858 (2017)

πŸ“œ Yifan Jiao, Xiaoshan Yang, Tianzhu Zhang, Shucheng Huang, Changsheng Xu: Video Highlight Detection via Deep Ranking Modeling. PSIVT 2017: 28-39

πŸ“‘ Xiaoshan Yang, Tianzhu Zhang, Changsheng Xu, Shuicheng Yan, M. Shamim Hossain, Ahmed Ghoneim: Deep Relative Attributes. IEEE Trans. Multimedia (TMM) 18(9): 1832-1842 (2016). Code

πŸ“‘ Xiaoshan Yang, Tianzhu Zhang, Changsheng Xu: Semantic Feature Mining for Video Event Understanding. ACM Transactions on Multimedia Computing, Communications and Applications (TOMM) 12(4): 55:1-55:22 (2016)

πŸ“œ Xiaoshan Yang, Tianzhu Zhang, Changsheng Xu: Abnormal Event Discovery in User Generated Photos. ACM Multimedia (MM), 2016

πŸ“‘ Xiaoshan Yang, Tianzhu Zhang, Changsheng Xu: A new discriminative coding method for image classification. Multimedia System (MSJ) 21(2): 133-145 (2015)

πŸ“‘ Xiaoshan Yang, Tianzhu Zhang, Changsheng Xu: Cross-Domain Feature Learning in Multimedia. IEEE Trans. Multimedia (TMM) 17(1): 64-78 (2015). Code

πŸ“‘ Xiaoshan Yang, Tianzhu Zhang, Changsheng Xu, M. Shamim Hossain: Automatic Visual Concept Learning for Social Event Understanding. IEEE Trans. Multimedia (TMM) 17(3): 346-358 (2015)

πŸ“‘ Xiaoshan Yang, Tianzhu Zhang, Changsheng Xu, Ming-Hsuan Yang: Boosted Multifeature Learning for Cross-Domain Transfer. ACM Transactions on Multimedia Computing, Communications and Applications (TOMM) 11(3): 35:1-35:18 (2015)

πŸ“‘ Jianbing Shen, Xiaoshan Yang, Xuelong Li, Yunde Jia: Intrinsic Image Decomposition Using Optimization and User Scribbles. IEEE Trans. Cybernetics (TCYB) 43(2): 425-436 (2013)

πŸ“œ Xiaoshan Yang, Tianzhu Zhang, Changsheng Xu: Locality discriminative coding for image classification. ICIMCS 2013: 52-55

πŸ“œ Xiaoshan Yang, Tianzhu Zhang, Changsheng Xu, Min Xu: Graph-Guided Fusion Penalty Based Sparse Coding for Image Classification. PCM 2013: 475-484

πŸ“œ Jianbing Shen, Xiaoshan Yang, Yunde Jia, Xuelong Li: Intrinsic images using optimization. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2011: 3481-3487. Code


Page Design