π Baochen Xiong, Xiaoshan Yang, Yaguang Song, Yaowei Wang, Changsheng Xu: Modality-Collaborative Test-Time Adaptation for Action Recognition, IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
π Yifan Xu, Xiaoshan Yang, Yaguang Song, Changsheng Xu: Libra: Building Decoupled Vision System on Large Language Models, International Conference on Machine Learning (ICML), 2024
π Xiaoshan Yang, Baochen Xiong, Yi Huang, Changsheng Xu: Cross-Modal Federated Human Activity Recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024
π Yifan Xu, Mengdan Zhang, Chaoyou Fu, Peixian Chen, Xiaoshan Yang, Ke Li, Changsheng Xu: Multi-modal queried object detection in the wild, Advances in Neural Information Processing Systems (NeurIPS), 2024
π Fan Qi, Huaiwen Zhang, Xiaoshan Yang, Changsheng Xu: A Versatile Multimodal Learning Framework For Zero-shot Emotion Recognition, IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 2024
π Baochen Xiong, Xiaoshan Yang, Yaguang Song, Yaowei Wang, Changsheng Xu: Client-Adaptive Cross-Model Reconstruction Network for Modality-Incomplete Multimodal Federated Learning, ACM Multimedia (MM), 2023
π Yiming Li, Xiaoshan Yang, Changsheng Xu: Iterative Learning with Extra and Inner Knowledge for Long-tail Dynamic Scene Graph Generation, ACM Multimedia (MM), 2023
π Qinghao Ye, Haiyang Xu, Ming Yan, Chenlin Zhao, Junyang Wang, Xiaoshan Yang, Ji Zhang, Fei Huang, Jitao Sang, Changsheng Xu: mPLUG-Octopus: The Versatile Assistant Empowered by A Modularized End-to-End Multimodal LLM, ACM Multimedia (MM), 2023
π Linhui Xiao, Xiaoshan Yang, Fang Peng, Ming Yan, Yaowei Wang, Changsheng Xu: CLIP-VG: Self-paced Curriculum Adapting of CLIP for Visual Grounding, IEEE Transactions on Multimedia (TMM), 2023
π Fang Peng, Xiaoshan Yang, Linhui Xiao, Yaowei Wang, Changsheng Xu: Sgva-clip: Semantic-guided visual adapting of vision-language models for few-shot image classification, IEEE Transactions on Multimedia (TMM), 2023
π Hao Liu, Xiaoshan Yang, Changsheng Xu: Counterfactual scenario-relevant knowledge-enriched multi-modal emotion reasoning, ACM Transactions on Multimedia Computing, Communications and Applications 19 β¦ 1 2023
π Yagong Song, Xiaoshan Yang, Yaowei Wang, Changsheng Xu: Recovering generalization via pre-training-like knowledge distillation for out-of-distribution visual question answering, IEEE Transactions on Multimedia (TMM), 2023
π Yaguang Song, Xiaoshan Yang, Changsheng Xu: Self-supervised calorie-aware heterogeneous graph networks for food recommendation, ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 2023
π Yuyang Wanyan, Xiaoshan Yang, Chaofan Chen, Changsheng Xu: Active Exploration of Multimodal Complementarity for Few-Shot Action Recognition. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023
π Chaofan Chen, Xiaoshan Yang, Jinpeng Zhang, Bo Dong, Changsheng Xu: Category Knowledge-guided Parameter Calibration for Few-shot Object Detection. IEEE Transactions on Image Processing (TIP), 2023
π Xuan Ma, Xiaoshan Yang, Changsheng Xu: Multi-Source Knowledge Reasoning Graph Network for Multi-modal Commonsense Inference. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 2022
π Yuyang Wanyan, Xiaoshan Yang, Xuan Ma, Changsheng Xu: Dual Scene Graph Convolutional Network for Motivation Prediction. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 2022
π Yi Huang, Xiaoshan Yang, Ji Zhang, Changsheng Xu: Relative Alignment Network for Source-Free Multimodal Video Domain Adaptation. ACM Multimedia (MM), 2022
π Chaofan Chen, Xiaoshan Yang, Ming Yan, Changsheng Xu: Attribute-guided Dynamic Routing Graph Network for Transductive Few-shot Learning. ACM Multimedia (MM), 2022
π Fan Qi, Zixin Zhang, Xiaoshan Yang, Huaiwen Zhang, Changsheng Xu: Feeling Without Sharing: A Federated Video Emotion Recognition Framework Via Privacy-Agnostic Hybrid Aggregation. ACM Multimedia (MM), 2022
π Yiming Li, Xiaoshan Yang, Changsheng Xu: Dynamic Scene Graph Generation via Anticipatory Pre-training. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022
π Jiabo Ye, Junfeng Tian, Ming Yan, Xiaoshan Yang, Xuwu Wang, Ji Zhang, Liang He, Xin Lin: Shifting More Attention to Visual Backbone: Query-modulated Refinement Networks for End-to-End Visual Grounding. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022
π Xiaoshan Yang, Baochen Xiong, Yi Huang, Changsheng Xu: Cross-Modal Federated Human Activity Recognition via Modality-Agnostic and Modality-Specific Representation Learning. AAAI Conference on Artificial Intelligence (AAAI), 2022, Oral
π Xinhong Ma, Xiaoshan Yang, Junyu Gao, Changsheng Xu: The Model May Fit You: User-Generalized Cross-Modal Retrieval. IEEE Transactions on Multimedia (TMM), 2022
π Yiming Li, Xiaoshan Yang, Xuhui Huang, Zhe Ma, and Changsheng Xu: Zero-shot Predicate Prediction for Scene Graph Parsing. IEEE Transactions on Multimedia (TMM), 2022
π Yaguang Song, Xiaoshan Yang, Changsheng Xu: Self-supervised Calorie-aware Heterogeneous Graph Networks for Food Recommendation, ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 2022
π Yi Huang, Xiaoshan Yang, Changsheng Xu: Multimodal Global Relation Knowledge Distillation for Egocentric Action Anticipation, ACM Multimedia (MM), 2021, Oral
π Fan Qi, Xiaoshan Yang, Changsheng Xu: Zero-shot Video Emotion Recognition via Multimodal Protagonist-aware Transformer Network, ACM Multimedia (MM), 2021, Oral
π Yaguang Song, Junyu Gao, Xiaoshan Yang, Changsheng Xu: Learning Hierarchical Video Graph Networks for One-Stop Video Delivery, ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 2021
π Yi Huang, Xiaoshan Yang, Junyu Gao, Changsheng Xu: Holographic Feature Learning of Egocentric-Exocentric Videos for Multi-Domain Action Recognition, IEEE Transactions on Multimedia (TMM), 2021
π Chaofan Chen, Xiaoshan Yang, Changsheng Xu, Xuhui Huang, Zhe Ma: ECKPN: Explicit Class Knowledge Propagation Network for Transductive Few-shot Learning, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021
π Xuan Ma, Xiaoshan Yang, Junyu Gao, Changsheng Xu, Health Status Prediction with Local-Global Heterogeneous Behavior Graph, ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 2021
π Yi Huang, Xiaoshan Yang, Junyu Gao, Jitao Sang, Changsheng Xu, Knowledge-driven Egocentric Multimodal Activity Recognition, ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 2020
π Fan Qi, Xiaoshan Yang, Changsheng Xu, Emotion Knowledge Driven Video Highlight Detection, IEEE Transactions on Multimedia (TMM), 2020
π Junyu Gao, Xiaoshan Yang, Yingying Zhang, Changsheng Xu, Unsupervised Video Summarization via Relation-aware Assignment Learning, IEEE Transactions on Multimedia (TMM), 2020
π Wei Wang, Junyu Gao, Xiaoshan Yang, Changsheng Xu: Learning Coarse-to-Fine Graph Neural Networks for Video-Text Retrieval, IEEE Transactions on Multimedia (TMM), 2020
π Shan Zhang, Xiaoshan Yang, Yanxia Liu, Changsheng Xu, Asymmetric multi-stage CNNs for small-scale pedestrian detection, Neurocomputing, 2020
π Fan Qi, Xiaoshan Yang, Tianzhu Zhang, Changsheng Xu, Discriminative multimodal embedding for event classification, Neurocomputing, 2020
π Xuecheng Ning, Xiaoshan Yang, Changsheng Xu, Multi-hop Interactive Cross-Modal Retrieval, International Conference on Multimedia Modeling (MMM), 2020
π Yiming Li, Xiaoshan Yang, Changsheng Xu: Structured Neural Motifs: Scene Graph Parsing via Enhanced Context, International Conference on Multimedia Modeling (MMM), 2020
π Yingying Zhang, Junyu Gao, Xiaoshan Yang, Chang Liu, Yan Li, Changsheng Xu: Find Objects and Focus on Highlights: Mining Object Semantics for Video Highlight Detection via Graph Neural Networks, Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2020
π Tingting Xie, Xiaoshan Yang, Tianzhu Zhang, Changsheng Xu, Ioannis Patras: Exploring feature representation and training strategies in temporal action localization. IEEE International Conference on Image Processing (ICIP), 2019
π Yi Huang, Xiaoshan Yang, Changsheng Xu: Time-Guided High-Order Attention Model of Longitudinal Heterogeneous Healthcare Data. Pacific Rim International Conference on Artificial Intelligence (PRICAI), 2019
π Xiaoshan Yang, Changsheng Xu: Image Captioning by Asking Questions. ACM Transactions on Multimedia Computing, Communications and Applications (TOMM), 2019
π Weiming Zhang, Yi Huang, Wanting Yu, Xiaoshan Yang, Wei Wang, Jitao Sang: Multimodal attribute and feature embedding for activity recognition, Proceedings of the ACM Multimedia Asia, MM Asia, 2019
π Junyu Gao, Tianzhu Zhang, Xiaoshan Yang, Changsheng Xu: P2T: Part-to-Target Tracking via Deep Regression Learning. IEEE Trans. Image Processing (TIP) 27(6): 3074-3086 (2018)
π Xiaoshan Yang, Tianzhu Zhang, Changsheng Xu: Deep-Structured Event Modeling for User-Generated Photos. IEEE Trans. Multimedia (TMM) 20(8): 2100-2113 (2018)
π Xiaoshan Yang, Tianzhu Zhang, Changsheng Xu: Text2Video: An End-to-end Learning Framework for Expressing Text With Videos. IEEE Trans. Multimedia (TMM) 20(9): 2360-2370 (2018)
π Yifan Jiao, Zhetao Li, Shucheng Huang, Xiaoshan Yang, Bin Liu, Tianzhu Zhang: Three-Dimensional Attention-Based Deep Ranking Model for Video Highlight Detection. IEEE Trans. Multimedia (TMM) 20(10): 2693-2705 (2018)
π Yikun Sheng, Xiaoshan Yang, Changsheng Xu: A Standalone Demo for Quiz Game βDescribe and Guessβ. MIPR 2018: 206-207
π Yikun Sheng, Xiaoshan Yang, Xueliang Liu, Changsheng Xu: Attribute-Assisted Domain Transfer from Image to Sketch. MIPR 2018: 287-292
π Fan Qi, Xiaoshan Yang, Changsheng Xu: A Unified Framework for Multimodal Domain Adaptation. ACM Multimedia (MM), 2018: 429-437, Oral
π Junyu Gao, Tianzhu Zhang, Xiaoshan Yang, Changsheng Xu: Deep Relative Tracking. IEEE Trans. Image Processing (TIP) 26(4): 1845-1858 (2017)
π Yifan Jiao, Xiaoshan Yang, Tianzhu Zhang, Shucheng Huang, Changsheng Xu: Video Highlight Detection via Deep Ranking Modeling. PSIVT 2017: 28-39
π Xiaoshan Yang, Tianzhu Zhang, Changsheng Xu, Shuicheng Yan, M. Shamim Hossain, Ahmed Ghoneim: Deep Relative Attributes. IEEE Trans. Multimedia (TMM) 18(9): 1832-1842 (2016). Code
π Xiaoshan Yang, Tianzhu Zhang, Changsheng Xu: Semantic Feature Mining for Video Event Understanding. ACM Transactions on Multimedia Computing, Communications and Applications (TOMM) 12(4): 55:1-55:22 (2016)
π Xiaoshan Yang, Tianzhu Zhang, Changsheng Xu: Abnormal Event Discovery in User Generated Photos. ACM Multimedia (MM), 2016
π Xiaoshan Yang, Tianzhu Zhang, Changsheng Xu: A new discriminative coding method for image classification. Multimedia System (MSJ) 21(2): 133-145 (2015)
π Xiaoshan Yang, Tianzhu Zhang, Changsheng Xu: Cross-Domain Feature Learning in Multimedia. IEEE Trans. Multimedia (TMM) 17(1): 64-78 (2015). Code
π Xiaoshan Yang, Tianzhu Zhang, Changsheng Xu, M. Shamim Hossain: Automatic Visual Concept Learning for Social Event Understanding. IEEE Trans. Multimedia (TMM) 17(3): 346-358 (2015)
π Xiaoshan Yang, Tianzhu Zhang, Changsheng Xu, Ming-Hsuan Yang: Boosted Multifeature Learning for Cross-Domain Transfer. ACM Transactions on Multimedia Computing, Communications and Applications (TOMM) 11(3): 35:1-35:18 (2015)
π Jianbing Shen, Xiaoshan Yang, Xuelong Li, Yunde Jia: Intrinsic Image Decomposition Using Optimization and User Scribbles. IEEE Trans. Cybernetics (TCYB) 43(2): 425-436 (2013)
π Xiaoshan Yang, Tianzhu Zhang, Changsheng Xu: Locality discriminative coding for image classification. ICIMCS 2013: 52-55
π Xiaoshan Yang, Tianzhu Zhang, Changsheng Xu, Min Xu: Graph-Guided Fusion Penalty Based Sparse Coding for Image Classification. PCM 2013: 475-484
π Jianbing Shen, Xiaoshan Yang, Yunde Jia, Xuelong Li: Intrinsic images using optimization. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2011: 3481-3487. Code