Bootstrap Your Own Prior: Towards Distribution-Agnostic Novel Class Discovery
The IEEE/CVF Computer Vision and Pattern Recognition Conference. CVPR 2023
Semantic Scene Completion with Cleaner Self
The IEEE/CVF Computer Vision and Pattern Recognition Conference. CVPR 2023
Unbiased Multiple Instance Learning for Weakly Supervised Video Anomaly Detection
The IEEE/CVF Computer Vision and Pattern Recognition Conference. CVPR 2023
Compositional Prompt Tuning with Motion Cues for Open-vocabulary Video Relation Detection
The Eleventh International Conference on Learning Representations. ICLR 2023
Debiased Fine-Tuning for Vision-language Models by Prompt Regularization
The AAAI Conference on Artificial Intelligence. AAAI 2023
Respecting Transfer Gap in Knowledge Distillation
Conference on Neural Information Processing Systems. NeurIPS 2022
Learning to Collocate Visual-Linguistic Neural Modules for Image Captioning
International Journal of Computer Vision
[arxiv]
Identifying Hard Noise in Long-Tailed Sample Distribution[oral]
European Conference on Computer Vision. ECCV 2022
Invariant Feature Learning for Generalized Long-Tailed Classification
European Conference on Computer Vision. ECCV 2022
Class Is Invariant to Context and Vice Versa: On Learning Invariance for Out-Of-Distribution Generalization
European Conference on Computer Vision. ECCV 2022
Equivariance and Invariance Inductive Bias for Learning from Insufficient Data
European Conference on Computer Vision. ECCV 2022
Certified Robustness Against Natural Language Attacks by Causal Intervention
International Conference on Machine Learning. ICML 2022
Class Re-Activation Maps for Weakly-Supervised Semantic Segmentation
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2022
KQA Pro: A Dataset with Explicit Compositional Programs for Complex Question Answering over Knowledge Base
Main Conference of ACL 2022
Learning to Imagine: Integrating Counterfactual Thinking in Neural Discrete Reasoning
Main Conference of ACL 2022
On Non-Random Missing Labels in Semi-Supervised Learning
International Conference on Learning Representations. ICLR 2022.
Cross-Domain Empirical Risk Minimization for Unbiased Long-tailed Classification[oral]
The AAAI Conference on Artificial Intelligence. AAAI 2022.
Deconfounded Visual Grounding[oral]
The AAAI Conference on Artificial Intelligence. AAAI 2022.
[arxiv]
Deconfounded Image Captioning: A Causal Retrospect
IEEE Transactions on Pattern Analysis and Machine Intelligence. TPAMI.
[link]
How Should Pre-Trained Language Models Be Fine-Tuned Towards Adversarial Robustness?
Conference on Neural Information Processing Systems. NeurIPS 2021. Virtual. December 2021
[arxiv]
Introspective Distillation for Robust Question Answering
Conference on Neural Information Processing Systems. NeurIPS 2021. Virtual. December 2021
Self-Supervised Learning Disentangled Group Representation as Feature[spotlight]
Conference on Neural Information Processing Systems. NeurIPS 2021. Virtual. December 2021
TransferNet: An Effective and Transparent Framework for Multi-hop Question Answering over Relation Graph
Conference on Empirical Methods in Natural Language Processing. EMNLP 2021. Virtual. November 2021
Transporting Causal Mechanisms for Unsupervised Domain Adaptation[oral]
IEEE International Conference on Computer Vision. ICCV 2021. Virtual. October 2021
Causal Attention for Unbiased Visual Recognition
IEEE International Conference on Computer Vision. ICCV 2021. Virtual. October 2021
Self-Regulation for Semantic Segmentation
IEEE International Conference on Computer Vision. ICCV 2021. Virtual. October 2021
Auto-Parsing Network for Image Captioning and Visual Question Answering
IEEE International Conference on Computer Vision. ICCV 2021. Virtual. October 2021
[arxiv]
Adversarial Visual Robustness by Causal Intervention
arXiv preprint 2021
[arxiv]
Are Missing Links Predictable? An Inferential Benchmark for Knowledge Graph Completion
Association for Computational Linguistics and International Joint Conference on Natural Language Processing. ACL-IJCNLP 2021
Clicks can be Cheating: Counterfactual Recommendation for Mitigating Clickbait Issue
Special Interest Group on Information Retrieval. ACM SIGIR 2021
[arxiv]
Empowering Language Understanding with Counterfactual Reasoning
Findings of ACL 2021
Cross-GCN: Enhancing Graph Convolutional Network with k-Order Feature Interactions
IEEE Transactions on Knowledge and Data Engineering. IEEE TKDE 2021
[arxiv]
Distilling Causal Effect of Data in Class-Incremental Learning
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2021
Counterfactual VQA: A Cause-Effect Look at Language Bias
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2021
Counterfactual Zero-Shot and Open-Set Visual Recognition
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2021
Causal Attention for Vision-Language Tasks
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2021
The Blessings of Unlabeled Background in Untrimmed Videos
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2021
Align R-CNN: A Pairwise Head Network for Visual Relationship Detection
IEEE Transactions on Multimedia. TMM
Auto-encoding and Distilling Scene Graphs for Image Captioning
IEEE Transactions on Pattern Analysis and Machine Intelligence. TPAMI
[link]
Ref-NMS: Breaking Proposal Bottlenecks in Two-Stage Referring Expression Grounding
The AAAI Conference on Artificial Intelligence. AAAI 2021
Causal Intervention for Weakly-Supervised Semantic Segmentation[oral]
34th Conference on Neural Information Processing Systems, NeurIPS 2020, Vancouver, Canada.
Interventional Few-Shot Learning
34th Conference on Neural Information Processing Systems, NeurIPS 2020, Vancouver, Canada.
Long-Tailed Classification by Keeping the Good and Removing the Bad Momentum Causal Effect
34th Conference on Neural Information Processing Systems, NeurIPS 2020, Vancouver, Canada.
Hierarchical Scene Graph Encoder-Decoder for Image Paragraph Captioning
ACM Multimedia 2020.
Counterfactual VQA: A Cause-Effect Look at Language Bias
Feature Pyramid Transformer
European Conference on Computer Vision. ECCV 2020
Self-Adaptive Neural Module Transformer for Visual Question Answering
IEEE Transactions on Multimedia. TMM 2020
Iterative Context-Aware Graph Inference for Visual Dialog[oral]
IEEE International Conference on Computer Vision and Pattern Recognition. CVPR 2020. Seattle, USA. June 2020
Unbiased Scene Graph Generation from Biased Training[oral]
IEEE International Conference on Computer Vision and Pattern Recognition. CVPR 2020. Seattle, USA. June 2020
Visual Commonsense R-CNN
IEEE International Conference on Computer Vision and Pattern Recognition. CVPR 2020. Seattle, USA. June 2020
Two Causal Principles for Improving Visual Dialog
IEEE International Conference on Computer Vision and Pattern Recognition. CVPR 2020. Seattle, USA. June 2020
Learning to Segment the Tail
IEEE International Conference on Computer Vision and Pattern Recognition. CVPR 2020. Seattle, USA. June 2020
Counterfactual Samples Synthesizing for Robust Visual Question Answering
IEEE International Conference on Computer Vision and Pattern Recognition. CVPR 2020. Seattle, USA. June 2020
More Grounded Image Captioning by Distilling Image-Text Matching Model
IEEE International Conference on Computer Vision and Pattern Recognition. CVPR 2020. Seattle, USA. June 2020
Learning Filter Pruning Criteria for Deep Convolutional Neural Networks Acceleration
IEEE International Conference on Computer Vision and Pattern Recognition. CVPR 2020. Seattle, USA. June 2020
Multi-Level Policy and Reward-Based Deep Reinforcement Learning Framework for Image Captioning
IEEE Transactions on Multimedia. TMM 2020
[Link]
General Partial Label Learning via Dual Bipartite Graph Autoencoder
The AAAI Conference on Artificial Intelligence. AAAI 2020
Learning to Assemble Neural Module Tree Networks for Visual Grounding [oral]
IEEE International Conference on Computer Vision. ICCV 2019 . Seoul, Korea, November 2019
Counterfactual Critic Multi-Agent Training for Scene Graph Generation [oral]
IEEE International Conference on Computer Vision. ICCV 2019. Seoul, Korea, November 2019
Learning to Collocate Neural Modules for Image Captioning
IEEE International Conference on Computer Vision. ICCV 2019. Seoul, Korea, November 2019
Making History Matter: History-Advantage Sequence Training for Visual Dialog
IEEE International Conference on Computer Vision. ICCV 2019. Seoul, Korea, November 2019
[arxiv preprint] [2nd place in 1st VisualDialog Challenge]
Variational Context: Exploiting Visual and Textual Context for Grounding Referring Expressions
IEEE Transactions on Pattern Analysis and Machine Intelligence. TPAMI 2019
Single-shot Semantic Image Inpainting with Densely Connected Generative Networks
ACM International Conference on Multimedia. MM 2019. Nice, France, October 2019
[Link]
Learning Using Privileged Information for Food Recognition
ACM International Conference on Multimedia. MM 2019. Nice, France, October 2019
[Link]
Question-Aware Tube-Switch Network for Video Question Answering
ACM International Conference on Multimedia. MM 2019. Nice, France, October 2019
[Link]
Fast Discrete Collaborative Multi-modal Hashing for Large-scale Multimedia Retrieval
IEEE Transactions on Knowledge and Data Engineering. TKDE 2019
[Link]
Learning to Compose and Reason with Language Tree Structures for Visual Grounding
IEEE Transactions on Pattern Analysis and Machine Intelligence. TPAMI 2019
Context-Aware Visual Policy Network for Fined-Grained Image Captioning
IEEE Transactions on Pattern Analysis and Machine Intelligence. TPAMI 2019
Explainable and Explicit Visual Reasoning over Scene Graphs
IEEE International Conference on Computer Vision and Pattern Recognition. CVPR 2019. Long Beach, USA. June 2019
Recursive Visual Attention in Visual Dialog [oral]
IEEE International Conference on Computer Vision and Pattern Recognition. CVPR 2019. Long Beach, USA. June 2019
Auto-Encoding Scene Graphs for Image Captioning [oral]
IEEE International Conference on Computer Vision and Pattern Recognition. CVPR 2019. Long Beach, USA. June 2019
Learning to Compose Dynamic Tree Structures for Visual Contexts [oral]
IEEE International Conference on Computer Vision and Pattern Recognition. CVPR 2019. Long Beach, USA. June 2019
DeepChannel: Salience Estimation by Contrastive Learning for Extractive Document Summarization
The Thirty-Second AAAI Conference on Artificial Intelligence. AAAI 2019
[arxiv preprint] [codes]
Learning to Embed Sentences Using Attentive Recursive Trees
The Thirty-Second AAAI Conference on Artificial Intelligence. AAAI 2019
[arxiv preprint] [codes]
Low-shot Learning via Covariance-Preserving Adversarial Augmentation Network
Thirty-second Conference on Neural Information Processing Systems. NIPS 2018
More is Better: Precise and Detailed Image Captioning using Online Positive Recall and Missing Concepts Mining
IEEE Transactions on Image Processing. TIP 2018
Shuffle-Then-Assemble: Learning Object-Agnostic Visual Relationship Features
15th European Conference on Computer Vision. ECCV 2018. Munich, Germany. Sep 2018
[arxiv preprint] [merged in vtranse]
Context-Aware Visual Policy Network for Sequence-Level Image Captioning [oral]
ACM International Conference on Multimedia. MM 2018. Seoul, Korea, October 2018
[arxiv preprint] [codes]
Discrete Factorization Machines for Fast Feature-based Recommendation
The 27th International Joint Conference on Artificial Intelligence. IJCAI 2018. Stockholm, Sweden, July, 2018
Multi-Level Policy and Reward Reinforcement Learning for Image Captioning
The 27th International Joint Conference on Artificial Intelligence. IJCAI 2018. Stockholm, Sweden, July, 2018
Self-Supervised Video Hashing with Hierarchical Binary Auto-encoder
IEEE Transactions on Image Processing. TIP 2018
Attributed Social Network Embedding
IEEE Transactions on Knowledge and Data Engineering. TKDE 2018
Zero-Shot Visual Recognition using Semantics-Preserving Adversarial Embedding Network
IEEE International Conference on Computer Vision and Pattern Recognition. CVPR 2018. Salt Late City, USA. June 2018
[arxiv preprint] [codes]
Grounding Referring Expressions in Images by Variational Context
IEEE International Conference on Computer Vision and Pattern Recognition. CVPR 2018. Salt Late City, USA. June 2018
[arxiv preprint] [codes]
Learning to Guide Decoding for Image Captioning
The Thirty-Second AAAI Conference on Artificial Intelligence. AAAI 2018. New Orleans, USA, Feb 2018
PPR-FCN: Weakly Supervised Visual Relation Detection via Parallel Pairwise R-FCN
International Conference on Computer Vision. ICCV 2017. Venice, Italy, October 2017
Improving Event Extraction via Cross-Modal Integration
ACM International Conference on Multimedia. MM 2017. Mountain View, CA USA, October 2017
Enhancing Micro-video Understanding by Harnessing External Sounds [oral]
ACM International Conference on Multimedia. MM 2017. Mountain View, CA USA, October 2017
Video Visual Relation Detection
ACM International Conference on Multimedia. MM 2017. Mountain View, CA USA, October 2017
Video Question Answering via Gradually Refined Attention over Appearance and Motion
ACM International Conference on Multimedia. MM 2017. Mountain View, CA USA, October 2017
Attentional Factorization Machines: Learning the Weight of Feature Interactions via Attention Networks
The 26th International Joint Conference on Artificial Intelligence. IJCAI 2017. Melbourne, Australia, August 2017
Videos Captioning with Attention-based LSTM and Semantic Consistency
IEEE Transactions on Multimedia. TMM 2017
VideoWhisper: Towards Discriminative Unsupervised Video Feature Learning with Attention Based Recurrent Neural Networks
IEEE Transactions on Multimedia. TMM 2017
Attentive Collaborative Filtering: Multimedia Recommendation with Feature- and Item-level Attention [oral]
International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR 2017. Tokyo, Japan. August 2017
Visual Translation Embedding Network for Visual Relation Detection
IEEE International Conference on Computer Vision and Pattern Recognition. CVPR 2017. Hawaii, USA. July 2017
[arxiv preprint] [codes & data]
SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning
IEEE International Conference on Computer Vision and Pattern Recognition. CVPR 2017. Hawaii, USA. July 2017
[arxiv preprint] [codes]
Matryoshka Peek: Towards Learning Fine-grained, Robust, Discriminative Features for Product Search
IEEE Transactions on Multimedia. TMM 2017
Neural Collaborative Filtering [oral]
26th International World Wide Web Conference. WWW 2017. Perth, Australia, April 2017
I Know What You Want to Express: Sentence Element Inference by Incorporating External Knowledge Base
IEEE Transactions on Knowledge and Data Engineering. TKDE 2016
Micro Tells Macro: Predicting the Popularity of Micro-Videos via a Transductive Model [oral]
ACM International Conference on Multimedia. MM 2016. Amsterdam, The Netherlands, October 2016
Play and Rewind: Optimizing Binary Representations of Videos by Self-Supervised Temporal Hashing [oral]
ACM International Conference on Multimedia. MM 2016. Amsterdam, The Netherlands, October 2016
Learning from Collective Intelligence: Feature Learning Using Social Images and Tags
ACM Transactions on Multimedia Computing, Communications and Applications. TOMM (formerly known as TOMCCAP) 2016
Event Classification in Microblog via Social Tracking
ACM Transactions on Intelligent Systems and Technology. TIST. 2016
Discrete Collaborative Filtering [oral] Best Paper Honorable Mention
International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR 2016. Pisa, Italy. July 2016
Fast Matrix Factorization for Online Recommendation with Implicit Feedback [oral]
International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR 2016. Pisa, Italy. July 2016
Online Collaborative Learning for Open-Vocabulary Visual Classifiers
IEEE International Conference on Computer Vision and Pattern Recognition. CVPR 2016. Las Vegas, USA. Jun 2016
Discrete Image Hashing Using Large Weakly Annotated Photo Collections
The Thirtieth AAAI Conference on Artificial Intelligence. AAAI 2016. Phoenix, Arizona, USA. Feb 2016.
Learning Image and User Features for Recommendation in Social Networks
IEEE International Conference on Computer Vision. ICCV 2015. Santiago, Chile. Nov 2015.
Learning Features from Large-Scale, Noisy and Social Image-Tag Collection
ACM International Conference on Multimedia. MM 2015. Brisbane, Australia. Oct 2015.
Visual Coding in a Semantic Hierarchy [oral]
ACM International Conference on Multimedia. MM 2015. Brisbane, Australia. Oct 2015.
Deep Fusion of Multiple Semantic Cues for Complex Event Recognition
IEEE Transactions on Image Processing. TIP 2015
Deep Aging Face Verification with Large Gaps
IEEE Transactions on Multimedia. TMM 2015
Multimedia Summarization for Social Events in Microblog Stream
IEEE Transactions on Multimedia. TMM 2015
Enhancing Video Event Recognition using Automatically Constructed Semantic-Visual Knowledge Base
IEEE Transactions on Multimedia. TMM 2015
Start from Scratch: Towards Automatically Identifying, Modeling, and Naming Visual Attributes [oral]
ACM International Conference on Multimedia. MM 2014. Orlando, USA. Nov 2014.
[pdf] [slides] [demo (ID:deep, PWD: deep123456)] [models] [codes]
Perception-Guided Multimodal Feature Fusion for Photo Aesthetics Assessment [oral]
ACM International Conference on Multimedia. MM 2014. Orlando, USA. Nov 2014.
One of a Kind: User Profiling by Social Curation [oral]
ACM International Conference on Multimedia. MM 2014. Orlando, USA. Nov 2014.
[pdf][slides]
Image Tagging with Social Assistance [oral]
ACM International Conference on Multimedia Retrieval. ICMR 2014. Glasgow, Scotland, Apr 2014.
Robust (Semi-) Nonnegative Graph Embedding
IEEE Transactions on Image Processing. TIP 2014
Attribute-augmented Semantic Hierarchy: Towards a Unified Framework for Content-based Image Retrieval
ACM Transactions on Multimedia Computing, Communications and Applications. TOMM (formerly known as TOMCCAP) 2014
Attribute-augmented Semantic Hierarchy [oral]
ACM International Conference on Multimedia. MM 2013. Barcelona, Spain. Oct 2014.
[pdf] [slides] Best Student Paper
Attribute Feedback [oral]
ACM International Conference on Multimedia. MM 2012. Nara, Japan. Oct 2012.
[pdf] [slides] [demo] Best Demo Runner-up
Robust Non-negative Graph Embedding: Towards Noisy Data, Unreliable Graphs, and Noisy Labels
IEEE International Conference on Computer Vision and Pattern Recognition. CVPR 2012. Rhode Island, USA. June 2012.