Xiaojun Chang

42
Papers
130
Total Citations

Papers (42)

2382 SSMG: Spatial-Semantic Map Guided Diffusion Model for Free-Form Layout-to-Image Generation

AAAI 2024
26
citations

MLP Can Be A Good Transformer Learner

CVPR 2024
20
citations

Dense Audio-Visual Event Localization Under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration

AAAI 2025
18
citations

OpenING: A Comprehensive Benchmark for Judging Open-ended Interleaved Image-Text Generation

CVPR 2025
18
citations

SWAP-NAS: Sample-Wise Activation Patterns for Ultra-fast NAS

ICLR 2024
16
citations

RoomTour3D: Geometry-Aware Video-Instruction Tuning for Embodied Navigation

CVPR 2025
12
citations

Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation

AAAI 2024arXiv
11
citations

HC-LLM: Historical-Constrained Large Language Models for Radiology Report Generation

AAAI 2025
8
citations

Towards Efficient General Feature Prediction in Masked Skeleton Modeling

ICCV 2025arXiv
1
citations

Overcoming Multi-Model Forgetting in One-Shot NAS With Diversity Maximization

CVPR 2020
0
citations

Vision-Language Navigation With Self-Supervised Auxiliary Reasoning Tasks

CVPR 2020arXiv
0
citations

Vision-Dialog Navigation by Exploring Cross-Modal Memory

CVPR 2020arXiv
0
citations

Dynamic Slimmable Network

CVPR 2021arXiv
0
citations

SOON: Scenario Oriented Object Navigation With Graph-Based Exploration

CVPR 2021arXiv
0
citations

Cross-Modal Clinical Graph Transformer for Ophthalmic Report Generation

CVPR 2022
0
citations

BaLeNAS: Differentiable Architecture Search via the Bayesian Learning Rule

CVPR 2022arXiv
0
citations

Knowledge Distillation via the Target-Aware Transformer

CVPR 2022arXiv
0
citations

Beyond Fixation: Dynamic Window Visual Transformer

CVPR 2022arXiv
0
citations

Dual-AI: Dual-Path Actor Interaction Learning for Group Activity Recognition

CVPR 2022
0
citations

Automated Progressive Learning for Efficient Training of Vision Transformers

CVPR 2022arXiv
0
citations

Self-Supervised Global-Local Structure Modeling for Point Cloud Domain Adaptation With Reliable Voted Pseudo Labels

CVPR 2022
0
citations

Dynamic Graph Enhanced Contrastive Learning for Chest X-Ray Report Generation

CVPR 2023
0
citations

Complex Event Detection by Identifying Reliable Shots From Untrimmed Videos

ICCV 2017
0
citations

BossNAS: Exploring Hybrid CNN-Transformers With Block-Wisely Self-Supervised Neural Architecture Search

ICCV 2021arXiv
0
citations

Exploring Inter-Channel Correlation for Diversity-Preserved Knowledge Distillation

ICCV 2021
0
citations

FULLER: Unified Multi-modality Multi-task 3D Perception via Multi-level Gradient Calibration

ICCV 2023arXiv
0
citations

HTML: Hybrid Temporal-scale Multimodal Learning Framework for Referring Video Object Segmentation

ICCV 2023
0
citations

Mining Inter-Video Proposal Relations for Video Object Detection

ECCV 2020
0
citations

An Efficient Spatio-Temporal Pyramid Transformer for Action Detection

ECCV 2022
0
citations

Vision-Language Navigation With Random Environmental Mixup

ICCV 2021arXiv
0
citations

Towards Open-Vocabulary Audio-Visual Event Localization

CVPR 2025
0
citations

ProAgent: Building Proactive Cooperative Agents with Large Language Models

AAAI 2024
0
citations

Video Recognition in Portrait Mode

CVPR 2024
0
citations

They Are Not Equally Reliable: Semantic Event Search Using Differentiated Concept Classifiers

CVPR 2016
0
citations

Reinforcement Cutting-Agent Learning for Video Object Segmentation

CVPR 2018
0
citations

ZSTAD: Zero-Shot Temporal Activity Detection

CVPR 2020arXiv
0
citations

Block-Wisely Supervised Neural Architecture Search With Knowledge Distillation

CVPR 2020
0
citations

Unity Style Transfer for Person Re-Identification

CVPR 2020arXiv
0
citations

Differentiable Neural Architecture Search in Equivalent Space with Exploration Enhancement

NeurIPS 2020
0
citations

Hierarchical Neural Architecture Search for Deep Stereo Matching

NeurIPS 2020
0
citations

Mask Propagation for Efficient Video Semantic Segmentation

NeurIPS 2023
0
citations

Complex Event Detection using Semantic Saliency and Nearly-Isotonic SVM

ICML 2015
0
citations