Hao Chen

112
Papers
380
Total Citations
1
Affiliations

Affiliations

CMU

Papers (112)

VoCo: A Simple-yet-Effective Volume Contrastive Learning Framework for 3D Medical Image Analysis

CVPR 2024
70
citations

ImageFolder: Autoregressive Image Generation with Folded Tokens

ICLR 2025
63
citations

What Matters When Repurposing Diffusion Models for General Dense Perception Tasks?

ICLR 2025arXiv
56
citations

SoftVQ-VAE: Efficient 1-Dimensional Continuous Tokenizer

CVPR 2025
32
citations

360+x: A Panoptic Multi-modal Scene Understanding Dataset

CVPR 2024
24
citations

OSV: One Step is Enough for High-Quality Image to Video Generation

CVPR 2025
22
citations

Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaboration

NeurIPS 2025
12
citations

FreeCompose: Generic Zero-Shot Image Composition with Diffusion Prior

ECCV 2024
12
citations

SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories

CVPR 2025
11
citations

WeatherGFM: Learning a Weather Generalist Foundation Model via In-context Learning

ICLR 2025
9
citations

Chain of Attack: On the Robustness of Vision-Language Models Against Transfer-Based Adversarial Attacks

CVPR 2025
9
citations

DiffCalib: Reformulating Monocular Camera Calibration as Diffusion-Based Dense Incident Map Generation

AAAI 2025
9
citations

TG-LLaVA: Text Guided LLaVA via Learnable Latent Embeddings

AAAI 2025
8
citations

FOCUS: Knowledge-enhanced Adaptive Visual Compression for Few-shot Whole Slide Image Classification

CVPR 2025
7
citations

Fast Encoding and Decoding for Implicit Video Representation

ECCV 2024
7
citations

PEACE: Empowering Geologic Map Holistic Understanding with MLLMs

CVPR 2025arXiv
6
citations

SwitchLingua: The First Large-Scale Multilingual and Multi-Ethnic Code-Switching Dataset

NeurIPS 2025
5
citations

Distilled Prompt Learning for Incomplete Multimodal Survival Prediction

CVPR 2025
4
citations

Improving Multimodal Learning Balance and Sufficiency through Data Remixing

ICML 2025
4
citations

SDP-CROWN: Efficient Bound Propagation for Neural Network Verification with Tightness of Semidefinite Programming

ICML 2025
3
citations

Point Cloud Upsampling Using Conditional Diffusion Module with Adaptive Noise Suppression

CVPR 2025
2
citations

VA-MoE: Variables-Adaptive Mixture of Experts for Incremental Weather Forecasting

ICCV 2025
2
citations

Rethinking the Bias of Foundation Model under Long-tailed Distribution

ICML 2025
1
citations

Evaluating Program Semantics Reasoning with Type Inference in System $F$

NeurIPS 2025
1
citations

Revisiting Open-Set Panoptic Segmentation

AAAI 2024
1
citations

A General Framework for Learning from Weak Supervision

ICML 2024
0
citations

Completing Visual Objects via Bridging Generation and Segmentation

ICML 2024
0
citations

Floating Anchor Diffusion Model for Multi-motif Scaffolding

ICML 2024
0
citations

Post-hoc Part-Prototype Networks

ICML 2024
0
citations

CompeteAI: Understanding the Competition Dynamics of Large Language Model-based Agents

ICML 2024
0
citations

Generative Active Learning for Long-tailed Instance Segmentation

ICML 2024
0
citations

Towards a Self-contained Data-driven Global Weather Forecasting Framework

ICML 2024
0
citations

DCAN: Deep Contour-Aware Networks for Accurate Gland Segmentation

CVPR 2016
0
citations

Progressively Complementarity-Aware Fusion Network for RGB-D Salient Object Detection

CVPR 2018
0
citations

Fast Neural Architecture Search of Compact Semantic Segmentation Models via Auxiliary Cells

CVPR 2019
0
citations

Memory-Efficient Hierarchical Neural Architecture Search for Image Denoising

CVPR 2020arXiv
0
citations

BlendMask: Top-Down Meets Bottom-Up for Instance Segmentation

CVPR 2020arXiv
0
citations

ABCNet: Real-Time Scene Text Spotting With Adaptive Bezier-Curve Network

CVPR 2020arXiv
0
citations

NAS-FCOS: Fast Neural Architecture Search for Object Detection

CVPR 2020
0
citations

Joint Generative and Contrastive Learning for Unsupervised Person Re-Identification

CVPR 2021arXiv
0
citations

The Lottery Ticket Hypothesis for Object Recognition

CVPR 2021arXiv
0
citations

Generic Perceptual Loss for Modeling Structured Output Dependencies

CVPR 2021arXiv
0
citations

BoxInst: High-Performance Instance Segmentation With Box Annotations

CVPR 2021arXiv
0
citations

Uni6D: A Unified CNN Framework Without Projection Breakdown for 6D Pose Estimation

CVPR 2022arXiv
0
citations

What To Look at and Where: Semantic and Spatial Refined Transformer for Detecting Human-Object Interactions

CVPR 2022arXiv
0
citations

A Voxel Graph CNN for Object Classification With Event Cameras

CVPR 2022arXiv
0
citations

TubeR: Tubelet Transformer for Video Action Detection

CVPR 2022arXiv
0
citations

Stochastic Backpropagation: A Memory Efficient Strategy for Training Video Models

CVPR 2022arXiv
0
citations

Towards Scalable Neural Representation for Diverse Videos

CVPR 2023arXiv
0
citations

Learning Conditional Attributes for Compositional Zero-Shot Learning

CVPR 2023arXiv
0
citations

DoNet: Deep De-Overlapping Network for Cytology Instance Segmentation

CVPR 2023arXiv
0
citations

Sparsely Annotated Semantic Segmentation With Adaptive Gaussian Mixtures

CVPR 2023
0
citations

Image Quality-Aware Diagnosis via Meta-Knowledge Co-Embedding

CVPR 2023arXiv
0
citations

Boosting Transductive Few-Shot Fine-Tuning With Margin-Based Uncertainty Weighting and Probability Regularization

CVPR 2023
0
citations

HNeRV: A Hybrid Neural Representation for Videos

CVPR 2023arXiv
0
citations

Learning To Fuse Monocular and Multi-View Cues for Multi-Frame Depth Estimation in Dynamic Scenes

CVPR 2023arXiv
0
citations

Square Localization for Efficient and Accurate Object Detection

ICCV 2015
0
citations

Explaining Neural Networks Semantically and Quantitatively

ICCV 2019
0
citations

EchoTraffic: Enhancing Traffic Anomaly Understanding with Audio-Visual Insights

CVPR 2025
0
citations

Detecting 11K Classes: Large Scale Object Detection Without Fine-Grained Bounding Boxes

ICCV 2019
0
citations

Selective Feature Compression for Efficient Activity Recognition Inference

ICCV 2021arXiv
0
citations

VidTr: Video Transformer Without Convolutions

ICCV 2021arXiv
0
citations

ICE: Inter-Instance Contrastive Encoding for Unsupervised Person Re-Identification

ICCV 2021arXiv
0
citations

Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image

ICCV 2023arXiv
0
citations

Cross-Modal Translation and Alignment for Survival Analysis

ICCV 2023arXiv
0
citations

CTVIS: Consistent Training for Online Video Instance Segmentation

ICCV 2023arXiv
0
citations

MHCN: A Hyperbolic Neural Network Model for Multi-view Hierarchical Clustering

ICCV 2023
0
citations

FrozenRecon: Pose-free 3D Scene Reconstruction with Frozen Depth Models

ICCV 2023arXiv
0
citations

Traj-MAE: Masked Autoencoders for Trajectory Prediction

ICCV 2023
0
citations

Multimodal Optimal Transport-based Co-Attention Transformer with Global Structure Consistency for Survival Prediction

ICCV 2023arXiv
0
citations

SegPrompt: Boosting Open-World Segmentation via Category-Level Prompt Learning

ICCV 2023arXiv
0
citations

Multi-view Self-supervised Disentanglement for General Image Denoising

ICCV 2023arXiv
0
citations

Conditional Convolutions for Instance Segmentation

ECCV 2020
0
citations

3D Human Shape and Pose from a Single Low-Resolution Image with Self-Supervised Learning

ECCV 2020
0
citations

Yet Another Intermediate-Level Attack

ECCV 2020
0
citations

"Unitail: Detecting, Reading, and Matching in Retail Scene"

ECCV 2022
0
citations

Automatic Check-Out via Prototype-Based Classifier Learning from Single-Product Exemplars

ECCV 2022
0
citations

FCOS: Fully Convolutional One-Stage Object Detection

ICCV 2019
0
citations

Dual-Interrelated Diffusion Model for Few-Shot Anomaly Image Generation

CVPR 2025
0
citations

Monocular and Generalizable Gaussian Talking Head Animation

CVPR 2025
0
citations

Satellite Observations Guided Diffusion Model for Accurate Meteorological States at Arbitrary Resolution

CVPR 2025
0
citations

POMATO: Marrying Pointmap Matching with Temporal Motions for Dynamic 3D Reconstruction

ICCV 2025
0
citations

Scaling Tumor Segmentation: Best Lessons from Real and Synthetic Data

ICCV 2025
0
citations

FW-Merging: Scaling Model Merging with Frank-Wolfe Optimization

ICCV 2025
0
citations

UnrealZoo: Enriching Photo-realistic Virtual Worlds for Embodied AI

ICCV 2025
0
citations

Separation for Better Integration: Disentangling Edge and Motion in Event-based Deblurring

ICCV 2025
0
citations

Conditional Visual Autoregressive Modeling for Pathological Image Restoration

ICCV 2025
0
citations

SurfaceSplat: Connecting Surface Reconstruction and Gaussian Splatting

ICCV 2025
0
citations

Unified Open-World Segmentation with Multi-Modal Prompts

ICCV 2025
0
citations

Learning Concept Prerequisite Relation via Global Knowledge Relation Optimization

AAAI 2025
0
citations

Know Where You Are From: Event-Based Segmentation via Spatio-Temporal Propagation

AAAI 2025
0
citations

MM-Tracker: Motion Mamba for UAV-platform Multiple Object Tracking

AAAI 2025
0
citations

ESEG: Event-Based Segmentation Boosted by Explicit Edge-Semantic Guidance

AAAI 2025
0
citations

Time Series Supplier Allocation via Deep Black-Litterman Model

AAAI 2025
0
citations

Towards Loss-Resilient Image Coding for Unstable Satellite Networks

AAAI 2025
0
citations

PromptMRG: Diagnosis-Driven Prompts for Medical Report Generation

AAAI 2024
0
citations

Retrieval-Augmented Primitive Representations for Compositional Zero-Shot Learning

AAAI 2024
0
citations

A Dynamic GCN with Cross-Representation Distillation for Event-Based Learning

AAAI 2024
0
citations

MICA: Towards Explainable Skin Lesion Diagnosis via Multi

AAAI 2024
0
citations

DiverGen: Improving Instance Segmentation by Learning Wider Data Distribution with More Diverse Generative Data

CVPR 2024
0
citations

Video Frame Interpolation via Direct Synthesis with the Event-based Reference

CVPR 2024
0
citations

FreeCustom: Tuning-Free Customized Image Generation for Multi-Concept Composition

CVPR 2024
0
citations

Backpropagating Linearly Improves Transferability of Adversarial Examples

NeurIPS 2020
0
citations

Stochastic Gradient Descent in Correlated Settings: A Study on Gaussian Processes

NeurIPS 2020
0
citations

Practical No-box Adversarial Attacks against DNNs

NeurIPS 2020
0
citations

Long Short-Term Transformer for Online Action Detection

NeurIPS 2021
0
citations

NeRV: Neural Representations for Videos

NeurIPS 2021
0
citations

USB: A Unified Semi-supervised Learning Benchmark for Classification

NeurIPS 2022
0
citations

An In-depth Study of Stochastic Backpropagation

NeurIPS 2022
0
citations

Improving Adversarial Transferability via Intermediate-level Perturbation Decay

NeurIPS 2023
0
citations

Towards Evaluating Transfer-based Attacks Systematically, Practically, and Fairly

NeurIPS 2023
0
citations

DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion Models

NeurIPS 2023
0
citations