Chen Zhao

44
Papers
247
Total Citations

Papers (44)

MMVU: Measuring Expert-Level Multi-Discipline Video Understanding

CVPR 2025
70
citations

End-to-End Temporal Action Detection with 1B Parameters Across 1000 Frames

CVPR 2024
51
citations

Exploiting Multimodal Spatial-temporal Patterns for Video Object Tracking

AAAI 2025
38
citations

STAR: Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution

ICCV 2025
22
citations

HOISDF: Constraining 3D Hand-Object Pose Estimation with Global Signed Distance Fields

CVPR 2024
19
citations

TexOct: Generating Textures of 3D Models with Octree-based Diffusion

CVPR 2024
12
citations

Towards Automated Movie Trailer Generation

CVPR 2024
10
citations

Splatter-360: Generalizable 360 Gaussian Splatting for Wide-baseline Panoramic Images

CVPR 2025
10
citations

UltraHR-100K: Enhancing UHR Image Synthesis with A Large-Scale High-Quality Dataset

NeurIPS 2025
7
citations

SMILE: Infusing Spatial and Motion Semantics in Masked Video Learning

CVPR 2025arXiv
3
citations

Self-Ensembling Gaussian Splatting for Few-Shot Novel View Synthesis

ICCV 2025
2
citations

BoxDreamer: Dreaming Box Corners for Generalizable Object Pose Estimation

ICCV 2025
1
citations

Auto-Regressively Generating Multi-View Consistent Images

ICCV 2025
1
citations

SolverLLM: Leveraging Test-Time Scaling for Optimization Problem via LLM-Guided Search

NeurIPS 2025
1
citations

TexGarment: Consistent Garment UV Texture Generation via Efficient 3D Structure-Guided Diffusion Transformer

CVPR 2025
0
citations

Metric-Agnostic Continual Learning for Sustainable Group Fairness

AAAI 2025
0
citations

Wavelet-based Fourier Information Interaction with Frequency Diffusion Adjustment for Underwater Image Restoration

CVPR 2024
0
citations

OSMamba: Omnidirectional Spectral Mamba with Dual-Domain Prior Generator for Exposure Correction

CVPR 2025
0
citations

Dr2Net: Dynamic Reversible Dual-Residual Networks for Memory-Efficient Finetuning

CVPR 2024
0
citations

BOLT: Boost Large Vision-Language Model Without Training for Long-form Video Understanding

CVPR 2025
0
citations

DVMNet: Computing Relative Pose for Unseen Objects Beyond Hypotheses

CVPR 2024
0
citations

From Zero to Detail: Deconstructing Ultra-High-Definition Image Restoration from Progressive Spectral Perspective

CVPR 2025
0
citations

TexGaussian: Generating High-quality PBR Material via Octree-based 3D Gaussian Splatting

CVPR 2025
0
citations

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives

CVPR 2024
0
citations

Do Models Explain Themselves? Counterfactual Simulatability of Natural Language Explanations

ICML 2024
0
citations

NM-Net: Mining Reliable Neighbors for Robust Feature Correspondences

CVPR 2019
0
citations

G-TAD: Sub-Graph Localization for Temporal Action Detection

CVPR 2020
0
citations

Attention Convolutional Binary Neural Tree for Fine-Grained Visual Categorization

CVPR 2020arXiv
0
citations

MAD: A Scalable Dataset for Language Grounding in Videos From Movie Audio Descriptions

CVPR 2022
0
citations

Ego4D: Around the World in 3,000 Hours of Egocentric Video

CVPR 2022
0
citations

Large-Capacity and Flexible Video Steganography via Invertible Neural Network

CVPR 2023arXiv
0
citations

Re2TAL: Rewiring Pretrained Video Backbones for Reversible Temporal Action Localization

CVPR 2023
0
citations

Open Set Action Recognition via Multi-Label Evidential Learning

CVPR 2023arXiv
0
citations

Video Self-Stitching Graph Network for Temporal Action Localization

ICCV 2021arXiv
0
citations

Progressive Correspondence Pruning by Consensus Learning

ICCV 2021arXiv
0
citations

EgoLoc: Revisiting 3D Object Localization from Egocentric Videos with Visual Queries

ICCV 2023arXiv
0
citations

A Unified Continual Learning Framework with General Parameter-Efficient Tuning

ICCV 2023arXiv
0
citations

FreeDoM: Training-Free Energy-Guided Conditional Diffusion Model

ICCV 2023arXiv
0
citations

Learning Semantic Neural Tree for Human Parsing

ECCV 2020
0
citations

Sparse-to-Dense Depth Completion Revisited: Sampling Strategy and Graph Construction

ECCV 2020
0
citations

Fusing Local Similarities for Retrieval-Based 3D Orientation Estimation of Unseen Objects

ECCV 2022
0
citations

Unsupervised Learning of 3D Semantic Keypoints with Mutual Reconstruction

ECCV 2022
0
citations

R-DFCIL: Relation-Guided Representation Learning for Data-Free Class Incremental Learning

ECCV 2022
0
citations

End-to-End Active Speaker Detection

ECCV 2022
0
citations