Zeyuan Chen

17
Papers
534
Total Citations

Papers (17)

BLIVA: A Simple Multimodal LLM for Better Handling of Text-Rich Visual Questions

AAAI 2024arXiv
190
citations

HIVE: Harnessing Human Feedback for Instructional Visual Editing

CVPR 2024
164
citations

Retroformer: Retrospective Large Language Agents with Policy Gradient Optimization

ICLR 2024
104
citations

Dolfin: Diffusion Layout Transformers without Autoencoder

ECCV 2024
25
citations

Bayesian Diffusion Models for 3D Shape Reconstruction

CVPR 2024
23
citations

X-Dyna: Expressive Dynamic Human Image Animation

CVPR 2025
13
citations

X-Dancer: Expressive Music to Human Dance Video Generation

ICCV 2025
9
citations

YOLO-Count: Differentiable Object Counting for Text-to-Image Generation

ICCV 2025
6
citations

Structured Policy Optimization: Enhance Large Vision-Language Model via Self-referenced Dialogue

ICCV 2025
0
citations

PSD: Principled Synthetic-to-Real Dehazing Guided by Physical Priors

CVPR 2021
0
citations

VideoINR: Learning Video Implicit Neural Representation for Continuous Space-Time Super-Resolution

CVPR 2022
0
citations

GlueGen: Plug and Play Multi-modal Encoders for X-to-image Generation

ICCV 2023
0
citations

Uni-3D: A Universal Model for Panoptic 3D Scene Reconstruction

ICCV 2023
0
citations

Burn after Reading: Online Adaptation for Cross-Domain Streaming Data

ECCV 2022
0
citations

CADGrasp: Learning Contact and Collision Aware General Dexterous Grasping in Cluttered Scenes

NeurIPS 2025
0
citations

DepR: Depth Guided Single-view Scene Reconstruction with Instance-level Diffusion

ICCV 2025
0
citations

CASA: Category-agnostic Skeletal Animal Reconstruction

NeurIPS 2022
0
citations