Zhongang Qi
12
Papers
1,547
Total Citations
Papers (12)
T2I-Adapter: Learning Adapters to Dig Out More Controllable Ability for Text-to-Image Diffusion
AAAI 2024arXiv
1,423
citations
Taming Rectified Flow for Inversion and Editing
ICML 2025
110
citations
EA-VTR: Event-Aware Video-Text Retrieval
ECCV 2024arXiv
7
citations
UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning
NeurIPS 2025arXiv
4
citations
Mono2Stereo: A Benchmark and Empirical Study for Stereo Conversion
CVPR 2025arXiv
3
citations
Mamba-3VL: Taming State Space Model for 3D Vision Language Learning
ICCV 2025
0
citations
CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities
AAAI 2025
0
citations
VisionMath: Vision-Form Mathematical Problem-Solving
ICCV 2025
0
citations
PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding
CVPR 2024
0
citations
How to Make Cross Encoder a Good Teacher for Efficient Image-Text Retrieval?
CVPR 2024
0
citations
DOGR: Towards Versatile Visual Document Grounding and Referring
ICCV 2025
0
citations
Less is More: Empowering GUI Agent with Context-Aware Simplification
ICCV 2025
0
citations