Kaipeng Zhang
20
Papers
541
Total Citations
Papers (20)
OmniQuant: Omnidirectionally Calibrated Quantization for Large Language Models
ICLR 2024
320
citations
GUIOdyssey: A Comprehensive Dataset for Cross-App GUI Navigation on Mobile Devices
ICCV 2025
96
citations
Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation
ICML 2025
72
citations
OpenING: A Comprehensive Benchmark for Judging Open-ended Interleaved Image-Text Generation
CVPR 2025
18
citations
Neighboring Autoregressive Modeling for Efficient Visual Generation
ICCV 2025
16
citations
REPA Works Until It Doesn’t: Early-Stopped, Holistic Alignment Supercharges Diffusion Training
NeurIPS 2025
8
citations
DiffAgent: Fast and Accurate Text-to-Image API Selection with Large Language Model
CVPR 2024
7
citations
Data Adaptive Traceback for Vision-Language Foundation Models in Image Classification
AAAI 2024arXiv
4
citations
OneLLM: One Framework to Align All Modalities with Language
CVPR 2024
0
citations
ZipVL: Accelerating Vision-Language Models through Dynamic Token Sparsity
ICCV 2025
0
citations
LiT: Delving into a Simple Linear Diffusion Transformer for Image Generation
ICCV 2025
0
citations
ProJudge: A Multi-Modal Multi-Discipline Benchmark and Instruction-Tuning Dataset for MLLM-based Process Judges
ICCV 2025
0
citations
Position: Towards Implicit Prompt For Text-To-Image Models
ICML 2024
0
citations
MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI
ICML 2024
0
citations
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models
ICML 2024
0
citations
Detecting Faces Using Inside Cascaded Contextual CNN
ICCV 2017
0
citations
DiffRate : Differentiable Compression Rate for Efficient Vision Transformers
ICCV 2023arXiv
0
citations
TagCLIP: A Local-to-Global Framework to Enhance Open-Vocabulary Multi-Label Classification of CLIP without Training
AAAI 2024
0
citations
Neural Routing by Memory
NeurIPS 2021
0
citations
Foundation Model is Efficient Multimodal Multitask Model Selector
NeurIPS 2023
0
citations