Kai Chen
37
Papers
586
Total Citations
Papers (37)
A Task is Worth One Word: Learning with Task Prompts for High-Quality Versatile Image Inpainting
ECCV 2024
152
citations
OMG-Seg: Is One Model Good Enough For All Segmentation?
CVPR 2024
106
citations
RTMO: Towards High-Performance One-Stage Real-Time Multi-Person Pose Estimation
CVPR 2024
53
citations
MagicDrive-V2: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control
ICCV 2025
44
citations
Gaining Wisdom from Setbacks: Aligning Large Language Models via Mistake Analysis
ICLR 2024
44
citations
EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
CVPR 2025
44
citations
DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception
CVPR 2024
39
citations
Towards Language-Driven Video Inpainting via Multimodal Large Language Models
CVPR 2024
30
citations
UVEB: A Large-scale Benchmark and Baseline Towards Real-World Underwater Video Enhancement
CVPR 2024
21
citations
Implicit Concept Removal of Diffusion Models
ECCV 2024arXiv
18
citations
Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLMs
ICCV 2025arXiv
12
citations
DuMo: Dual Encoder Modulation Network for Precise Concept Erasure
AAAI 2025
7
citations
Rethinking Verification for LLM Code Generation: From Generation to Testing
NeurIPS 2025
7
citations
RepeatLeakage: Leak Prompts from Repeating as Large Language Model Is a Good Repeater
AAAI 2025
2
citations
Memory Decoder: A Pretrained, Plug-and-Play Memory for Large Language Models
NeurIPS 2025arXiv
2
citations
MotionShot: Adaptive Motion Transfer across Arbitrary Objects for Text-to-Video Generation
ICCV 2025arXiv
2
citations
PatchScaler: An Efficient Patch-Independent Diffusion Model for Image Super-Resolution
ICCV 2025
1
citations
Contact Map Transfer with Conditional Diffusion Model for Generalizable Dexterous Grasp Generation
NeurIPS 2025
1
citations
SocialMOIF: Multi-Order Intention Fusion for Pedestrian Trajectory Prediction
CVPR 2025
1
citations
Parallel Beam Search Algorithms for Domain-Independent Dynamic Programming
AAAI 2024
0
citations
Everything2Motion: Synchronizing Diverse Inputs via a Unified Framework for Human Motion Synthesis
AAAI 2024
0
citations
EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI
CVPR 2024
0
citations
PIA: Your Personalized Image Animator via Plug-and-Play Modules in Text-to-Image Models
CVPR 2024
0
citations
Make-It-Vivid: Dressing Your Animatable Biped Cartoon Characters from Text
CVPR 2024
0
citations
Information Density Principle for MLLM Benchmarks
ICCV 2025
0
citations
From Pixels to Graphs: Open-Vocabulary Scene Graph Generation with Vision-Language Models
CVPR 2024
0
citations
Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language
CVPR 2025
0
citations
TAPT: Test-Time Adversarial Prompt Tuning for Robust Inference in Vision-Language Models
CVPR 2025
0
citations
Hybrid Reciprocal Transformer with Triplet Feature Alignment for Scene Graph Generation
CVPR 2025
0
citations
Differentiable Model Scaling using Differentiable Topk
ICML 2024
0
citations
Mixing Expert Knowledge: Bring Human Thoughts Back To the Game of Go
NeurIPS 2025
0
citations
Can AI Assistants Know What They Don't Know?
ICML 2024
0
citations
DocVision: a Seamless, Cross-Device Immersive Active Reading Framework for Digital Academic Literature
ISMAR 2025
0
citations
Social Recommendation via Graph-Level Counterfactual Augmentation
AAAI 2025
0
citations
Semantic-guided Masked Mutual Learning for Multi-modal Brain Tumor Segmentation with Arbitrary Missing Modalities
AAAI 2025
0
citations
LLM-DR: A Novel LLM-Aided Diffusion Model for Rule Generation on Temporal Knowledge Graphs
AAAI 2025
0
citations
Promptable Anomaly Segmentation with SAM Through Self-Perception Tuning
AAAI 2025
0
citations