Kai Chen

37
Papers
586
Total Citations

Papers (37)

A Task is Worth One Word: Learning with Task Prompts for High-Quality Versatile Image Inpainting

ECCV 2024
152
citations

OMG-Seg: Is One Model Good Enough For All Segmentation?

CVPR 2024
106
citations

RTMO: Towards High-Performance One-Stage Real-Time Multi-Person Pose Estimation

CVPR 2024
53
citations

MagicDrive-V2: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control

ICCV 2025
44
citations

Gaining Wisdom from Setbacks: Aligning Large Language Models via Mistake Analysis

ICLR 2024
44
citations

EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions

CVPR 2025
44
citations

DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception

CVPR 2024
39
citations

Towards Language-Driven Video Inpainting via Multimodal Large Language Models

CVPR 2024
30
citations

UVEB: A Large-scale Benchmark and Baseline Towards Real-World Underwater Video Enhancement

CVPR 2024
21
citations

Implicit Concept Removal of Diffusion Models

ECCV 2024arXiv
18
citations

Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLMs

ICCV 2025arXiv
12
citations

DuMo: Dual Encoder Modulation Network for Precise Concept Erasure

AAAI 2025
7
citations

Rethinking Verification for LLM Code Generation: From Generation to Testing

NeurIPS 2025
7
citations

RepeatLeakage: Leak Prompts from Repeating as Large Language Model Is a Good Repeater

AAAI 2025
2
citations

Memory Decoder: A Pretrained, Plug-and-Play Memory for Large Language Models

NeurIPS 2025arXiv
2
citations

MotionShot: Adaptive Motion Transfer across Arbitrary Objects for Text-to-Video Generation

ICCV 2025arXiv
2
citations

PatchScaler: An Efficient Patch-Independent Diffusion Model for Image Super-Resolution

ICCV 2025
1
citations

Contact Map Transfer with Conditional Diffusion Model for Generalizable Dexterous Grasp Generation

NeurIPS 2025
1
citations

SocialMOIF: Multi-Order Intention Fusion for Pedestrian Trajectory Prediction

CVPR 2025
1
citations

Parallel Beam Search Algorithms for Domain-Independent Dynamic Programming

AAAI 2024
0
citations

Everything2Motion: Synchronizing Diverse Inputs via a Unified Framework for Human Motion Synthesis

AAAI 2024
0
citations

EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI

CVPR 2024
0
citations

PIA: Your Personalized Image Animator via Plug-and-Play Modules in Text-to-Image Models

CVPR 2024
0
citations

Make-It-Vivid: Dressing Your Animatable Biped Cartoon Characters from Text

CVPR 2024
0
citations

Information Density Principle for MLLM Benchmarks

ICCV 2025
0
citations

From Pixels to Graphs: Open-Vocabulary Scene Graph Generation with Vision-Language Models

CVPR 2024
0
citations

Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language

CVPR 2025
0
citations

TAPT: Test-Time Adversarial Prompt Tuning for Robust Inference in Vision-Language Models

CVPR 2025
0
citations

Hybrid Reciprocal Transformer with Triplet Feature Alignment for Scene Graph Generation

CVPR 2025
0
citations

Differentiable Model Scaling using Differentiable Topk

ICML 2024
0
citations

Mixing Expert Knowledge: Bring Human Thoughts Back To the Game of Go

NeurIPS 2025
0
citations

Can AI Assistants Know What They Don't Know?

ICML 2024
0
citations

DocVision: a Seamless, Cross-Device Immersive Active Reading Framework for Digital Academic Literature

ISMAR 2025
0
citations

Social Recommendation via Graph-Level Counterfactual Augmentation

AAAI 2025
0
citations

Semantic-guided Masked Mutual Learning for Multi-modal Brain Tumor Segmentation with Arbitrary Missing Modalities

AAAI 2025
0
citations

LLM-DR: A Novel LLM-Aided Diffusion Model for Rule Generation on Temporal Knowledge Graphs

AAAI 2025
0
citations

Promptable Anomaly Segmentation with SAM Through Self-Perception Tuning

AAAI 2025
0
citations