Kai Chen
75
Papers
584
Total Citations
Papers (75)
A Task is Worth One Word: Learning with Task Prompts for High-Quality Versatile Image Inpainting
ECCV 2024
152
citations
OMG-Seg: Is One Model Good Enough For All Segmentation?
CVPR 2024
106
citations
RTMO: Towards High-Performance One-Stage Real-Time Multi-Person Pose Estimation
CVPR 2024
53
citations
MagicDrive-V2: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control
ICCV 2025
44
citations
Gaining Wisdom from Setbacks: Aligning Large Language Models via Mistake Analysis
ICLR 2024
44
citations
EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
CVPR 2025
44
citations
DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception
CVPR 2024
39
citations
Towards Language-Driven Video Inpainting via Multimodal Large Language Models
CVPR 2024
30
citations
UVEB: A Large-scale Benchmark and Baseline Towards Real-World Underwater Video Enhancement
CVPR 2024
21
citations
Implicit Concept Removal of Diffusion Models
ECCV 2024arXiv
18
citations
Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLMs
ICCV 2025arXiv
12
citations
Rethinking Verification for LLM Code Generation: From Generation to Testing
NeurIPS 2025
7
citations
DuMo: Dual Encoder Modulation Network for Precise Concept Erasure
AAAI 2025
7
citations
RepeatLeakage: Leak Prompts from Repeating as Large Language Model Is a Good Repeater
AAAI 2025
2
citations
Memory Decoder: A Pretrained, Plug-and-Play Memory for Large Language Models
NeurIPS 2025
2
citations
PatchScaler: An Efficient Patch-Independent Diffusion Model for Image Super-Resolution
ICCV 2025
1
citations
Contact Map Transfer with Conditional Diffusion Model for Generalizable Dexterous Grasp Generation
NeurIPS 2025
1
citations
SocialMOIF: Multi-Order Intention Fusion for Pedestrian Trajectory Prediction
CVPR 2025
1
citations
Differentiable Model Scaling using Differentiable Topk
ICML 2024
0
citations
Can AI Assistants Know What They Don't Know?
ICML 2024
0
citations
Discover and Learn New Objects From Documentaries
CVPR 2017arXiv
0
citations
Optimizing Video Object Detection via a Scale-Time Lattice
CVPR 2018arXiv
0
citations
Libra R-CNN: Towards Balanced Learning for Object Detection
CVPR 2019
0
citations
Region Proposal by Guided Anchoring
CVPR 2019
0
citations
Hybrid Task Cascade for Instance Segmentation
CVPR 2019
0
citations
Prime Sample Attention in Object Detection
CVPR 2020arXiv
0
citations
Positional Encoding As Spatial Inductive Bias in GANs
CVPR 2021arXiv
0
citations
Seesaw Loss for Long-Tailed Instance Segmentation
CVPR 2021arXiv
0
citations
Learning To Identify Correct 2D-2D Line Correspondences on Sphere
CVPR 2021
0
citations
TransRank: Self-Supervised Video Representation Learning via Ranking-Based Transformation Recognition
CVPR 2022arXiv
0
citations
OCSampler: Compressing Videos to One Clip With Single-Step Sampling
CVPR 2022arXiv
0
citations
Video K-Net: A Simple, Strong, and Unified Baseline for Video Segmentation
CVPR 2022
0
citations
Revisiting Skeleton-Based Action Recognition
CVPR 2022arXiv
0
citations
GCFSR: A Generative and Controllable Face Super Resolution Method Without Facial and GAN Priors
CVPR 2022arXiv
0
citations
Group R-CNN for Weakly Semi-Supervised Object Detection With Points
CVPR 2022
0
citations
LAVT: Language-Aware Vision Transformer for Referring Image Segmentation
CVPR 2022arXiv
0
citations
Mixed Autoencoder for Self-Supervised Visual Representation Learning
CVPR 2023arXiv
0
citations
RIFormer: Keep Your Vision Backbone Effective but Removing Token Mixer
CVPR 2023
0
citations
Dense Distinct Query for End-to-End Object Detection
CVPR 2023arXiv
0
citations
CARAFE: Content-Aware ReAssembly of FEatures
ICCV 2019
0
citations
SGPA: Structure-Guided Prior Adaptation for Category-Level 6D Object Pose Estimation
ICCV 2021
0
citations
MultiSiam: Self-Supervised Multi-Instance Siamese Representation Learning for Autonomous Driving
ICCV 2021arXiv
0
citations
Learning Icosahedral Spherical Probability Map Based on Bingham Mixture Model for Vanishing Point Estimation
ICCV 2021
0
citations
Learning Shape Primitives via Implicit Convexity Regularization
ICCV 2023
0
citations
Robo3D: Towards Robust and Reliable 3D Perception against Corruptions
ICCV 2023arXiv
0
citations
Improving Pixel-based MIM by Reducing Wasted Modeling Capability
ICCV 2023arXiv
0
citations
UMC: A Unified Bandwidth-efficient and Multi-resolution based Collaborative Perception Framework
ICCV 2023arXiv
0
citations
Deep Fusion Transformer Network with Weighted Vector-Wise Keypoints Voting for Robust 6D Object Pose Estimation
ICCV 2023arXiv
0
citations
Side-Aware Boundary Localization for More Precise Object Detection
ECCV 2020
0
citations
Dense Siamese Network for Dense Unsupervised Learning
ECCV 2022
0
citations
CODA: A Real-World Road Corner Case Dataset for Object Detection in Autonomous Driving
ECCV 2022
0
citations
Sim-to-Real 6D Object Pose Estimation via Iterative Self-Training for Robotic Bin Picking
ECCV 2022
0
citations
Consistent-Teacher: Towards Reducing Inconsistent Pseudo-Targets in Semi-Supervised Object Detection
CVPR 2023
0
citations
Hybrid Reciprocal Transformer with Triplet Feature Alignment for Scene Graph Generation
CVPR 2025
0
citations
TAPT: Test-Time Adversarial Prompt Tuning for Robust Inference in Vision-Language Models
CVPR 2025
0
citations
Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language
CVPR 2025
0
citations
Information Density Principle for MLLM Benchmarks
ICCV 2025
0
citations
MotionShot: Adaptive Motion Transfer across Arbitrary Objects for Text-to-Video Generation
ICCV 2025
0
citations
Mixing Expert Knowledge: Bring Human Thoughts Back To the Game of Go
NeurIPS 2025
0
citations
DocVision: a Seamless, Cross-Device Immersive Active Reading Framework for Digital Academic Literature
ISMAR 2025
0
citations
Social Recommendation via Graph-Level Counterfactual Augmentation
AAAI 2025
0
citations
Semantic-guided Masked Mutual Learning for Multi-modal Brain Tumor Segmentation with Arbitrary Missing Modalities
AAAI 2025
0
citations
LLM-DR: A Novel LLM-Aided Diffusion Model for Rule Generation on Temporal Knowledge Graphs
AAAI 2025
0
citations
Promptable Anomaly Segmentation with SAM Through Self-Perception Tuning
AAAI 2025
0
citations
Parallel Beam Search Algorithms for Domain-Independent Dynamic Programming
AAAI 2024
0
citations
Everything2Motion: Synchronizing Diverse Inputs via a Unified Framework for Human Motion Synthesis
AAAI 2024
0
citations
EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI
CVPR 2024
0
citations
PIA: Your Personalized Image Animator via Plug-and-Play Modules in Text-to-Image Models
CVPR 2024
0
citations
Make-It-Vivid: Dressing Your Animatable Biped Cartoon Characters from Text
CVPR 2024
0
citations
From Pixels to Graphs: Open-Vocabulary Scene Graph Generation with Vision-Language Models
CVPR 2024
0
citations
K-Net: Towards Unified Image Segmentation
NeurIPS 2021
0
citations
Few-Shot Object Detection via Association and DIscrimination
NeurIPS 2021
0
citations
Deliberated Domain Bridging for Domain Adaptive Semantic Segmentation
NeurIPS 2022
0
citations
Segment Any Point Cloud Sequences by Distilling Vision Foundation Models
NeurIPS 2023
0
citations
GlyphControl: Glyph Conditional Control for Visual Text Generation
NeurIPS 2023
0
citations