Xi Chen

39
Papers
598
Total Citations
1
Affiliations

Affiliations

Google Research

Papers (39)

On Scaling Up a Multilingual Vision and Language Model

CVPR 2024
254
citations

OpenIns3D: Snap and Lookup for 3D Open-vocabulary Instance Segmentation

ECCV 2024arXiv
82
citations

UniReal: Universal Image Generation and Editing via Learning Real-world Dynamics

CVPR 2025
70
citations

PolyVoice: Language Models for Speech to Speech Translation

ICLR 2024
29
citations

Learning Disentangled Identifiers for Action-Customized Text-to-Image Generation

CVPR 2024
23
citations

EnvGS: Modeling View-Dependent Appearance with Environment Gaussian

CVPR 2025
16
citations

ViLLa: Video Reasoning Segmentation with Large Language Model

ICCV 2025
16
citations

GOPlan: Goal-conditioned Offline Reinforcement Learning by Planning with Learned Models

ICLR 2025
15
citations

MotionStone: Decoupled Motion Intensity Modulation with Diffusion Transformer for Image-to-Video Generation

CVPR 2025
13
citations

Dual-Window Multiscale Transformer for Hyperspectral Snapshot Compressive Imaging

AAAI 2024
13
citations

NoT: Federated Unlearning via Weight Negation

CVPR 2025arXiv
11
citations

Embodied VideoAgent: Persistent Memory from Egocentric Videos and Embodied Sensors Enables Dynamic Scene Understanding

ICCV 2025
11
citations

ObjectMover: Generative Object Movement with Video Prior

CVPR 2025
10
citations

Online Video Understanding: OVBench and VideoChat-Online

CVPR 2025arXiv
9
citations

Asynchronous Federated Clustering with Unknown Number of Clusters

AAAI 2025
8
citations

Exploiting Symmetric Temporally Sparse BPTT for Efficient RNN Training

AAAI 2024arXiv
4
citations

Unleashing Diffusion Transformers for Visual Correspondence by Modulating Massive Activations

NeurIPS 2025
4
citations

ROSE: Remove Objects with Side Effects in Videos

NeurIPS 2025
4
citations

Less or More From Teacher: Exploiting Trilateral Geometry For Knowledge Distillation

ICLR 2024
3
citations

PlayerOne: Egocentric World Simulator

NeurIPS 2025
3
citations

Understanding the Training Speedup from Sampling with Approximate Losses

ICML 2024
0
citations

MangaNinja: Line Art Colorization with Precise Reference Following

CVPR 2025
0
citations

EIDT-V: Exploiting Intersections in Diffusion Trajectories for Model-Agnostic, Zero-Shot, Training-Free Text-to-Video Generation

CVPR 2025
0
citations

UniGraspTransformer: Simplified Policy Distillation for Scalable Dexterous Robotic Grasping

CVPR 2025
0
citations

DiffDoctor: Diagnosing Image Diffusion Models Before Treating

ICCV 2025
0
citations

GenieBlue: Integrating both Linguistic and Multimodal Capabilities for Large Language Models on Mobile Devices

ICCV 2025
0
citations

Zero-shot Denoising via Neural Compression: Theoretical and algorithmic framework

NeurIPS 2025
0
citations

TC-LLaVA: Rethinking the Transfer of LLava from Image to Video Understanding with Temporal Considerations

AAAI 2025
0
citations

VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Grounding

AAAI 2025
0
citations

HFF-Tracker: A Hierarchical Fine-grained Fusion Tracker for Referring Multi-Object Tracking

AAAI 2025
0
citations

Reverse Region-to-Entity Annotation for Pixel-Level Visual Entity Linking

AAAI 2025
0
citations

Disentangled Modeling of Preferences and Social Influence for Group Recommendation

AAAI 2025
0
citations

The Distributional Reward Critic Framework for Reinforcement Learning Under Perturbed Rewards

AAAI 2025
0
citations

Decoupling Metacognition from Cognition: A Framework for Quantifying Metacognitive Ability in LLMs

AAAI 2025
0
citations

Calibrated One Round Federated Learning with Bayesian Inference in the Predictive Space

AAAI 2024
0
citations

AnyDoor: Zero-shot Object-level Image Customization

CVPR 2024
0
citations

PredToken: Predicting Unknown Tokens and Beyond with Coarse-to-Fine Iterative Decoding

CVPR 2024
0
citations

Bagged Deep Image Prior for Recovering Images in the Presence of Speckle Noise

ICML 2024
0
citations

Rethinking Generative Large Language Model Evaluation for Semantic Comprehension

ICML 2024
0
citations