Xi Chen
39
Papers
598
Total Citations
1
Affiliations
Affiliations
Google Research
Papers (39)
On Scaling Up a Multilingual Vision and Language Model
CVPR 2024
254
citations
OpenIns3D: Snap and Lookup for 3D Open-vocabulary Instance Segmentation
ECCV 2024arXiv
82
citations
UniReal: Universal Image Generation and Editing via Learning Real-world Dynamics
CVPR 2025
70
citations
PolyVoice: Language Models for Speech to Speech Translation
ICLR 2024
29
citations
Learning Disentangled Identifiers for Action-Customized Text-to-Image Generation
CVPR 2024
23
citations
EnvGS: Modeling View-Dependent Appearance with Environment Gaussian
CVPR 2025
16
citations
ViLLa: Video Reasoning Segmentation with Large Language Model
ICCV 2025
16
citations
GOPlan: Goal-conditioned Offline Reinforcement Learning by Planning with Learned Models
ICLR 2025
15
citations
MotionStone: Decoupled Motion Intensity Modulation with Diffusion Transformer for Image-to-Video Generation
CVPR 2025
13
citations
Dual-Window Multiscale Transformer for Hyperspectral Snapshot Compressive Imaging
AAAI 2024
13
citations
NoT: Federated Unlearning via Weight Negation
CVPR 2025arXiv
11
citations
Embodied VideoAgent: Persistent Memory from Egocentric Videos and Embodied Sensors Enables Dynamic Scene Understanding
ICCV 2025
11
citations
ObjectMover: Generative Object Movement with Video Prior
CVPR 2025
10
citations
Online Video Understanding: OVBench and VideoChat-Online
CVPR 2025arXiv
9
citations
Asynchronous Federated Clustering with Unknown Number of Clusters
AAAI 2025
8
citations
Exploiting Symmetric Temporally Sparse BPTT for Efficient RNN Training
AAAI 2024arXiv
4
citations
Unleashing Diffusion Transformers for Visual Correspondence by Modulating Massive Activations
NeurIPS 2025
4
citations
ROSE: Remove Objects with Side Effects in Videos
NeurIPS 2025
4
citations
Less or More From Teacher: Exploiting Trilateral Geometry For Knowledge Distillation
ICLR 2024
3
citations
PlayerOne: Egocentric World Simulator
NeurIPS 2025
3
citations
Understanding the Training Speedup from Sampling with Approximate Losses
ICML 2024
0
citations
MangaNinja: Line Art Colorization with Precise Reference Following
CVPR 2025
0
citations
EIDT-V: Exploiting Intersections in Diffusion Trajectories for Model-Agnostic, Zero-Shot, Training-Free Text-to-Video Generation
CVPR 2025
0
citations
UniGraspTransformer: Simplified Policy Distillation for Scalable Dexterous Robotic Grasping
CVPR 2025
0
citations
DiffDoctor: Diagnosing Image Diffusion Models Before Treating
ICCV 2025
0
citations
GenieBlue: Integrating both Linguistic and Multimodal Capabilities for Large Language Models on Mobile Devices
ICCV 2025
0
citations
Zero-shot Denoising via Neural Compression: Theoretical and algorithmic framework
NeurIPS 2025
0
citations
TC-LLaVA: Rethinking the Transfer of LLava from Image to Video Understanding with Temporal Considerations
AAAI 2025
0
citations
VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Grounding
AAAI 2025
0
citations
HFF-Tracker: A Hierarchical Fine-grained Fusion Tracker for Referring Multi-Object Tracking
AAAI 2025
0
citations
Reverse Region-to-Entity Annotation for Pixel-Level Visual Entity Linking
AAAI 2025
0
citations
Disentangled Modeling of Preferences and Social Influence for Group Recommendation
AAAI 2025
0
citations
The Distributional Reward Critic Framework for Reinforcement Learning Under Perturbed Rewards
AAAI 2025
0
citations
Decoupling Metacognition from Cognition: A Framework for Quantifying Metacognitive Ability in LLMs
AAAI 2025
0
citations
Calibrated One Round Federated Learning with Bayesian Inference in the Predictive Space
AAAI 2024
0
citations
AnyDoor: Zero-shot Object-level Image Customization
CVPR 2024
0
citations
PredToken: Predicting Unknown Tokens and Beyond with Coarse-to-Fine Iterative Decoding
CVPR 2024
0
citations
Bagged Deep Image Prior for Recovering Images in the Presence of Speckle Noise
ICML 2024
0
citations
Rethinking Generative Large Language Model Evaluation for Semantic Comprehension
ICML 2024
0
citations