Xi Chen

71

Papers

16,944

Total Citations

1

Affiliations

Affiliations

Google Research

Papers (71)

Improved Techniques for Training GANs

NeurIPS 2016arXiv

InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets

NeurIPS 2016arXiv

Improved Variational Inference with Inverse Autoregressive Flow

NeurIPS 2016arXiv

On Scaling Up a Multilingual Vision and Language Model

OpenIns3D: Snap and Lookup for 3D Open-vocabulary Instance Segmentation

VIME: Variational Information Maximizing Exploration

NeurIPS 2016arXiv

UniReal: Universal Image Generation and Editing via Learning Real-world Dynamics

PolyVoice: Language Models for Speech to Speech Translation

Learning Disentangled Identifiers for Action-Customized Text-to-Image Generation

EnvGS: Modeling View-Dependent Appearance with Environment Gaussian

ViLLa: Video Reasoning Segmentation with Large Language Model

GOPlan: Goal-conditioned Offline Reinforcement Learning by Planning with Learned Models

On the Recursive Teaching Dimension of VC Classes

Dual-Window Multiscale Transformer for Hyperspectral Snapshot Compressive Imaging

MotionStone: Decoupled Motion Intensity Modulation with Diffusion Transformer for Image-to-Video Generation

Embodied VideoAgent: Persistent Memory from Egocentric Videos and Embodied Sensors Enables Dynamic Scene Understanding

NoT: Federated Unlearning via Weight Negation

ObjectMover: Generative Object Movement with Video Prior

Online Video Understanding: OVBench and VideoChat-Online

Asynchronous Federated Clustering with Unknown Number of Clusters

ROSE: Remove Objects with Side Effects in Videos

Unleashing Diffusion Transformers for Visual Correspondence by Modulating Massive Activations

Exploiting Symmetric Temporally Sparse BPTT for Efficient RNN Training

Less or More From Teacher: Exploiting Trilateral Geometry For Knowledge Distillation

PlayerOne: Egocentric World Simulator

The Distributional Reward Critic Framework for Reinforcement Learning Under Perturbed Rewards

Decoupling Metacognition from Cognition: A Framework for Quantifying Metacognitive Ability in LLMs

Disentangled Modeling of Preferences and Social Influence for Group Recommendation

Calibrated One Round Federated Learning with Bayesian Inference in the Predictive Space

Reverse Region-to-Entity Annotation for Pixel-Level Visual Entity Linking

AnyDoor: Zero-shot Object-level Image Customization

HFF-Tracker: A Hierarchical Fine-grained Fusion Tracker for Referring Multi-Object Tracking

VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Grounding

PredToken: Predicting Unknown Tokens and Beyond with Coarse-to-Fine Iterative Decoding

TC-LLaVA: Rethinking the Transfer of LLava from Image to Video Understanding with Temporal Considerations

Zero-shot Denoising via Neural Compression: Theoretical and algorithmic framework

Bagged Deep Image Prior for Recovering Images in the Presence of Speckle Noise

Rethinking Generative Large Language Model Evaluation for Semantic Comprehension

Understanding the Training Speedup from Sampling with Approximate Losses

Resolution Adaptive Networks for Efficient Inference

State-Aware Tracker for Real-Time Video Object Segmentation

FocalClick: Towards Practical Interactive Image Segmentation

Dynamically Instance-Guided Adaptation: A Backward-Free Approach for Test-Time Domain Adaptive Semantic Segmentation

Improving Robust Generalization by Direct PAC-Bayesian Bound Minimization

Detecting Everything in the Open World: Towards Universal Object Detection

Conditional Diffusion for Interactive Segmentation

Open-vocabulary Panoptic Segmentation with Embedding Modulation

Understanding Hessian Alignment for Domain Generalization

PreSTU: Pre-Training for Scene-Text Understanding

Autoregressive Uncertainty Modeling for 3D Bounding Box Prediction

PACTran: PAC-Bayesian Metrics for Estimating the Transferability of Pretrained Models to Classification Tasks

GenieBlue: Integrating both Linguistic and Multimodal Capabilities for Large Language Models on Mobile Devices

DiffDoctor: Diagnosing Image Diffusion Models Before Treating

UniGraspTransformer: Simplified Policy Distillation for Scalable Dexterous Robotic Grasping

EIDT-V: Exploiting Intersections in Diffusion Trajectories for Model-Agnostic, Zero-Shot, Training-Free Text-to-Video Generation

MangaNinja: Line Art Colorization with Precise Reference Following

Near-Optimal Policies for Dynamic Multinomial Logit Assortment Selection Models

Online EXP3 Learning in Adversarial Bandits with Delayed Feedback

Information Theoretic Counterfactual Learning from Missing-Not-At-Random Feedback

Fixed-Support Wasserstein Barycenters: Computational Hardness and Fast Algorithm

Hedging in games: Faster convergence of external and swap regrets

Generalized DataWeighting via Class-Level Gradient Manipulation

Bridging the Gap Between Practice and PAC-Bayes Theory in Few-Shot Meta-Learning

LAPO: Latent-Variable Advantage-Weighted Policy Optimization for Offline Reinforcement Learning

TMT-VIS: Taxonomy-aware Multi-dataset Joint Training for Video Instance Segmentation

Uni3DETR: Unified 3D Detection Transformer

Large-Scale Markov Decision Problems with KL Control Cost and its Application to Crowdsourcing

Benchmarking Deep Reinforcement Learning for Continuous Control

Adaptive Multiple-Arm Identification

Flow++: Improving Flow-Based Generative Models with Variational Dequantization and Architecture Design

Population Based Augmentation: Efficient Learning of Augmentation Policy Schedules