Xiangyang Ji

37

Papers

141

Total Citations

Papers (37)

ParCo: Part-Coordinating Text-to-Motion Synthesis

Open-Vocabulary Functional 3D Scene Graphs for Real-World Indoor Spaces

Latent Reward: LLM-Empowered Credit Assignment in Episodic Reinforcement Learning

Local Action-Guided Motion Diffusion Model for Text-to-Motion Generation

EventGPT: Event Stream Understanding with Multimodal Large Language Models

Fast and Robust: Task Sampling with Posterior and Diversity Synergies for Adaptive Decision-Makers in Randomized Environments

GIVEPose: Gradual Intra-class Variation Elimination for RGB-based Category-Level Object Pose Estimation

PatchVSR: Breaking Video Diffusion Resolution Limits with Patch-wise Video Super-Resolution

KP-RED: Exploiting Semantic Keypoints for Joint 3D Shape Retrieval and Deformation

PlugMark: A Plug-in Zero-Watermarking Framework for Diffusion Models

FAFA: Frequency-Aware Flow-Aided Self-Supervision for Underwater Object Pose Estimation

Joint Asymmetric Loss for Learning with Noisy Labels

Towards Understanding How Knowledge Evolves in Large Vision-Language Models

Active Event-based Stereo Vision

ConformalSAM: Unlocking the Potential of Foundational Segmentation Models in Semi-Supervised Semantic Segmentation with Conformal Prediction

Know2Vec: A Black-Box Proxy for Neural Network Retrieval

Learning Scale-Aware Spatio-temporal Implicit Representation for Event-based Motion Deblurring

The Pitfalls and Promise of Conformal Inference Under Adversarial Attacks

LLM-Empowered State Representation for Reinforcement Learning

Data-free Neural Representation Compression with Riemannian Neural Dynamics

DPFlow: Adaptive Optical Flow Estimation with a Dual-Pyramid Framework

Kepler codebook

UNOPose: Unseen Object Pose Estimation with an Unposed RGB-D Reference Image

Linguistics-aware Masked Image Modeling for Self-supervised Scene Text Recognition

Enhanced Event-based Dense Stereo via Cross-Sensor Knowledge Distillation

DyGS-SLAM: Real-Time Accurate Localization and Gaussian Reconstruction for Dynamic Scenes

Street Gaussians without 3D Object Tracker

SHIFT: Smoothing Hallucinations by Information Flow Tuning for Multimodal Large Language Models

Can We Achieve Efficient Diffusion Without Self-Attention? Distilling Self-Attention into Convolutions

Bridging the Gap Between Ideal and Real-world Evaluation: Benchmarking AI-Generated Image Detection in Challenging Scenarios

Adaptive Neighborhood-Constrained Q Learning for Offline Reinforcement Learning

Delving into Cascaded Instability: A Lipschitz Continuity View on Image Restoration and Object Detection Synergy

NeurIPS 2025arXiv

Parallel Vertex Diffusion for Unified Visual Grounding

ShapeMatcher: Self-Supervised Joint Shape Canonicalization Segmentation Retrieval and Deformation

MOHO: Learning Single-view Hand-held Object Reconstruction with Multi-view Occlusion-Aware Supervision

FaceChain-SuDe: Building Derived Class to Inherit Category Attributes for One-shot Subject-Driven Generation

SynFog: A Photo-realistic Synthetic Fog Dataset based on End-to-end Imaging Simulation for Advancing Real-World Defogging in Autonomous Driving