Sun

48
Papers
924
Total Citations

Papers (48)

Advancing LLM Reasoning Generalists with Preference Trees

ICLR 2025arXiv
179
citations

VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

NeurIPS 2025arXiv
130
citations

VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents

ICLR 2025arXiv
121
citations

Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens

ICLR 2025arXiv
112
citations

Physics-Informed Diffusion Models

ICLR 2025arXiv
52
citations

TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes

ECCV 2024arXiv
40
citations

Vamos: Versatile Action Models for Video Understanding

ECCV 2024arXiv
36
citations

Preserving Diversity in Supervised Fine-Tuning of Large Language Models

ICLR 2025arXiv
33
citations

Multi-Agent Collaboration via Evolving Orchestration

NeurIPS 2025arXiv
25
citations

Prioritized Semantic Learning for Zero-shot Instance Navigation

ECCV 2024arXiv
22
citations

EditShield: Protecting Unauthorized Image Editing by Instruction-guided Diffusion Models

ECCV 2024arXiv
20
citations

Intervening Anchor Token: Decoding Strategy in Alleviating Hallucinations for MLLMs

ICLR 2025
20
citations

VITA-Audio: Fast Interleaved Audio-Text Token Generation for Efficient Large Speech-Language Model

NeurIPS 2025
17
citations

NeuSDFusion: A Spatial-Aware Generative Model for 3D Shape Completion, Reconstruction, and Generation

ECCV 2024arXiv
14
citations

Adversarial Locomotion and Motion Imitation for Humanoid Policy Learning

NeurIPS 2025arXiv
13
citations

How new data permeates LLM knowledge and how to dilute it

ICLR 2025arXiv
8
citations

Genesis: Multimodal Driving Scene Generation with Spatio-Temporal and Cross-Modal Consistency

NeurIPS 2025arXiv
8
citations

HyGen: Efficient LLM Serving via Elastic Online-Offline Request Co-location

NeurIPS 2025arXiv
7
citations

Segment, Lift and Fit: Automatic 3D Shape Labeling from 2D Prompts

ECCV 2024arXiv
6
citations

Long-range Turbulence Mitigation: A Large-scale Dataset and A Coarse-to-fine Framework

ECCV 2024arXiv
6
citations

COME: Adding Scene-Centric Forecasting Control to Occupancy World Model

NeurIPS 2025arXiv
5
citations

Exploring Reliable Matching with Phase Enhancement for Night-time Semantic Segmentation

ECCV 2024arXiv
5
citations

IGL-Bench: Establishing the Comprehensive Benchmark for Imbalanced Graph Learning

ICLR 2025arXiv
5
citations

Transformer brain encoders explain human high-level visual responses

NeurIPS 2025arXiv
4
citations

Triplets Better Than Pairs: Towards Stable and Effective Self-Play Fine-Tuning for LLMs

NeurIPS 2025arXiv
4
citations

ParamMute: Suppressing Knowledge-Critical FFNs for Faithful Retrieval-Augmented Generation

NeurIPS 2025arXiv
4
citations

The Overthinker's DIET: Cutting Token Calories with DIfficulty-AwarE Training

NeurIPS 2025arXiv
4
citations

Incentivizing Reasoning for Advanced Instruction-Following of Large Language Models

NeurIPS 2025arXiv
4
citations

Lagrangian Hashing for Compressed Neural Field Representations

ECCV 2024arXiv
3
citations

Avoiding exp(R) scaling in RLHF through Preference-based Exploration

NeurIPS 2025
3
citations

GRAVER: Generative Graph Vocabularies for Robust Graph Foundation Models Fine-tuning

NeurIPS 2025arXiv
3
citations

Learning Where to Look: Self-supervised Viewpoint Selection for Active Localization using Geometrical Information

ECCV 2024arXiv
2
citations

CPSample: Classifier Protected Sampling for Guarding Training Data During Diffusion

ICLR 2025arXiv
2
citations

Teaching Language Models to Reason with Tools

NeurIPS 2025arXiv
2
citations

PINP: Physics-Informed Neural Predictor with latent estimation of fluid flows

ICLR 2025arXiv
2
citations

Multimodal Label Relevance Ranking via Reinforcement Learning

ECCV 2024arXiv
1
citations

EA3D: Online Open-World 3D Object Extraction from Streaming Videos

NeurIPS 2025arXiv
1
citations

CLDyB: Towards Dynamic Benchmarking for Continual Learning with Pre-trained Models

ICLR 2025arXiv
1
citations

PCA++: How Uniformity Induces Robustness to Background Noise in Contrastive Learning

NeurIPS 2025arXiv
0
citations

Toward a Unified Geometry Understanding : Riemannian Diffusion Framework for Graph Generation and Prediction

NeurIPS 2025arXiv
0
citations

FRBNet: Revisiting Low-Light Vision through Frequency-Domain Radial Basis Network

NeurIPS 2025arXiv
0
citations

UnCLe: Towards Scalable Dynamic Causal Discovery in Non-linear Temporal Systems

NeurIPS 2025arXiv
0
citations

TRiCo: Triadic Game-Theoretic Co-Training for Robust Semi-Supervised Learning

NeurIPS 2025arXiv
0
citations

Enhancing Training Data Attribution with Representational Optimization

NeurIPS 2025arXiv
0
citations

Conservative classifiers do consistently well with improving agents: characterizing statistical and online learning

NeurIPS 2025arXiv
0
citations

Conditional Representation Learning for Customized Tasks

NeurIPS 2025arXiv
0
citations

MeCeFO: Enhancing LLM Training Robustness via Fault-Tolerant Optimization

NeurIPS 2025arXiv
0
citations

ProteinConformers: Benchmark Dataset for Simulating Protein Conformational Landscape Diversity and Plausibility

NeurIPS 2025
0
citations