Sun

41
Papers
802
Total Citations

Papers (41)

Advancing LLM Reasoning Generalists with Preference Trees

ICLR 2025arXiv
179
citations

VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

NeurIPS 2025arXiv
130
citations

VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents

ICLR 2025arXiv
121
citations

Physics-Informed Diffusion Models

ICLR 2025arXiv
52
citations

TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes

ECCV 2024arXiv
40
citations

Vamos: Versatile Action Models for Video Understanding

ECCV 2024arXiv
36
citations

Preserving Diversity in Supervised Fine-Tuning of Large Language Models

ICLR 2025arXiv
33
citations

Multi-Agent Collaboration via Evolving Orchestration

NeurIPS 2025arXiv
25
citations

Prioritized Semantic Learning for Zero-shot Instance Navigation

ECCV 2024arXiv
22
citations

Intervening Anchor Token: Decoding Strategy in Alleviating Hallucinations for MLLMs

ICLR 2025
20
citations

EditShield: Protecting Unauthorized Image Editing by Instruction-guided Diffusion Models

ECCV 2024arXiv
20
citations

VITA-Audio: Fast Interleaved Audio-Text Token Generation for Efficient Large Speech-Language Model

NeurIPS 2025
17
citations

NeuSDFusion: A Spatial-Aware Generative Model for 3D Shape Completion, Reconstruction, and Generation

ECCV 2024arXiv
14
citations

Adversarial Locomotion and Motion Imitation for Humanoid Policy Learning

NeurIPS 2025arXiv
13
citations

Genesis: Multimodal Driving Scene Generation with Spatio-Temporal and Cross-Modal Consistency

NeurIPS 2025arXiv
8
citations

How new data permeates LLM knowledge and how to dilute it

ICLR 2025arXiv
8
citations

HyGen: Efficient LLM Serving via Elastic Online-Offline Request Co-location

NeurIPS 2025arXiv
7
citations

Long-range Turbulence Mitigation: A Large-scale Dataset and A Coarse-to-fine Framework

ECCV 2024arXiv
6
citations

Segment, Lift and Fit: Automatic 3D Shape Labeling from 2D Prompts

ECCV 2024arXiv
6
citations

IGL-Bench: Establishing the Comprehensive Benchmark for Imbalanced Graph Learning

ICLR 2025arXiv
5
citations

Exploring Reliable Matching with Phase Enhancement for Night-time Semantic Segmentation

ECCV 2024arXiv
5
citations

The Overthinker's DIET: Cutting Token Calories with DIfficulty-AwarE Training

NeurIPS 2025arXiv
4
citations

Incentivizing Reasoning for Advanced Instruction-Following of Large Language Models

NeurIPS 2025arXiv
4
citations

Transformer brain encoders explain human high-level visual responses

NeurIPS 2025arXiv
4
citations

ParamMute: Suppressing Knowledge-Critical FFNs for Faithful Retrieval-Augmented Generation

NeurIPS 2025arXiv
4
citations

Triplets Better Than Pairs: Towards Stable and Effective Self-Play Fine-Tuning for LLMs

NeurIPS 2025arXiv
4
citations

Avoiding exp(R) scaling in RLHF through Preference-based Exploration

NeurIPS 2025
3
citations

Lagrangian Hashing for Compressed Neural Field Representations

ECCV 2024arXiv
3
citations

Teaching Language Models to Reason with Tools

NeurIPS 2025arXiv
2
citations

Learning Where to Look: Self-supervised Viewpoint Selection for Active Localization using Geometrical Information

ECCV 2024arXiv
2
citations

PINP: Physics-Informed Neural Predictor with latent estimation of fluid flows

ICLR 2025arXiv
2
citations

CLDyB: Towards Dynamic Benchmarking for Continual Learning with Pre-trained Models

ICLR 2025arXiv
1
citations

EA3D: Online Open-World 3D Object Extraction from Streaming Videos

NeurIPS 2025arXiv
1
citations

Multimodal Label Relevance Ranking via Reinforcement Learning

ECCV 2024arXiv
1
citations

FRBNet: Revisiting Low-Light Vision through Frequency-Domain Radial Basis Network

NeurIPS 2025arXiv
0
citations

UnCLe: Towards Scalable Dynamic Causal Discovery in Non-linear Temporal Systems

NeurIPS 2025arXiv
0
citations

Conservative classifiers do consistently well with improving agents: characterizing statistical and online learning

NeurIPS 2025arXiv
0
citations

Conditional Representation Learning for Customized Tasks

NeurIPS 2025arXiv
0
citations

Enhancing Training Data Attribution with Representational Optimization

NeurIPS 2025arXiv
0
citations

PCA++: How Uniformity Induces Robustness to Background Noise in Contrastive Learning

NeurIPS 2025arXiv
0
citations

Toward a Unified Geometry Understanding : Riemannian Diffusion Framework for Graph Generation and Prediction

NeurIPS 2025arXiv
0
citations