Sun
48
Papers
924
Total Citations
Papers (48)
Advancing LLM Reasoning Generalists with Preference Trees
ICLR 2025arXiv
179
citations
VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction
NeurIPS 2025arXiv
130
citations
VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents
ICLR 2025arXiv
121
citations
Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens
ICLR 2025arXiv
112
citations
Physics-Informed Diffusion Models
ICLR 2025arXiv
52
citations
TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes
ECCV 2024arXiv
40
citations
Vamos: Versatile Action Models for Video Understanding
ECCV 2024arXiv
36
citations
Preserving Diversity in Supervised Fine-Tuning of Large Language Models
ICLR 2025arXiv
33
citations
Multi-Agent Collaboration via Evolving Orchestration
NeurIPS 2025arXiv
25
citations
Prioritized Semantic Learning for Zero-shot Instance Navigation
ECCV 2024arXiv
22
citations
EditShield: Protecting Unauthorized Image Editing by Instruction-guided Diffusion Models
ECCV 2024arXiv
20
citations
Intervening Anchor Token: Decoding Strategy in Alleviating Hallucinations for MLLMs
ICLR 2025
20
citations
VITA-Audio: Fast Interleaved Audio-Text Token Generation for Efficient Large Speech-Language Model
NeurIPS 2025
17
citations
NeuSDFusion: A Spatial-Aware Generative Model for 3D Shape Completion, Reconstruction, and Generation
ECCV 2024arXiv
14
citations
Adversarial Locomotion and Motion Imitation for Humanoid Policy Learning
NeurIPS 2025arXiv
13
citations
How new data permeates LLM knowledge and how to dilute it
ICLR 2025arXiv
8
citations
Genesis: Multimodal Driving Scene Generation with Spatio-Temporal and Cross-Modal Consistency
NeurIPS 2025arXiv
8
citations
HyGen: Efficient LLM Serving via Elastic Online-Offline Request Co-location
NeurIPS 2025arXiv
7
citations
Segment, Lift and Fit: Automatic 3D Shape Labeling from 2D Prompts
ECCV 2024arXiv
6
citations
Long-range Turbulence Mitigation: A Large-scale Dataset and A Coarse-to-fine Framework
ECCV 2024arXiv
6
citations
COME: Adding Scene-Centric Forecasting Control to Occupancy World Model
NeurIPS 2025arXiv
5
citations
Exploring Reliable Matching with Phase Enhancement for Night-time Semantic Segmentation
ECCV 2024arXiv
5
citations
IGL-Bench: Establishing the Comprehensive Benchmark for Imbalanced Graph Learning
ICLR 2025arXiv
5
citations
Transformer brain encoders explain human high-level visual responses
NeurIPS 2025arXiv
4
citations
Triplets Better Than Pairs: Towards Stable and Effective Self-Play Fine-Tuning for LLMs
NeurIPS 2025arXiv
4
citations
ParamMute: Suppressing Knowledge-Critical FFNs for Faithful Retrieval-Augmented Generation
NeurIPS 2025arXiv
4
citations
The Overthinker's DIET: Cutting Token Calories with DIfficulty-AwarE Training
NeurIPS 2025arXiv
4
citations
Incentivizing Reasoning for Advanced Instruction-Following of Large Language Models
NeurIPS 2025arXiv
4
citations
Lagrangian Hashing for Compressed Neural Field Representations
ECCV 2024arXiv
3
citations
Avoiding exp(R) scaling in RLHF through Preference-based Exploration
NeurIPS 2025
3
citations
GRAVER: Generative Graph Vocabularies for Robust Graph Foundation Models Fine-tuning
NeurIPS 2025arXiv
3
citations
Learning Where to Look: Self-supervised Viewpoint Selection for Active Localization using Geometrical Information
ECCV 2024arXiv
2
citations
CPSample: Classifier Protected Sampling for Guarding Training Data During Diffusion
ICLR 2025arXiv
2
citations
Teaching Language Models to Reason with Tools
NeurIPS 2025arXiv
2
citations
PINP: Physics-Informed Neural Predictor with latent estimation of fluid flows
ICLR 2025arXiv
2
citations
Multimodal Label Relevance Ranking via Reinforcement Learning
ECCV 2024arXiv
1
citations
EA3D: Online Open-World 3D Object Extraction from Streaming Videos
NeurIPS 2025arXiv
1
citations
CLDyB: Towards Dynamic Benchmarking for Continual Learning with Pre-trained Models
ICLR 2025arXiv
1
citations
PCA++: How Uniformity Induces Robustness to Background Noise in Contrastive Learning
NeurIPS 2025arXiv
0
citations
Toward a Unified Geometry Understanding : Riemannian Diffusion Framework for Graph Generation and Prediction
NeurIPS 2025arXiv
0
citations
FRBNet: Revisiting Low-Light Vision through Frequency-Domain Radial Basis Network
NeurIPS 2025arXiv
0
citations
UnCLe: Towards Scalable Dynamic Causal Discovery in Non-linear Temporal Systems
NeurIPS 2025arXiv
0
citations
TRiCo: Triadic Game-Theoretic Co-Training for Robust Semi-Supervised Learning
NeurIPS 2025arXiv
0
citations
Enhancing Training Data Attribution with Representational Optimization
NeurIPS 2025arXiv
0
citations
Conservative classifiers do consistently well with improving agents: characterizing statistical and online learning
NeurIPS 2025arXiv
0
citations
Conditional Representation Learning for Customized Tasks
NeurIPS 2025arXiv
0
citations
MeCeFO: Enhancing LLM Training Robustness via Fault-Tolerant Optimization
NeurIPS 2025arXiv
0
citations
ProteinConformers: Benchmark Dataset for Simulating Protein Conformational Landscape Diversity and Plausibility
NeurIPS 2025
0
citations