wang
99
Papers
1,998
Total Citations
Papers (99)
Video-R1: Reinforcing Video Reasoning in MLLMs
NeurIPS 2025arXiv
236
citations
Advancing LLM Reasoning Generalists with Preference Trees
ICLR 2025arXiv
179
citations
SGS-SLAM: Semantic Gaussian Splatting For Neural Dense SLAM
ECCV 2024arXiv
131
citations
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling
ICLR 2025arXiv
125
citations
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines
NeurIPS 2025arXiv
118
citations
Tamper-Resistant Safeguards for Open-Weight LLMs
ICLR 2025arXiv
108
citations
Autoregressive Video Generation without Vector Quantization
ICLR 2025arXiv
101
citations
TLControl: Trajectory and Language Control for Human Motion Synthesis
ECCV 2024arXiv
77
citations
BAD-Gaussians: Bundle Adjusted Deblur Gaussian Splatting
ECCV 2024arXiv
74
citations
DSBench: How Far Are Data Science Agents from Becoming Data Science Experts?
ICLR 2025arXiv
62
citations
MindSearch: Mimicking Human Minds Elicits Deep AI Searcher
ICLR 2025arXiv
53
citations
WritingBench: A Comprehensive Benchmark for Generative Writing
NeurIPS 2025arXiv
41
citations
Dynamic Diffusion Transformer
ICLR 2025arXiv
34
citations
AnimatableDreamer: Text-Guided Non-rigid 3D Model Generation and Reconstruction with Canonical Score Distillation
ECCV 2024arXiv
33
citations
EventBind: Learning a Unified Representation to Bind Them All for Event-based Open-world Understanding
ECCV 2024arXiv
28
citations
Language-Driven Physics-Based Scene Synthesis and Editing via Feature Splatting
ECCV 2024
28
citations
Theoretical Benefit and Limitation of Diffusion Language Model
NeurIPS 2025arXiv
27
citations
Representation Entanglement for Generation: Training Diffusion Transformers Is Much Easier Than You Think
NeurIPS 2025arXiv
27
citations
Surgical, Cheap, and Flexible: Mitigating False Refusal in Language Models via Single Vector Ablation
ICLR 2025arXiv
24
citations
SPA: 3D Spatial-Awareness Enables Effective Embodied Representation
ICLR 2025arXiv
24
citations
Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry Priors
NeurIPS 2025arXiv
24
citations
EAS-SNN: End-to-End Adaptive Sampling and Representation for Event-based Detection with Recurrent Spiking Neural Networks
ECCV 2024arXiv
24
citations
SWE-bench Goes Live!
NeurIPS 2025arXiv
22
citations
Temporal Reasoning Transfer from Text to Video
ICLR 2025arXiv
20
citations
Influence-Guided Diffusion for Dataset Distillation
ICLR 2025
19
citations
Do as We Do, Not as You Think: the Conformity of Large Language Models
ICLR 2025arXiv
18
citations
CLIBD: Bridging Vision and Genomics for Biodiversity Monitoring at Scale
ICLR 2025arXiv
18
citations
VeriThinker: Learning to Verify Makes Reasoning Model Efficient
NeurIPS 2025arXiv
16
citations
Weak-to-Strong Preference Optimization: Stealing Reward from Weak Aligned Model
ICLR 2025arXiv
14
citations
AMD: Automatic Multi-step Distillation of Large-scale Vision Models
ECCV 2024arXiv
14
citations
UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Language Interface
NeurIPS 2025arXiv
14
citations
CEB: Compositional Evaluation Benchmark for Fairness in Large Language Models
ICLR 2025arXiv
13
citations
DiffuMatting: Synthesizing Arbitrary Objects with Matting-level Annotation
ECCV 2024arXiv
13
citations
VideoChat-R1.5: Visual Test-Time Scaling to Reinforce Multimodal Reasoning by Iterative Perception
NeurIPS 2025arXiv
13
citations
UFM: A Simple Path towards Unified Dense Correspondence with Flow
NeurIPS 2025arXiv
13
citations
Unlocking Efficient, Scalable, and Continual Knowledge Editing with Basis-Level Representation Fine-Tuning
ICLR 2025arXiv
12
citations
This Time is Different: An Observability Perspective on Time Series Foundation Models
NeurIPS 2025arXiv
11
citations
Overcoming False Illusions in Real-World Face Restoration with Multi-Modal Guided Diffusion Model
ICLR 2025arXiv
10
citations
BountyBench: Dollar Impact of AI Agent Attackers and Defenders on Real-World Cybersecurity Systems
NeurIPS 2025arXiv
9
citations
On Reasoning Strength Planning in Large Reasoning Models
NeurIPS 2025arXiv
9
citations
JarvisArt: Liberating Human Artistic Creativity via an Intelligent Photo Retouching Agent
NeurIPS 2025arXiv
9
citations
Implicit In-context Learning
ICLR 2025arXiv
8
citations
FSD-BEV: Foreground Self-Distillation for Multi-view 3D Object Detection
ECCV 2024arXiv
8
citations
EgoBlind: Towards Egocentric Visual Assistance for the Blind
NeurIPS 2025arXiv
8
citations
DC-Solver: Improving Predictor-Corrector Diffusion Sampler via Dynamic Compensation
ECCV 2024arXiv
8
citations
Interactive Speculative Planning: Enhance Agent Efficiency through Co-design of System and User Interface
ICLR 2025arXiv
8
citations
UniCoTT: A Unified Framework for Structural Chain-of-Thought Distillation
ICLR 2025
7
citations
LLMCO4MR: LLMs-aided Neural Combinatorial Optimization for Ancient Manuscript Restoration from Fragments with Case Studies on Dunhuang
ECCV 2024
6
citations
OSDA Agent: Leveraging Large Language Models for De Novo Design of Organic Structure Directing Agents
ICLR 2025
6
citations
ELICIT: LLM Augmentation Via External In-context Capability
ICLR 2025arXiv
6
citations
STAR: Stability-Inducing Weight Perturbation for Continual Learning
ICLR 2025arXiv
5
citations
Flow Matching-Based Autonomous Driving Planning with Advanced Interactive Behavior Modeling
NeurIPS 2025arXiv
5
citations
Pushing the Limits of All-Atom Geometric Graph Neural Networks: Pre-Training, Scaling, and Zero-Shot Transfer
ICLR 2025arXiv
5
citations
WorldWeaver: Generating Long-Horizon Video Worlds via Rich Perception
NeurIPS 2025arXiv
5
citations
MetaBox-v2: A Unified Benchmark Platform for Meta-Black-Box Optimization
NeurIPS 2025arXiv
5
citations
Frame In-N-Out: Unbounded Controllable Image-to-Video Generation
NeurIPS 2025arXiv
5
citations
Advantage-Guided Distillation for Preference Alignment in Small Language Models
ICLR 2025arXiv
4
citations
Advanced Sign Language Video Generation with Compressed and Quantized Multi-Condition Tokenization
NeurIPS 2025arXiv
4
citations
AdmTree: Compressing Lengthy Context with Adaptive Semantic Trees
NeurIPS 2025arXiv
4
citations
Rethinking Neural Combinatorial Optimization for Vehicle Routing Problems with Different Constraint Tightness Degrees
NeurIPS 2025arXiv
4
citations
Audio-Sync Video Generation with Multi-Stream Temporal Control
NeurIPS 2025arXiv
4
citations
Who You Are Matters: Bridging Interests and Social Roles via LLM-Enhanced Logic Recommendation
NeurIPS 2025
3
citations
Attention! Your Vision Language Model Could Be Maliciously Manipulated
NeurIPS 2025arXiv
3
citations
Improving Generalization of Neural Combinatorial Optimization for Vehicle Routing Problems via Test-Time Projection Learning
NeurIPS 2025arXiv
3
citations
HG-Adapter: Improving Pre-Trained Heterogeneous Graph Neural Networks with Dual Adapters
ICLR 2025arXiv
3
citations
Reinforcement Learning for Out-of-Distribution Reasoning in LLMs: An Empirical Study on Diagnosis-Related Group Coding
NeurIPS 2025arXiv
3
citations
Image Editing As Programs with Diffusion Models
NeurIPS 2025arXiv
2
citations
SAS: Simulated Attention Score
NeurIPS 2025arXiv
2
citations
ArchCAD-400K: A Large-Scale CAD drawings Dataset and New Baseline for Panoptic Symbol Spotting
NeurIPS 2025arXiv
2
citations
Teaching Language Models to Reason with Tools
NeurIPS 2025arXiv
2
citations
MPCache: MPC-Friendly KV Cache Eviction for Efficient Private LLM Inference
NeurIPS 2025arXiv
2
citations
Bridging Symmetry and Robustness: On the Role of Equivariance in Enhancing Adversarial Robustness
NeurIPS 2025arXiv
2
citations
Rotated Orthographic Projection for Self-Supervised 3D Human Pose Estimation
ECCV 2024
2
citations
Dysca: A Dynamic and Scalable Benchmark for Evaluating Perception Ability of LVLMs
ICLR 2025arXiv
2
citations
Multi-Task Domain Adaptation for Language Grounding with 3D Objects
ECCV 2024arXiv
2
citations
MVU-Eval: Towards Multi-Video Understanding Evaluation for Multimodal LLMs
NeurIPS 2025arXiv
2
citations
Holistic Large-Scale Scene Reconstruction via Mixed Gaussian Splatting
NeurIPS 2025arXiv
1
citations
Cue3D: Quantifying the Role of Image Cues in Single-Image 3D Generation
NeurIPS 2025arXiv
1
citations
OverLayBench: A Benchmark for Layout-to-Image Generation with Dense Overlaps
NeurIPS 2025arXiv
1
citations
Hierarchical Optimization via LLM-Guided Objective Evolution for Mobility-on-Demand Systems
NeurIPS 2025arXiv
0
citations
Emergent Orientation Maps —— Mechanisms, Coding Efficiency and Robustness
ICLR 2025
0
citations
Learning Partial Graph Matching via Optimal Partial Transport
ICLR 2025arXiv
0
citations
Optimal Nuisance Function Tuning for Estimating a Doubly Robust Functional under Proportional Asymptotics
NeurIPS 2025arXiv
0
citations
EfficientNav: Towards On-Device Object-Goal Navigation with Navigation Map Caching and Retrieval
NeurIPS 2025arXiv
0
citations
NeuroPath: Neurobiology-Inspired Path Tracking and Reflection for Semantically Coherent Retrieval
NeurIPS 2025arXiv
0
citations
BlazeBVD: Make Scale-Time Equalization Great Again for Blind Video Deflickering
ECCV 2024arXiv
0
citations
Semantic Temporal Abstraction via Vision-Language Model Guidance for Efficient Reinforcement Learning
ICLR 2025
0
citations
MRO: Enhancing Reasoning in Diffusion Language Models via Multi-Reward Optimization
NeurIPS 2025arXiv
0
citations
PlanU: Large Language Model Reasoning through Planning under Uncertainty
NeurIPS 2025arXiv
0
citations
Don’t Forget the Enjoin: FocalLoRA for Instruction Hierarchical Alignment in Large Language Models
NeurIPS 2025
0
citations
The Mirage of Performance Gains: Why Contrastive Decoding Fails to Mitigate Object Hallucinations in MLLMs?
NeurIPS 2025arXiv
0
citations
Nearly-Linear Time Private Hypothesis Selection with the Optimal Approximation Factor
NeurIPS 2025arXiv
0
citations
Diffusing to the Top: Boost Graph Neural Networks with Minimal Hyperparameter Tuning
ICLR 2025arXiv
0
citations
RankSEG-RMA: An Efficient Segmentation Algorithm via Reciprocal Moment Approximation
NeurIPS 2025arXiv
0
citations
Fewer May Be Better: Enhancing Offline Reinforcement Learning with Reduced Dataset
ICLR 2025arXiv
0
citations
Chains of Diffusion Models
ECCV 2024
0
citations
Wonder Wins Ways: Curiosity-Driven Exploration through Multi-Agent Contextual Calibration
NeurIPS 2025arXiv
0
citations
Off-policy Reinforcement Learning with Model-based Exploration Augmentation
NeurIPS 2025arXiv
0
citations
Block-Diagonal LoRA for Eliminating Communication Overhead in Tensor Parallel LoRA Serving
NeurIPS 2025arXiv
0
citations