wang

78
Papers
1,864
Total Citations

Papers (78)

Video-R1: Reinforcing Video Reasoning in MLLMs

NeurIPS 2025arXiv
236
citations

Advancing LLM Reasoning Generalists with Preference Trees

ICLR 2025arXiv
179
citations

SGS-SLAM: Semantic Gaussian Splatting For Neural Dense SLAM

ECCV 2024arXiv
131
citations

WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling

ICLR 2025arXiv
125
citations

SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines

NeurIPS 2025arXiv
118
citations

Tamper-Resistant Safeguards for Open-Weight LLMs

ICLR 2025arXiv
108
citations

Autoregressive Video Generation without Vector Quantization

ICLR 2025arXiv
101
citations

TLControl: Trajectory and Language Control for Human Motion Synthesis

ECCV 2024arXiv
77
citations

BAD-Gaussians: Bundle Adjusted Deblur Gaussian Splatting

ECCV 2024arXiv
74
citations

DSBench: How Far Are Data Science Agents from Becoming Data Science Experts?

ICLR 2025arXiv
62
citations

MindSearch: Mimicking Human Minds Elicits Deep AI Searcher

ICLR 2025arXiv
53
citations

WritingBench: A Comprehensive Benchmark for Generative Writing

NeurIPS 2025arXiv
41
citations

Dynamic Diffusion Transformer

ICLR 2025arXiv
34
citations

AnimatableDreamer: Text-Guided Non-rigid 3D Model Generation and Reconstruction with Canonical Score Distillation

ECCV 2024arXiv
33
citations

Language-Driven Physics-Based Scene Synthesis and Editing via Feature Splatting

ECCV 2024
28
citations

Theoretical Benefit and Limitation of Diffusion Language Model

NeurIPS 2025arXiv
27
citations

Representation Entanglement for Generation: Training Diffusion Transformers Is Much Easier Than You Think

NeurIPS 2025arXiv
27
citations

SPA: 3D Spatial-Awareness Enables Effective Embodied Representation

ICLR 2025arXiv
24
citations

EAS-SNN: End-to-End Adaptive Sampling and Representation for Event-based Detection with Recurrent Spiking Neural Networks

ECCV 2024arXiv
24
citations

Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry Priors

NeurIPS 2025arXiv
24
citations

SWE-bench Goes Live!

NeurIPS 2025arXiv
22
citations

Temporal Reasoning Transfer from Text to Video

ICLR 2025arXiv
20
citations

Influence-Guided Diffusion for Dataset Distillation

ICLR 2025
19
citations

CLIBD: Bridging Vision and Genomics for Biodiversity Monitoring at Scale

ICLR 2025arXiv
18
citations

Do as We Do, Not as You Think: the Conformity of Large Language Models

ICLR 2025arXiv
18
citations

VeriThinker: Learning to Verify Makes Reasoning Model Efficient

NeurIPS 2025arXiv
16
citations

AMD: Automatic Multi-step Distillation of Large-scale Vision Models

ECCV 2024arXiv
14
citations

UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Language Interface

NeurIPS 2025arXiv
14
citations

VideoChat-R1.5: Visual Test-Time Scaling to Reinforce Multimodal Reasoning by Iterative Perception

NeurIPS 2025arXiv
13
citations

UFM: A Simple Path towards Unified Dense Correspondence with Flow

NeurIPS 2025arXiv
13
citations

DiffuMatting: Synthesizing Arbitrary Objects with Matting-level Annotation

ECCV 2024arXiv
13
citations

CEB: Compositional Evaluation Benchmark for Fairness in Large Language Models

ICLR 2025arXiv
13
citations

Unlocking Efficient, Scalable, and Continual Knowledge Editing with Basis-Level Representation Fine-Tuning

ICLR 2025arXiv
12
citations

This Time is Different: An Observability Perspective on Time Series Foundation Models

NeurIPS 2025arXiv
11
citations

Overcoming False Illusions in Real-World Face Restoration with Multi-Modal Guided Diffusion Model

ICLR 2025arXiv
10
citations

BountyBench: Dollar Impact of AI Agent Attackers and Defenders on Real-World Cybersecurity Systems

NeurIPS 2025arXiv
9
citations

FSD-BEV: Foreground Self-Distillation for Multi-view 3D Object Detection

ECCV 2024arXiv
8
citations

DC-Solver: Improving Predictor-Corrector Diffusion Sampler via Dynamic Compensation

ECCV 2024arXiv
8
citations

Implicit In-context Learning

ICLR 2025arXiv
8
citations

Interactive Speculative Planning: Enhance Agent Efficiency through Co-design of System and User Interface

ICLR 2025arXiv
8
citations

EgoBlind: Towards Egocentric Visual Assistance for the Blind

NeurIPS 2025arXiv
8
citations

LLMCO4MR: LLMs-aided Neural Combinatorial Optimization for Ancient Manuscript Restoration from Fragments with Case Studies on Dunhuang

ECCV 2024
6
citations

Flow Matching-Based Autonomous Driving Planning with Advanced Interactive Behavior Modeling

NeurIPS 2025arXiv
5
citations

STAR: Stability-Inducing Weight Perturbation for Continual Learning

ICLR 2025arXiv
5
citations

MetaBox-v2: A Unified Benchmark Platform for Meta-Black-Box Optimization

NeurIPS 2025arXiv
5
citations

Audio-Sync Video Generation with Multi-Stream Temporal Control

NeurIPS 2025arXiv
4
citations

AdmTree: Compressing Lengthy Context with Adaptive Semantic Trees

NeurIPS 2025arXiv
4
citations

Advanced Sign Language Video Generation with Compressed and Quantized Multi-Condition Tokenization

NeurIPS 2025arXiv
4
citations

Rethinking Neural Combinatorial Optimization for Vehicle Routing Problems with Different Constraint Tightness Degrees

NeurIPS 2025arXiv
4
citations

Attention! Your Vision Language Model Could Be Maliciously Manipulated

NeurIPS 2025arXiv
3
citations

Improving Generalization of Neural Combinatorial Optimization for Vehicle Routing Problems via Test-Time Projection Learning

NeurIPS 2025arXiv
3
citations

MVU-Eval: Towards Multi-Video Understanding Evaluation for Multimodal LLMs

NeurIPS 2025arXiv
2
citations

Dysca: A Dynamic and Scalable Benchmark for Evaluating Perception Ability of LVLMs

ICLR 2025arXiv
2
citations

Image Editing As Programs with Diffusion Models

NeurIPS 2025arXiv
2
citations

ArchCAD-400K: A Large-Scale CAD drawings Dataset and New Baseline for Panoptic Symbol Spotting

NeurIPS 2025arXiv
2
citations

Bridging Symmetry and Robustness: On the Role of Equivariance in Enhancing Adversarial Robustness

NeurIPS 2025arXiv
2
citations

Teaching Language Models to Reason with Tools

NeurIPS 2025arXiv
2
citations

MPCache: MPC-Friendly KV Cache Eviction for Efficient Private LLM Inference

NeurIPS 2025arXiv
2
citations

SAS: Simulated Attention Score

NeurIPS 2025arXiv
2
citations

Rotated Orthographic Projection for Self-Supervised 3D Human Pose Estimation

ECCV 2024
2
citations

Holistic Large-Scale Scene Reconstruction via Mixed Gaussian Splatting

NeurIPS 2025arXiv
1
citations

Cue3D: Quantifying the Role of Image Cues in Single-Image 3D Generation

NeurIPS 2025arXiv
1
citations

RankSEG-RMA: An Efficient Segmentation Algorithm via Reciprocal Moment Approximation

NeurIPS 2025arXiv
0
citations

Chains of Diffusion Models

ECCV 2024
0
citations

Wonder Wins Ways: Curiosity-Driven Exploration through Multi-Agent Contextual Calibration

NeurIPS 2025arXiv
0
citations

BlazeBVD: Make Scale-Time Equalization Great Again for Blind Video Deflickering

ECCV 2024arXiv
0
citations

Block-Diagonal LoRA for Eliminating Communication Overhead in Tensor Parallel LoRA Serving

NeurIPS 2025arXiv
0
citations

Don’t Forget the Enjoin: FocalLoRA for Instruction Hierarchical Alignment in Large Language Models

NeurIPS 2025
0
citations

PlanU: Large Language Model Reasoning through Planning under Uncertainty

NeurIPS 2025arXiv
0
citations

The Mirage of Performance Gains: Why Contrastive Decoding Fails to Mitigate Object Hallucinations in MLLMs?

NeurIPS 2025arXiv
0
citations

NeuroPath: Neurobiology-Inspired Path Tracking and Reflection for Semantically Coherent Retrieval

NeurIPS 2025arXiv
0
citations

Optimal Nuisance Function Tuning for Estimating a Doubly Robust Functional under Proportional Asymptotics

NeurIPS 2025arXiv
0
citations

Emergent Orientation Maps —— Mechanisms, Coding Efficiency and Robustness

ICLR 2025
0
citations

Learning Partial Graph Matching via Optimal Partial Transport

ICLR 2025arXiv
0
citations

Semantic Temporal Abstraction via Vision-Language Model Guidance for Efficient Reinforcement Learning

ICLR 2025
0
citations

Diffusing to the Top: Boost Graph Neural Networks with Minimal Hyperparameter Tuning

ICLR 2025arXiv
0
citations

Nearly-Linear Time Private Hypothesis Selection with the Optimal Approximation Factor

NeurIPS 2025arXiv
0
citations

Fewer May Be Better: Enhancing Offline Reinforcement Learning with Reduced Dataset

ICLR 2025arXiv
0
citations