Zhu

28
Papers
1,058
Total Citations

Papers (28)

MobileNetV4: Universal Models for the Mobile Ecosystem

ECCV 2024arXiv
407
citations

DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search

ICLR 2025arXiv
134
citations

SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines

NeurIPS 2025arXiv
118
citations

The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning

NeurIPS 2025arXiv
74
citations

ChartMimic: Evaluating LMM's Cross-Modal Reasoning Capability via Chart-to-Code Generation

ICLR 2025arXiv
65
citations

Building Interactable Replicas of Complex Articulated Objects via Gaussian Splatting

ICLR 2025arXiv
38
citations

Sparse MeZO: Less Parameters for Better Performance in Zeroth-Order LLM Fine-Tuning

NeurIPS 2025arXiv
36
citations

Merging LoRAs like Playing LEGO: Pushing the Modularity of LoRA to Extremes Through Rank-Wise Clustering

ICLR 2025arXiv
33
citations

VLA-Cache: Efficient Vision-Language-Action Manipulation via Adaptive Token Caching

NeurIPS 2025arXiv
27
citations

$\text{D}_{2}\text{O}$: Dynamic Discriminative Operations for Efficient Long-Context Inference of Large Language Models

ICLR 2025
22
citations

SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training

ICLR 2025arXiv
15
citations

EmoTalk3D: High-Fidelity Free-View Synthesis of Emotional 3D Talking Head

ECCV 2024arXiv
14
citations

OpenPSG: Open-set Panoptic Scene Graph Generation via Large Multimodal Models

ECCV 2024arXiv
13
citations

Mix Data or Merge Models? Balancing the Helpfulness, Honesty, and Harmlessness of Large Language Model via Model Merging

NeurIPS 2025arXiv
13
citations

NetMoE: Accelerating MoE Training through Dynamic Sample Placement

ICLR 2025
11
citations

KDProR: A Knowledge-Decoupling Probabilistic Framework for Video-Text Retrieval

ECCV 2024
10
citations

UniCoTT: A Unified Framework for Structural Chain-of-Thought Distillation

ICLR 2025
7
citations

WebRPG: Automatic Web Rendering Parameters Generation for Visual Presentation

ECCV 2024arXiv
5
citations

RTV-Bench: Benchmarking MLLM Continuous Perception, Understanding and Reasoning through Real-Time Video

NeurIPS 2025arXiv
5
citations

DexFlyWheel: A Scalable and Self-improving Data Generation Framework for Dexterous Manipulation

NeurIPS 2025arXiv
3
citations

Rotated Orthographic Projection for Self-Supervised 3D Human Pose Estimation

ECCV 2024
2
citations

SEBRA : Debiasing through Self-Guided Bias Ranking

ICLR 2025arXiv
2
citations

Scaling Instruction-tuned LLMs to Million-token Contexts via Hierarchical Synthetic Data Generation

ICLR 2025arXiv
2
citations

Blackbox Model Provenance via Palimpsestic Membership Inference

NeurIPS 2025arXiv
1
citations

VimoRAG: Video-based Retrieval-augmented 3D Motion Generation for Motion Language Models

NeurIPS 2025arXiv
1
citations

AANet: Virtual Screening under Structural Uncertainty via Alignment and Aggregation

NeurIPS 2025arXiv
0
citations

World Models Should Prioritize the Unification of Physical and Social Dynamics

NeurIPS 2025arXiv
0
citations

Collaborating Vision, Depth, and Thermal Signals for Multi-Modal Tracking: Dataset and Algorithm

NeurIPS 2025arXiv
0
citations