Zhao
43
Papers
1,097
Total Citations
Papers (43)
An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models
ECCV 2024arXiv
343
citations
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines
NeurIPS 2025arXiv
118
citations
OpenIns3D: Snap and Lookup for 3D Open-vocabulary Instance Segmentation
ECCV 2024arXiv
82
citations
Self-Introspective Decoding: Alleviating Hallucinations for Large Vision-Language Models
ICLR 2025arXiv
61
citations
MindSearch: Mimicking Human Minds Elicits Deep AI Searcher
ICLR 2025arXiv
53
citations
Explore the Potential of CLIP for Training-Free Open Vocabulary Semantic Segmentation
ECCV 2024arXiv
44
citations
Dynamic Diffusion Transformer
ICLR 2025arXiv
34
citations
Informed Correctors for Discrete Diffusion Models
NeurIPS 2025arXiv
31
citations
Motion-Agent: A Conversational Framework for Human Motion Generation with LLMs
ICLR 2025arXiv
30
citations
EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Videos Generation
NeurIPS 2025arXiv
25
citations
FocusDiffuser: Perceiving Local Disparities for Camouflaged Object Detection
ECCV 2024arXiv
23
citations
Region-Adaptive Transform with Segmentation Prior for Image Compression
ECCV 2024arXiv
21
citations
CVT-Occ: Cost Volume Temporal Fusion for 3D Occupancy Prediction
ECCV 2024arXiv
19
citations
Commit0: Library Generation from Scratch
ICLR 2025arXiv
18
citations
InsMapper: Exploring Inner-instance Information for Vectorized HD Mapping
ECCV 2024arXiv
18
citations
OV-Uni3DETR: Towards Unified Open-Vocabulary 3D Object Detection via Cycle-Modality Propagation
ECCV 2024arXiv
18
citations
OS-Harm: A Benchmark for Measuring Safety of Computer Use Agents
NeurIPS 2025arXiv
18
citations
Beyond Single Concept Vector: Modeling Concept Subspace in LLMs with Gaussian Distribution
ICLR 2025arXiv
16
citations
FastVID: Dynamic Density Pruning for Fast Video Large Language Models
NeurIPS 2025arXiv
16
citations
CirT: Global Subseasonal-to-Seasonal Forecasting with Geometry-inspired Transformer
ICLR 2025arXiv
13
citations
Unifying Appearance Codes and Bilateral Grids for Driving Scene Gaussian Splatting
NeurIPS 2025arXiv
11
citations
CoSIGN: Few-Step Guidance of ConSIstency Model to Solve General INverse Problems
ECCV 2024arXiv
10
citations
CLEVER: A Curated Benchmark for Formally Verified Code Generation
NeurIPS 2025arXiv
10
citations
PiTe: Pixel-Temporal Alignment for Large Video-Language Model
ECCV 2024arXiv
9
citations
SAMRefiner: Taming Segment Anything Model for Universal Mask Refinement
ICLR 2025arXiv
9
citations
LangSplatV2: High-dimensional 3D Language Gaussian Splatting with 450+ FPS
NeurIPS 2025arXiv
7
citations
A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image Generation
ICLR 2025arXiv
7
citations
T2V-OptJail: Discrete Prompt Optimization for Text-to-Video Jailbreak Attacks
NeurIPS 2025arXiv
6
citations
Think Thrice Before You Act: Progressive Thought Refinement in Large Language Models
ICLR 2025arXiv
5
citations
The Fluorescent Veil: A Stealthy and Effective Physical Adversarial Patch Against Traffic Sign Recognition
NeurIPS 2025arXiv
5
citations
Test-time Model Adaptation for Image Reconstruction Using Self-supervised Adaptive Layers
ECCV 2024
4
citations
Towards foundational LiDAR world models with efficient latent flow matching
NeurIPS 2025arXiv
4
citations
TrajAgent: An LLM-Agent Framework for Trajectory Modeling via Large-and-Small Model Collaboration
NeurIPS 2025arXiv
3
citations
TCM-Ladder: A Benchmark for Multimodal Question Answering on Traditional Chinese Medicine
NeurIPS 2025arXiv
2
citations
PreFM: Online Audio-Visual Event Parsing via Predictive Future Modeling
NeurIPS 2025arXiv
1
citations
Capability Localization: Capabilities Can be Localized rather than Individual Knowledge
ICLR 2025arXiv
1
citations
PolyhedronNet: Representation Learning for Polyhedra with Surface-attributed Graph
ICLR 2025arXiv
1
citations
Learning the Plasticity: Plasticity-Driven Learning Framework in Spiking Neural Networks
NeurIPS 2025arXiv
1
citations
Idling Neurons, Appropriately Lenient Workload During Fine-tuning Leads to Better Generalization
ECCV 2024
0
citations
Simulating Society Requires Simulating Thought
NeurIPS 2025arXiv
0
citations
MiCo: Multi-image Contrast for Reinforcement Visual Reasoning
NeurIPS 2025arXiv
0
citations
Towards Physics-informed Spatial Intelligence with Human Priors: An Autonomous Driving Pilot Study
NeurIPS 2025arXiv
0
citations
Rainbow Delay Compensation: A Multi-Agent Reinforcement Learning Framework for Mitigating Observation Delays
NeurIPS 2025
0
citations