li

74
Papers
1,457
Total Citations

Papers (74)

OpenHands: An Open Platform for AI Software Developers as Generalist Agents

ICLR 2025arXiv
351
citations

SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal

ICLR 2025arXiv
141
citations

SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines

NeurIPS 2025arXiv
118
citations

CHASE-SQL: Multi-Path Reasoning and Preference Optimized Candidate Selection in Text-to-SQL

ICLR 2025arXiv
116
citations

LLaVA-Grounding: Grounded Visual Chat with Large Multimodal Models

ECCV 2024arXiv
114
citations

Frame Context Packing and Drift Prevention in Next-Frame-Prediction Video Diffusion Models

NeurIPS 2025arXiv
56
citations

MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix

NeurIPS 2025arXiv
52
citations

TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes

ECCV 2024arXiv
40
citations

SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration

ICLR 2025arXiv
39
citations

Hierarchical Gaussian Mixture Normalizing Flow Modeling for Unified Anomaly Detection

ECCV 2024arXiv
30
citations

STAMP: Scalable Task- And Model-agnostic Collaborative Perception

ICLR 2025arXiv
29
citations

EAS-SNN: End-to-End Adaptive Sampling and Representation for Event-based Detection with Recurrent Spiking Neural Networks

ECCV 2024arXiv
24
citations

SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters

ICLR 2025arXiv
23
citations

Robots Pre-train Robots: Manipulation-Centric Robotic Representation from Large-Scale Robot Datasets

ICLR 2025arXiv
23
citations

Navigation-Guided Sparse Scene Representation for End-to-End Autonomous Driving

ICLR 2025arXiv
23
citations

SpikeLLM: Scaling up Spiking Neural Network to Large Language Models via Saliency-based Spiking

ICLR 2025arXiv
21
citations

How Do Large Language Models Understand Graph Patterns? A Benchmark for Graph Pattern Comprehension

ICLR 2025arXiv
20
citations

VITA-Audio: Fast Interleaved Audio-Text Token Generation for Efficient Large Speech-Language Model

NeurIPS 2025
17
citations

Towards Effective Usage of Human-Centric Priors in Diffusion Models for Text-based Human Image Generation

CVPR 2024
17
citations

Deep Video Discovery: Agentic Search with Tool Use for Long-form Video Understanding

NeurIPS 2025arXiv
17
citations

Quantized Spike-driven Transformer

ICLR 2025arXiv
14
citations

CirT: Global Subseasonal-to-Seasonal Forecasting with Geometry-inspired Transformer

ICLR 2025arXiv
13
citations

C2C: Component-to-Composition Learning for Zero-Shot Compositional Action Recognition

ECCV 2024arXiv
12
citations

TIGER: Time-frequency Interleaved Gain Extraction and Reconstruction for Efficient Speech Separation

ICLR 2025arXiv
11
citations

JarvisArt: Liberating Human Artistic Creativity via an Intelligent Photo Retouching Agent

NeurIPS 2025arXiv
9
citations

Test-time Adaptation for Cross-modal Retrieval with Query Shift

ICLR 2025arXiv
9
citations

BountyBench: Dollar Impact of AI Agent Attackers and Defenders on Real-World Cybersecurity Systems

NeurIPS 2025arXiv
9
citations

Causally Motivated Sycophancy Mitigation for Large Language Models

ICLR 2025
8
citations

IPDreamer: Appearance-Controllable 3D Object Generation with Complex Image Prompts

ICLR 2025arXiv
7
citations

SemReg: Semantics Constrained Point Cloud Registration

ECCV 2024
7
citations

MimeQA: Towards Socially-Intelligent Nonverbal Foundation Models

NeurIPS 2025arXiv
7
citations

LayerCraft: Enhancing Text-to-Image Generation with CoT Reasoning and Layered Object Integration

NeurIPS 2025arXiv
6
citations

LLMCO4MR: LLMs-aided Neural Combinatorial Optimization for Ancient Manuscript Restoration from Fragments with Case Studies on Dunhuang

ECCV 2024
6
citations

CMD: A Cross Mechanism Domain Adaptation Dataset for 3D Object Detection

ECCV 2024
6
citations

Integrative Decoding: Improving Factuality via Implicit Self-consistency

ICLR 2025arXiv
6
citations

RTV-Bench: Benchmarking MLLM Continuous Perception, Understanding and Reasoning through Real-Time Video

NeurIPS 2025arXiv
5
citations

The Fluorescent Veil: A Stealthy and Effective Physical Adversarial Patch Against Traffic Sign Recognition

NeurIPS 2025arXiv
5
citations

JavisGPT: A Unified Multi-modal LLM for Sounding-Video Comprehension and Generation

NeurIPS 2025arXiv
4
citations

Characterizing the Expressivity of Fixed-Precision Transformer Language Models

NeurIPS 2025arXiv
4
citations

Incentivizing Reasoning for Advanced Instruction-Following of Large Language Models

NeurIPS 2025arXiv
4
citations

DexFlyWheel: A Scalable and Self-improving Data Generation Framework for Dexterous Manipulation

NeurIPS 2025arXiv
3
citations

Solving the inverse problem of microscopy deconvolution with a residual Beylkin-Coifman-Rokhlin neural network

ECCV 2024arXiv
3
citations

DanmakuTPPBench: A Multi-modal Benchmark for Temporal Point Process Modeling and Understanding

NeurIPS 2025arXiv
3
citations

Rotated Orthographic Projection for Self-Supervised 3D Human Pose Estimation

ECCV 2024
2
citations

MPCache: MPC-Friendly KV Cache Eviction for Efficient Private LLM Inference

NeurIPS 2025arXiv
2
citations

EcoFace: Audio-Visual Emotional Co-Disentanglement Speech-Driven 3D Talking Face Generation

ICLR 2025
2
citations

Dendritic Resonate-and-Fire Neuron for Effective and Efficient Long Sequence Modeling

NeurIPS 2025arXiv
2
citations

An Optimal Discriminator Weighted Imitation Perspective for Reinforcement Learning

ICLR 2025arXiv
2
citations

MVU-Eval: Towards Multi-Video Understanding Evaluation for Multimodal LLMs

NeurIPS 2025arXiv
2
citations

Outlier Synthesis via Hamiltonian Monte Carlo for Out-of-Distribution Detection

ICLR 2025arXiv
2
citations

Matrix Product Sketching via Coordinated Sampling

ICLR 2025arXiv
2
citations

RoFt-Mol: Benchmarking Robust Fine-tuning with Molecular Graph Foundation Models

NeurIPS 2025arXiv
1
citations

LEDiT: Your Length-Extrapolatable Diffusion Transformer without Positional Encoding

NeurIPS 2025arXiv
1
citations

Mitigating Hallucination in VideoLLMs via Temporal-Aware Activation Engineering

NeurIPS 2025arXiv
1
citations

Learning Cocoercive Conservative Denoisers via Helmholtz Decomposition for Poisson Imaging Inverse Problems

NeurIPS 2025
1
citations

UniGist: Towards General and Hardware-aligned Sequence-level Long Context Compression

NeurIPS 2025arXiv
1
citations

TrackingWorld: World-centric Monocular 3D Tracking of Almost All Pixels

NeurIPS 2025arXiv
1
citations

VADTree: Explainable Training-Free Video Anomaly Detection via Hierarchical Granularity-Aware Tree

NeurIPS 2025arXiv
1
citations

Release the Powers of Prompt Tuning: Cross-Modality Prompt Transfer

ICLR 2025
1
citations

Adaptive Stochastic Coefficients for Accelerating Diffusion Sampling

NeurIPS 2025arXiv
1
citations

ProDyG: Progressive Dynamic Scene Reconstruction via Gaussian Splatting from Monocular Videos

NeurIPS 2025arXiv
0
citations

EfficientNav: Towards On-Device Object-Goal Navigation with Navigation Map Caching and Retrieval

NeurIPS 2025arXiv
0
citations

Revealing Multimodal Causality with Large Language Models

NeurIPS 2025arXiv
0
citations

DSAS: A Universal Plug-and-Play Framework for Attention Optimization in Multi-Document Question Answering

NeurIPS 2025arXiv
0
citations

Order-Level Attention Similarity Across Language Models: A Latent Commonality

NeurIPS 2025arXiv
0
citations

Don’t Forget the Enjoin: FocalLoRA for Instruction Hierarchical Alignment in Large Language Models

NeurIPS 2025
0
citations

Adaptive Data-Borrowing for Improving Treatment Effect Estimation using External Controls

NeurIPS 2025arXiv
0
citations

WearVQA: A Visual Question Answering Benchmark for Wearables in Egocentric Authentic Real-world scenarios

NeurIPS 2025arXiv
0
citations

Rebalancing Contrastive Alignment with Bottlenecked Semantic Increments in Text-Video Retrieval

NeurIPS 2025arXiv
0
citations

Functional Matching of Logic Subgraphs: Beyond Structural Isomorphism

NeurIPS 2025arXiv
0
citations

Feature Averaging: An Implicit Bias of Gradient Descent Leading to Non-Robustness in Neural Networks

ICLR 2025arXiv
0
citations

Sketching for Convex and Nonconvex Regularized Least Squares with Sharp Guarantees

ICLR 2025arXiv
0
citations

NeuroPath: Neurobiology-Inspired Path Tracking and Reflection for Semantically Coherent Retrieval

NeurIPS 2025arXiv
0
citations

Is Noise Conditioning Necessary? A Unified Theory of Unconditional Graph Diffusion Models

NeurIPS 2025arXiv
0
citations