43
Papers
3,949
Total Citations
10
h-index

Papers (43)

DETRs Beat YOLOs on Real-time Object Detection

CVPR 2024
2,424
citations

MVBench: A Comprehensive Multi-modal Video Understanding Benchmark

CVPR 2024
864
citations

CityGaussian: Real-time High-quality Large-Scale Scene Rendering with Gaussians

ECCV 2024arXiv
180
citations

Improved Techniques for Optimization-Based Jailbreaking on Large Language Models

ICLR 2025arXiv
74
citations

IntrinsicAnything: Learning Diffusion Priors for Inverse Rendering Under Unknown Illumination

ECCV 2024arXiv
54
citations

FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis

ECCV 2024arXiv
51
citations

VITATECS: A Diagnostic Dataset for Temporal Concept Understanding of Video-Language Models

ECCV 2024
49
citations

RRM: Robust Reward Model Training Mitigates Reward Hacking

ICLR 2025arXiv
44
citations

On the Role of Attention Heads in Large Language Model Safety

ICLR 2025arXiv
40
citations

SUTrack: Towards Simple and Unified Single Object Tracking

AAAI 2025
37
citations

Exploring Enhanced Contextual Information for Video-Level Object Tracking

AAAI 2025
27
citations

Training-free Video Temporal Grounding using Large-scale Pre-trained Models

ECCV 2024arXiv
20
citations

Temporal Reasoning Transfer from Text to Video

ICLR 2025arXiv
20
citations

Towards Efficient and Effective Text-to-Video Retrieval with Coarse-to-Fine Visual Representation Learning

AAAI 2024arXiv
14
citations

CirT: Global Subseasonal-to-Seasonal Forecasting with Geometry-inspired Transformer

ICLR 2025arXiv
13
citations

PiTe: Pixel-Temporal Alignment for Large Video-Language Model

ECCV 2024arXiv
9
citations

ZoLA: Zero-Shot Creative Long Animation Generation with Short Video Model

ECCV 2024
9
citations

RTV-Bench: Benchmarking MLLM Continuous Perception, Understanding and Reasoning through Real-Time Video

NeurIPS 2025arXiv
5
citations

UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning

NeurIPS 2025arXiv
4
citations

Can LLMs Outshine Conventional Recommenders? A Comparative Evaluation

NeurIPS 2025arXiv
4
citations

DanmakuTPPBench: A Multi-modal Benchmark for Temporal Point Process Modeling and Understanding

NeurIPS 2025arXiv
3
citations

TrackingWorld: World-centric Monocular 3D Tracking of Almost All Pixels

NeurIPS 2025arXiv
1
citations

CoST: Efficient Collaborative Perception From Unified Spatiotemporal Perspective

ICCV 2025
1
citations

Stability and Sharper Risk Bounds with Convergence Rate $\tilde{O}(1/n^2)$

NeurIPS 2025arXiv
1
citations

Intrinsic Benefits of Categorical Distributional Loss: Uncertainty-aware Regularized Exploration in Reinforcement Learning

NeurIPS 2025
1
citations

Layer as Puzzle Pieces: Compressing Large Language Models through Layer Concatenation

NeurIPS 2025arXiv
0
citations

Sampled Estimators For Softmax Must Be Biased

NeurIPS 2025
0
citations

Semantic Surgery: Zero-Shot Concept Erasure in Diffusion Models

NeurIPS 2025arXiv
0
citations

Effective Cloud Removal for Remote Sensing Images by an Improved Mean-Reverting Denoising Model with Elucidated Design Space

CVPR 2025
0
citations

VisualLens: Personalization through Task-Agnostic Visual History

NeurIPS 2025arXiv
0
citations

Controllable Protein Sequence Generation with LLM Preference Optimization

AAAI 2025
0
citations

DepthVanish: Optimizing Adversarial Interval Structures for Stereo-Depth-Invisible Patches

NeurIPS 2025arXiv
0
citations

MiCo: Multi-image Contrast for Reinforcement Visual Reasoning

NeurIPS 2025arXiv
0
citations

VHM: Versatile and Honest Vision Language Model for Remote Sensing Image Analysis

AAAI 2025
0
citations

Chains of Diffusion Models

ECCV 2024
0
citations

1066 Benchmarking Large Language Models on Controllable Generation under Diversified Instructions

AAAI 2024
0
citations

Gradual Residuals Alignment: A Dual-Stream Framework for GAN Inversion and Image Attribute Editing

AAAI 2024
0
citations

CAMixerSR: Only Details Need More "Attention"

CVPR 2024
0
citations

Tactile-Augmented Radiance Fields

CVPR 2024
0
citations

CycleVAR: Repurposing Autoregressive Model for Unsupervised One-Step Image Translation

ICCV 2025
0
citations

Erase Diffusion: Empowering Object Removal Through Calibrating Diffusion Pathways

CVPR 2025
0
citations

Tuning-free Estimation and Inference of Cumulative Distribution Function under Local Differential Privacy

ICML 2024
0
citations

Multi-Source Conformal Inference Under Distribution Shift

ICML 2024
0
citations