Most Cited 2025 "quantum noise characterization" Papers

22,274 papers found • Page 15 of 112

#2801

STCOcc: Sparse Spatial-Temporal Cascade Renovation for 3D Occupancy and Scene Flow Prediction

Zhimin Liao, Ping Wei, Shuaijia Chen et al.

CVPR 2025arXiv:2504.19749
10
citations
#2802

Diff2Flow: Training Flow Matching Models via Diffusion Model Alignment

Johannes Schusterbauer, Ming Gui, Frank Fundel et al.

CVPR 2025arXiv:2506.02221
10
citations
#2803

Breaking the Data Barrier -- Building GUI Agents Through Task Generalization

Junlei Zhang, Zichen Ding, Chang Ma et al.

COLM 2025paperarXiv:2504.10127
10
citations
#2804

Steering the CensorShip: Uncovering Representation Vectors for LLM "Thought'' Control

Hannah Cyberey, David Evans

COLM 2025paper
10
citations
#2805

More RLHF, More Trust? On The Impact of Preference Alignment On Trustworthiness

Aaron J. Li, Satyapriya Krishna, Hima Lakkaraju

ICLR 2025arXiv:2404.18870
10
citations
#2806

Advancing Language Multi-Agent Learning with Credit Re-Assignment for Interactive Environment Generalization

Zhitao He, Zijun Liu, Peng Li et al.

COLM 2025paperarXiv:2502.14496
10
citations
#2807

Generalized Recorrupted-to-Recorrupted: Self-Supervised Learning Beyond Gaussian Noise

Brayan Monroy, Jorge Bacca, Julián Tachella

CVPR 2025arXiv:2412.04648
10
citations
#2808

First SFT, Second RL, Third UPT: Continual Improving Multi-Modal LLM Reasoning via Unsupervised Post-Training

Lai Wei, Yuting Li, Chen Wang et al.

NEURIPS 2025arXiv:2505.22453
10
citations
#2809

The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text

Nikhil Kandpal, Brian Lester, Colin Raffel et al.

NEURIPS 2025arXiv:2506.05209
10
citations
#2810

Aligning Human Motion Generation with Human Perceptions

Haoru Wang, Wentao Zhu, Luyi Miao et al.

ICLR 2025arXiv:2407.02272
10
citations
#2811

Towards Multiple Character Image Animation Through Enhancing Implicit Decoupling

Jingyun Xue, WANG HongFa, Qi Tian et al.

ICLR 2025arXiv:2406.03035
10
citations
#2812

EqNIO: Subequivariant Neural Inertial Odometry

Royina Karegoudra Jayanth, Yinshuang Xu, Ziyun Wang et al.

ICLR 2025arXiv:2408.06321
10
citations
#2813

ResCLIP: Residual Attention for Training-free Dense Vision-language Inference

Jinhong Deng, Yuhang Yang, Wen Li et al.

CVPR 2025arXiv:2411.15851
10
citations
#2814

A Single Goal is All You Need: Skills and Exploration Emerge from Contrastive RL without Rewards, Demonstrations, or Subgoals

Grace Liu, Michael Tang, Benjamin Eysenbach

ICLR 2025arXiv:2408.05804
10
citations
#2815

Deep MMD Gradient Flow without adversarial training

Alexandre Galashov, Valentin De Bortoli, Arthur Gretton

ICLR 2025arXiv:2405.06780
10
citations
#2816

Fluid Language Model Benchmarking

Valentin Hofmann, David Heineman, Ian Magnusson et al.

COLM 2025paperarXiv:2509.11106
10
citations
#2817

MagicColor: Multi-instance Sketch Colorization

yinhan Zhang, Yue Ma, Bingyuan Wang et al.

ICCV 2025
10
citations
#2818

Gaussian Eigen Models for Human Heads

Wojciech Zielonka, Timo Bolkart, Thabo Beeler et al.

CVPR 2025arXiv:2407.04545
10
citations
#2819

Context-Parametric Inversion: Why Instruction Finetuning May Not Actually Improve Context Reliance

Sachin Goyal, Christina Baek, Zico Kolter et al.

ICLR 2025
10
citations
#2820

Benchmarking Multimodal CoT Reward Model Stepwise by Visual Program

Minghe Gao, Xuqi Liu, Zhongqi Yue et al.

ICCV 2025arXiv:2504.06606
10
citations
#2821

Breach By A Thousand Leaks: Unsafe Information Leakage in 'Safe' AI Responses

David Glukhov, Ziwen Han, I Shumailov et al.

ICLR 2025arXiv:2407.02551
10
citations
#2822

Rethinking Visual Counterfactual Explanations Through Region Constraint

Bartlomiej Sobieski, Jakub Grzywaczewski, Bartłomiej Sadlej et al.

ICLR 2025arXiv:2410.12591
10
citations
#2823

Multiple Heads are Better than One: Mixture of Modality Knowledge Experts for Entity Representation Learning

Yichi Zhang, Zhuo Chen, Lingbing Guo et al.

ICLR 2025arXiv:2405.16869
10
citations
#2824

The 3D-PC: a benchmark for visual perspective taking in humans and machines

Drew Linsley, Peisen Zhou, Alekh Ashok et al.

ICLR 2025arXiv:2406.04138
10
citations
#2825

Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning of Vision Language Models

Huajie Tan, Yuheng Ji, Xiaoshuai Hao et al.

NEURIPS 2025arXiv:2503.20752
10
citations
#2826

EDEN: Enhanced Diffusion for High-quality Large-motion Video Frame Interpolation

Zihao Zhang, Haoran Chen, Haoyu Zhao et al.

CVPR 2025arXiv:2503.15831
10
citations
#2827

Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering

Federico Cocchi, Nicholas Moratelli, Marcella Cornia et al.

CVPR 2025arXiv:2411.16863
10
citations
#2828

MAD-TD: Model-Augmented Data stabilizes High Update Ratio RL

Claas Voelcker, Marcel Hussing, ERIC EATON et al.

ICLR 2025arXiv:2410.08896
10
citations
#2829

Causal Inference over Visual-Semantic-Aligned Graph for Image Classification

Lei Meng, Xiangxian Li, Xiaoshuo Yan et al.

AAAI 2025paper
10
citations
#2830

Temporal Separation with Entropy Regularization for Knowledge Distillation in Spiking Neural Networks

Kairong Yu, Chengting Yu, Tianqing Zhang et al.

CVPR 2025arXiv:2503.03144
10
citations
#2831

DocVLM: Make Your VLM an Efficient Reader

Mor Shpigel Nacson, Aviad Aberdam, Roy Ganz et al.

CVPR 2025arXiv:2412.08746
10
citations
#2832

RGBAvatar: Reduced Gaussian Blendshapes for Online Modeling of Head Avatars

Linzhou Li, Yumeng Li, Yanlin Weng et al.

CVPR 2025highlightarXiv:2503.12886
10
citations
#2833

V2X-R: Cooperative LiDAR-4D Radar Fusion with Denoising Diffusion for 3D Object Detection

Xun Huang, Jinlong Wang, Qiming Xia et al.

CVPR 2025arXiv:2411.08402
10
citations
#2834

Dual Process Learning: Controlling Use of In-Context vs. In-Weights Strategies with Weight Forgetting

Suraj Anand, Michael Lepori, Jack Merullo et al.

ICLR 2025arXiv:2406.00053
10
citations
#2835

Label-Free Backdoor Attacks in Vertical Federated Learning

Wei Shen, Wenke Huang, Guancheng Wan et al.

AAAI 2025paper
10
citations
#2836

Are They the Same? Exploring Visual Correspondence Shortcomings of Multimodal LLMs

Yikang Zhou, Tao Zhang, Shilin Xu et al.

ICCV 2025arXiv:2501.04670
10
citations
#2837

PCDreamer: Point Cloud Completion Through Multi-view Diffusion Priors

Guangshun Wei, Yuan Feng, Long Ma et al.

CVPR 2025arXiv:2411.19036
10
citations
#2838

Harnessing Massive Satellite Imagery with Efficient Masked Image Modeling

Fengxiang Wang, Hongzhen Wang, Di Wang et al.

ICCV 2025arXiv:2406.11933
10
citations
#2839

Pareto Set Learning for Multi-Objective Reinforcement Learning

Erlong Liu, Yu-Chang Wu, Xiaobin Huang et al.

AAAI 2025paperarXiv:2501.06773
10
citations
#2840

Noise Stability Optimization for Finding Flat Minima: A Hessian-based Regularization Approach

Haotian Ju, Hongyang Zhang, Dongyue Li

ICLR 2025arXiv:2306.08553
10
citations
#2841

WildSAT: Learning Satellite Image Representations from Wildlife Observations

Rangel Daroya, Elijah Cole, Oisin Mac Aodha et al.

ICCV 2025arXiv:2412.14428
10
citations
#2842

VTON 360: High-Fidelity Virtual Try-On from Any Viewing Direction

Zijian He, Yuwei Ning, Yipeng Qin et al.

CVPR 2025arXiv:2503.12165
10
citations
#2843

SuperDec: 3D Scene Decomposition with Superquadrics Primitives

Elisabetta Fedele, Boyang Sun, Francis Engelmann et al.

ICCV 2025arXiv:2504.00992
10
citations
#2844

LoftUp: Learning a Coordinate-Based Feature Upsampler for Vision Foundation Models

Haiwen Huang, Anpei Chen, Volodymyr Havrylov et al.

ICCV 2025arXiv:2504.14032
10
citations
#2845

$Q\sharp$: Provably Optimal Distributional RL for LLM Post-Training

Jin Zhou, Kaiwen Wang, Jonathan Chang et al.

NEURIPS 2025arXiv:2502.20548
10
citations
#2846

Generative Monoculture in Large Language Models

Fan Wu, Emily Black, Varun Chandrasekaran

ICLR 2025arXiv:2407.02209
10
citations
#2847

DisCO: Reinforcing Large Reasoning Models with Discriminative Constrained Optimization

Gang Li, Ming Lin, Tomer Galanti et al.

NEURIPS 2025arXiv:2505.12366
10
citations
#2848

ParZC: Parametric Zero-Cost Proxies for Efficient NAS

Peijie Dong, Lujun Li, Zhenheng Tang et al.

AAAI 2025paperarXiv:2402.02105
10
citations
#2849

Video Summarization with Large Language Models

Min Jung Lee, Dayoung Gong, Minsu Cho

CVPR 2025arXiv:2504.11199
10
citations
#2850

Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model

Zhaochong An, Guolei Sun, Yun Liu et al.

CVPR 2025arXiv:2503.16282
10
citations
#2851

DOVE: Efficient One-Step Diffusion Model for Real-World Video Super-Resolution

Zheng Chen, Zichen Zou, Kewei Zhang et al.

NEURIPS 2025arXiv:2505.16239
10
citations
#2852

Generalized and Efficient 2D Gaussian Splatting for Arbitrary-scale Super-Resolution

Du Chen, Liyi Chen, Zhengqiang ZHANG et al.

ICCV 2025arXiv:2501.06838
10
citations
#2853

70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float (DFloat11)

Tianyi Zhang, Mohsen Hariri, Shaochen (Henry) Zhong et al.

NEURIPS 2025arXiv:2504.11651
10
citations
#2854

Knowledge Graph Completion with Relation-Aware Anchor Enhancement

Duanyang Yuan, Sihang Zhou, Xiaoshu Chen et al.

AAAI 2025paperarXiv:2504.06129
10
citations
#2855

ReCap: Better Gaussian Relighting with Cross-Environment Captures

Jingzhi Li, Zongwei Wu, Eduard Zamfir et al.

CVPR 2025arXiv:2412.07534
10
citations
#2856

Graph Neural Preconditioners for Iterative Solutions of Sparse Linear Systems

Jie Chen

ICLR 2025arXiv:2406.00809
10
citations
#2857

Revisiting Multimodal Fusion for 3D Anomaly Detection from an Architectural Perspective

Kaifang Long, Guoyang Xie, Lianbo Ma et al.

AAAI 2025paperarXiv:2412.17297
10
citations
#2858

MindLLM: A Subject-Agnostic and Versatile Model for fMRI-to-text Decoding

Weikang Qiu, Zheng Huang, Haoyu Hu et al.

ICML 2025arXiv:2502.15786
10
citations
#2859

Open-World Amodal Appearance Completion

Jiayang Ao, Yanbei Jiang, Qiuhong Ke et al.

CVPR 2025arXiv:2411.13019
10
citations
#2860

OmniConsistency: Learning Style-Agnostic Consistency from Paired Stylization Data

Yiren Song, Cheng Liu, Mike Zheng Shou

NEURIPS 2025arXiv:2505.18445
10
citations
#2861

Anyprefer: An Agentic Framework for Preference Data Synthesis

Yiyang Zhou, Zhaoyang Wang, Tianle Wang et al.

ICLR 2025arXiv:2504.19276
10
citations
#2862

Budgeted Online Continual Learning by Adaptive Layer Freezing and Frequency-based Sampling

Minhyuk Seo, Hyunseo Koh, Jonghyun Choi

ICLR 2025arXiv:2410.15143
10
citations
#2863

TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters

Haiyang Wang, Yue Fan, Muhammad Ferjad Naeem et al.

ICLR 2025arXiv:2410.23168
10
citations
#2864

MINERVA: Evaluating Complex Video Reasoning

Arsha Nagrani, Sachit Menon, Ahmet Iscen et al.

ICCV 2025arXiv:2505.00681
10
citations
#2865

Efficiently Parameterized Neural Metriplectic Systems

Anthony Gruber, Kookjin Lee, Haksoo Lim et al.

ICLR 2025arXiv:2405.16305
10
citations
#2866

DSP: Dynamic Sequence Parallelism for Multi-Dimensional Transformers

Xuanlei Zhao, Shenggan Cheng, Chang Chen et al.

ICML 2025arXiv:2403.10266
10
citations
#2867

The Double-Ellipsoid Geometry of CLIP

Meir Yossef Levi, Guy Gilboa

ICML 2025arXiv:2411.14517
10
citations
#2868

Generate, but Verify: Reducing Hallucination in Vision-Language Models with Retrospective Resampling

Tsung-Han (Patrick) Wu, Heekyung Lee, Jiaxin Ge et al.

NEURIPS 2025arXiv:2504.13169
10
citations
#2869

HIIF: Hierarchical Encoding based Implicit Image Function for Continuous Super-resolution

Yuxuan Jiang, Ho Man Kwan, jasmine peng et al.

CVPR 2025arXiv:2412.03748
10
citations
#2870

A Large Recurrent Action Model: xLSTM enables Fast Inference for Robotics Tasks

Thomas Schmied, Thomas Adler, Vihang Patil et al.

ICML 2025arXiv:2410.22391
10
citations
#2871

MoS: Unleashing Parameter Efficiency of Low-Rank Adaptation with Mixture of Shards

Sheng Wang, Liheng Chen, Pengan CHEN et al.

ICLR 2025arXiv:2410.00938
10
citations
#2872

DOTA: Distributional Test-time Adaptation of Vision-Language Models

Zongbo Han, Jialong Yang, Guangyu Wang et al.

NEURIPS 2025arXiv:2409.19375
10
citations
#2873

Probing the Latent Hierarchical Structure of Data via Diffusion Models

Antonio Sclocchi, Alessandro Favero, Noam Levi et al.

ICLR 2025arXiv:2410.13770
10
citations
#2874

SlerpFace: Face Template Protection via Spherical Linear Interpolation

Zhizhou Zhong, Yuxi Mi, Yuge Huang et al.

AAAI 2025paperarXiv:2407.03043
10
citations
#2875

SpatialDreamer: Self-supervised Stereo Video Synthesis from Monocular Input

Zhen Lv, Yangqi Long, Congzhentao Huang et al.

CVPR 2025arXiv:2411.11934
10
citations
#2876

DisEnvisioner: Disentangled and Enriched Visual Prompt for Customized Image Generation

Jing He, Haodong Li, huyongzhe et al.

ICLR 2025arXiv:2410.02067
10
citations
#2877

Training-Free Guidance Beyond Differentiability: Scalable Path Steering with Tree Search in Diffusion and Flow Models

Yingqing Guo, Yukang Yang, Hui Yuan et al.

NEURIPS 2025arXiv:2502.11420
10
citations
#2878

CL-MoE: Enhancing Multimodal Large Language Model with Dual Momentum Mixture-of-Experts for Continual Visual Question Answering

Tianyu Huai, Jie Zhou, Xingjiao Wu et al.

CVPR 2025highlightarXiv:2503.00413
10
citations
#2879

GRPose: Learning Graph Relations for Human Image Generation with Pose Priors

Xiangchen Yin, Donglin Di, Lei Fan et al.

AAAI 2025paperarXiv:2408.16540
10
citations
#2880

Light-T2M: A Lightweight and Fast Model for Text-to-motion Generation

Ling-An Zeng, Guohong Huang, Gaojie Wu et al.

AAAI 2025paperarXiv:2412.11193
10
citations
#2881

X-Fi: A Modality-Invariant Foundation Model for Multimodal Human Sensing

Xinyan Chen, Jianfei Yang

ICLR 2025arXiv:2410.10167
10
citations
#2882

Trajectory Mamba: Efficient Attention-Mamba Forecasting Model Based on Selective SSM

Yizhou Huang, Yihua Cheng, Kezhi Wang

CVPR 2025arXiv:2503.10898
10
citations
#2883

Measuring memorization in RLHF for code completion

Jamie Hayes, I Shumailov, Billy Porter et al.

ICLR 2025arXiv:2406.11715
10
citations
#2884

Visual Test-time Scaling for GUI Agent Grounding

Tiange Luo, Lajanugen Logeswaran, Justin Johnson et al.

ICCV 2025highlightarXiv:2505.00684
10
citations
#2885

PhyMPGN: Physics-encoded Message Passing Graph Network for spatiotemporal PDE systems

Bocheng Zeng, Qi Wang, Mengtao Yan et al.

ICLR 2025oralarXiv:2410.01337
10
citations
#2886

Hybrid Global-Local Representation with Augmented Spatial Guidance for Zero-Shot Referring Image Segmentation

Ting Liu, Siyuan Li

CVPR 2025arXiv:2504.00356
10
citations
#2887

Harmony in Divergence: Towards Fast, Accurate, and Memory-efficient Zeroth-order LLM Fine-tuning

Qitao Tan, Jun Liu, Zheng Zhan et al.

NEURIPS 2025arXiv:2502.03304
10
citations
#2888

HOIGPT: Learning Long-Sequence Hand-Object Interaction with Language Models

Mingzhen Huang, Fu-Jen Chu, Bugra Tekin et al.

CVPR 2025arXiv:2503.19157
10
citations
#2889

Layered Image Vectorization via Semantic Simplification

Zhenyu Wang, Jianxi Huang, Zhida Sun et al.

CVPR 2025arXiv:2406.05404
10
citations
#2890

Radiology Report Generation via Multi-objective Preference Optimization

Ting Xiao, Lei Shi, Peng Liu et al.

AAAI 2025paperarXiv:2412.08901
10
citations
#2891

GUI-Xplore: Empowering Generalizable GUI Agents with One Exploration

Yuchen Sun, Shanhui Zhao, Tao Yu et al.

CVPR 2025arXiv:2503.17709
10
citations
#2892

Task-Agnostic Guided Feature Expansion for Class-Incremental Learning

Bowen Zheng, Da-Wei Zhou, Han-Jia Ye et al.

CVPR 2025arXiv:2503.00823
10
citations
#2893

VTON-HandFit: Virtual Try-on for Arbitrary Hand Pose Guided by Hand Priors Embedding

Yujie Liang, Xiaobin Hu, Boyuan Jiang et al.

CVPR 2025arXiv:2408.12340
10
citations
#2894

Prompt-A-Video: Prompt Your Video Diffusion Model via Preference-Aligned LLM

Yatai Ji, Jiacheng Zhang, Jie Wu et al.

ICCV 2025arXiv:2412.15156
10
citations
#2895

CCIN: Compositional Conflict Identification and Neutralization for Composed Image Retrieval

Likai Tian, Jian Zhao, Zechao Hu et al.

CVPR 2025highlight
10
citations
#2896

Extrapolating and Decoupling Image-to-Video Generation Models: Motion Modeling is Easier Than You Think

Zhenyi Lu, Xiaoye Qu, Zhenyi Lu et al.

CVPR 2025highlightarXiv:2503.00948
10
citations
#2897

Rewind-to-Delete: Certified Machine Unlearning for Nonconvex Functions

Siqiao Mu, Diego Klabjan

NEURIPS 2025arXiv:2409.09778
10
citations
#2898

Text-to-CAD Generation Through Infusing Visual Feedback in Large Language Models

Ruiyu Wang, Yu Yuan, Shizhao Sun et al.

ICML 2025arXiv:2501.19054
10
citations
#2899

Node-Time Conditional Prompt Learning in Dynamic Graphs

Xingtong Yu, Zhenghao Liu, Xinming Zhang et al.

ICLR 2025oralarXiv:2405.13937
10
citations
#2900

Mol-LLaMA: Towards General Understanding of Molecules in Large Molecular Language Model

Dongki Kim, Wonbin Lee, Sung Ju Hwang

NEURIPS 2025arXiv:2502.13449
10
citations
#2901

Spectral Image Tokenizer

Carlos Esteves, Mohammed Suhail, Ameesh Makadia

ICCV 2025arXiv:2412.09607
10
citations
#2902

VideoUFO: A Million-Scale User-Focused Dataset for Text-to-Video Generation

Wenhao Wang, Yi Yang

NEURIPS 2025arXiv:2503.01739
10
citations
#2903

Reliable and Efficient Amortized Model-based Evaluation

Sang Truong, Yuheng Tu, Percy Liang et al.

ICML 2025arXiv:2503.13335
10
citations
#2904

RomanTex: Decoupling 3D-aware Rotary Positional Embedded Multi-Attention Network for Texture Synthesis

yifei feng, Mx Yang, Shuhui Yang et al.

ICCV 2025arXiv:2503.19011
10
citations
#2905

Point2RBox-v2: Rethinking Point-supervised Oriented Object Detection with Spatial Layout Among Instances

Yi Yu, Botao Ren, Peiyuan Zhang et al.

CVPR 2025arXiv:2502.04268
10
citations
#2906

DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agents

Hao Li, Xiaogeng Liu, CHIU Chun et al.

NEURIPS 2025arXiv:2506.12104
10
citations
#2907

Attention as a Hypernetwork

Simon Schug, Seijin Kobayashi, Yassir Akram et al.

ICLR 2025arXiv:2406.05816
10
citations
#2908

Beyond Graphs: Can Large Language Models Comprehend Hypergraphs?

Yifan Feng, Chengwu Yang, Xingliang Hou et al.

ICLR 2025arXiv:2410.10083
10
citations
#2909

Visual Generation Without Guidance

Huayu Chen, Kai Jiang, Kaiwen Zheng et al.

ICML 2025arXiv:2501.15420
10
citations
#2910

CLEVER: A Curated Benchmark for Formally Verified Code Generation

Amitayush Thakur, Jasper Lee, George Tsoukalas et al.

NEURIPS 2025arXiv:2505.13938
10
citations
#2911

Tuning the Frequencies: Robust Training for Sinusoidal Neural Networks

Tiago Novello, Diana Aldana Moreno, André Araujo et al.

CVPR 2025highlightarXiv:2407.21121
10
citations
#2912

The ML.ENERGY Benchmark: Toward Automated Inference Energy Measurement and Optimization

Jae-Won Chung, Jeff J. Ma, Ruofan Wu et al.

NEURIPS 2025spotlightarXiv:2505.06371
10
citations
#2913

ADBA: Approximation Decision Boundary Approach for Black-Box Adversarial Attacks

Feiyang Wang, Xingquan Zuo, Hai Huang et al.

AAAI 2025paper
10
citations
#2914

Needle Threading: Can LLMs Follow Threads Through Near-Million-Scale Haystacks?

Jonathan Roberts, Kai Han, Samuel Albanie

ICLR 2025arXiv:2411.05000
10
citations
#2915

AlphaPO: Reward Shape Matters for LLM Alignment

Aman Gupta, Shao Tang, Qingquan Song et al.

ICML 2025arXiv:2501.03884
10
citations
#2916

Taylor Series-Inspired Local Structure Fitting Network for Few-shot Point Cloud Semantic Segmentation

Changshuo Wang, Shuting He, Xiang Fang et al.

AAAI 2025paperarXiv:2504.02454
10
citations
#2917

Unifying 2D and 3D Vision-Language Understanding

Ayush Jain, Alexander Swerdlow, Yuzhou Wang et al.

ICML 2025arXiv:2503.10745
10
citations
#2918

A Closer Look at TabPFN v2: Understanding Its Strengths and Extending Its Capabilities

Han-Jia Ye, Si-Yang Liu, Wei-Lun (Harry) Chao

NEURIPS 2025arXiv:2502.17361
10
citations
#2919

Unlearn and Burn: Adversarial Machine Unlearning Requests Destroy Model Accuracy

Yangsibo Huang, Daogao Liu, Lynn Chua et al.

ICLR 2025arXiv:2410.09591
10
citations
#2920

QERA: an Analytical Framework for Quantization Error Reconstruction

Cheng Zhang, Jeffrey T. H. Wong, Can Xiao et al.

ICLR 2025arXiv:2410.06040
10
citations
#2921

Transformer Copilot: Learning from The Mistake Log in LLM Fine-tuning

Jiaru Zou, Yikun Ban, Zihao Li et al.

NEURIPS 2025spotlightarXiv:2505.16270
10
citations
#2922

CameraCtrl II: Dynamic Scene Exploration via Camera-controlled Video Diffusion Models

Hao He, Ceyuan Yang, Shanchuan Lin et al.

ICCV 2025arXiv:2503.10592
10
citations
#2923

DreamRelation: Bridging Customization and Relation Generation

Qingyu Shi, Lu Qi, Jianzong Wu et al.

CVPR 2025arXiv:2410.23280
10
citations
#2924

Periodic Materials Generation using Text-Guided Joint Diffusion Model

KISHALAY DAS, Subhojyoti Khastagir, Pawan Goyal et al.

ICLR 2025arXiv:2503.00522
10
citations
#2925

FormalAlign: Automated Alignment Evaluation for Autoformalization

Jianqiao Lu, Yingjia Wan, Yinya Huang et al.

ICLR 2025arXiv:2410.10135
10
citations
#2926

PatchDPO: Patch-level DPO for Finetuning-free Personalized Image Generation

Qihan Huang, Weilong Dai, Jinlong Liu et al.

CVPR 2025arXiv:2412.03177
10
citations
#2927

Scaling Laws for Optimal Data Mixtures

Mustafa Shukor, Louis Bethune, Dan Busbridge et al.

NEURIPS 2025arXiv:2507.09404
10
citations
#2928

Boosting Latent Diffusion with Perceptual Objectives

Tariq Berrada, Pietro Astolfi, Melissa Hall et al.

ICLR 2025arXiv:2411.04873
10
citations
#2929

AdaRankGrad: Adaptive Gradient Rank and Moments for Memory-Efficient LLMs Training and Fine-Tuning

Yehonathan Refael, Jonathan Svirsky, Boris Shustin et al.

ICLR 2025arXiv:2410.17881
10
citations
#2930

UCF-Crime-DVS: A Novel Event-Based Dataset for Video Anomaly Detection with Spiking Neural Networks

Yuanbin Qian, Shuhan Ye, Chong Wang et al.

AAAI 2025paperarXiv:2503.12905
10
citations
#2931

Attention layers provably solve single-location regression

Pierre Marion, Raphaël Berthier, Gérard Biau et al.

ICLR 2025arXiv:2410.01537
10
citations
#2932

Efficient Attention-Sharing Information Distillation Transformer for Lightweight Single Image Super-Resolution

Karam Park, Jae Woong Soh, Nam Ik Cho

AAAI 2025paperarXiv:2501.15774
10
citations
#2933

Nautilus: Locality-aware Autoencoder for Scalable Mesh Generation

Yuxuan Wang, Xuanyu Yi, Haohan Weng et al.

ICCV 2025arXiv:2501.14317
10
citations
#2934

X-Dancer: Expressive Music to Human Dance Video Generation

Zeyuan Chen, Hongyi Xu, Guoxian Song et al.

ICCV 2025highlightarXiv:2502.17414
10
citations
#2935

Topological Blindspots: Understanding and Extending Topological Deep Learning Through the Lens of Expressivity

Yam Eitan, Yoav Gelberg, Guy Bar-Shalom et al.

ICLR 2025arXiv:2408.05486
10
citations
#2936

DM-Adapter: Domain-Aware Mixture-of-Adapters for Text-Based Person Retrieval

Yating Liu, Zimo Liu, Xiangyuan Lan et al.

AAAI 2025paperarXiv:2503.04144
10
citations
#2937

RayFlow: Instance-Aware Diffusion Acceleration via Adaptive Flow Trajectories

Huiyang Shao, Xin Xia, Yuhong Yang et al.

CVPR 2025arXiv:2503.07699
10
citations
#2938

VSTAR: Generative Temporal Nursing for Longer Dynamic Video Synthesis

Yumeng Li, William H Beluch, Margret Keuper et al.

ICLR 2025oralarXiv:2403.13501
10
citations
#2939

Model-based RL as a Minimalist Approach to Horizon-Free and Second-Order Bounds

Zhiyong Wang, Dongruo Zhou, John C.S. Lui et al.

ICLR 2025arXiv:2408.08994
10
citations
#2940

Hydra-SGG: Hybrid Relation Assignment for One-stage Scene Graph Generation

Minghan Chen, Guikun Chen, Wenguan Wang et al.

ICLR 2025arXiv:2409.10262
10
citations
#2941

Open-World Reinforcement Learning over Long Short-Term Imagination

Jiajian Li, Qi Wang, Yunbo Wang et al.

ICLR 2025arXiv:2410.03618
10
citations
#2942

PLeaS - Merging Models with Permutations and Least Squares

Anshul Nasery, Jonathan Hayase, Pang Wei Koh et al.

CVPR 2025arXiv:2407.02447
10
citations
#2943

Efficient Gaussian Splatting for Monocular Dynamic Scene Rendering via Sparse Time-Variant Attribute Modeling

Hanyang Kong, Xingyi Yang, Xinchao Wang

AAAI 2025paperarXiv:2502.20378
10
citations
#2944

Fast training and sampling of Restricted Boltzmann Machines

Nicolas BEREUX, Aurélien Decelle, Cyril Furtlehner et al.

ICLR 2025arXiv:2405.15376
10
citations
#2945

Overcoming False Illusions in Real-World Face Restoration with Multi-Modal Guided Diffusion Model

Keda TAO, Jinjin Gu, Yulun Zhang et al.

ICLR 2025arXiv:2410.04161
10
citations
#2946

On the Crucial Role of Initialization for Matrix Factorization

Bingcong Li, Liang Zhang, Aryan Mokhtari et al.

ICLR 2025arXiv:2410.18965
10
citations
#2947

NeSyCoCo: A Neuro-Symbolic Concept Composer for Compositional Generalization

Danial Kamali, Elham J. Barezi, Parisa Kordjamshidi

AAAI 2025paperarXiv:2412.15588
10
citations
#2948

Can Transformers Reason Logically? A Study in SAT Solving

Leyan Pan, Vijay Ganesh, Jacob Abernethy et al.

ICML 2025arXiv:2410.07432
10
citations
#2949

UrbanLLaVA: A Multi-modal Large Language Model for Urban Intelligence

Jie Feng, Shengyuan Wang, Tianhui Liu et al.

ICCV 2025arXiv:2506.23219
10
citations
#2950

Towards Federated RLHF with Aggregated Client Preference for LLMs

Feijie Wu, Xiaoze Liu, Haoyu Wang et al.

ICLR 2025arXiv:2407.03038
10
citations
#2951

NTPP: Generative Speech Language Modeling for Dual-Channel Spoken Dialogue via Next-Token-Pair Prediction

Qichao Wang, Ziqiao Meng, Wenqian Cui et al.

ICML 2025arXiv:2506.00975
10
citations
#2952

ROADWork: A Dataset and Benchmark for Learning to Recognize, Observe, Analyze and Drive Through Work Zones

Anurag Ghosh, Shen Zheng, Robert Tamburo et al.

ICCV 2025arXiv:2406.07661
10
citations
#2953

RoMo: Robust Motion Segmentation Improves Structure from Motion

Lily Goli, Sara Sabour, Mark Matthews et al.

ICCV 2025arXiv:2411.18650
10
citations
#2954

Constrain Alignment with Sparse Autoencoders

Qingyu Yin, Chak Tou Leong, Hongbo Zhang et al.

ICML 2025arXiv:2411.07618
10
citations
#2955

Adaptive Self-improvement LLM Agentic System for ML Library Development

Genghan Zhang, Weixin Liang, Olivia Hsu et al.

ICML 2025arXiv:2502.02534
10
citations
#2956

Scaling Inference-Efficient Language Models

Song Bian, Minghao Yan, Shivaram Venkataraman

ICML 2025arXiv:2501.18107
10
citations
#2957

AnimateAnyMesh: A Feed-Forward 4D Foundation Model for Text-Driven Universal Mesh Animation

zijie wu, Chaohui Yu, Fan Wang et al.

ICCV 2025arXiv:2506.09982
10
citations
#2958

AVTrustBench: Assessing and Enhancing Reliability and Robustness in Audio-Visual LLMs

Sanjoy Chowdhury, Sayan Nag, Subhrajyoti Dasgupta et al.

ICCV 2025arXiv:2501.02135
10
citations
#2959

FedMIA: An Effective Membership Inference Attack Exploiting "All for One" Principle in Federated Learning

Gongxi Zhu, Donghao Li, Hanlin Gu et al.

CVPR 2025
10
citations
#2960

Flowing from Words to Pixels: A Noise-Free Framework for Cross-Modality Evolution

Qihao Liu, Xi Yin, Alan L. Yuille et al.

CVPR 2025highlightarXiv:2412.15213
10
citations
#2961

Plastic Learning with Deep Fourier Features

Alex Lewandowski, Dale Schuurmans, Marlos C. Machado

ICLR 2025arXiv:2410.20634
10
citations
#2962

MatryoshkaKV: Adaptive KV Compression via Trainable Orthogonal Projection

Bokai Lin, Zihao Zeng, Zipeng Xiao et al.

ICLR 2025arXiv:2410.14731
10
citations
#2963

DELT: A Simple Diversity-driven EarlyLate Training for Dataset Distillation

Zhiqiang Shen, Ammar Sherif, Zeyuan Yin et al.

CVPR 2025arXiv:2411.19946
10
citations
#2964

CADDreamer: CAD Object Generation from Single-view Images

Yuan Li, Cheng Lin, Yuan Liu et al.

CVPR 2025highlightarXiv:2502.20732
10
citations
#2965

Momentum-SAM: Sharpness Aware Minimization without Computational Overhead

Marlon Becker, Frederick Altrock, Benjamin Risse

NEURIPS 2025arXiv:2401.12033
10
citations
#2966

Fragment and Geometry Aware Tokenization of Molecules for Structure-Based Drug Design Using Language Models

Cong Fu, Xiner Li, Blake Olson et al.

ICLR 2025arXiv:2408.09730
10
citations
#2967

ObjectMover: Generative Object Movement with Video Prior

Xin Yu, Tianyu Wang, Soo Ye Kim et al.

CVPR 2025arXiv:2503.08037
10
citations
#2968

STOP: Integrated Spatial-Temporal Dynamic Prompting for Video Understanding

Zichen Liu, Kunlun Xu, Bing Su et al.

CVPR 2025arXiv:2503.15973
10
citations
#2969

TACO: Taming Diffusion for in-the-wild Video Amodal Completion

Ruijie Lu, Yixin Chen, Yu Liu et al.

ICCV 2025arXiv:2503.12049
10
citations
#2970

FilterTS: Comprehensive Frequency Filtering for Multivariate Time Series Forecasting

Yulong Wang, Yushuo Liu, Xiaoyi Duan et al.

AAAI 2025paperarXiv:2505.04158
10
citations
#2971

ThermalGaussian: Thermal 3D Gaussian Splatting

Rongfeng Lu, Hangyu Chen, Zunjie Zhu et al.

ICLR 2025arXiv:2409.07200
10
citations
#2972

Gemstones: A Model Suite for Multi-Faceted Scaling Laws

Sean McLeish, John Kirchenbauer, David Miller et al.

NEURIPS 2025arXiv:2502.06857
10
citations
#2973

What Does It Mean to Be a Transformer? Insights from a Theoretical Hessian Analysis

Weronika Ormaniec, Felix Dangel, Sidak Pal Singh

ICLR 2025arXiv:2410.10986
10
citations
#2974

Towards a Comprehensive, Efficient and Promptable Anatomic Structure Segmentation Model Using 3D Whole-Body CT Scans

Heng Guo, Jianfeng Zhang, Jiaxing Huang et al.

AAAI 2025paperarXiv:2403.15063
10
citations
#2975

RANKCLIP: Ranking-Consistent Language-Image Pretraining

Yiming Zhang, Zhuokai Zhao, Zhaorun Chen et al.

ICCV 2025arXiv:2404.09387
10
citations
#2976

Intrinsic User-Centric Interpretability through Global Mixture of Experts

Vinitra Swamy, Syrielle Montariol, Julian Blackwell et al.

ICLR 2025arXiv:2402.02933
10
citations
#2977

Reconstructing People, Places, and Cameras

Lea Müller, Hongsuk Choi, Anthony Zhang et al.

CVPR 2025highlightarXiv:2412.17806
10
citations
#2978

EgoExoBench: A Benchmark for First- and Third-person View Video Understanding in MLLMs

Yuping He, Yifei Huang, Guo Chen et al.

NEURIPS 2025oralarXiv:2507.18342
10
citations
#2979

GraphAvatar: Compact Head Avatars with GNN-Generated 3D Gaussians

Xiaobao Wei, Peng Chen, Ming Lu et al.

AAAI 2025paperarXiv:2412.13983
10
citations
#2980

Enhancing Multilingual LLM Pretraining with Model-Based Data Selection

Bettina Messmer, Vinko Sabolčec, Martin Jaggi

NEURIPS 2025arXiv:2502.10361
10
citations
#2981

MIP against Agent: Malicious Image Patches Hijacking Multimodal OS Agents

Lukas Aichberger, Alasdair Paren, Guohao Li et al.

NEURIPS 2025arXiv:2503.10809
10
citations
#2982

Hierarchical Mixture of Experts: Generalizable Learning for High-Level Synthesis

Weikai Li, Ding Wang, Zijian Ding et al.

AAAI 2025paperarXiv:2410.19225
10
citations
#2983

A Recipe for Generating 3D Worlds from a Single Image

Katja Schwarz, Denis Rozumny, Samuel Rota Bulò et al.

ICCV 2025arXiv:2503.16611
10
citations
#2984

BizGen: Advancing Article-level Visual Text Rendering for Infographics Generation

Yuyang Peng, Shishi Xiao, Keming Wu et al.

CVPR 2025arXiv:2503.20672
10
citations
#2985

Bayesian Concept Bottleneck Models with LLM Priors

Jean Feng, Avni Kothari, Lucas Zier et al.

NEURIPS 2025arXiv:2410.15555
10
citations
#2986

Zeroth-Order Policy Gradient for Reinforcement Learning from Human Feedback without Reward Inference

Qining Zhang, Lei Ying

ICLR 2025arXiv:2409.17401
10
citations
#2987

DocLayLLM: An Efficient Multi-modal Extension of Large Language Models for Text-rich Document Understanding

Wenhui Liao, Jiapeng Wang, Hongliang Li et al.

CVPR 2025arXiv:2408.15045
10
citations
#2988

DiTFastAttnV2: Head-wise Attention Compression for Multi-Modality Diffusion Transformers

Hanling Zhang, Rundong Su, Zhihang Yuan et al.

ICCV 2025arXiv:2503.22796
10
citations
#2989

Atlas Gaussians Diffusion for 3D Generation

Haitao Yang, Yuan Dong, Hanwen Jiang et al.

ICLR 2025arXiv:2408.13055
10
citations
#2990

Splatter-360: Generalizable 360 Gaussian Splatting for Wide-baseline Panoramic Images

Zheng Chen, Chenming Wu, Zhelun Shen et al.

CVPR 2025
10
citations
#2991

InsightEdit: Towards Better Instruction Following for Image Editing

Yingjing Xu, Jie Kong, Jiazhi Wang et al.

CVPR 2025arXiv:2411.17323
10
citations
#2992

Imagine and Seek: Improving Composed Image Retrieval with an Imagined Proxy

You Li, Fan Ma, Yi Yang

CVPR 2025arXiv:2411.16752
10
citations
#2993

CoreMatching: A Co-adaptive Sparse Inference Framework with Token and Neuron Pruning for Comprehensive Acceleration of Vision-Language Models

Qinsi Wang, Hancheng Ye, Ming-Yu Chung et al.

ICML 2025arXiv:2505.19235
10
citations
#2994

CholecTrack20: A Multi-Perspective Tracking Dataset for Surgical Tools

Chinedu Innocent Nwoye, Kareem elgohary, Anvita A. Srinivas et al.

CVPR 2025arXiv:2312.07352
9
citations
#2995

LoRA Subtraction for Drift-Resistant Space in Exemplar-Free Continual Learning

Xuan Liu, Xiaobin Chang

CVPR 2025arXiv:2503.18985
9
citations
#2996

GenDeg: Diffusion-based Degradation Synthesis for Generalizable All-In-One Image Restoration

Sudarshan Rajagopalan, Nithin Gopalakrishnan Nair, Jay Paranjape et al.

CVPR 2025arXiv:2411.17687
9
citations
#2997

RSafe: Incentivizing proactive reasoning to build robust and adaptive LLM safeguards

jingnan zheng, Xiangtian Ji, Yijun Lu et al.

NEURIPS 2025arXiv:2506.07736
9
citations
#2998

Energy-based Backdoor Defense Against Federated Graph Learning

Guancheng Wan, Zitong Shi, Wenke Huang et al.

ICLR 2025
9
citations
#2999

SceneTAP: Scene-Coherent Typographic Adversarial Planner against Vision-Language Models in Real-World Environments

Yue Cao, Yun Xing, Jie Zhang et al.

CVPR 2025arXiv:2412.00114
9
citations
#3000

FisherTune: Fisher-Guided Robust Tuning of Vision Foundation Models for Domain Generalized Segmentation

Dong Zhao, Jinlong Li, Shuang Wang et al.

CVPR 2025arXiv:2503.17940
9
citations