Most Cited 2025 "perplexity" Papers

22,274 papers found • Page 41 of 112

#8001

SDFit: 3D Object Pose and Shape by Fitting a Morphable SDF to a Single Image

Dimitrije Antić, Georgios Paschalidis, Shashank Tripathi et al.

ICCV 2025arXiv:2409.16178
5
citations
#8002

Towards Immersive Human-X Interaction: A Real-Time Framework for Physically Plausible Motion Synthesis

Kaiyang Ji, Ye Shi, Zichen Jin et al.

ICCV 2025highlightarXiv:2508.02106
5
citations
#8003

EgoM2P: Egocentric Multimodal Multitask Pretraining

Gen Li, Yutong Chen, Yiqian Wu et al.

ICCV 2025arXiv:2506.07886
5
citations
#8004

Temporal Unlearnable Examples: Preventing Personal Video Data from Unauthorized Exploitation by Object Tracking

Qiangqiang Wu, Yi Yu, Chenqi Kong et al.

ICCV 2025arXiv:2507.07483
5
citations
#8005

G-DexGrasp: Generalizable Dexterous Grasping Synthesis Via Part-Aware Prior Retrieval and Prior-Assisted Generation

Juntao Jian, Xiuping Liu, Zixuanchen Zixuanchen et al.

ICCV 2025arXiv:2503.19457
5
citations
#8006

DH-FaceVid-1K: A Large-Scale High-Quality Dataset for Face Video Generation

Donglin Di, He Feng, Wenzhang SUN et al.

ICCV 2025arXiv:2410.07151
5
citations
#8007

Adaptive Hyper-Graph Convolution Network for Skeleton-based Human Action Recognition with Virtual Connections

Youwei Zhou, Tianyang Xu, Cong Wu et al.

ICCV 2025arXiv:2411.14796
5
citations
#8008

FaceLift: Learning Generalizable Single Image 3D Face Reconstruction from Synthetic Heads

Weijie Lyu, Yi Zhou, Ming-Hsuan Yang et al.

ICCV 2025arXiv:2412.17812
5
citations
#8009

Precise Action-to-Video Generation Through Visual Action Prompts

Yuang Wang, Chao Wen, Haoyu Guo et al.

ICCV 2025arXiv:2508.13104
5
citations
#8010

Latent-Reframe: Enabling Camera Control for Video Diffusion Models without Training

Zhenghong Zhou, Jie An, Jiebo Luo

ICCV 2025arXiv:2412.06029
5
citations
#8011

Frequency-Guided Posterior Sampling for Diffusion-Based Image Restoration

Darshan Thaker, Abhishek Goyal, Rene Vidal

ICCV 2025arXiv:2411.15295
5
citations
#8012

A Quality-Guided Mixture of Score-Fusion Experts Framework for Human Recognition

Jie Zhu, Yiyang Su, Minchul Kim et al.

ICCV 2025arXiv:2508.00053
5
citations
#8013

Motion Synthesis with Sparse and Flexible Keyjoint Control

Inwoo Hwang, Jinseok Bae, Donggeun Lim et al.

ICCV 2025arXiv:2503.15557
5
citations
#8014

IDFace: Face Template Protection for Efficient and Secure Identification

Sunpill Kim, Seunghun Paik, Chanwoo Hwang et al.

ICCV 2025arXiv:2507.12050
5
citations
#8015

MeshLLM: Empowering Large Language Models to Progressively Understand and Generate 3D Mesh

Shuangkang Fang, I-Chao Shen, Yufeng Wang et al.

ICCV 2025highlightarXiv:2508.01242
5
citations
#8016

2HandedAfforder: Learning Precise Actionable Bimanual Affordances from Human Videos

Marvin Heidinger, Snehal Jauhri, Vignesh Prasad et al.

ICCV 2025arXiv:2503.09320
5
citations
#8017

Edicho: Consistent Image Editing in the Wild

Qingyan Bai, Hao Ouyang, Yinghao Xu et al.

ICCV 2025arXiv:2412.21079
5
citations
#8018

DualReal: Adaptive Joint Training for Lossless Identity-Motion Fusion in Video Customization

Wenchuan Wang, Mengqi Huang, Yijing Tu et al.

ICCV 2025arXiv:2505.02192
5
citations
#8019

Versatile Transition Generation with Image-to-Video Diffusion

Zuhao Yang, Jiahui Zhang, Yingchen Yu et al.

ICCV 2025arXiv:2508.01698
5
citations
#8020

AnyI2V: Animating Any Conditional Image with Motion Control

Ziye Li, Xincheng Shuai, Hao Luo et al.

ICCV 2025arXiv:2507.02857
5
citations
#8021

The Silent Assistant: NoiseQuery as Implicit Guidance for Goal-Driven Image Generation

Ruoyu Wang, Huayang Huang, Ye Zhu et al.

ICCV 2025highlightarXiv:2412.05101
5
citations
#8022

Disrupting Model Merging: A Parameter-Level Defense Without Sacrificing Accuracy

JUNHAO WEI, YU ZHE, Jun Sakuma

ICCV 2025arXiv:2503.07661
5
citations
#8023

Lay-Your-Scene: Natural Scene Layout Generation with Diffusion Transformers

Divyansh Srivastava, Xiang Zhang, He Wen et al.

ICCV 2025arXiv:2505.04718
5
citations
#8024

Contrastive Test-Time Composition of Multiple LoRA Models for Image Generation

Tuna Meral, Enis Simsar, Federico Tombari et al.

ICCV 2025highlightarXiv:2403.19776
5
citations
#8025

Benchmarking and Learning Multi-Dimensional Quality Evaluator for Text-to-3D Generation

Yujie Zhang, Bingyang Cui, Qi Yang et al.

ICCV 2025arXiv:2412.11170
5
citations
#8026

Dense2MoE: Restructuring Diffusion Transformer to MoE for Efficient Text-to-Image Generation

Youwei Zheng, Yuxi Ren, Xin Xia et al.

ICCV 2025arXiv:2510.09094
5
citations
#8027

Exploring Multimodal Diffusion Transformers for Enhanced Prompt-based Image Editing

Joonghyuk Shin, Alchan Hwang, Yujin Kim et al.

ICCV 2025arXiv:2508.07519
5
citations
#8028

EDiT: Efficient Diffusion Transformers with Linear Compressed Attention

Philipp Becker, Abhinav Mehrotra, Ruchika Chavhan et al.

ICCV 2025arXiv:2503.16726
5
citations
#8029

Towards Video Thinking Test: A Holistic Benchmark for Advanced Video Reasoning and Understanding

Yuanhan Zhang, Yunice Chew, Yuhao Dong et al.

ICCV 2025arXiv:2507.15028
5
citations
#8030

MultiverSeg: Scalable Interactive Segmentation of Biomedical Imaging Datasets with In-Context Guidance

Hallee Wong, Jose Javier Gonzalez Ortiz, John Guttag et al.

ICCV 2025arXiv:2412.15058
5
citations
#8031

DisTime: Distribution-based Time Representation for Video Large Language Models

yingsen zeng, Zepeng Huang, Yujie Zhong et al.

ICCV 2025arXiv:2505.24329
5
citations
#8032

Dynamic Dictionary Learning for Remote Sensing Image Segmentation

Xuechao Zou, Yue Li, Shun Zhang et al.

ICCV 2025arXiv:2503.06683
5
citations
#8033

Streaming VideoLLMs for Real-Time Procedural Video Understanding

Dibyadip Chatterjee, Edoardo Remelli, Yale Song et al.

ICCV 2025arXiv:2504.13915
5
citations
#8034

SparseMM: Head Sparsity Emerges from Visual Concept Responses in MLLMs

Jiahui Wang, Zuyan Liu, Yongming Rao et al.

ICCV 2025arXiv:2506.05344
5
citations
#8035

From Trial to Triumph: Advancing Long Video Understanding via Visual Context Sample Scaling and Self-reward Alignment

Yucheng Suo, Fan Ma, Linchao Zhu et al.

ICCV 2025arXiv:2503.20472
5
citations
#8036

VFlowOpt: A Token Pruning Framework for LMMs with Visual Information Flow-Guided Optimization

Sihan Yang, Runsen Xu, Chenhang Cui et al.

ICCV 2025arXiv:2508.05211
5
citations
#8037

Scaling Tumor Segmentation: Best Lessons from Real and Synthetic Data

Qi Chen, Xinze Zhou, Chen Liu et al.

ICCV 2025arXiv:2510.14831
5
citations
#8038

SimMLM: A Simple Framework for Multi-modal Learning with Missing Modality

Sijie Li, Chen Chen, Jungong Han

ICCV 2025arXiv:2507.19264
5
citations
#8039

Street Gaussians without 3D Object Tracker

Ruida Zhang, Chengxi Li, Chenyangguang Zhang et al.

ICCV 2025arXiv:2412.05548
5
citations
#8040

StochasticSplats: Stochastic Rasterization for Sorting-Free 3D Gaussian Splatting

Shakiba Kheradmand, Delio Vicini, George Kopanas et al.

ICCV 2025arXiv:2503.24366
5
citations
#8041

Curve-Aware Gaussian Splatting for 3D Parametric Curve Reconstruction

Zhirui Gao, Renjiao Yi, YaQiao Dai et al.

ICCV 2025arXiv:2506.21401
5
citations
#8042

AAA-Gaussians: Anti-Aliased and Artifact-Free 3D Gaussian Rendering

Michael Steiner, Thomas Köhler, Lukas Radl et al.

ICCV 2025highlightarXiv:2504.12811
5
citations
#8043

Sat2City: 3D City Generation from A Single Satellite Image with Cascaded Latent Diffusion

Tongyan Hua, Lutao Jiang, Ying-Cong Chen et al.

ICCV 2025arXiv:2507.04403
5
citations
#8044

Leveraging BEV Paradigm for Ground-to-Aerial Image Synthesis

Junyan Ye, Jun He, Weijia Li et al.

ICCV 2025arXiv:2408.01812
5
citations
#8045

Unraveling the Effects of Synthetic Data on End-to-End Autonomous Driving

Junhao Ge, Zuhong Liu, Longteng Fan et al.

ICCV 2025arXiv:2503.18108
5
citations
#8046

MikuDance: Animating Character Art with Mixed Motion Dynamics

Jiaxu Zhang, Xianfang Zeng, Xin Chen et al.

ICCV 2025arXiv:2411.08656
5
citations
#8047

ObjectRelator: Enabling Cross-View Object Relation Understanding Across Ego-Centric and Exo-Centric Perspectives

Yuqian Fu, Runze Wang, Bin Ren et al.

ICCV 2025highlightarXiv:2411.19083
5
citations
#8048

CoHD: A Counting-Aware Hierarchical Decoding Framework for Generalized Referring Expression Segmentation

Zhuoyan Luo, Yinghao Wu, Tianheng Cheng et al.

ICCV 2025arXiv:2405.15658
5
citations
#8049

Not All Frame Features Are Equal: Video-to-4D Generation via Decoupling Dynamic-Static Features

Liying Yang, Chen Liu, Zhenwei Zhu et al.

ICCV 2025highlightarXiv:2502.08377
5
citations
#8050

Synergistic Prompting for Robust Visual Recognition with Missing Modalities

Zhihui Zhang, Luanyuan Dai, Qika Lin et al.

ICCV 2025arXiv:2507.07802
5
citations
#8051

PerLDiff: Controllable Street View Synthesis Using Perspective-Layout Diffusion Model

Jinhua Zhang, Hualian Sheng, Sijia Cai et al.

ICCV 2025arXiv:2407.06109
5
citations
#8052

HERO: Human Reaction Generation from Videos

Chengjun Yu, Wei Zhai, Yuhang Yang et al.

ICCV 2025arXiv:2503.08270
5
citations
#8053

Improving Multimodal Learning via Imbalanced Learning

Shicai Wei, Chunbo Luo, Yang Luo

ICCV 2025arXiv:2507.10203
5
citations
#8054

InterGSEdit: Interactive 3D Gaussian Splatting Editing with 3D Geometry-Consistent Attention Prior

Minghao Wen, Shengjie Wu, Kangkan Wang et al.

ICCV 2025arXiv:2507.04961
5
citations
#8055

CaO2: Rectifying Inconsistencies in Diffusion-Based Dataset Distillation

Haoxuan Wang, Zhenghao Zhao, Junyi Wu et al.

ICCV 2025
5
citations
#8056

CATP-LLM: Empowering Large Language Models for Cost-Aware Tool Planning

Duo Wu, Jinghe Wang, Yuan Meng et al.

ICCV 2025arXiv:2411.16313
5
citations
#8057

Learning Normal Flow Directly From Events

Dehao Yuan, Levi Burner, Jiayi Wu et al.

ICCV 2025arXiv:2412.11284
5
citations
#8058

Disentangled World Models: Learning to Transfer Semantic Knowledge from Distracting Videos for Reinforcement Learning

Qi Wang, Zhipeng Zhang, Baao Xie et al.

ICCV 2025arXiv:2503.08751
5
citations
#8059

DiffVSR: Revealing an Effective Recipe for Taming Robust Video Super-Resolution Against Complex Degradations

Xiaohui Li, Yihao Liu, Shuo Cao et al.

ICCV 2025arXiv:2501.10110
5
citations
#8060

SpatialCrafter: Unleashing the Imagination of Video Diffusion Models for Scene Reconstruction from Limited Observations

Songchun Zhang, Huiyao Xu, Sitong Guo et al.

ICCV 2025arXiv:2505.11992
5
citations
#8061

TextSSR: Diffusion-based Data Synthesis for Scene Text Recognition

Xingsong Ye, Yongkun Du, Yunbo Tao et al.

ICCV 2025arXiv:2412.01137
5
citations
#8062

MOSCATO: Predicting Multiple Object State Change Through Actions

Parnian Zameni, Yuhan Shen, Ehsan Elhamifar

ICCV 2025
5
citations
#8063

AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction

Junhao Cheng, Yuying Ge, Yixiao Ge et al.

ICCV 2025arXiv:2504.01014
5
citations
#8064

Exploring the Visual Feature Space for Multimodal Neural Decoding

Weihao Xia, Cengiz Oztireli

ICCV 2025arXiv:2505.15755
5
citations
#8065

RAGNet: Large-scale Reasoning-based Affordance Segmentation Benchmark towards General Grasping

Dongming Wu, Yanping Fu, Saike Huang et al.

ICCV 2025arXiv:2507.23734
5
citations
#8066

Scendi Score: Prompt‑Aware Diversity Evaluation via Schur Complement of CLIP Embeddings

Azim Ospanov, Mohammad Jalali, Farzan Farnia

ICCV 2025highlightarXiv:2412.18645
5
citations
#8067

VisNumBench: Evaluating Number Sense of Multimodal Large Language Models

Tengjin Weng, Jingyi Wang, Wenhao Jiang et al.

ICCV 2025arXiv:2503.14939
5
citations
#8068

NeurOp-Diff: Continuous Remote Sensing Image Super-Resolution via Neural Operator Diffusion

Zihao Xu, Yuzhi Tang, Bowen Xu et al.

ICCV 2025
5
citations
#8069

Φ-GAN:Physics-Inspired GAN for Generating SAR Images Under Limited Data

Xidan Zhang, Yihan Zhuang, Qian Guo et al.

ICCV 2025
5
citations
#8070

GaSLight: Gaussian Splats for Spatially-Varying Lighting in HDR

Christophe Bolduc, Yannick Hold-Geoffroy, Jean-Francois Lalonde

ICCV 2025arXiv:2504.10809
5
citations
#8071

GUAVA: Generalizable Upper Body 3D Gaussian Avatar

Dongbin Zhang, Yunfei Liu, Lijian Lin et al.

ICCV 2025arXiv:2505.03351
5
citations
#8072

DC-AR: Efficient Masked Autoregressive Image Generation with Deep Compression Hybrid Tokenizer

Yecheng Wu, Han Cai, Junyu Chen et al.

ICCV 2025arXiv:2507.04947
5
citations
#8073

X2I: Seamless Integration of Multimodal Understanding into Diffusion Transformer via Attention Distillation

jian ma, Qirong Peng, Xu Guo et al.

ICCV 2025arXiv:2503.06134
5
citations
#8074

WaveMamba: Wavelet-Driven Mamba Fusion for RGB-Infrared Object Detection

Haodong Zhu, Wenhao Dong, Linlin Yang et al.

ICCV 2025arXiv:2507.18173
5
citations
#8075

SAME: Learning Generic Language-Guided Visual Navigation with State-Adaptive Mixture of Experts

Gengze Zhou, Yicong Hong, Zun Wang et al.

ICCV 2025arXiv:2412.05552
5
citations
#8076

DC-ControlNet: Decoupling Inter- and Intra-Element Conditions in Image Generation with Diffusion Models

hongji yang, Wencheng Han, Yucheng Zhou et al.

ICCV 2025arXiv:2502.14779
5
citations
#8077

Adding Additional Control to One-Step Diffusion with Joint Distribution Matching

Yihong Luo, Tianyang Hu, Yifan Song et al.

ICCV 2025arXiv:2503.06652
5
citations
#8078

Boost 3D Reconstruction using Diffusion-based Monocular Camera Calibration

Junyuan Deng, Wei Yin, Xiaoyang Guo et al.

ICCV 2025arXiv:2411.17240
5
citations
#8079

SignRep: Enhancing Self-Supervised Sign Representations

Ryan Wong, Necati Cihan Camgoz, Richard Bowden

ICCV 2025arXiv:2503.08529
5
citations
#8080

Frequency-Aware Autoregressive Modeling for Efficient High-Resolution Image Synthesis

Zhuokun Chen, Jugang Fan, Zhuowei Yu et al.

ICCV 2025arXiv:2507.20454
5
citations
#8081

LongAnimation: Long Animation Generation with Dynamic Global-Local Memory

Nan Chen, Mengqi Huang, Yihao Meng et al.

ICCV 2025arXiv:2507.01945
5
citations
#8082

SEGS-SLAM: Structure-enhanced 3D Gaussian Splatting SLAM with Appearance Embedding

Tianci Wen, Zhiang Liu, Yongchun Fang

ICCV 2025arXiv:2501.05242
5
citations
#8083

SteerX: Creating Any Camera-Free 3D and 4D Scenes with Geometric Steering

Byeongjun Park, Hyojun Go, Hyelin Nam et al.

ICCV 2025arXiv:2503.12024
5
citations
#8084

GaussianUpdate: Continual 3D Gaussian Splatting Update for Changing Environments

Lin Zeng, Boming Zhao, Jiarui Hu et al.

ICCV 2025arXiv:2508.08867
5
citations
#8085

Distilling Diffusion Models to Efficient 3D LiDAR Scene Completion

shengyuan zhang, An Zhao, Ling Yang et al.

ICCV 2025arXiv:2412.03515
5
citations
#8086

MonoMVSNet: Monocular Priors Guided Multi-View Stereo Network

Jianfei Jiang, Qiankun Liu, Haochen Yu et al.

ICCV 2025arXiv:2507.11333
5
citations
#8087

PhysRig: Differentiable Physics-Based Skinning and Rigging Framework for Realistic Articulated Object Modeling

Hao Zhang, Haolan Xu, Chun Feng et al.

ICCV 2025arXiv:2506.20936
5
citations
#8088

Latent Diffusion Models with Masked AutoEncoders

Junho Lee, Jeongwoo Shin, Hyungwook Choi et al.

ICCV 2025arXiv:2507.09984
5
citations
#8089

TikZero: Zero-Shot Text-Guided Graphics Program Synthesis

Jonas Belouadi, Eddy Ilg, Margret Keuper et al.

ICCV 2025highlightarXiv:2503.11509
5
citations
#8090

Capturing Individual Human Preferences with Reward Features

Andre Barreto, Vincent Dumoulin, Yiran Mao et al.

NEURIPS 2025arXiv:2503.17338
5
citations
#8091

Understanding Prompt Tuning and In-Context Learning via Meta-Learning

Tim Genewein, Kevin Li, Jordi Grau-Moya et al.

NEURIPS 2025spotlightarXiv:2505.17010
5
citations
#8092

Tight Lower Bounds and Improved Convergence in Performative Prediction

Pedram Khorsandi, Rushil Gupta, Mehrnaz Mofakhami et al.

NEURIPS 2025arXiv:2412.03671
5
citations
#8093

MIRAGE: A Benchmark for Multimodal Information-Seeking and Reasoning in Agricultural Expert-Guided Conversations

Vardhan Dongre, Chi Gui, Shubham Garg et al.

NEURIPS 2025arXiv:2506.20100
5
citations
#8094

MindGYM: What Matters in Question Synthesis for Thinking-Centric Fine-Tuning?

Zhe Xu, Daoyuan Chen, Zhenqing Ling et al.

NEURIPS 2025arXiv:2503.09499
5
citations
#8095

EgoExOR: An Ego-Exo-Centric Operating Room Dataset for Surgical Activity Understanding

Ege Özsoy, Arda Mamur, Felix Tristram et al.

NEURIPS 2025arXiv:2505.24287
5
citations
#8096

BEDLAM2.0: Synthetic humans and cameras in motion

Joachim Tesch, Giorgio Becherini, Prerana Achar et al.

NEURIPS 2025oralarXiv:2511.14394
5
citations
#8097

Privacy Reasoning in Ambiguous Contexts

Ren Yi, Octavian Suciu, Adrian Gascon et al.

NEURIPS 2025arXiv:2506.12241
5
citations
#8098

Reinforcement Learning for Out-of-Distribution Reasoning in LLMs: An Empirical Study on Diagnosis-Related Group Coding

Hanyin Wang, Zhenbang Wu, Gururaj Kolar et al.

NEURIPS 2025spotlightarXiv:2505.21908
5
citations
#8099

Mitigating Forgetting in LLM Fine-Tuning via Low-Perplexity Token Learning

Chao-Chung Wu, Zhi Rui Tam, Chieh-Yen Lin et al.

NEURIPS 2025arXiv:2501.14315
5
citations
#8100

FlowDAS: A Stochastic Interpolant-based Framework for Data Assimilation

Siyi Chen, Yixuan Jia, Qing Qu et al.

NEURIPS 2025arXiv:2501.16642
5
citations
#8101

Backward Conformal Prediction

Etienne Gauthier, Francis Bach, Michael Jordan

NEURIPS 2025arXiv:2505.13732
5
citations
#8102

FreqPolicy: Efficient Flow-based Visuomotor Policy via Frequency Consistency

Yifei Su, Ning Liu, Dong Chen et al.

NEURIPS 2025oralarXiv:2506.08822
5
citations
#8103

Tropical Attention: Neural Algorithmic Reasoning for Combinatorial Algorithms

Baran Hashemi, Kurt Pasque, Chris Teska et al.

NEURIPS 2025arXiv:2505.17190
5
citations
#8104

GRE Suite: Geo-localization Inference via Fine-Tuned Vision-Language Models and Enhanced Reasoning Chains

Chun Wang, Xiaojun Ye, Xiaoran Pan et al.

NEURIPS 2025arXiv:2505.18700
5
citations
#8105

In-Context Learning Strategies Emerge Rationally

Daniel Wurgaft, Ekdeep S Lubana, Core Francisco Park et al.

NEURIPS 2025arXiv:2506.17859
5
citations
#8106

Logic.py: Bridging the Gap between LLMs and Constraint Solvers

Pascal Kesseli, Peter O'Hearn, Ricardo Cabral

NEURIPS 2025arXiv:2502.15776
5
citations
#8107

BeliefMapNav: 3D Voxel-Based Belief Map for Zero-Shot Object Navigation

Zibo Zhou, Yue Hu, Lingkai Zhang et al.

NEURIPS 2025arXiv:2506.06487
5
citations
#8108

PanoWan: Lifting Diffusion Video Generation Models to 360$^\circ$ with Latitude/Longitude-aware Mechanisms

Yifei Xia, Shuchen Weng, Siqi Yang et al.

NEURIPS 2025
5
citations
#8109

MINGLE: Mixture of Null-Space Gated Low-Rank Experts for Test-Time Continual Model Merging

Zihuan Qiu, Yi Xu, Chiyuan He et al.

NEURIPS 2025arXiv:2505.11883
5
citations
#8110

Treatment Effect Estimation for Optimal Decision-Making

Dennis Frauen, Valentyn Melnychuk, Jonas Schweisthal et al.

NEURIPS 2025arXiv:2505.13092
5
citations
#8111

ChartSketcher: Reasoning with Multimodal Feedback and Reflection for Chart Understanding

Muye Huang, Lingling Zhang, Jie Ma et al.

NEURIPS 2025arXiv:2505.19076
5
citations
#8112

Exploring Diffusion Transformer Designs via Grafting

Keshigeyan Chandrasegaran, Michael Poli, Dan Fu et al.

NEURIPS 2025oralarXiv:2506.05340
5
citations
#8113

MAGNET: A Multi-agent Framework for Finding Audio-Visual Needles by Reasoning over Multi-Video Haystacks

Sanjoy Chowdhury, Mohamed Elmoghany, Yohan Abeysinghe et al.

NEURIPS 2025oralarXiv:2506.07016
5
citations
#8114

NeurIPT: Foundation Model for Neural Interfaces

Zitao Fang, Chenxuan Li, Hongting Zhou et al.

NEURIPS 2025oralarXiv:2510.16548
5
citations
#8115

Learning Diffusion Models with Flexible Representation Guidance

Chenyu Wang, Cai Zhou, Sharut Gupta et al.

NEURIPS 2025arXiv:2507.08980
5
citations
#8116

Efficient Data Selection at Scale via Influence Distillation

Mahdi Nikdan, Vincent Cohen-Addad, Dan Alistarh et al.

NEURIPS 2025arXiv:2505.19051
5
citations
#8117

Scaling Up Liquid-Resistance Liquid-Capacitance Networks for Efficient Sequence Modeling

Mónika Farsang, Radu Grosu

NEURIPS 2025arXiv:2505.21717
5
citations
#8118

Entropic Time Schedulers for Generative Diffusion Models

Dejan Stancevic, Florian Handke, Luca Ambrogioni

NEURIPS 2025arXiv:2504.13612
5
citations
#8119

Modeling Microenvironment Trajectories on Spatial Transcriptomics with NicheFlow

Kristiyan Sakalyan, Alessandro Palma, Filippo Guerranti et al.

NEURIPS 2025oralarXiv:2511.00977
5
citations
#8120

Better Language Model Inversion by Compactly Representing Next-Token Distributions

Murtaza Nazir, Matthew Finlayson, John Morris et al.

NEURIPS 2025arXiv:2506.17090
5
citations
#8121

Distribution-Aligned Decoding for Efficient LLM Task Adaptation

Senkang Hu, Xudong Han, Jinqi Jiang et al.

NEURIPS 2025arXiv:2509.15888
5
citations
#8122

System Prompt Optimization with Meta-Learning

Yumin Choi, Jinheon Baek, Sung Ju Hwang

NEURIPS 2025arXiv:2505.09666
5
citations
#8123

InfinityStar: Unified Spacetime AutoRegressive Modeling for Visual Generation

Jinlai Liu, Jian Han, Bin Yan et al.

NEURIPS 2025oral
5
citations
#8124

Understanding and Rectifying Safety Perception Distortion in VLMs

Xiaohan Zou, Jian Kang, George Kesidis et al.

NEURIPS 2025arXiv:2502.13095
5
citations
#8125

Multilevel neural simulation-based inference

Yuga Hikida, Ayush Bharti, Niall Jeffrey et al.

NEURIPS 2025arXiv:2506.06087
5
citations
#8126

One Subgoal at a Time: Zero-Shot Generalization to Arbitrary Linear Temporal Logic Requirements in Multi-Task Reinforcement Learning

Zijian Guo, İlker Işık, H M Sabbir Ahmad et al.

NEURIPS 2025oralarXiv:2508.01561
5
citations
#8127

Neural Thermodynamics: Entropic Forces in Deep and Universal Representation Learning

Liu Ziyin, Yizhou Xu, Isaac Chuang

NEURIPS 2025arXiv:2505.12387
5
citations
#8128

Improving Generalization of Neural Combinatorial Optimization for Vehicle Routing Problems via Test-Time Projection Learning

Yuanyao Chen, Rongsheng Chen, Fu Luo et al.

NEURIPS 2025arXiv:2506.02392
5
citations
#8129

Fine-grained List-wise Alignment for Generative Medication Recommendation

Chenxiao Fan, Chongming Gao, Wentao Shi et al.

NEURIPS 2025spotlightarXiv:2505.20218
5
citations
#8130

Improved Balanced Classification with Theoretically Grounded Loss Functions

Corinna Cortes, Mehryar Mohri, Yutao Zhong

NEURIPS 2025arXiv:2512.23947
5
citations
#8131

Spiral: Semantic-Aware Progressive LiDAR Scene Generation and Understanding

Dekai Zhu, Yixuan Hu, Youquan Liu et al.

NEURIPS 2025arXiv:2505.22643
5
citations
#8132

The Fluorescent Veil: A Stealthy and Effective Physical Adversarial Patch Against Traffic Sign Recognition

Shuai Yuan, Xingshuo Han, Hongwei Li et al.

NEURIPS 2025arXiv:2409.12394
5
citations
#8133

ExGra-Med: Extended Context Graph Alignment for Medical Vision-Language Models

Duy M. H. Nguyen, Nghiem Diep, Trung Nguyen et al.

NEURIPS 2025arXiv:2410.02615
5
citations
#8134

Transformer brain encoders explain human high-level visual responses

Hossein Adeli, Sun Minni, Nikolaus Kriegeskorte

NEURIPS 2025spotlightarXiv:2505.17329
5
citations
#8135

When Are Concepts Erased From Diffusion Models?

Kevin Lu, Nicky Kriplani, Rohit Gandikota et al.

NEURIPS 2025arXiv:2505.17013
5
citations
#8136

R$^2$ec: Towards Large Recommender Models with Reasoning

Runyang You, Yongqi Li, Xinyu Lin et al.

NEURIPS 2025arXiv:2505.16994
5
citations
#8137

Convergence of Clipped SGD on Convex $(L_0,L_1)$-Smooth Functions

Ofir Gaash, Kfir Y. Levy, Yair Carmon

NEURIPS 2025arXiv:2502.16492
5
citations
#8138

RadarQA: Multi-modal Quality Analysis of Weather Radar Forecasts

Xuming He, Zhiyuan You, Junchao Gong et al.

NEURIPS 2025arXiv:2508.12291
5
citations
#8139

The Rich and the Simple: On the Implicit Bias of Adam and SGD

Bhavya Vasudeva, Jung Lee, Vatsal Sharan et al.

NEURIPS 2025arXiv:2505.24022
5
citations
#8140

macOSWorld: A Multilingual Interactive Benchmark for GUI Agents

Pei Yang, Hai Ci, Mike Zheng Shou

NEURIPS 2025arXiv:2506.04135
5
citations
#8141

CHOICE: Benchmarking the Remote Sensing Capabilities of Large Vision-Language Models

Xiao An, Jiaxing Sun, Zihan Gui et al.

NEURIPS 2025arXiv:2411.18145
5
citations
#8142

Joint Relational Database Generation via Graph-Conditional Diffusion Models

Mohamed Amine Ketata, David Lüdke, Leo Schwinn et al.

NEURIPS 2025arXiv:2505.16527
5
citations
#8143

Generating Computational Cognitive models using Large Language Models

Milena Rmus, Akshay Kumar Jagadish, Marvin Mathony et al.

NEURIPS 2025oralarXiv:2502.00879
5
citations
#8144

Generalization or Hallucination? Understanding Out-of-Context Reasoning in Transformers

Yixiao Huang, Hanlin Zhu, Tianyu Guo et al.

NEURIPS 2025arXiv:2506.10887
5
citations
#8145

Failure Prediction at Runtime for Generative Robot Policies

Ralf Römer, Adrian Kobras, Luca Worbis et al.

NEURIPS 2025arXiv:2510.09459
5
citations
#8146

Predicting Empirical AI Research Outcomes with Language Models

Jiaxin Wen, Chenglei Si, Yueh-Han Chen et al.

NEURIPS 2025arXiv:2506.00794
5
citations
#8147

Transformers Provably Learn Chain-of-Thought Reasoning with Length Generalization

Yu Huang, Zixin Wen, Aarti Singh et al.

NEURIPS 2025arXiv:2511.07378
5
citations
#8148

Towards Understanding the Mechanisms of Classifier-Free Guidance

Xiang Li, Rongrong Wang, Qing Qu

NEURIPS 2025spotlightarXiv:2505.19210
5
citations
#8149

Learning Orthogonal Multi-Index Models: A Fine-Grained Information Exponent Analysis

Yunwei Ren, Jason Lee

NEURIPS 2025arXiv:2410.09678
5
citations
#8150

DrVD-Bench: Do Vision-Language Models Reason Like Human Doctors in Medical Image Diagnosis?

Tianhong Zhou, xu yin, Yingtao Zhu et al.

NEURIPS 2025arXiv:2505.24173
5
citations
#8151

Refusal Direction is Universal Across Safety-Aligned Languages

Xinpeng Wang, Mingyang Wang, Yihong Liu et al.

NEURIPS 2025arXiv:2505.17306
5
citations
#8152

Systematic Reward Gap Optimization for Mitigating VLM Hallucinations

Lehan He, Zeren Chen, Zhelun Shi et al.

NEURIPS 2025arXiv:2411.17265
5
citations
#8153

Revisiting End-to-End Learning with Slide-level Supervision in Computational Pathology

Wenhao Tang, Rong Qin, Heng Fang et al.

NEURIPS 2025arXiv:2506.02408
5
citations
#8154

Language Models Can Predict Their Own Behavior

Dhananjay Ashok, Jonathan May

NEURIPS 2025arXiv:2502.13329
5
citations
#8155

AceSearcher: Bootstrapping Reasoning and Search for LLMs via Reinforced Self-Play

Ran Xu, Yuchen Zhuang, Zihan Dong et al.

NEURIPS 2025spotlightarXiv:2509.24193
5
citations
#8156

Object-centric 3D Motion Field for Robot Learning from Human Videos

Zhao-Heng Yin, Sherry Yang, Pieter Abbeel

NEURIPS 2025spotlightarXiv:2506.04227
5
citations
#8157

Planning without Search: Refining Frontier LLMs with Offline Goal-Conditioned RL

Joey Hong, Anca Dragan, Sergey Levine

NEURIPS 2025arXiv:2505.18098
5
citations
#8158

LuxDiT: Lighting Estimation with Video Diffusion Transformer

Ruofan Liang, Kai He, Zan Gojcic et al.

NEURIPS 2025arXiv:2509.03680
5
citations
#8159

Bisecle: Binding and Separation in Continual Learning for Video Language Understanding

Yue Tan, Xiaoqian Hu, Hao Xue et al.

NEURIPS 2025arXiv:2507.00469
5
citations
#8160

Online Learning of Neural Networks

Amit Daniely, Idan Mehalel, Elchanan Mossel

NEURIPS 2025arXiv:2505.09167
5
citations
#8161

VeriThoughts: Enabling Automated Verilog Code Generation using Reasoning and Formal Verification

Patrick Yubeaton, Andre Nakkab, Weihua Xiao et al.

NEURIPS 2025arXiv:2505.20302
5
citations
#8162

FlexEvent: Towards Flexible Event-Frame Object Detection at Varying Operational Frequencies

Dongyue Lu, Lingdong Kong, Gim Hee Lee et al.

NEURIPS 2025oralarXiv:2412.06708
5
citations
#8163

Uni-Instruct: One-step Diffusion Model through Unified Diffusion Divergence Instruction

Yifei Wang, Weimin Bai, colin zhang et al.

NEURIPS 2025arXiv:2505.20755
5
citations
#8164

Multi-head Transformers Provably Learn Symbolic Multi-step Reasoning via Gradient Descent

Tong Yang, Yu Huang, Yingbin Liang et al.

NEURIPS 2025arXiv:2508.08222
5
citations
#8165

WorldWeaver: Generating Long-Horizon Video Worlds via Rich Perception

Zhiheng Liu, Xueqing Deng, Shoufa Chen et al.

NEURIPS 2025oralarXiv:2508.15720
5
citations
#8166

Poison as Cure: Visual Noise for Mitigating Object Hallucinations in LVMs

Kejia Zhang, Keda TAO, Jiasheng Tang et al.

NEURIPS 2025arXiv:2501.19164
5
citations
#8167

The Structural Complexity of Matrix-Vector Multiplication

Emile Anand, Jan van den Brand, Rose McCarty

NEURIPS 2025arXiv:2502.21240
5
citations
#8168

Neurosymbolic Diffusion Models

Emile van Krieken, Pasquale Minervini, Edoardo Maria Ponti et al.

NEURIPS 2025arXiv:2505.13138
5
citations
#8169

Head Pursuit: Probing Attention Specialization in Multimodal Transformers

Lorenzo Basile, Valentino Maiorca, Diego Doimo et al.

NEURIPS 2025spotlightarXiv:2510.21518
5
citations
#8170

Fine-Grained Preference Optimization Improves Spatial Reasoning in VLMs

Yifan Shen, Yuanzhe Liu, Jingyuan Zhu et al.

NEURIPS 2025arXiv:2506.21656
5
citations
#8171

Chain-of-Action: Trajectory Autoregressive Modeling for Robotic Manipulation

Wenbo Zhang, Tianrun Hu, Hanbo Zhang et al.

NEURIPS 2025oralarXiv:2506.09990
5
citations
#8172

VideoVLA: Video Generators Can Be Generalizable Robot Manipulators

Yichao Shen, Fangyun Wei, Zhiying Du et al.

NEURIPS 2025arXiv:2512.06963
5
citations
#8173

Revisiting Multi-Agent World Modeling from a Diffusion-Inspired Perspective

Yang Zhang, Xinran Li, Jianing Ye et al.

NEURIPS 2025arXiv:2505.20922
5
citations
#8174

AutoData: A Multi-Agent System for Open Web Data Collection

Tianyi Ma, Yiyue Qian, Zheyuan Zhang et al.

NEURIPS 2025arXiv:2505.15859
5
citations
#8175

E2Former: An Efficient and Equivariant Transformer with Linear-Scaling Tensor Products

Yunyang Li, Lin Huang, Zhihao Ding et al.

NEURIPS 2025spotlightarXiv:2501.19216
5
citations
#8176

Do different prompting methods yield a common task representation in language models?

Guy Davidson, Todd Gureckis, Brenden Lake et al.

NEURIPS 2025arXiv:2505.12075
5
citations
#8177

Alligat0R: Pre-Training through Covisibility Segmentation for Relative Camera Pose Regression

Thibaut Loiseau, Guillaume Bourmaud, Vincent Lepetit

NEURIPS 2025spotlight
5
citations
#8178

PeRL: Permutation-Enhanced Reinforcement Learning for Interleaved Vision-Language Reasoning

Yizhen Zhang, Yang Ding, Shuoshuo Zhang et al.

NEURIPS 2025arXiv:2506.14907
5
citations
#8179

Let Me Think! A Long Chain of Thought Can Be Worth Exponentially Many Short Ones

Parsa Mirtaheri, Ezra Edelman, Samy Jelassi et al.

NEURIPS 2025arXiv:2505.21825
5
citations
#8180

Geo-Sign: Hyperbolic Contrastive Regularisation for Geometrically Aware Sign Language Translation

Edward Fish, Richard Bowden

NEURIPS 2025oralarXiv:2506.00129
5
citations
#8181

Grammars of Formal Uncertainty: When to Trust LLMs in Automated Reasoning Tasks

Debargha Ganguly, Vikash Singh, Sreehari Sankar et al.

NEURIPS 2025arXiv:2505.20047
5
citations
#8182

SAVVY: Spatial Awareness via Audio-Visual LLMs through Seeing and Hearing

Mingfei Chen, Zijun Cui, Xiulong Liu et al.

NEURIPS 2025oralarXiv:2506.05414
5
citations
#8183

Flexible MOF Generation with Torsion-Aware Flow Matching

Nayoung Kim, Seongsu Kim, Sungsoo Ahn

NEURIPS 2025arXiv:2505.17914
5
citations
#8184

MedMax: Mixed-Modal Instruction Tuning for Training Biomedical Assistants

Hritik Bansal, Daniel Israel, Siyan Zhao et al.

NEURIPS 2025arXiv:2412.12661
5
citations
#8185

FALCON: An ML Framework for Fully Automated Layout-Constrained Analog Circuit Design

Asal Mehradfar, Xuzhe Zhao, Yilun Huang et al.

NEURIPS 2025arXiv:2505.21923
5
citations
#8186

Functional Scaling Laws in Kernel Regression: Loss Dynamics and Learning Rate Schedules

Binghui Li, Fengling Chen, Zixun Huang et al.

NEURIPS 2025spotlightarXiv:2509.19189
5
citations
#8187

On the Edge of Memorization in Diffusion Models

Sam Buchanan, Druv Pai, Yi Ma et al.

NEURIPS 2025arXiv:2508.17689
5
citations
#8188

CellVerse: Do Large Language Models Really Understand Cell Biology?

Fan Zhang, Tianyu Liu, Zhihong Zhu et al.

NEURIPS 2025arXiv:2505.07865
5
citations
#8189

Orthogonal Survival Learners for Estimating Heterogeneous Treatment Effects from Time-to-Event Data

Dennis Frauen, Maresa Schröder, Konstantin Hess et al.

NEURIPS 2025arXiv:2505.13072
5
citations
#8190

Watermarking Autoregressive Image Generation

Nikola Jovanović, Ismail Labiad, Tomas Soucek et al.

NEURIPS 2025arXiv:2506.16349
5
citations
#8191

Top-H Decoding: Adapting the Creativity and Coherence with Bounded Entropy in Text Generation

Erfan Baghaei Potraghloo, Seyedarmin Azizi, Souvik Kundu et al.

NEURIPS 2025arXiv:2509.02510
5
citations
#8192

Efficient Quadratic Corrections for Frank-Wolfe Algorithms

Jannis Halbey, Seta Rakotomandimby, Mathieu Besançon et al.

NEURIPS 2025arXiv:2506.02635
5
citations
#8193

Toward Engineering AGI: Benchmarking the Engineering Design Capabilities of LLMs

Xingang Guo, Yaxin Li, XiangYi Kong et al.

NEURIPS 2025arXiv:2509.16204
5
citations
#8194

Unleashing Diffusion Transformers for Visual Correspondence by Modulating Massive Activations

Chaofan Gan, Yuanpeng Tu, Xi Chen et al.

NEURIPS 2025arXiv:2505.18584
5
citations
#8195

L-MTP: Leap Multi-Token Prediction Beyond Adjacent Context for Large Language Models

Xiaohao Liu, Xiaobo Xia, Weixiang Zhao et al.

NEURIPS 2025arXiv:2505.17505
5
citations
#8196

Physics-Driven Spatiotemporal Modeling for AI-Generated Video Detection

Shuhai Zhang, ZiHao Lian, Jiahao Yang et al.

NEURIPS 2025oralarXiv:2510.08073
5
citations
#8197

MaterialRefGS: Reflective Gaussian Splatting with Multi-view Consistent Material Inference

Wenyuan Zhang, Jimin Tang, Weiqi Zhang et al.

NEURIPS 2025arXiv:2510.11387
5
citations
#8198

Surprise3D: A Dataset for Spatial Understanding and Reasoning in Complex 3D Scenes

Jiaxin Huang, Ziwen Li, Hanlue Zhang et al.

NEURIPS 2025arXiv:2507.07781
5
citations
#8199

RTV-Bench: Benchmarking MLLM Continuous Perception, Understanding and Reasoning through Real-Time Video

ShuHang Xun, Sicheng Tao, Jungang Li et al.

NEURIPS 2025arXiv:2505.02064
5
citations
#8200

DanmakuTPPBench: A Multi-modal Benchmark for Temporal Point Process Modeling and Understanding

Yue Jiang, Jichu Li, Yang Liu et al.

NEURIPS 2025oralarXiv:2505.18411
5
citations