Most Cited 2025 "embedding space deduplication" Papers

22,274 papers found • Page 41 of 112

Filters:Most Cited 2025 embedding space deduplication Clear all

Conference

AAAI 2025 (3,028)COLM 2025 (418)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,558)oral (1,594)spotlight (1,421)highlight (975)

#8001

SDFit: 3D Object Pose and Shape by Fitting a Morphable SDF to a Single Image

Dimitrije Antić, Georgios Paschalidis, Shashank Tripathi et al.

ICCV 2025arXiv:2409.16178

citations

#8002

Towards Immersive Human-X Interaction: A Real-Time Framework for Physically Plausible Motion Synthesis

Kaiyang Ji, Ye Shi, Zichen Jin et al.

ICCV 2025highlightarXiv:2508.02106

citations

#8003

EgoM2P: Egocentric Multimodal Multitask Pretraining

Gen Li, Yutong Chen, Yiqian Wu et al.

ICCV 2025arXiv:2506.07886

citations

#8004

Temporal Unlearnable Examples: Preventing Personal Video Data from Unauthorized Exploitation by Object Tracking

Qiangqiang Wu, Yi Yu, Chenqi Kong et al.

ICCV 2025arXiv:2507.07483

citations

#8005

G-DexGrasp: Generalizable Dexterous Grasping Synthesis Via Part-Aware Prior Retrieval and Prior-Assisted Generation

Juntao Jian, Xiuping Liu, Zixuanchen Zixuanchen et al.

ICCV 2025arXiv:2503.19457

citations

#8006

DH-FaceVid-1K: A Large-Scale High-Quality Dataset for Face Video Generation

Donglin Di, He Feng, Wenzhang SUN et al.

ICCV 2025arXiv:2410.07151

citations

#8007

Adaptive Hyper-Graph Convolution Network for Skeleton-based Human Action Recognition with Virtual Connections

Youwei Zhou, Tianyang Xu, Cong Wu et al.

ICCV 2025arXiv:2411.14796

citations

#8008

FaceLift: Learning Generalizable Single Image 3D Face Reconstruction from Synthetic Heads

Weijie Lyu, Yi Zhou, Ming-Hsuan Yang et al.

ICCV 2025arXiv:2412.17812

citations

#8009

Precise Action-to-Video Generation Through Visual Action Prompts

Yuang Wang, Chao Wen, Haoyu Guo et al.

ICCV 2025arXiv:2508.13104

citations

#8010

Latent-Reframe: Enabling Camera Control for Video Diffusion Models without Training

Zhenghong Zhou, Jie An, Jiebo Luo

ICCV 2025arXiv:2412.06029

citations

#8011

Frequency-Guided Posterior Sampling for Diffusion-Based Image Restoration

Darshan Thaker, Abhishek Goyal, Rene Vidal

ICCV 2025arXiv:2411.15295

citations

#8012

A Quality-Guided Mixture of Score-Fusion Experts Framework for Human Recognition

Jie Zhu, Yiyang Su, Minchul Kim et al.

ICCV 2025arXiv:2508.00053

citations

#8013

Motion Synthesis with Sparse and Flexible Keyjoint Control

Inwoo Hwang, Jinseok Bae, Donggeun Lim et al.

ICCV 2025arXiv:2503.15557

citations

#8014

IDFace: Face Template Protection for Efficient and Secure Identification

Sunpill Kim, Seunghun Paik, Chanwoo Hwang et al.

ICCV 2025arXiv:2507.12050

citations

#8015

MeshLLM: Empowering Large Language Models to Progressively Understand and Generate 3D Mesh

Shuangkang Fang, I-Chao Shen, Yufeng Wang et al.

ICCV 2025highlightarXiv:2508.01242

citations

#8016

2HandedAfforder: Learning Precise Actionable Bimanual Affordances from Human Videos

Marvin Heidinger, Snehal Jauhri, Vignesh Prasad et al.

ICCV 2025arXiv:2503.09320

citations

#8017

Edicho: Consistent Image Editing in the Wild

Qingyan Bai, Hao Ouyang, Yinghao Xu et al.

ICCV 2025arXiv:2412.21079

citations

#8018

DualReal: Adaptive Joint Training for Lossless Identity-Motion Fusion in Video Customization

Wenchuan Wang, Mengqi Huang, Yijing Tu et al.

ICCV 2025arXiv:2505.02192

citations

#8019

Versatile Transition Generation with Image-to-Video Diffusion

Zuhao Yang, Jiahui Zhang, Yingchen Yu et al.

ICCV 2025arXiv:2508.01698

citations

#8020

AnyI2V: Animating Any Conditional Image with Motion Control

Ziye Li, Xincheng Shuai, Hao Luo et al.

ICCV 2025arXiv:2507.02857

citations

#8021

The Silent Assistant: NoiseQuery as Implicit Guidance for Goal-Driven Image Generation

Ruoyu Wang, Huayang Huang, Ye Zhu et al.

ICCV 2025highlightarXiv:2412.05101

citations

#8022

Disrupting Model Merging: A Parameter-Level Defense Without Sacrificing Accuracy

JUNHAO WEI, YU ZHE, Jun Sakuma

ICCV 2025arXiv:2503.07661

citations

#8023

Lay-Your-Scene: Natural Scene Layout Generation with Diffusion Transformers

Divyansh Srivastava, Xiang Zhang, He Wen et al.

ICCV 2025arXiv:2505.04718

citations

#8024

Contrastive Test-Time Composition of Multiple LoRA Models for Image Generation

Tuna Meral, Enis Simsar, Federico Tombari et al.

ICCV 2025highlightarXiv:2403.19776

citations

#8025

Benchmarking and Learning Multi-Dimensional Quality Evaluator for Text-to-3D Generation

Yujie Zhang, Bingyang Cui, Qi Yang et al.

ICCV 2025arXiv:2412.11170

citations

#8026

Dense2MoE: Restructuring Diffusion Transformer to MoE for Efficient Text-to-Image Generation

Youwei Zheng, Yuxi Ren, Xin Xia et al.

ICCV 2025arXiv:2510.09094

citations

#8027

Exploring Multimodal Diffusion Transformers for Enhanced Prompt-based Image Editing

Joonghyuk Shin, Alchan Hwang, Yujin Kim et al.

ICCV 2025arXiv:2508.07519

citations

#8028

EDiT: Efficient Diffusion Transformers with Linear Compressed Attention

Philipp Becker, Abhinav Mehrotra, Ruchika Chavhan et al.

ICCV 2025arXiv:2503.16726

citations

#8029

Towards Video Thinking Test: A Holistic Benchmark for Advanced Video Reasoning and Understanding

Yuanhan Zhang, Yunice Chew, Yuhao Dong et al.

ICCV 2025arXiv:2507.15028

citations

#8030

MultiverSeg: Scalable Interactive Segmentation of Biomedical Imaging Datasets with In-Context Guidance

Hallee Wong, Jose Javier Gonzalez Ortiz, John Guttag et al.

ICCV 2025arXiv:2412.15058

citations

#8031

DisTime: Distribution-based Time Representation for Video Large Language Models

yingsen zeng, Zepeng Huang, Yujie Zhong et al.

ICCV 2025arXiv:2505.24329

citations

#8032

Dynamic Dictionary Learning for Remote Sensing Image Segmentation

Xuechao Zou, Yue Li, Shun Zhang et al.

ICCV 2025arXiv:2503.06683

citations

#8033

Streaming VideoLLMs for Real-Time Procedural Video Understanding

Dibyadip Chatterjee, Edoardo Remelli, Yale Song et al.

ICCV 2025arXiv:2504.13915

citations

#8034

SparseMM: Head Sparsity Emerges from Visual Concept Responses in MLLMs

Jiahui Wang, Zuyan Liu, Yongming Rao et al.

ICCV 2025arXiv:2506.05344

citations

#8035

From Trial to Triumph: Advancing Long Video Understanding via Visual Context Sample Scaling and Self-reward Alignment

Yucheng Suo, Fan Ma, Linchao Zhu et al.

ICCV 2025arXiv:2503.20472

citations

#8036

VFlowOpt: A Token Pruning Framework for LMMs with Visual Information Flow-Guided Optimization

Sihan Yang, Runsen Xu, Chenhang Cui et al.

ICCV 2025arXiv:2508.05211

citations

#8037

Scaling Tumor Segmentation: Best Lessons from Real and Synthetic Data

Qi Chen, Xinze Zhou, Chen Liu et al.

ICCV 2025arXiv:2510.14831

citations

#8038

SimMLM: A Simple Framework for Multi-modal Learning with Missing Modality

Sijie Li, Chen Chen, Jungong Han

ICCV 2025arXiv:2507.19264

citations

#8039

Street Gaussians without 3D Object Tracker

Ruida Zhang, Chengxi Li, Chenyangguang Zhang et al.

ICCV 2025arXiv:2412.05548

citations

#8040

StochasticSplats: Stochastic Rasterization for Sorting-Free 3D Gaussian Splatting

Shakiba Kheradmand, Delio Vicini, George Kopanas et al.

ICCV 2025arXiv:2503.24366

citations

#8041

Curve-Aware Gaussian Splatting for 3D Parametric Curve Reconstruction

Zhirui Gao, Renjiao Yi, YaQiao Dai et al.

ICCV 2025arXiv:2506.21401

citations

#8042

AAA-Gaussians: Anti-Aliased and Artifact-Free 3D Gaussian Rendering

Michael Steiner, Thomas Köhler, Lukas Radl et al.

ICCV 2025highlightarXiv:2504.12811

citations

#8043

Sat2City: 3D City Generation from A Single Satellite Image with Cascaded Latent Diffusion

Tongyan Hua, Lutao Jiang, Ying-Cong Chen et al.

ICCV 2025arXiv:2507.04403

citations

#8044

Leveraging BEV Paradigm for Ground-to-Aerial Image Synthesis

Junyan Ye, Jun He, Weijia Li et al.

ICCV 2025arXiv:2408.01812

citations

#8045

Unraveling the Effects of Synthetic Data on End-to-End Autonomous Driving

Junhao Ge, Zuhong Liu, Longteng Fan et al.

ICCV 2025arXiv:2503.18108

citations

#8046

MikuDance: Animating Character Art with Mixed Motion Dynamics

Jiaxu Zhang, Xianfang Zeng, Xin Chen et al.

ICCV 2025arXiv:2411.08656

citations

#8047

ObjectRelator: Enabling Cross-View Object Relation Understanding Across Ego-Centric and Exo-Centric Perspectives

Yuqian Fu, Runze Wang, Bin Ren et al.

ICCV 2025highlightarXiv:2411.19083

citations

#8048

CoHD: A Counting-Aware Hierarchical Decoding Framework for Generalized Referring Expression Segmentation

Zhuoyan Luo, Yinghao Wu, Tianheng Cheng et al.

ICCV 2025arXiv:2405.15658

citations

#8049

Not All Frame Features Are Equal: Video-to-4D Generation via Decoupling Dynamic-Static Features

Liying Yang, Chen Liu, Zhenwei Zhu et al.

ICCV 2025highlightarXiv:2502.08377

citations

#8050

Synergistic Prompting for Robust Visual Recognition with Missing Modalities

Zhihui Zhang, Luanyuan Dai, Qika Lin et al.

ICCV 2025arXiv:2507.07802

citations

#8051

PerLDiff: Controllable Street View Synthesis Using Perspective-Layout Diffusion Model

Jinhua Zhang, Hualian Sheng, Sijia Cai et al.

ICCV 2025arXiv:2407.06109

citations

#8052

HERO: Human Reaction Generation from Videos

Chengjun Yu, Wei Zhai, Yuhang Yang et al.

ICCV 2025arXiv:2503.08270

citations

#8053

Improving Multimodal Learning via Imbalanced Learning

Shicai Wei, Chunbo Luo, Yang Luo

ICCV 2025arXiv:2507.10203

citations

#8054

InterGSEdit: Interactive 3D Gaussian Splatting Editing with 3D Geometry-Consistent Attention Prior

Minghao Wen, Shengjie Wu, Kangkan Wang et al.

ICCV 2025arXiv:2507.04961

citations

#8055

CaO2: Rectifying Inconsistencies in Diffusion-Based Dataset Distillation

Haoxuan Wang, Zhenghao Zhao, Junyi Wu et al.

ICCV 2025

citations

#8056

CATP-LLM: Empowering Large Language Models for Cost-Aware Tool Planning

Duo Wu, Jinghe Wang, Yuan Meng et al.

ICCV 2025arXiv:2411.16313

citations

#8057

Learning Normal Flow Directly From Events

Dehao Yuan, Levi Burner, Jiayi Wu et al.

ICCV 2025arXiv:2412.11284

citations

#8058

Disentangled World Models: Learning to Transfer Semantic Knowledge from Distracting Videos for Reinforcement Learning

Qi Wang, Zhipeng Zhang, Baao Xie et al.

ICCV 2025arXiv:2503.08751

citations

#8059

DiffVSR: Revealing an Effective Recipe for Taming Robust Video Super-Resolution Against Complex Degradations

Xiaohui Li, Yihao Liu, Shuo Cao et al.

ICCV 2025arXiv:2501.10110

citations

#8060

SpatialCrafter: Unleashing the Imagination of Video Diffusion Models for Scene Reconstruction from Limited Observations

Songchun Zhang, Huiyao Xu, Sitong Guo et al.

ICCV 2025arXiv:2505.11992

citations

#8061

TextSSR: Diffusion-based Data Synthesis for Scene Text Recognition

Xingsong Ye, Yongkun Du, Yunbo Tao et al.

ICCV 2025arXiv:2412.01137

citations

#8062

MOSCATO: Predicting Multiple Object State Change Through Actions

Parnian Zameni, Yuhan Shen, Ehsan Elhamifar

ICCV 2025

citations

#8063

AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction

Junhao Cheng, Yuying Ge, Yixiao Ge et al.

ICCV 2025arXiv:2504.01014

citations

#8064

Exploring the Visual Feature Space for Multimodal Neural Decoding

Weihao Xia, Cengiz Oztireli

ICCV 2025arXiv:2505.15755

citations

#8065

RAGNet: Large-scale Reasoning-based Affordance Segmentation Benchmark towards General Grasping

Dongming Wu, Yanping Fu, Saike Huang et al.

ICCV 2025arXiv:2507.23734

citations

#8066

Scendi Score: Prompt‑Aware Diversity Evaluation via Schur Complement of CLIP Embeddings

Azim Ospanov, Mohammad Jalali, Farzan Farnia

ICCV 2025highlightarXiv:2412.18645

citations

#8067

VisNumBench: Evaluating Number Sense of Multimodal Large Language Models

Tengjin Weng, Jingyi Wang, Wenhao Jiang et al.

ICCV 2025arXiv:2503.14939

citations

#8068

NeurOp-Diff: Continuous Remote Sensing Image Super-Resolution via Neural Operator Diffusion

Zihao Xu, Yuzhi Tang, Bowen Xu et al.

ICCV 2025

citations

#8069

Φ-GAN:Physics-Inspired GAN for Generating SAR Images Under Limited Data

Xidan Zhang, Yihan Zhuang, Qian Guo et al.

ICCV 2025

citations

#8070

GaSLight: Gaussian Splats for Spatially-Varying Lighting in HDR

Christophe Bolduc, Yannick Hold-Geoffroy, Jean-Francois Lalonde

ICCV 2025arXiv:2504.10809

citations

#8071

GUAVA: Generalizable Upper Body 3D Gaussian Avatar

Dongbin Zhang, Yunfei Liu, Lijian Lin et al.

ICCV 2025arXiv:2505.03351

citations

#8072

DC-AR: Efficient Masked Autoregressive Image Generation with Deep Compression Hybrid Tokenizer

Yecheng Wu, Han Cai, Junyu Chen et al.

ICCV 2025arXiv:2507.04947

citations

#8073

X2I: Seamless Integration of Multimodal Understanding into Diffusion Transformer via Attention Distillation

jian ma, Qirong Peng, Xu Guo et al.

ICCV 2025arXiv:2503.06134

citations

#8074

WaveMamba: Wavelet-Driven Mamba Fusion for RGB-Infrared Object Detection

Haodong Zhu, Wenhao Dong, Linlin Yang et al.

ICCV 2025arXiv:2507.18173

citations

#8075

SAME: Learning Generic Language-Guided Visual Navigation with State-Adaptive Mixture of Experts

Gengze Zhou, Yicong Hong, Zun Wang et al.

ICCV 2025arXiv:2412.05552

citations

#8076

DC-ControlNet: Decoupling Inter- and Intra-Element Conditions in Image Generation with Diffusion Models

hongji yang, Wencheng Han, Yucheng Zhou et al.

ICCV 2025arXiv:2502.14779

citations

#8077

Adding Additional Control to One-Step Diffusion with Joint Distribution Matching

Yihong Luo, Tianyang Hu, Yifan Song et al.

ICCV 2025arXiv:2503.06652

citations

#8078

Boost 3D Reconstruction using Diffusion-based Monocular Camera Calibration

Junyuan Deng, Wei Yin, Xiaoyang Guo et al.

ICCV 2025arXiv:2411.17240

citations

#8079

SignRep: Enhancing Self-Supervised Sign Representations

Ryan Wong, Necati Cihan Camgoz, Richard Bowden

ICCV 2025arXiv:2503.08529

citations

#8080

Frequency-Aware Autoregressive Modeling for Efficient High-Resolution Image Synthesis

Zhuokun Chen, Jugang Fan, Zhuowei Yu et al.

ICCV 2025arXiv:2507.20454

citations

#8081

LongAnimation: Long Animation Generation with Dynamic Global-Local Memory

Nan Chen, Mengqi Huang, Yihao Meng et al.

ICCV 2025arXiv:2507.01945

citations

#8082

SEGS-SLAM: Structure-enhanced 3D Gaussian Splatting SLAM with Appearance Embedding

Tianci Wen, Zhiang Liu, Yongchun Fang

ICCV 2025arXiv:2501.05242

citations

#8083

SteerX: Creating Any Camera-Free 3D and 4D Scenes with Geometric Steering

Byeongjun Park, Hyojun Go, Hyelin Nam et al.

ICCV 2025arXiv:2503.12024

citations

#8084

GaussianUpdate: Continual 3D Gaussian Splatting Update for Changing Environments

Lin Zeng, Boming Zhao, Jiarui Hu et al.

ICCV 2025arXiv:2508.08867

citations

#8085

Distilling Diffusion Models to Efficient 3D LiDAR Scene Completion

shengyuan zhang, An Zhao, Ling Yang et al.

ICCV 2025arXiv:2412.03515

citations

#8086

MonoMVSNet: Monocular Priors Guided Multi-View Stereo Network

Jianfei Jiang, Qiankun Liu, Haochen Yu et al.

ICCV 2025arXiv:2507.11333

citations

#8087

PhysRig: Differentiable Physics-Based Skinning and Rigging Framework for Realistic Articulated Object Modeling

Hao Zhang, Haolan Xu, Chun Feng et al.

ICCV 2025arXiv:2506.20936

citations

#8088

Latent Diffusion Models with Masked AutoEncoders

Junho Lee, Jeongwoo Shin, Hyungwook Choi et al.

ICCV 2025arXiv:2507.09984

citations

#8089

TikZero: Zero-Shot Text-Guided Graphics Program Synthesis

Jonas Belouadi, Eddy Ilg, Margret Keuper et al.

ICCV 2025highlightarXiv:2503.11509

citations

#8090

Capturing Individual Human Preferences with Reward Features

Andre Barreto, Vincent Dumoulin, Yiran Mao et al.

NEURIPS 2025arXiv:2503.17338

citations

#8091

Understanding Prompt Tuning and In-Context Learning via Meta-Learning

Tim Genewein, Kevin Li, Jordi Grau-Moya et al.

NEURIPS 2025spotlightarXiv:2505.17010

citations

#8092

Tight Lower Bounds and Improved Convergence in Performative Prediction

Pedram Khorsandi, Rushil Gupta, Mehrnaz Mofakhami et al.

NEURIPS 2025arXiv:2412.03671

citations

#8093

MIRAGE: A Benchmark for Multimodal Information-Seeking and Reasoning in Agricultural Expert-Guided Conversations

Vardhan Dongre, Chi Gui, Shubham Garg et al.

NEURIPS 2025arXiv:2506.20100

citations

#8094

MindGYM: What Matters in Question Synthesis for Thinking-Centric Fine-Tuning?

Zhe Xu, Daoyuan Chen, Zhenqing Ling et al.

NEURIPS 2025arXiv:2503.09499

citations

#8095

EgoExOR: An Ego-Exo-Centric Operating Room Dataset for Surgical Activity Understanding

Ege Özsoy, Arda Mamur, Felix Tristram et al.

NEURIPS 2025arXiv:2505.24287

citations

#8096

BEDLAM2.0: Synthetic humans and cameras in motion

Joachim Tesch, Giorgio Becherini, Prerana Achar et al.

NEURIPS 2025oralarXiv:2511.14394

citations

#8097

Privacy Reasoning in Ambiguous Contexts

Ren Yi, Octavian Suciu, Adrian Gascon et al.

NEURIPS 2025arXiv:2506.12241

citations

#8098

Reinforcement Learning for Out-of-Distribution Reasoning in LLMs: An Empirical Study on Diagnosis-Related Group Coding

Hanyin Wang, Zhenbang Wu, Gururaj Kolar et al.

NEURIPS 2025spotlightarXiv:2505.21908

citations

#8099

Mitigating Forgetting in LLM Fine-Tuning via Low-Perplexity Token Learning

Chao-Chung Wu, Zhi Rui Tam, Chieh-Yen Lin et al.

NEURIPS 2025arXiv:2501.14315

citations

#8100

FlowDAS: A Stochastic Interpolant-based Framework for Data Assimilation

Siyi Chen, Yixuan Jia, Qing Qu et al.

NEURIPS 2025arXiv:2501.16642

citations

#8101

Backward Conformal Prediction

Etienne Gauthier, Francis Bach, Michael Jordan

NEURIPS 2025arXiv:2505.13732

citations

#8102

FreqPolicy: Efficient Flow-based Visuomotor Policy via Frequency Consistency

Yifei Su, Ning Liu, Dong Chen et al.

NEURIPS 2025oralarXiv:2506.08822

citations

#8103

Tropical Attention: Neural Algorithmic Reasoning for Combinatorial Algorithms

Baran Hashemi, Kurt Pasque, Chris Teska et al.

NEURIPS 2025arXiv:2505.17190

citations

#8104

GRE Suite: Geo-localization Inference via Fine-Tuned Vision-Language Models and Enhanced Reasoning Chains

Chun Wang, Xiaojun Ye, Xiaoran Pan et al.

NEURIPS 2025arXiv:2505.18700

citations

#8105

In-Context Learning Strategies Emerge Rationally

Daniel Wurgaft, Ekdeep S Lubana, Core Francisco Park et al.

NEURIPS 2025arXiv:2506.17859

citations

#8106

Logic.py: Bridging the Gap between LLMs and Constraint Solvers

Pascal Kesseli, Peter O'Hearn, Ricardo Cabral

NEURIPS 2025arXiv:2502.15776

citations

#8107

BeliefMapNav: 3D Voxel-Based Belief Map for Zero-Shot Object Navigation

Zibo Zhou, Yue Hu, Lingkai Zhang et al.

NEURIPS 2025arXiv:2506.06487

citations

#8108

PanoWan: Lifting Diffusion Video Generation Models to 360$^\circ$ with Latitude/Longitude-aware Mechanisms

Yifei Xia, Shuchen Weng, Siqi Yang et al.

NEURIPS 2025

citations

#8109

MINGLE: Mixture of Null-Space Gated Low-Rank Experts for Test-Time Continual Model Merging

Zihuan Qiu, Yi Xu, Chiyuan He et al.

NEURIPS 2025arXiv:2505.11883

citations

#8110

Treatment Effect Estimation for Optimal Decision-Making

Dennis Frauen, Valentyn Melnychuk, Jonas Schweisthal et al.

NEURIPS 2025arXiv:2505.13092

citations

#8111

ChartSketcher: Reasoning with Multimodal Feedback and Reflection for Chart Understanding

Muye Huang, Lingling Zhang, Jie Ma et al.

NEURIPS 2025arXiv:2505.19076

citations

#8112

Exploring Diffusion Transformer Designs via Grafting

Keshigeyan Chandrasegaran, Michael Poli, Dan Fu et al.

NEURIPS 2025oralarXiv:2506.05340

citations

#8113

MAGNET: A Multi-agent Framework for Finding Audio-Visual Needles by Reasoning over Multi-Video Haystacks

Sanjoy Chowdhury, Mohamed Elmoghany, Yohan Abeysinghe et al.

NEURIPS 2025oralarXiv:2506.07016

citations

#8114

NeurIPT: Foundation Model for Neural Interfaces

Zitao Fang, Chenxuan Li, Hongting Zhou et al.

NEURIPS 2025oralarXiv:2510.16548

citations

#8115

Learning Diffusion Models with Flexible Representation Guidance

Chenyu Wang, Cai Zhou, Sharut Gupta et al.

NEURIPS 2025arXiv:2507.08980

citations

#8116

Efficient Data Selection at Scale via Influence Distillation

Mahdi Nikdan, Vincent Cohen-Addad, Dan Alistarh et al.

NEURIPS 2025arXiv:2505.19051

citations

#8117

Scaling Up Liquid-Resistance Liquid-Capacitance Networks for Efficient Sequence Modeling

Mónika Farsang, Radu Grosu

NEURIPS 2025arXiv:2505.21717

citations

#8118

Entropic Time Schedulers for Generative Diffusion Models

Dejan Stancevic, Florian Handke, Luca Ambrogioni

NEURIPS 2025arXiv:2504.13612

citations

#8119

Modeling Microenvironment Trajectories on Spatial Transcriptomics with NicheFlow

Kristiyan Sakalyan, Alessandro Palma, Filippo Guerranti et al.

NEURIPS 2025oralarXiv:2511.00977

citations

#8120

Better Language Model Inversion by Compactly Representing Next-Token Distributions

Murtaza Nazir, Matthew Finlayson, John Morris et al.

NEURIPS 2025arXiv:2506.17090

citations

#8121

Distribution-Aligned Decoding for Efficient LLM Task Adaptation

Senkang Hu, Xudong Han, Jinqi Jiang et al.

NEURIPS 2025arXiv:2509.15888

citations

#8122

System Prompt Optimization with Meta-Learning

Yumin Choi, Jinheon Baek, Sung Ju Hwang

NEURIPS 2025arXiv:2505.09666

citations

#8123

InfinityStar: Uniﬁed Spacetime AutoRegressive Modeling for Visual Generation

Jinlai Liu, Jian Han, Bin Yan et al.

NEURIPS 2025oral

citations

#8124

Understanding and Rectifying Safety Perception Distortion in VLMs

Xiaohan Zou, Jian Kang, George Kesidis et al.

NEURIPS 2025arXiv:2502.13095

citations

#8125

Multilevel neural simulation-based inference

Yuga Hikida, Ayush Bharti, Niall Jeffrey et al.

NEURIPS 2025arXiv:2506.06087

citations

#8126

One Subgoal at a Time: Zero-Shot Generalization to Arbitrary Linear Temporal Logic Requirements in Multi-Task Reinforcement Learning

Zijian Guo, İlker Işık, H M Sabbir Ahmad et al.

NEURIPS 2025oralarXiv:2508.01561

citations

#8127

Neural Thermodynamics: Entropic Forces in Deep and Universal Representation Learning

Liu Ziyin, Yizhou Xu, Isaac Chuang

NEURIPS 2025arXiv:2505.12387

citations

#8128

Improving Generalization of Neural Combinatorial Optimization for Vehicle Routing Problems via Test-Time Projection Learning

Yuanyao Chen, Rongsheng Chen, Fu Luo et al.

NEURIPS 2025arXiv:2506.02392

citations

#8129

Fine-grained List-wise Alignment for Generative Medication Recommendation

Chenxiao Fan, Chongming Gao, Wentao Shi et al.

NEURIPS 2025spotlightarXiv:2505.20218

citations

#8130

Improved Balanced Classification with Theoretically Grounded Loss Functions

Corinna Cortes, Mehryar Mohri, Yutao Zhong

NEURIPS 2025arXiv:2512.23947

citations

#8131

Spiral: Semantic-Aware Progressive LiDAR Scene Generation and Understanding

Dekai Zhu, Yixuan Hu, Youquan Liu et al.

NEURIPS 2025arXiv:2505.22643

citations

#8132

The Fluorescent Veil: A Stealthy and Effective Physical Adversarial Patch Against Traffic Sign Recognition

Shuai Yuan, Xingshuo Han, Hongwei Li et al.

NEURIPS 2025arXiv:2409.12394

citations

#8133

ExGra-Med: Extended Context Graph Alignment for Medical Vision-Language Models

Duy M. H. Nguyen, Nghiem Diep, Trung Nguyen et al.

NEURIPS 2025arXiv:2410.02615

citations

#8134

Transformer brain encoders explain human high-level visual responses

Hossein Adeli, Sun Minni, Nikolaus Kriegeskorte

NEURIPS 2025spotlightarXiv:2505.17329

citations

#8135

When Are Concepts Erased From Diffusion Models?

Kevin Lu, Nicky Kriplani, Rohit Gandikota et al.

NEURIPS 2025arXiv:2505.17013

citations

#8136

R$^2$ec: Towards Large Recommender Models with Reasoning

Runyang You, Yongqi Li, Xinyu Lin et al.

NEURIPS 2025arXiv:2505.16994

citations

#8137

Convergence of Clipped SGD on Convex $(L_0,L_1)$-Smooth Functions

Ofir Gaash, Kfir Y. Levy, Yair Carmon

NEURIPS 2025arXiv:2502.16492

citations

#8138

RadarQA: Multi-modal Quality Analysis of Weather Radar Forecasts

Xuming He, Zhiyuan You, Junchao Gong et al.

NEURIPS 2025arXiv:2508.12291

citations

#8139

The Rich and the Simple: On the Implicit Bias of Adam and SGD

Bhavya Vasudeva, Jung Lee, Vatsal Sharan et al.

NEURIPS 2025arXiv:2505.24022

citations

#8140

macOSWorld: A Multilingual Interactive Benchmark for GUI Agents

Pei Yang, Hai Ci, Mike Zheng Shou

NEURIPS 2025arXiv:2506.04135

citations

#8141

CHOICE: Benchmarking the Remote Sensing Capabilities of Large Vision-Language Models

Xiao An, Jiaxing Sun, Zihan Gui et al.

NEURIPS 2025arXiv:2411.18145

citations

#8142

Joint Relational Database Generation via Graph-Conditional Diffusion Models

Mohamed Amine Ketata, David Lüdke, Leo Schwinn et al.

NEURIPS 2025arXiv:2505.16527

citations

#8143

Generating Computational Cognitive models using Large Language Models

Milena Rmus, Akshay Kumar Jagadish, Marvin Mathony et al.

NEURIPS 2025oralarXiv:2502.00879

citations

#8144

Generalization or Hallucination? Understanding Out-of-Context Reasoning in Transformers

Yixiao Huang, Hanlin Zhu, Tianyu Guo et al.

NEURIPS 2025arXiv:2506.10887

citations

#8145

Failure Prediction at Runtime for Generative Robot Policies

Ralf Römer, Adrian Kobras, Luca Worbis et al.

NEURIPS 2025arXiv:2510.09459

citations

#8146

Predicting Empirical AI Research Outcomes with Language Models

Jiaxin Wen, Chenglei Si, Yueh-Han Chen et al.

NEURIPS 2025arXiv:2506.00794

citations

#8147

Transformers Provably Learn Chain-of-Thought Reasoning with Length Generalization

Yu Huang, Zixin Wen, Aarti Singh et al.

NEURIPS 2025arXiv:2511.07378

citations

#8148

Towards Understanding the Mechanisms of Classifier-Free Guidance

Xiang Li, Rongrong Wang, Qing Qu

NEURIPS 2025spotlightarXiv:2505.19210

citations

#8149

Learning Orthogonal Multi-Index Models: A Fine-Grained Information Exponent Analysis

Yunwei Ren, Jason Lee

NEURIPS 2025arXiv:2410.09678

citations

#8150

DrVD-Bench: Do Vision-Language Models Reason Like Human Doctors in Medical Image Diagnosis?

Tianhong Zhou, xu yin, Yingtao Zhu et al.

NEURIPS 2025arXiv:2505.24173

citations

#8151

Refusal Direction is Universal Across Safety-Aligned Languages

Xinpeng Wang, Mingyang Wang, Yihong Liu et al.

NEURIPS 2025arXiv:2505.17306

citations

#8152

Systematic Reward Gap Optimization for Mitigating VLM Hallucinations

Lehan He, Zeren Chen, Zhelun Shi et al.

NEURIPS 2025arXiv:2411.17265

citations

#8153

Revisiting End-to-End Learning with Slide-level Supervision in Computational Pathology

Wenhao Tang, Rong Qin, Heng Fang et al.

NEURIPS 2025arXiv:2506.02408

citations

#8154

Language Models Can Predict Their Own Behavior

Dhananjay Ashok, Jonathan May

NEURIPS 2025arXiv:2502.13329

citations

#8155

AceSearcher: Bootstrapping Reasoning and Search for LLMs via Reinforced Self-Play

Ran Xu, Yuchen Zhuang, Zihan Dong et al.

NEURIPS 2025spotlightarXiv:2509.24193

citations

#8156

Object-centric 3D Motion Field for Robot Learning from Human Videos

Zhao-Heng Yin, Sherry Yang, Pieter Abbeel

NEURIPS 2025spotlightarXiv:2506.04227

citations

#8157

Planning without Search: Refining Frontier LLMs with Offline Goal-Conditioned RL

Joey Hong, Anca Dragan, Sergey Levine

NEURIPS 2025arXiv:2505.18098

citations

#8158

LuxDiT: Lighting Estimation with Video Diffusion Transformer

Ruofan Liang, Kai He, Zan Gojcic et al.

NEURIPS 2025arXiv:2509.03680

citations

#8159

Bisecle: Binding and Separation in Continual Learning for Video Language Understanding

Yue Tan, Xiaoqian Hu, Hao Xue et al.

NEURIPS 2025arXiv:2507.00469

citations

#8160

Online Learning of Neural Networks

Amit Daniely, Idan Mehalel, Elchanan Mossel

NEURIPS 2025arXiv:2505.09167

citations

#8161

VeriThoughts: Enabling Automated Verilog Code Generation using Reasoning and Formal Verification

Patrick Yubeaton, Andre Nakkab, Weihua Xiao et al.

NEURIPS 2025arXiv:2505.20302

citations

#8162

FlexEvent: Towards Flexible Event-Frame Object Detection at Varying Operational Frequencies

Dongyue Lu, Lingdong Kong, Gim Hee Lee et al.

NEURIPS 2025oralarXiv:2412.06708

citations

#8163

Uni-Instruct: One-step Diffusion Model through Unified Diffusion Divergence Instruction

Yifei Wang, Weimin Bai, colin zhang et al.

NEURIPS 2025arXiv:2505.20755

citations

#8164

Multi-head Transformers Provably Learn Symbolic Multi-step Reasoning via Gradient Descent

Tong Yang, Yu Huang, Yingbin Liang et al.

NEURIPS 2025arXiv:2508.08222

citations

#8165

WorldWeaver: Generating Long-Horizon Video Worlds via Rich Perception

Zhiheng Liu, Xueqing Deng, Shoufa Chen et al.

NEURIPS 2025oralarXiv:2508.15720

citations

#8166

Poison as Cure: Visual Noise for Mitigating Object Hallucinations in LVMs

Kejia Zhang, Keda TAO, Jiasheng Tang et al.

NEURIPS 2025arXiv:2501.19164

citations

#8167

The Structural Complexity of Matrix-Vector Multiplication

Emile Anand, Jan van den Brand, Rose McCarty

NEURIPS 2025arXiv:2502.21240

citations

#8168

Neurosymbolic Diffusion Models

Emile van Krieken, Pasquale Minervini, Edoardo Maria Ponti et al.

NEURIPS 2025arXiv:2505.13138

citations

#8169

Head Pursuit: Probing Attention Specialization in Multimodal Transformers

Lorenzo Basile, Valentino Maiorca, Diego Doimo et al.

NEURIPS 2025spotlightarXiv:2510.21518

citations

#8170

Fine-Grained Preference Optimization Improves Spatial Reasoning in VLMs

Yifan Shen, Yuanzhe Liu, Jingyuan Zhu et al.

NEURIPS 2025arXiv:2506.21656

citations

#8171

Chain-of-Action: Trajectory Autoregressive Modeling for Robotic Manipulation

Wenbo Zhang, Tianrun Hu, Hanbo Zhang et al.

NEURIPS 2025oralarXiv:2506.09990

citations

#8172

VideoVLA: Video Generators Can Be Generalizable Robot Manipulators

Yichao Shen, Fangyun Wei, Zhiying Du et al.

NEURIPS 2025arXiv:2512.06963

citations

#8173

Revisiting Multi-Agent World Modeling from a Diffusion-Inspired Perspective

Yang Zhang, Xinran Li, Jianing Ye et al.

NEURIPS 2025arXiv:2505.20922

citations

#8174

AutoData: A Multi-Agent System for Open Web Data Collection

Tianyi Ma, Yiyue Qian, Zheyuan Zhang et al.

NEURIPS 2025arXiv:2505.15859

citations

#8175

E2Former: An Efficient and Equivariant Transformer with Linear-Scaling Tensor Products

Yunyang Li, Lin Huang, Zhihao Ding et al.

NEURIPS 2025spotlightarXiv:2501.19216

citations

#8176

Do different prompting methods yield a common task representation in language models?

Guy Davidson, Todd Gureckis, Brenden Lake et al.

NEURIPS 2025arXiv:2505.12075

citations

#8177

Alligat0R: Pre-Training through Covisibility Segmentation for Relative Camera Pose Regression

Thibaut Loiseau, Guillaume Bourmaud, Vincent Lepetit

NEURIPS 2025spotlight

citations

#8178

PeRL: Permutation-Enhanced Reinforcement Learning for Interleaved Vision-Language Reasoning

Yizhen Zhang, Yang Ding, Shuoshuo Zhang et al.

NEURIPS 2025arXiv:2506.14907

citations

#8179

Let Me Think! A Long Chain of Thought Can Be Worth Exponentially Many Short Ones

Parsa Mirtaheri, Ezra Edelman, Samy Jelassi et al.

NEURIPS 2025arXiv:2505.21825

citations

#8180

Geo-Sign: Hyperbolic Contrastive Regularisation for Geometrically Aware Sign Language Translation

Edward Fish, Richard Bowden

NEURIPS 2025oralarXiv:2506.00129

citations

#8181

Grammars of Formal Uncertainty: When to Trust LLMs in Automated Reasoning Tasks

Debargha Ganguly, Vikash Singh, Sreehari Sankar et al.

NEURIPS 2025arXiv:2505.20047

citations

#8182

SAVVY: Spatial Awareness via Audio-Visual LLMs through Seeing and Hearing

Mingfei Chen, Zijun Cui, Xiulong Liu et al.

NEURIPS 2025oralarXiv:2506.05414

citations

#8183

Flexible MOF Generation with Torsion-Aware Flow Matching

Nayoung Kim, Seongsu Kim, Sungsoo Ahn

NEURIPS 2025arXiv:2505.17914

citations

#8184

MedMax: Mixed-Modal Instruction Tuning for Training Biomedical Assistants

Hritik Bansal, Daniel Israel, Siyan Zhao et al.

NEURIPS 2025arXiv:2412.12661

citations

#8185

FALCON: An ML Framework for Fully Automated Layout-Constrained Analog Circuit Design

Asal Mehradfar, Xuzhe Zhao, Yilun Huang et al.

NEURIPS 2025arXiv:2505.21923

citations

#8186

Functional Scaling Laws in Kernel Regression: Loss Dynamics and Learning Rate Schedules

Binghui Li, Fengling Chen, Zixun Huang et al.

NEURIPS 2025spotlightarXiv:2509.19189

citations

#8187

On the Edge of Memorization in Diffusion Models

Sam Buchanan, Druv Pai, Yi Ma et al.

NEURIPS 2025arXiv:2508.17689

citations

#8188

CellVerse: Do Large Language Models Really Understand Cell Biology?

Fan Zhang, Tianyu Liu, Zhihong Zhu et al.

NEURIPS 2025arXiv:2505.07865

citations

#8189

Orthogonal Survival Learners for Estimating Heterogeneous Treatment Effects from Time-to-Event Data

Dennis Frauen, Maresa Schröder, Konstantin Hess et al.

NEURIPS 2025arXiv:2505.13072

citations

#8190

Watermarking Autoregressive Image Generation

Nikola Jovanović, Ismail Labiad, Tomas Soucek et al.

NEURIPS 2025arXiv:2506.16349

citations

#8191

Top-H Decoding: Adapting the Creativity and Coherence with Bounded Entropy in Text Generation

Erfan Baghaei Potraghloo, Seyedarmin Azizi, Souvik Kundu et al.

NEURIPS 2025arXiv:2509.02510

citations

#8192

Efficient Quadratic Corrections for Frank-Wolfe Algorithms

Jannis Halbey, Seta Rakotomandimby, Mathieu Besançon et al.

NEURIPS 2025arXiv:2506.02635

citations

#8193

Toward Engineering AGI: Benchmarking the Engineering Design Capabilities of LLMs

Xingang Guo, Yaxin Li, XiangYi Kong et al.

NEURIPS 2025arXiv:2509.16204

citations

#8194

Unleashing Diffusion Transformers for Visual Correspondence by Modulating Massive Activations

Chaofan Gan, Yuanpeng Tu, Xi Chen et al.

NEURIPS 2025arXiv:2505.18584

citations

#8195

L-MTP: Leap Multi-Token Prediction Beyond Adjacent Context for Large Language Models

Xiaohao Liu, Xiaobo Xia, Weixiang Zhao et al.

NEURIPS 2025arXiv:2505.17505

citations

#8196

Physics-Driven Spatiotemporal Modeling for AI-Generated Video Detection

Shuhai Zhang, ZiHao Lian, Jiahao Yang et al.

NEURIPS 2025oralarXiv:2510.08073

citations

#8197

MaterialRefGS: Reflective Gaussian Splatting with Multi-view Consistent Material Inference

Wenyuan Zhang, Jimin Tang, Weiqi Zhang et al.

NEURIPS 2025arXiv:2510.11387

citations

#8198

Surprise3D: A Dataset for Spatial Understanding and Reasoning in Complex 3D Scenes

Jiaxin Huang, Ziwen Li, Hanlue Zhang et al.

NEURIPS 2025arXiv:2507.07781

citations

#8199

RTV-Bench: Benchmarking MLLM Continuous Perception, Understanding and Reasoning through Real-Time Video

ShuHang Xun, Sicheng Tao, Jungang Li et al.

NEURIPS 2025arXiv:2505.02064

citations

#8200

DanmakuTPPBench: A Multi-modal Benchmark for Temporal Point Process Modeling and Understanding

Yue Jiang, Jichu Li, Yang Liu et al.

NEURIPS 2025oralarXiv:2505.18411

citations

← Previous

1...39 40 41 42 43...112