Most Cited 2025 Poster Papers

22,274 papers found • Page 35 of 112

#6801

Decouple Distortion from Perception: Region Adaptive Diffusion for Extreme-low Bitrate Perception Image Compression

Jinchang Xu, Shaokang Wang, Jintao Chen et al.

CVPR 2025
3
citations
#6802

CReFT-CAD: Boosting Orthographic Projection Reasoning for CAD via Reinforcement Fine-Tuning

Ke Niu, Zhuofan Chen, Haiyang Yu et al.

NEURIPS 2025arXiv:2506.00568
3
citations
#6803

CrossAD: Time Series Anomaly Detection with Cross-scale Associations and Cross-window Modeling

Beibu Li, Qichao Shentu, Yang Shu et al.

NEURIPS 2025arXiv:2510.12489
3
citations
#6804

Hallucinatory Image Tokens: A Training-free EAZY Approach to Detecting and Mitigating Object Hallucinations in LVLMs

Liwei Che, Qingze T Liu, Jing Jia et al.

ICCV 2025arXiv:2503.07772
3
citations
#6805

SC-Captioner: Improving Image Captioning with Self-Correction by Reinforcement Learning

Lin Zhang, Xianfang Zeng, Kangcong Li et al.

ICCV 2025arXiv:2508.06125
3
citations
#6806

SPAZER: Spatial-Semantic Progressive Reasoning Agent for Zero-shot 3D Visual Grounding

Zhao Jin, Rong-Cheng Tu, Jingyi Liao et al.

NEURIPS 2025arXiv:2506.21924
3
citations
#6807

CSD-VAR: Content-Style Decomposition in Visual Autoregressive Models

Quang-Binh Nguyen, Minh Luu, Quang Nguyen et al.

ICCV 2025arXiv:2507.13984
3
citations
#6808

Sampling from multi-modal distributions with polynomial query complexity in fixed dimension via reverse diffusion

Adrien Vacher, Omar Chehab, Anna Korba

NEURIPS 2025arXiv:2501.00565
3
citations
#6809

FACE: Faithful Automatic Concept Extraction

Dipkamal Bhusal, Michael Clifford, Sara Rampazzi et al.

NEURIPS 2025arXiv:2510.11675
3
citations
#6810

Entropic Time Schedulers for Generative Diffusion Models

Dejan Stancevic, Florian Handke, Luca Ambrogioni

NEURIPS 2025arXiv:2504.13612
3
citations
#6811

Sufficient Invariant Learning for Distribution Shift

Taero Kim, Subeen Park, Sungjun Lim et al.

CVPR 2025arXiv:2210.13533
3
citations
#6812

Differentiation Through Black-Box Quadratic Programming Solvers

Connor Magoon, Fengyu Yang, Noam Aigerman et al.

NEURIPS 2025arXiv:2410.06324
3
citations
#6813

Integrating Visual Interpretation and Linguistic Reasoning for Geometric Problem Solving

Zixian Guo, Ming Liu, Qilong Wang et al.

ICCV 2025
3
citations
#6814

$\boldsymbol{\lambda}$-Orthogonality Regularization for Compatible Representation Learning

Simone Ricci, Niccolò Biondi, Federico Pernici et al.

NEURIPS 2025
3
citations
#6815

Next Semantic Scale Prediction via Hierarchical Diffusion Language Models

Cai Zhou, Chenyu Wang, Dinghuai Zhang et al.

NEURIPS 2025arXiv:2510.08632
3
citations
#6816

Tracktention: Leveraging Point Tracking to Attend Videos Faster and Better

Zihang Lai, Andrea Vedaldi

CVPR 2025highlightarXiv:2503.19904
3
citations
#6817

Styl3R: Instant 3D Stylized Reconstruction for Arbitrary Scenes and Styles

Peng Wang, Xiang Liu, Peidong Liu

NEURIPS 2025arXiv:2505.21060
3
citations
#6818

Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance

Ruihang Chu, Yefei He, Zhekai Chen et al.

NEURIPS 2025oralarXiv:2512.08765
3
citations
#6819

Joint Diffusion Models in Continual Learning

Paweł Skierś, Kamil Deja

ICCV 2025arXiv:2411.08224
3
citations
#6820

GT-Loc: Unifying When and Where in Images through a Joint Embedding Space

David G. Shatwell, Ishan Rajendrakumar Dave, Swetha Sirnam et al.

ICCV 2025arXiv:2507.10473
3
citations
#6821

MV-CoLight: Efficient Object Compositing with Consistent Lighting and Shadow Generation

Kerui Ren, Jiayang Bai, Linning Xu et al.

NEURIPS 2025arXiv:2505.21483
3
citations
#6822

GenPO: Generative Diffusion Models Meet On-Policy Reinforcement Learning

Shutong Ding, Ke Hu, Shan Zhong et al.

NEURIPS 2025arXiv:2505.18763
3
citations
#6823

VITRIX-UniViTAR: Unified Vision Transformer with Native Resolution

Limeng Qiao, Yiyang Gan, Bairui Wang et al.

NEURIPS 2025oral
3
citations
#6824

ATA: Adaptive Transformation Agent for Text-Guided Subject-Position Variable Background Inpainting

Yizhe Tang, Zhimin Sun, Yuzhen Du et al.

CVPR 2025
3
citations
#6825

GG-SSMs: Graph-Generating State Space Models

Nikola Zubic, Davide Scaramuzza

CVPR 2025
3
citations
#6826

Dynamic View Synthesis as an Inverse Problem

Hidir Yesiltepe, Pinar Yanardag

NEURIPS 2025arXiv:2506.08004
3
citations
#6827

PMA: Towards Parameter-Efficient Point Cloud Understanding via Point Mamba Adapter

Yaohua Zha, Yanzi Wang, Hang Guo et al.

CVPR 2025arXiv:2505.20941
3
citations
#6828

Convergent Functions, Divergent Forms

Hyeonseong Jeon, Ainaz Eftekhar, Aaron Walsman et al.

NEURIPS 2025arXiv:2505.21665
3
citations
#6829

Learning to Better Search with Language Models via Guided Reinforced Self-Training

Seungyong Moon, Bumsoo Park, Hyun Oh Song

NEURIPS 2025arXiv:2410.02992
3
citations
#6830

HumanMM: Global Human Motion Recovery from Multi-shot Videos

Yuhong Zhang, Guanlin Wu, Ling-Hao Chen et al.

CVPR 2025arXiv:2503.07597
3
citations
#6831

Bootstrapping Grounded Chain-of-Thought in Multimodal LLMs for Data-Efficient Model Adaptation

Jiaer Xia, Bingkui Tong, Yuhang Zang et al.

ICCV 2025highlightarXiv:2507.02859
3
citations
#6832

Reading Recognition in the Wild

Charig Yang, Samiul Alam, Shakhrul Iman Siam et al.

NEURIPS 2025arXiv:2505.24848
3
citations
#6833

Heavy Labels Out! Dataset Distillation with Label Space Lightening

Ruonan Yu, Songhua Liu, Zigeng Chen et al.

ICCV 2025arXiv:2408.08201
3
citations
#6834

VolumetricSMPL: A Neural Volumetric Body Model for Efficient Interactions, Contacts, and Collisions

Marko Mihajlovic, Siwei Zhang, Gen Li et al.

ICCV 2025highlightarXiv:2506.23236
3
citations
#6835

Noise Diffusion for Enhancing Semantic Faithfulness in Text-to-Image Synthesis

Boming Miao, Chunxiao Li, Xiaoxiao Wang et al.

CVPR 2025arXiv:2411.16503
3
citations
#6836

Conditional Balance: Improving Multi-Conditioning Trade-Offs in Image Generation

Nadav Z. Cohen, Oron Nir, Ariel Shamir

CVPR 2025arXiv:2412.19853
3
citations
#6837

EA-KD: Entropy-based Adaptive Knowledge Distillation

Chi-Ping Su, Ching-Hsun Tseng, Bin Pu et al.

ICCV 2025arXiv:2311.13621
3
citations
#6838

Seeing What Matters: Generalizable AI-generated Video Detection with Forensic-Oriented Augmentation

Riccardo Corvi, Davide Cozzolino, Ekta Prashnani et al.

NEURIPS 2025arXiv:2506.16802
3
citations
#6839

I Am Big, You Are Little; I Am Right, You Are Wrong

David A Kelly, Akchunya Chanchal, Nathan Blake

ICCV 2025arXiv:2507.23509
3
citations
#6840

TLB-VFI: Temporal-Aware Latent Brownian Bridge Diffusion for Video Frame Interpolation

Zonglin Lyu, Chen Chen

ICCV 2025arXiv:2507.04984
3
citations
#6841

Generate, Refine, and Encode: Leveraging Synthesized Novel Samples for On-the-Fly Fine-Grained Category Discovery

Xiao Liu, Nan Pu, Haiyang Zheng et al.

ICCV 2025arXiv:2507.04051
3
citations
#6842

Deep learning for continuous-time stochastic control with jumps

Patrick Cheridito, Jean-Loup Dupret, Donatien Hainaut

NEURIPS 2025arXiv:2505.15602
3
citations
#6843

Surprise3D: A Dataset for Spatial Understanding and Reasoning in Complex 3D Scenes

Jiaxin Huang, Ziwen Li, Hanlue Zhang et al.

NEURIPS 2025arXiv:2507.07781
3
citations
#6844

BRACE: A Benchmark for Robust Audio Caption Quality Evaluation

Tianyu Guo, Hongyu Chen, Hao Liang et al.

NEURIPS 2025arXiv:2512.10403
3
citations
#6845

HybridMQA: Exploring Geometry-Texture Interactions for Colored Mesh Quality Assessment

Armin Shafiee Sarvestani, Sheyang Tang, Zhou Wang

CVPR 2025arXiv:2412.01986
3
citations
#6846

SpecEdge: Scalable Edge-Assisted Serving Framework for Interactive LLMs

Jinwoo Park, Seunggeun Cho, Dongsu Han

NEURIPS 2025spotlightarXiv:2505.17052
3
citations
#6847

Self-Refining Language Model Anonymizers via Adversarial Distillation

Kyuyoung Kim, Hyunjun Jeon, Jinwoo Shin

NEURIPS 2025arXiv:2506.01420
3
citations
#6848

ReDi: Rectified Discrete Flow

Jaehoon Yoo, Wonjung Kim, Seunghoon Hong

NEURIPS 2025arXiv:2507.15897
3
citations
#6849

Attention! Your Vision Language Model Could Be Maliciously Manipulated

Xiaosen Wang, Shaokang Wang, Zhijin Ge et al.

NEURIPS 2025arXiv:2505.19911
3
citations
#6850

AdaDetectGPT: Adaptive Detection of LLM-Generated Text with Statistical Guarantees

Hongyi Zhou, Jin Zhu, Pingfan Su et al.

NEURIPS 2025arXiv:2510.01268
3
citations
#6851

DanmakuTPPBench: A Multi-modal Benchmark for Temporal Point Process Modeling and Understanding

Yue Jiang, Jichu Li, Yang Liu et al.

NEURIPS 2025oralarXiv:2505.18411
3
citations
#6852

GRAB: A Challenging GRaph Analysis Benchmark for Large Multimodal Models

Jonathan Roberts, Kai Han, Samuel Albanie

ICCV 2025arXiv:2408.11817
3
citations
#6853

Efficient Policy Optimization in Robust Constrained MDPs with Iteration Complexity Guarantees

Sourav Ganguly, Kishan Panaganti, Arnob Ghosh et al.

NEURIPS 2025arXiv:2505.19238
3
citations
#6854

SGFormer: Satellite-Ground Fusion for 3D Semantic Scene Completion

Xiyue Guo, Jiarui Hu, Junjie Hu et al.

CVPR 2025arXiv:2503.16825
3
citations
#6855

Visual Modality Prompt for Adapting Vision-Language Object Detectors

Heitor Rapela Medeiros, Atif Belal, Srikanth Muralidharan et al.

ICCV 2025arXiv:2412.00622
3
citations
#6856

Is `Right' Right? Enhancing Object Orientation Understanding in Multimodal Large Language Models through Egocentric Instruction Tuning

JiHyeok Jung, EunTae Kim, SeoYeon Kim et al.

CVPR 2025arXiv:2411.16761
3
citations
#6857

Efficient Unsupervised Shortcut Learning Detection and Mitigation in Transformers

Lukas Kuhn, sari sadiya, Jörg Schlötterer et al.

ICCV 2025arXiv:2501.00942
3
citations
#6858

FreqPolicy: Efficient Flow-based Visuomotor Policy via Frequency Consistency

Yifei Su, Ning Liu, Dong Chen et al.

NEURIPS 2025oralarXiv:2506.08822
3
citations
#6859

From Sequence to Structure: Uncovering Substructure Reasoning in Transformers

Xinnan Dai, Kai Yang, Jay Revolinsky et al.

NEURIPS 2025arXiv:2507.10435
3
citations
#6860

GeoComplete: Geometry-Aware Diffusion for Reference-Driven Image Completion

Beibei Lin, Tingting Chen, Robby Tan

NEURIPS 2025arXiv:2510.03110
3
citations
#6861

HCRMP: An LLM-Hinted Contextual Reinforcement Learning Framework for Autonomous Driving

Zhiwen Chen, Hanming Deng, Zhuoren Li et al.

NEURIPS 2025arXiv:2505.15793
3
citations
#6862

Multi-focal Conditioned Latent Diffusion for Person Image Synthesis

Jiaqi Liu, Jichao Zhang, Paolo Rota et al.

CVPR 2025arXiv:2503.15686
3
citations
#6863

TriDi: Trilateral Diffusion of 3D Humans, Objects, and Interactions

Ilya A. Petrov, Riccardo Marin, Julian Chibane et al.

ICCV 2025arXiv:2412.06334
3
citations
#6864

Reinforcement Learning for Out-of-Distribution Reasoning in LLMs: An Empirical Study on Diagnosis-Related Group Coding

Hanyin Wang, Zhenbang Wu, Gururaj Kolar et al.

NEURIPS 2025spotlightarXiv:2505.21908
3
citations
#6865

A Stable Whitening Optimizer for Efficient Neural Network Training

Kevin Frans, Sergey Levine, Pieter Abbeel

NEURIPS 2025arXiv:2506.07254
3
citations
#6866

HumanSAM: Classifying Human-centric Forgery Videos in Human Spatial, Appearance, and Motion Anomaly

Chang Liu, Yunfan Ye, Fan Zhang et al.

ICCV 2025arXiv:2507.19924
3
citations
#6867

EvoLM: In Search of Lost Language Model Training Dynamics

Zhenting Qi, Fan Nie, Alexandre Alahi et al.

NEURIPS 2025oralarXiv:2506.16029
3
citations
#6868

Who You Are Matters: Bridging Interests and Social Roles via LLM-Enhanced Logic Recommendation

Qing Yu, Xiaobei Wang, Shuchang Liu et al.

NEURIPS 2025oral
3
citations
#6869

Fairshare Data Pricing via Data Valuation for Large Language Models

Luyang Zhang, Cathy Jiao, Beibei Li et al.

NEURIPS 2025arXiv:2502.00198
3
citations
#6870

Scale Efficient Training for Large Datasets

Qing Zhou, Junyu Gao, Qi Wang

CVPR 2025arXiv:2503.13385
3
citations
#6871

Adaptive Distraction: Probing LLM Contextual Robustness with Automated Tree Search

Yanbo Wang, Zixiang Xu, Yue Huang et al.

NEURIPS 2025arXiv:2502.01609
3
citations
#6872

SceneMI: Motion In-betweening for Modeling Human-Scene Interaction

Inwoo Hwang, Bing Zhou, Young Min Kim et al.

ICCV 2025highlightarXiv:2503.16289
3
citations
#6873

CAPability: A Comprehensive Visual Caption Benchmark for Evaluating Both Correctness and Thoroughness

Zhihang Liu, Chen-Wei Xie, Bin Wen et al.

NEURIPS 2025arXiv:2502.14914
3
citations
#6874

Zero-Shot Trajectory Planning for Signal Temporal Logic Tasks

Ruijia Liu, Ancheng Hou, Xiao Yu et al.

NEURIPS 2025oralarXiv:2501.13457
3
citations
#6875

Comprehensive Relighting: Generalizable and Consistent Monocular Human Relighting and Harmonization

Junying Wang, Jingyuan Liu, Xin Sun et al.

CVPR 2025arXiv:2504.03011
3
citations
#6876

MOOSE-Chem2: Exploring LLM Limits in Fine-Grained Scientific Hypothesis Discovery via Hierarchical Search

Zonglin Yang, Wanhao Liu, Ben Gao et al.

NEURIPS 2025arXiv:2505.19209
3
citations
#6877

Trans-EnV: A Framework for Evaluating the Linguistic Robustness of LLMs Against English Varieties

Jiyoung Lee, Seungho Kim, Jieun Han et al.

NEURIPS 2025arXiv:2505.20875
3
citations
#6878

Dataset Distillation via Vision-Language Category Prototype

YAWEN ZOU, Guang Li, Duo Su et al.

ICCV 2025highlightarXiv:2506.23580
3
citations
#6879

Parallelizing MCMC Across the Sequence Length

David Zoltowski, Skyler Wu, Xavier Gonzalez et al.

NEURIPS 2025arXiv:2508.18413
3
citations
#6880

RoboTron-Nav: A Unified Framework for Embodied Navigation Integrating Perception, Planning, and Prediction

Yufeng Zhong, Chengjian Feng, Feng yan et al.

ICCV 2025arXiv:2503.18525
3
citations
#6881

Zero-shot RGB-D Point Cloud Registration with Pre-trained Large Vision Model

Haobo Jiang, Jin Xie, Jian Yang et al.

CVPR 2025
3
citations
#6882

FREE-Merging: Fourier Transform for Efficient Model Merging

Shenghe Zheng, Hongzhi Wang

ICCV 2025arXiv:2411.16815
3
citations
#6883

UniPhy: Learning a Unified Constitutive Model for Inverse Physics Simulation

Himangi Mittal, Peiye Zhuang, Hsin-Ying Lee et al.

CVPR 2025arXiv:2505.16971
3
citations
#6884

ATLAS: Decoupling Skeletal and Shape Parameters for Expressive Parametric Human Modeling

Jinhyung Park, Javier Romero, Shunsuke Saito et al.

ICCV 2025arXiv:2508.15767
3
citations
#6885

Identifiability of Deep Polynomial Neural Networks

Konstantin Usevich, Ricardo Borsoi, Clara Dérand et al.

NEURIPS 2025oralarXiv:2506.17093
3
citations
#6886

On the Generalization of Representation Uncertainty in Earth Observation

Spyros Kondylatos, Nikolaos Ioannis Bountos, Dimitrios Michail et al.

ICCV 2025arXiv:2503.07082
3
citations
#6887

Predict-Optimize-Distill: A Self-Improving Cycle for 4D Object Understanding

Mingxuan Wu, Huang Huang, Justin Kerr et al.

ICCV 2025arXiv:2504.17441
3
citations
#6888

ImViD: Immersive Volumetric Videos for Enhanced VR Engagement

Zhengxian Yang, Shi Pan, Shengqi Wang et al.

CVPR 2025highlightarXiv:2503.14359
3
citations
#6889

Jigsaw++: Imagining Complete Shape Priors for Object Reassembly

Jiaxin Lu, Gang Hua, Qixing Huang

ICCV 2025arXiv:2410.11816
3
citations
#6890

SMMILE: An expert-driven benchmark for multimodal medical in-context learning

Melanie Rieff, Maya Varma, Ossian Rabow et al.

NEURIPS 2025arXiv:2506.21355
3
citations
#6891

Towards Unbiased and Robust Spatio-Temporal Scene Graph Generation and Anticipation

Rohith Peddi, Saurabh ., Ayush Abhay Shrivastava et al.

CVPR 2025highlightarXiv:2411.13059
3
citations
#6892

ARGUS: Hallucination and Omission Evaluation in Video-LLMs

Ruchit Rawal, Reza Shirkavand, Heng Huang et al.

ICCV 2025arXiv:2506.07371
3
citations
#6893

In the Eye of MLLM: Benchmarking Egocentric Video Intent Understanding with Gaze-Guided Prompting

Taiying Peng, Jiacheng Hua, Miao Liu et al.

NEURIPS 2025oralarXiv:2509.07447
3
citations
#6894

NADER: Neural Architecture Design via Multi-Agent Collaboration

Zekang Yang, Wang ZENG, Sheng Jin et al.

CVPR 2025arXiv:2412.19206
3
citations
#6895

CSI-Bench: A Large-Scale In-the-Wild Dataset for Multi-task WiFi Sensing

Guozhen Zhu, Yuqian Hu, Weihang Gao et al.

NEURIPS 2025arXiv:2505.21866
3
citations
#6896

BoltzNCE: Learning likelihoods for Boltzmann Generation with Stochastic Interpolants and Noise Contrastive Estimation

Rishal Aggarwal, Jacky Chen, Nicholas Boffi et al.

NEURIPS 2025arXiv:2507.00846
3
citations
#6897

A Flag Decomposition for Hierarchical Datasets

Nathan Mankovich, Ignacio Santamaria, Gustau Camps-Valls et al.

CVPR 2025arXiv:2502.07782
3
citations
#6898

From Head to Tail: Towards Balanced Representation in Large Vision-Language Models through Adaptive Data Calibration

Mingyang Song, Xiaoye Qu, Jiawei Zhou et al.

CVPR 2025arXiv:2503.12821
3
citations
#6899

On Fairness of Unified Multimodal Large Language Model for Image Generation

Ming Liu, Hao Chen, Jindong Wang et al.

NEURIPS 2025arXiv:2502.03429
3
citations
#6900

DeCafNet: Delegate and Conquer for Efficient Temporal Grounding in Long Videos

Zijia Lu, ASM Iftekhar, Gaurav Mittal et al.

CVPR 2025arXiv:2505.16376
3
citations
#6901

STEP: A Unified Spiking Transformer Evaluation Platform for Fair and Reproducible Benchmarking

Sicheng Shen, Dongcheng Zhao, Linghao Feng et al.

NEURIPS 2025oralarXiv:2505.11151
3
citations
#6902

3D-MOOD: Lifting 2D to 3D for Monocular Open-Set Object Detection

Yung-Hsu Yang, Luigi Piccinelli, Mattia Segu et al.

ICCV 2025arXiv:2507.23567
3
citations
#6903

SUM Parts: Benchmarking Part-Level Semantic Segmentation of Urban Meshes

Weixiao Gao, Liangliang Nan, Hugo Ledoux

CVPR 2025arXiv:2503.15300
3
citations
#6904

Olympus: A Universal Task Router for Computer Vision Tasks

Yuanze Lin, Yunsheng Li, Dongdong Chen et al.

CVPR 2025highlightarXiv:2412.09612
3
citations
#6905

SViMo: Synchronized Diffusion for Video and Motion Generation in Hand-object Interaction Scenarios

Lingwei Dang, Ruizhi Shao, Hongwen Zhang et al.

NEURIPS 2025spotlightarXiv:2506.02444
3
citations
#6906

FlySearch: Exploring how vision-language models explore

Adam Pardyl, Dominik Matuszek, Mateusz Przebieracz et al.

NEURIPS 2025arXiv:2506.02896
3
citations
#6907

VA-MoE: Variables-Adaptive Mixture of Experts for Incremental Weather Forecasting

Hao Chen, Tao Han, Song Guo et al.

ICCV 2025arXiv:2412.02503
3
citations
#6908

THUNDER: Tile-level Histopathology image UNDERstanding benchmark

Pierre Marza, Leo Fillioux, Sofiène Boutaj et al.

NEURIPS 2025spotlightarXiv:2507.07860
3
citations
#6909

Believing is Seeing: Unobserved Object Detection using Generative Models

Subhransu S. Bhattacharjee, Dylan Campbell, Rahul Shome

CVPR 2025arXiv:2410.05869
3
citations
#6910

Glocal Information Bottleneck for Time Series Imputation

Jie Yang, Kexin Zhang, Guibin Zhang et al.

NEURIPS 2025oralarXiv:2510.04910
3
citations
#6911

OODD: Test-time Out-of-Distribution Detection with Dynamic Dictionary

Yifeng Yang, Lin Zhu, Zewen Sun et al.

CVPR 2025arXiv:2503.10468
3
citations
#6912

Watch and Listen: Understanding Audio-Visual-Speech Moments with Multimodal LLM

Zinuo Li, Xian Zhang, Yongxin Guo et al.

NEURIPS 2025oralarXiv:2505.18110
3
citations
#6913

GO-N3RDet: Geometry Optimized NeRF-enhanced 3D Object Detector

Zechuan Li, Hongshan Yu, Yihao Ding et al.

CVPR 2025arXiv:2503.15211
3
citations
#6914

DynRefer: Delving into Region-level Multimodal Tasks via Dynamic Resolution

Yuzhong Zhao, Feng Liu, Yue Liu et al.

CVPR 2025arXiv:2405.16071
3
citations
#6915

EmoNet-Face: An Expert-Annotated Benchmark for Synthetic Emotion Recognition

Christoph Schuhmann, Robert Kaczmarczyk, Gollam Rabby et al.

NEURIPS 2025arXiv:2505.20033
3
citations
#6916

Stable Part Diffusion 4D: Multi-View RGB and Kinematic Parts Video Generation

Hao Zhang, Chun-Han Yao, Simon Donné et al.

NEURIPS 2025oralarXiv:2509.10687
3
citations
#6917

Open-Insect: Benchmarking Open-Set Recognition of Novel Species in Biodiversity Monitoring

Yuyan Chen, Nico Lang, B. Schmidt et al.

NEURIPS 2025spotlightarXiv:2503.01691
3
citations
#6918

Unveiling the Invisible: Reasoning Complex Occlusions Amodally with AURA

Zhixuan Li, Hyunse Yoon, Sanghoon Lee et al.

ICCV 2025arXiv:2503.10225
3
citations
#6919

4D Visual Pre-training for Robot Learning

Chengkai Hou, Yanjie Ze, Yankai Fu et al.

ICCV 2025arXiv:2508.17230
3
citations
#6920

Escaping the SpuriVerse: Can Large Vision-Language Models Generalize Beyond Seen Spurious Correlations?

Yiwei Yang, Chung Peng Lee, Shangbin Feng et al.

NEURIPS 2025arXiv:2506.18322
3
citations
#6921

Compositional Caching for Training-free Open-vocabulary Attribute Detection

Marco Garosi, Alessandro Conti, Gaowen Liu et al.

CVPR 2025highlightarXiv:2503.19145
3
citations
#6922

Extremely Simple Multimodal Outlier Synthesis for Out-of-Distribution Detection and Segmentation

Moru Liu, Hao Dong, Jessica Kelly et al.

NEURIPS 2025arXiv:2505.16985
3
citations
#6923

Asymptotic Theory of Geometric and Adaptive $k$-Means Clustering

Adam Quinn Jaffe

NEURIPS 2025arXiv:2202.13423
3
citations
#6924

TaxaDiffusion: Progressively Trained Diffusion Model for Fine-Grained Species Generation

Amin Karimi Monsefi, Mridul Khurana, Rajiv Ramnath et al.

ICCV 2025arXiv:2506.01923
3
citations
#6925

ADIEE: Automatic Dataset Creation and Scorer for Instruction-Guided Image Editing Evaluation

Sherry Chen, Yi Wei, Luowei Zhou et al.

ICCV 2025arXiv:2507.07317
3
citations
#6926

SMILE: Infusing Spatial and Motion Semantics in Masked Video Learning

Fida Mohammad Thoker, Letian Jiang, Chen Zhao et al.

CVPR 2025arXiv:2504.00527
3
citations
#6927

Longer Context, Deeper Thinking: Uncovering the Role of Long-Context Ability in Reasoning

Wang Yang, Zirui Liu, Hongye Jin et al.

NEURIPS 2025arXiv:2505.17315
3
citations
#6928

ResQ: A Novel Framework to Implement Residual Neural Networks on Analog Rydberg Atom Quantum Computers

Nicholas DiBrita, Jason Han, Tirthak Patel

ICCV 2025arXiv:2506.21537
3
citations
#6929

FairGen: Enhancing Fairness in Text-to-Image Diffusion Models via Self-Discovering Latent Directions

Yilei Jiang, Wei-Hong Li, Yiyuan Zhang et al.

ICCV 2025arXiv:2412.18810
3
citations
#6930

Return of ChebNet: Understanding and Improving an Overlooked GNN on Long Range Tasks

Ali Hariri, Alvaro Arroyo, Alessio Gravina et al.

NEURIPS 2025spotlightarXiv:2506.07624
3
citations
#6931

Unveiling the Learning Mind of Language Models: A Cognitive Framework and Empirical Study

Zhengyu Hu, Jianxun Lian, Zheyuan Xiao et al.

NEURIPS 2025arXiv:2506.13464
3
citations
#6932

Whose View of Safety? A Deep DIVE Dataset for Pluralistic Alignment of Text-to-Image Models

Charvi Rastogi, Tian Huey Teh, Pushkar Mishra et al.

NEURIPS 2025spotlightarXiv:2507.13383
3
citations
#6933

SemanticDraw: Towards Real-Time Interactive Content Creation from Image Diffusion Models

Jaerin Lee, Daniel Jung, Kanggeon Lee et al.

CVPR 2025arXiv:2403.09055
3
citations
#6934

MoPFormer: Motion-Primitive Transformer for Wearable-Sensor Activity Recognition

Hao Zhang, Zhan Zhuang, Xuehao Wang et al.

NEURIPS 2025oralarXiv:2505.20744
3
citations
#6935

GEOPARD: Geometric Pretraining for Articulation Prediction in 3D Shapes

Pradyumn Goyal, Dmitrii Petrov, Sheldon Andrews et al.

ICCV 2025arXiv:2504.02747
3
citations
#6936

Memory-Enhanced Neural Solvers for Routing Problems

Felix Chalumeau, Refiloe Shabe, Noah De Nicola et al.

NEURIPS 2025spotlightarXiv:2406.16424
3
citations
#6937

Details Matter for Indoor Open-vocabulary 3D Instance Segmentation

Sanghun Jung, Jingjing Zheng, Ke Zhang et al.

ICCV 2025arXiv:2507.23134
3
citations
#6938

Self-supervised Learning of Hybrid Part-aware 3D Representations of 2D Gaussians and Superquadrics

Zhirui Gao, Renjiao Yi, Yuhang Huang et al.

ICCV 2025arXiv:2408.10789
3
citations
#6939

From Laboratory to Real World: A New Benchmark Towards Privacy-Preserved Visible-Infrared Person Re-Identification

Yan Jiang, Hao Yu, Xu Cheng et al.

CVPR 2025
3
citations
#6940

Gradient Multi-Normalization for Efficient LLM Training

Meyer Scetbon, Chao Ma, Wenbo Gong et al.

NEURIPS 2025
3
citations
#6941

One Sample is Enough to Make Conformal Prediction Robust

Soroush H. Zargarbashi, Mohammad Sadegh Akhondzadeh, Aleksandar Bojchevski

NEURIPS 2025arXiv:2506.16553
3
citations
#6942

End-to-End Multi-Modal Diffusion Mamba

Chunhao Lu, Qiang Lu, Meichen Dong et al.

ICCV 2025arXiv:2510.13253
3
citations
#6943

VIGFace: Virtual Identity Generation for Privacy-Free Face Recognition Dataset

Minsoo Kim, Min-Cheol Sagong, Gi Pyo Nam et al.

ICCV 2025
3
citations
#6944

RapVerse: Coherent Vocals and Whole-Body Motion Generation from Text

Jiaben Chen, Xin Yan, Yihang Chen et al.

ICCV 2025arXiv:2405.20336
3
citations
#6945

Gradient-Variation Online Adaptivity for Accelerated Optimization with Hölder Smoothness

Yuheng Zhao, Yu-Hu Yan, Kfir Y. Levy et al.

NEURIPS 2025spotlightarXiv:2511.02276
3
citations
#6946

How Different from the Past? Spatio-Temporal Time Series Forecasting with Self-Supervised Deviation Learning

Haotian Gao, Zheng Dong, Jiawei Yong et al.

NEURIPS 2025oralarXiv:2510.04908
3
citations
#6947

Compressed and Smooth Latent Space for Text Diffusion Modeling

Viacheslav Meshchaninov, Egor Chimbulatov, Alexander Shabalin et al.

NEURIPS 2025arXiv:2506.21170
3
citations
#6948

AdaLRS: Loss-Guided Adaptive Learning Rate Search for Efficient Foundation Model Pretraining

Hongyuan Dong, Dingkang Yang, Xiao Liang et al.

NEURIPS 2025arXiv:2506.13274
3
citations
#6949

Unleashing High-Quality Image Generation in Diffusion Sampling Using Second-Order Levenberg-Marquardt-Langevin

Fangyikang Wang, Hubery Yin, Lei Qian et al.

ICCV 2025arXiv:2505.24222
3
citations
#6950

PoseTraj: Pose-Aware Trajectory Control in Video Diffusion

longbin ji, Lei Zhong, Pengfei Wei et al.

CVPR 2025arXiv:2503.16068
3
citations
#6951

A machine learning approach that beats Rubik's cubes

Alexander Chervov, Kirill Khoruzhii, Nikita Bukhal et al.

NEURIPS 2025spotlight
3
citations
#6952

CuMPerLay: Learning Cubical Multiparameter Persistence Vectorizations

Caner Korkmaz, Brighton Nuwagira, Baris Coskunuzer et al.

ICCV 2025arXiv:2510.12795
3
citations
#6953

DERD-Net: Learning Depth from Event-based Ray Densities

Diego de Oliveira Hitzges, Suman Ghosh, Guillermo Gallego

NEURIPS 2025spotlightarXiv:2504.15863
3
citations
#6954

SA-Occ: Satellite-Assisted 3D Occupancy Prediction in Real World

Chen Chen, Zhirui Wang, Taowei Sheng et al.

ICCV 2025arXiv:2503.16399
3
citations
#6955

RoboTron-Sim: Improving Real-World Driving via Simulated Hard-Case

Baihui Xiao, Chengjian Feng, Zhijian Huang et al.

ICCV 2025arXiv:2508.04642
3
citations
#6956

COALA: Numerically Stable and Efficient Framework for Context-Aware Low-Rank Approximation

Uliana Parkina, Maxim Rakhuba

NEURIPS 2025arXiv:2507.07580
3
citations
#6957

PriOr-Flow: Enhancing Primitive Panoramic Optical Flow with Orthogonal View

Longliang Liu, Miaojie Feng, Junda Cheng et al.

ICCV 2025highlightarXiv:2506.23897
3
citations
#6958

Hierarchical-aware Orthogonal Disentanglement Framework for Fine-grained Skeleton-based Action Recognition

Haochen Chang, Pengfei Ren, Haoyang Zhang et al.

ICCV 2025
3
citations
#6959

Disentangled Clothed Avatar Generation with Layered Representation

Weitian Zhang, Yichao Yan, Sijing Wu et al.

ICCV 2025highlightarXiv:2501.04631
3
citations
#6960

Anti-Aliased 2D Gaussian Splatting

Mae Younes, Adnane Boukhayma

NEURIPS 2025arXiv:2506.11252
3
citations
#6961

SAUCE: Selective Concept Unlearning in Vision-Language Models with Sparse Autoencoders

Jiahui Geng, Qing Li

ICCV 2025arXiv:2503.14530
3
citations
#6962

Rethinking Layered Graphic Design Generation with a Top-Down Approach

Jingye Chen, Zhaowen Wang, Nanxuan Zhao et al.

ICCV 2025arXiv:2507.05601
3
citations
#6963

Semantic Equitable Clustering: A Simple and Effective Strategy for Clustering Vision Tokens

Qihang Fan, Huaibo Huang, Mingrui Chen et al.

ICCV 2025arXiv:2405.13337
3
citations
#6964

GaRe: Relightable 3D Gaussian Splatting for Outdoor Scenes from Unconstrained Photo Collections

Haiyang Bai, Jiaqi Zhu, Songru Jiang et al.

ICCV 2025arXiv:2507.20512
3
citations
#6965

PropVG: End-to-End Proposal-Driven Visual Grounding with Multi-Granularity Discrimination

Ming Dai, Wenxuan Cheng, Jiedong Zhuang et al.

ICCV 2025arXiv:2509.04833
3
citations
#6966

MP-HSIR: A Multi-Prompt Framework for Universal Hyperspectral Image Restoration

Zhehui Wu, Yong Chen, Naoto Yokoya et al.

ICCV 2025arXiv:2503.09131
3
citations
#6967

Faster and Better 3D Splatting via Group Training

Chengbo Wang, Guozheng Ma, Yizhen Lao et al.

ICCV 2025arXiv:2412.07608
3
citations
#6968

Inpaint4Drag: Repurposing Inpainting Models for Drag-Based Image Editing via Bidirectional Warping

Jingyi Lu, Kai Han

ICCV 2025arXiv:2509.04582
3
citations
#6969

RTMap: Real-Time Recursive Mapping with Change Detection and Localization

Yuheng Du, Sheng Yang, Lingxuan Wang et al.

ICCV 2025arXiv:2507.00980
3
citations
#6970

Joint Self-Supervised Video Alignment and Action Segmentation

Ali Shah Ali, Syed Ahmed Mahmood, Mubin Saeed et al.

ICCV 2025arXiv:2503.16832
3
citations
#6971

You Think, You ACT: The New Task of Arbitrary Text to Motion Generation

Runqi Wang, Caoyuan Ma, Guopeng Li et al.

ICCV 2025arXiv:2404.14745
3
citations
#6972

Constraint-Aware Feature Learning for Parametric Point Cloud

Xi Cheng, Ruiqi Lei, Di Huang et al.

ICCV 2025arXiv:2411.07747
3
citations
#6973

NeRF Is a Valuable Assistant for 3D Gaussian Splatting

Shuangkang Fang, I-Chao Shen, Takeo Igarashi et al.

ICCV 2025arXiv:2507.23374
3
citations
#6974

TextSSR: Diffusion-based Data Synthesis for Scene Text Recognition

Xingsong Ye, Yongkun Du, Yunbo Tao et al.

ICCV 2025arXiv:2412.01137
3
citations
#6975

Monocular Semantic Scene Completion via Masked Recurrent Networks

Xuzhi Wang, Xinran Wu, Song Wang et al.

ICCV 2025arXiv:2507.17661
3
citations
#6976

DiTaiListener: Controllable High Fidelity Listener Video Generation with Diffusion

Maksim Siniukov, Di Chang, Minh Tran et al.

ICCV 2025arXiv:2504.04010
3
citations
#6977

Foresight in Motion: Reinforcing Trajectory Prediction with Reward Heuristics

Muleilan Pei, Shaoshuai Shi, Xuesong Chen et al.

ICCV 2025arXiv:2507.12083
3
citations
#6978

Charm: The Missing Piece in ViT Fine-Tuning for Image Aesthetic Assessment

Fatemeh Behrad, Tinne Tuytelaars, Johan Wagemans

CVPR 2025arXiv:2504.02522
3
citations
#6979

Video Individual Counting for Moving Drones

Yaowu Fan, Jia Wan, Tao Han et al.

ICCV 2025highlightarXiv:2503.10701
3
citations
#6980

Open-ended Hierarchical Streaming Video Understanding with Vision Language Models

Hyolim Kang, Yunsu Park, Youngbeom Yoo et al.

ICCV 2025arXiv:2509.12145
3
citations
#6981

A Lesson in Splats: Teacher-Guided Diffusion for 3D Gaussian Splats Generation with 2D Supervision

Chensheng Peng, Ido Sobol, Masayoshi Tomizuka et al.

ICCV 2025arXiv:2412.00623
3
citations
#6982

How To Make Your Cell Tracker Say "I dunno!"

Richard D Paul, Johannes Seiffarth, David Rügamer et al.

ICCV 2025
3
citations
#6983

GS-Occ3D: Scaling Vision-only Occupancy Reconstruction with Gaussian Splatting

Baijun Ye, Minghui Qin, Saining Zhang et al.

ICCV 2025arXiv:2507.19451
3
citations
#6984

Sparse Fine-Tuning of Transformers for Generative Tasks

Wei Chen, Jingxi Yu, Zichen Miao et al.

ICCV 2025arXiv:2507.10855
3
citations
#6985

From Imitation to Innovation: The Emergence of AI's Unique Artistic Styles and the Challenge of Copyright Protection

Zexi Jia, Chuanwei Huang, Hongyan Fei et al.

ICCV 2025arXiv:2507.04769
3
citations
#6986

FROSS: Faster-Than-Real-Time Online 3D Semantic Scene Graph Generation from RGB-D Images

Hao-Yu Hou, Chun-Yi Lee, Motoharu Sonogashira et al.

ICCV 2025arXiv:2507.19993
3
citations
#6987

Constrained Diffusers for Safe Planning and Control

Jichen Zhang, Liqun Zhao, Antonis Papachristodoulou et al.

NEURIPS 2025arXiv:2506.12544
3
citations
#6988

LangScene-X: Reconstruct Generalizable 3D Language-Embedded Scenes with TriMap Video Diffusion

Fangfu Liu, Hao Li, Jiawei Chi et al.

ICCV 2025arXiv:2507.02813
3
citations
#6989

Self-Ensembling Gaussian Splatting for Few-Shot Novel View Synthesis

Chen Zhao, Xuan Wang, Tong Zhang et al.

ICCV 2025arXiv:2411.00144
3
citations
#6990

HairCUP: Hair Compositional Universal Prior for 3D Gaussian Avatars

Byungjun Kim, Shunsuke Saito, Giljoo Nam et al.

ICCV 2025arXiv:2507.19481
3
citations
#6991

GenM3: Generative Pretrained Multi-path Motion Model for Text Conditional Human Motion Generation

Junyu Shi, Lijiang LIU, Yong Sun et al.

ICCV 2025
3
citations
#6992

Long-term Traffic Simulation with Interleaved Autoregressive Motion and Scenario Generation

Xiuyu Yang, Shuhan Tan, Philipp Kraehenbuehl

ICCV 2025arXiv:2506.17213
3
citations
#6993

Kestrel: 3D Multimodal LLM for Part-Aware Grounded Description

Mahmoud Ahmed, Junjie Fei, Jian Ding et al.

ICCV 2025arXiv:2405.18937
3
citations
#6994

StyleMotif: Multi-Modal Motion Stylization using Style-Content Cross Fusion

Ziyu Guo, Young-Yoon Lee, Joseph Liu et al.

ICCV 2025arXiv:2503.21775
3
citations
#6995

Resilient Sensor Fusion Under Adverse Sensor Failures via Multi-Modal Expert Fusion

Konyul Park, Yecheol Kim, Daehun Kim et al.

CVPR 2025arXiv:2503.19776
3
citations
#6996

What You Have is What You Track: Adaptive and Robust Multimodal Tracking

Yuedong Tan, Jiawei Shao, Eduard Zamfir et al.

ICCV 2025arXiv:2507.05899
3
citations
#6997

INTER: Mitigating Hallucination in Large Vision-Language Models by Interaction Guidance Sampling

Xin Dong, Shichao Dong, Jin Wang et al.

ICCV 2025arXiv:2507.05056
3
citations
#6998

4D Gaussian Splatting SLAM

Yanyan Li, Youxu Fang, Zunjie Zhu et al.

ICCV 2025arXiv:2503.16710
3
citations
#6999

PBCAT: Patch-Based Composite Adversarial Training against Physically Realizable Attacks on Object Detection

Xiao Li, Yiming Zhu, Yifan Huang et al.

ICCV 2025arXiv:2506.23581
3
citations
#7000

AdaHuman: Animatable Detailed 3D Human Generation with Compositional Multiview Diffusion

Yangyi Huang, Ye Yuan, Xueting Li et al.

ICCV 2025arXiv:2505.24877
3
citations