Most Cited 2025 "ventral stream selectivity" Papers

22,274 papers found • Page 55 of 112

#10801

Dataset Distillation via Vision-Language Category Prototype

YAWEN ZOU, Guang Li, Duo Su et al.

ICCV 2025highlightarXiv:2506.23580
3
citations
#10802

Predictive Uncertainty Quantification for Bird's Eye View Segmentation: A Benchmark and Novel Loss Function

Linlin Yu, Bowen Yang, Tianhao Wang et al.

ICLR 2025arXiv:2405.20986
3
citations
#10803

Zero-Shot Blind-spot Image Denoising via Implicit Neural Sampling

Yuhui Quan, Tianxiang Zheng, Zhiyuan Ma et al.

CVPR 2025
3
citations
#10804

Diffusion-based Realistic Listening Head Generation via Hybrid Motion Modeling

Yinuo Wang, Yanbo Fan, Xuan Wang et al.

CVPR 2025highlight
3
citations
#10805

Classifier-to-Bias: Toward Unsupervised Automatic Bias Detection for Visual Classifiers

Quentin Guimard, Moreno D'Incà, Massimiliano Mancini et al.

CVPR 2025arXiv:2504.20902
3
citations
#10806

Accurate and Scalable Graph Neural Networks via Message Invariance

Zhihao Shi, Jie Wang, Zhiwei Zhuang et al.

ICLR 2025arXiv:2502.19693
3
citations
#10807

Search and Detect: Training-Free Long Tail Object Detection via Web-Image Retrieval

Mankeerat Sidhu, Hetarth Chopra, Ansel Blume et al.

CVPR 2025arXiv:2409.18733
3
citations
#10808

PaCA: Partial Connection Adaptation for Efficient Fine-Tuning

Sunghyeon Woo, Sol Namkung, SunWoo Lee et al.

ICLR 2025arXiv:2503.01905
3
citations
#10809

Fine-Grained Image-Text Correspondence with Cost Aggregation for Open-Vocabulary Part Segmentation

Jiho Choi, Seonho Lee, Minhyun Lee et al.

CVPR 2025arXiv:2501.09688
3
citations
#10810

Stereo Any Video: Temporally Consistent Stereo Matching

Junpeng Jing, Weixun Luo, Ye Mao et al.

ICCV 2025highlightarXiv:2503.05549
3
citations
#10811

Sparse Fine-Tuning of Transformers for Generative Tasks

Wei Chen, Jingxi Yu, Zichen Miao et al.

ICCV 2025arXiv:2507.10855
3
citations
#10812

Decoupling Training-Free Guided Diffusion by ADMM

Youyuan Zhang, Zehua Liu, Zenan Li et al.

CVPR 2025arXiv:2411.12773
3
citations
#10813

Learning from Synchronization: Self-Supervised Uncalibrated Multi-View Person Association in Challenging Scenes

Keqi Chen, vinkle srivastav, Didier MUTTER et al.

CVPR 2025arXiv:2503.13739
3
citations
#10814

RC-AutoCalib: An End-to-End Radar-Camera Automatic Calibration Network

Van-Tin Luu, Yong-Lin Cai, Vu-Hoang Tran et al.

CVPR 2025arXiv:2505.22427
3
citations
#10815

Computationally Efficient RL under Linear Bellman Completeness for Deterministic Dynamics

Runzhe Wu, Ayush Sekhari, Akshay Krishnamurthy et al.

ICLR 2025arXiv:2406.11810
3
citations
#10816

Decoder Gradient Shield: Provable and High-Fidelity Prevention of Gradient-Based Box-Free Watermark Removal

Haonan An, Guang Hua, Zhengru Fang et al.

CVPR 2025arXiv:2502.20924
3
citations
#10817

Robust and Efficient 3D Gaussian Splatting for Urban Scene Reconstruction

Zhensheng Yuan, Haozhi Huang, Zhen Xiong et al.

ICCV 2025arXiv:2507.23006
3
citations
#10818

Efficient Imitation under Misspecification

Nicolas Espinosa Dice, Sanjiban Choudhury, Wen Sun et al.

ICLR 2025arXiv:2503.13162
3
citations
#10819

Towards Stabilized and Efficient Diffusion Transformers through Long-Skip-Connections with Spectral Constraints

Guanjie Chen, Xinyu Zhao, Yucheng Zhou et al.

ICCV 2025arXiv:2411.17616
3
citations
#10820

Watermarking One for All: A Robust Watermarking Scheme Against Partial Image Theft

Gaozhi Liu, Silu Cao, Zhenxing Qian et al.

CVPR 2025
3
citations
#10821

Heavy Labels Out! Dataset Distillation with Label Space Lightening

Ruonan Yu, Songhua Liu, Zigeng Chen et al.

ICCV 2025arXiv:2408.08201
3
citations
#10822

Taming Flow Matching with Unbalanced Optimal Transport into Fast Pansharpening

Zihan Cao, Yu Zhong, Liang-Jian Deng

ICCV 2025arXiv:2503.14975
3
citations
#10823

Revisiting Pool-based Prompt Learning for Few-shot Class-incremental Learning

Yongwei Jiang, Yixiong Zou, Yuhua Li et al.

ICCV 2025arXiv:2507.09183
3
citations
#10824

Vivid4D: Improving 4D Reconstruction from Monocular Video by Video Inpainting

Jiaxin Huang, Sheng Miao, Bangbang Yang et al.

ICCV 2025arXiv:2504.11092
3
citations
#10825

ChroKnowledge: Unveiling Chronological Knowledge of Language Models in Multiple Domains

Yein Park, Chanwoong Yoon, Jungwoo Park et al.

ICLR 2025oralarXiv:2410.09870
3
citations
#10826

ZeroKey: Point-Level Reasoning and Zero-Shot 3D Keypoint Detection from Large Language Models

Bingchen Gong, Diego Gomez, Abdullah Hamdi et al.

ICCV 2025arXiv:2412.06292
3
citations
#10827

CuMPerLay: Learning Cubical Multiparameter Persistence Vectorizations

Caner Korkmaz, Brighton Nuwagira, Baris Coskunuzer et al.

ICCV 2025arXiv:2510.12795
3
citations
#10828

IM-Zero: Instance-level Motion Controllable Video Generation in a Zero-shot Manner

Yuyang Huang, Yabo Chen, Li Ding et al.

CVPR 2025
3
citations
#10829

EigenGS Representation: From Eigenspace to Gaussian Image Space

LO-WEI TAI, Ching-En Ching En, Li et al.

CVPR 2025arXiv:2503.07446
3
citations
#10830

SA-Occ: Satellite-Assisted 3D Occupancy Prediction in Real World

Chen Chen, Zhirui Wang, Taowei Sheng et al.

ICCV 2025arXiv:2503.16399
3
citations
#10831

An Effective Theory of Bias Amplification

Arjun Subramonian, Samuel Bell, Levent Sagun et al.

ICLR 2025arXiv:2410.17263
3
citations
#10832

CL-MFAP: A Contrastive Learning-Based Multimodal Foundation Model for Molecular Property Prediction and Antibiotic Screening

Gen Zhou, Sugitha Janarthanan, Yutong Lu et al.

ICLR 2025arXiv:2502.11001
3
citations
#10833

MIM-Refiner: A Contrastive Learning Boost from Intermediate Pre-Trained Masked Image Modeling Representations

Benedikt Alkin, Lukas Miklautz, Sepp Hochreiter et al.

ICLR 2025
3
citations
#10834

FairGen: Enhancing Fairness in Text-to-Image Diffusion Models via Self-Discovering Latent Directions

Yilei Jiang, Wei-Hong Li, Yiyuan Zhang et al.

ICCV 2025arXiv:2412.18810
3
citations
#10835

MaRI: Material Retrieval Integration across Domains

Jianhui Wang, Zhifei Yang, Yangfan He et al.

CVPR 2025arXiv:2503.08111
3
citations
#10836

Exploiting Diffusion Prior for Task-driven Image Restoration

Jaeha Kim, Junghun Oh, Kyoung Mu Lee

ICCV 2025arXiv:2507.22459
3
citations
#10837

ADIEE: Automatic Dataset Creation and Scorer for Instruction-Guided Image Editing Evaluation

Sherry Chen, Yi Wei, Luowei Zhou et al.

ICCV 2025arXiv:2507.07317
3
citations
#10838

SEAL: Semantic Aware Image Watermarking

Kasra Arabi, R. Teal Witter, Chinmay Hegde et al.

ICCV 2025arXiv:2503.12172
3
citations
#10839

Discretized Gaussian Representation for Tomographic Reconstruction

Shaokai Wu, Yuxiang Lu, Yapan Guo et al.

ICCV 2025arXiv:2411.04844
3
citations
#10840

Infilling Score: A Pretraining Data Detection Algorithm for Large Language Models

Negin Raoof, Litu Rout, Giannis Daras et al.

ICLR 2025
3
citations
#10841

A3: Few-shot Prompt Learning of Unlearnable Examples with Cross-Modal Adversarial Feature Alignment

Xuan Wang, Xitong Gao, Dongping Liao et al.

CVPR 2025
3
citations
#10842

FADE: Frequency-Aware Diffusion Model Factorization for Video Editing

Yixuan Zhu, Haolin Wang, Shilin Ma et al.

CVPR 2025arXiv:2506.05934
3
citations
#10843

Finsler Multi-Dimensional Scaling: Manifold Learning for Asymmetric Dimensionality Reduction and Embedding

Thomas Dagès, Simon Weber, Ya-Wei Eileen Lin et al.

CVPR 2025arXiv:2503.18010
3
citations
#10844

GSBA$^K$: $top$-$K$ Geometric Score-based Black-box Attack

Md Farhamdur Reza, Richeng Jin, Tianfu Wu et al.

ICLR 2025arXiv:2503.12827
3
citations
#10845

SFDM: Robust Decomposition of Geometry and Reflectance for Realistic Face Rendering from Sparse-view Images

Daisheng Jin, Jiangbei Hu, Baixin Xu et al.

CVPR 2025arXiv:2312.06085
3
citations
#10846

MPG-SAM 2: Adapting SAM 2 with Mask Priors and Global Context for Referring Video Object Segmentation

Fu Rong, Meng Lan, Qian Zhang et al.

ICCV 2025arXiv:2501.13667
3
citations
#10847

SA-LUT: Spatial Adaptive 4D Look-Up Table for Photorealistic Style Transfer

Zerui Gong, Zhonghua Wu, Qingyi Tao et al.

ICCV 2025arXiv:2506.13465
3
citations
#10848

VAFlow: Video-to-Audio Generation with Cross-Modality Flow Matching

Xihua Wang, Xin Cheng, Yuyue Wang et al.

ICCV 2025
3
citations
#10849

How Can Objects Help Video-Language Understanding?

Zitian Tang, Shijie Wang, Junho Cho et al.

ICCV 2025arXiv:2504.07454
3
citations
#10850

StickMotion: Generating 3D Human Motions by Drawing a Stickman

Tao Wang, Zhihua Wu, Qiaozhi He et al.

CVPR 2025arXiv:2503.04829
3
citations
#10851

GaRe: Relightable 3D Gaussian Splatting for Outdoor Scenes from Unconstrained Photo Collections

Haiyang Bai, Jiaqi Zhu, Songru Jiang et al.

ICCV 2025arXiv:2507.20512
3
citations
#10852

LoCA: Location-Aware Cosine Adaptation for Parameter-Efficient Fine-Tuning

Zhekai Du, Yinjie Min, Jingjing Li et al.

ICLR 2025arXiv:2502.06820
3
citations
#10853

NeRF Is a Valuable Assistant for 3D Gaussian Splatting

Shuangkang Fang, I-Chao Shen, Takeo Igarashi et al.

ICCV 2025arXiv:2507.23374
3
citations
#10854

PVChat: Personalized Video Chat with One-Shot Learning

YUFEI SHI, Weilong Yan, Gang Xu et al.

ICCV 2025arXiv:2503.17069
3
citations
#10855

Conditional Testing based on Localized Conformal $p$-values

Xiaoyang Wu, Lin Lu, Zhaojun Wang et al.

ICLR 2025arXiv:2409.16829
3
citations
#10856

From Panels to Prose: Generating Literary Narratives from Comics

Ragav Sachdeva, Andrew Zisserman

ICCV 2025arXiv:2503.23344
3
citations
#10857

GGTalker: Talking Head Systhesis with Generalizable Gaussian Priors and Identity-Specific Adaptation

Wentao Hu, Shunkai Li, Ziqiao Peng et al.

ICCV 2025highlightarXiv:2506.21513
3
citations
#10858

Beyond Sight: Towards Cognitive Alignment in LVLM via Enriched Visual Knowledge

Yaqi Zhao, Yuanyang Yin, Lin Li et al.

CVPR 2025arXiv:2411.16824
3
citations
#10859

MVGBench: a Comprehensive Benchmark for Multi-view Generation Models

Xianghui Xie, Jan Lenssen, Gerard Pons-Moll

ICCV 2025
3
citations
#10860

On the Generalization of Handwritten Text Recognition Models

Carlos Garrido-Munoz, Jorge Calvo-Zaragoza

CVPR 2025arXiv:2411.17332
3
citations
#10861

Visual Prompting for One-shot Controllable Video Editing without Inversion

Zhengbo Zhang, Yuxi Zhou, DUO PENG et al.

CVPR 2025arXiv:2504.14335
3
citations
#10862

NL-Eye: Abductive NLI For Images

Mor Ventura, Michael Toker, Nitay Calderon et al.

ICLR 2025arXiv:2410.02613
3
citations
#10863

Robustness Inspired Graph Backdoor Defense

Zhiwei Zhang, Minhua Lin, Junjie Xu et al.

ICLR 2025arXiv:2406.09836
3
citations
#10864

DMesh++: An Efficient Differentiable Mesh for Complex Shapes

Sanghyun Son, Matheus Gadelha, Yang Zhou et al.

ICCV 2025arXiv:2412.16776
3
citations
#10865

Expected Return Symmetries

Darius Muglich, Johannes Forkel, Elise van der Pol et al.

ICLR 2025arXiv:2502.01711
3
citations
#10866

From Head to Tail: Efficient Black-box Model Inversion Attack via Long-tailed Learning

Ziang Li, Hongguang Zhang, Juan Wang et al.

CVPR 2025arXiv:2503.16266
3
citations
#10867

LogoSP: Local-global Grouping of Superpoints for Unsupervised Semantic Segmentation of 3D Point Clouds

Zihui Zhang, Weisheng Dai, Hongtao Wen et al.

CVPR 2025arXiv:2506.07857
3
citations
#10868

Efficient Distribution Matching of Representations via Noise-Injected Deep InfoMax

Ivan Butakov, Alexander Semenenko, Alexander Tolmachev et al.

ICLR 2025arXiv:2410.06993
3
citations
#10869

UA-Pose: Uncertainty-Aware 6D Object Pose Estimation and Online Object Completion with Partial References

Ming-Feng Li, Xin Yang, Fu-En Wang et al.

CVPR 2025arXiv:2506.07996
3
citations
#10870

Amodal Depth Anything: Amodal Depth Estimation in the Wild

Zhenyu Li, Mykola Lavreniuk, Jian Shi et al.

ICCV 2025arXiv:2412.02336
3
citations
#10871

CoLMDriver: LLM-based Negotiation Benefits Cooperative Autonomous Driving

Changxing Liu, Genjia Liu, Zijun Wang et al.

ICCV 2025arXiv:2503.08683
3
citations
#10872

DFM: Differentiable Feature Matching for Anomaly Detection

Wu Sheng, Yimi Wang, Xudong Liu et al.

CVPR 2025
3
citations
#10873

Free-viewpoint Human Animation with Pose-correlated Reference Selection

Fa-Ting Hong, Zhan Xu, Haiyang Liu et al.

CVPR 2025highlightarXiv:2412.17290
3
citations
#10874

GS-Occ3D: Scaling Vision-only Occupancy Reconstruction with Gaussian Splatting

Baijun Ye, Minghui Qin, Saining Zhang et al.

ICCV 2025arXiv:2507.19451
3
citations
#10875

Resonance: Learning to Predict Social-Aware Pedestrian Trajectories as Co-Vibrations

Conghao Wong, Ziqian Zou, Beihao Xia

ICCV 2025arXiv:2412.02447
3
citations
#10876

Action Detail Matters: Refining Video Recognition with Local Action Queries

Mengmeng Wang, Zeyi Huang, Xiangjie Kong et al.

CVPR 2025
3
citations
#10877

A Policy-Gradient Approach to Solving Imperfect-Information Games with Best-Iterate Convergence

Mingyang Liu, Gabriele Farina, Asuman Ozdaglar

ICLR 2025arXiv:2408.00751
3
citations
#10878

FFR: Frequency Feature Rectification for Weakly Supervised Semantic Segmentation

Ziqian Yang, Xinqiao Zhao, Xiaolei Wang et al.

CVPR 2025
3
citations
#10879

Simpler Diffusion: 1.5 FID on ImageNet512 with Pixel-space Diffusion

Emiel Hoogeboom, Thomas Mensink, Jonathan Heek et al.

CVPR 2025
3
citations
#10880

Multi-Scale Neighborhood Occupancy Masked Autoencoder for Self-Supervised Learning in LiDAR Point Clouds

Mohamed Abdelsamad, Michael Ulrich, Claudius Glaeser et al.

CVPR 2025arXiv:2502.20316
3
citations
#10881

Signs as Tokens: A Retrieval-Enhanced Multilingual Sign Language Generator

Ronglai Zuo, Rolandos Alexandros Potamias, Evangelos Ververas et al.

ICCV 2025arXiv:2411.17799
3
citations
#10882

CAP-Net: A Unified Network for 6D Pose and Size Estimation of Categorical Articulated Parts from a Single RGB-D Image

Jingshun Huang, Haitao Lin, Tianyu Wang et al.

CVPR 2025highlightarXiv:2504.11230
3
citations
#10883

Diffusion Transformers for Tabular Data Time Series Generation

Fabrizio Garuti, Enver Sangineto, Simone Luetto et al.

ICLR 2025arXiv:2504.07566
3
citations
#10884

4D-Fly: Fast 4D Reconstruction from a Single Monocular Video

Diankun Wu, Fangfu Liu, Yi-Hsin Hung et al.

CVPR 2025
3
citations
#10885

ThunderKittens: Simple, Fast, and $\textit{Adorable}$ Kernels

Benjamin Spector, Simran Arora, Aaryan Singhal et al.

ICLR 2025
3
citations
#10886

Large (Vision) Language Models are Unsupervised In-Context Learners

Artyom Gadetsky, Andrei Atanov, Yulun Jiang et al.

ICLR 2025arXiv:2504.02349
3
citations
#10887

Boosting Adversarial Transferability via Residual Perturbation Attack

Jinjia Peng, Zeze Tao, Huibing Wang et al.

ICCV 2025arXiv:2508.05689
3
citations
#10888

Geometry of Neural Reinforcement Learning in Continuous State and Action Spaces

Saket Tiwari, Omer Gottesman, George D Konidaris

ICLR 2025arXiv:2507.20853
3
citations
#10889

HalLoc: Token-level Localization of Hallucinations for Vision Language Models

Eunkyu Park, Minyeong Kim, Gunhee Kim

CVPR 2025arXiv:2506.10286
3
citations
#10890

Fast Uncovering of Protein Sequence Diversity from Structure

Luca Alessandro Silva, Barthelemy Meynard-Piganeau, Carlo Lucibello et al.

ICLR 2025arXiv:2406.11975
3
citations
#10891

Plug-and-Play Versatile Compressed Video Enhancement

Huimin Zeng, Jiacheng Li, Zhiwei Xiong

CVPR 2025arXiv:2504.15380
3
citations
#10892

Unleashing the Potential of Consistency Learning for Detecting and Grounding Multi-Modal Media Manipulation

Yiheng Li, Yang Yang, Zichang Tan et al.

CVPR 2025arXiv:2506.05890
3
citations
#10893

HyTIP: Hybrid Temporal Information Propagation for Masked Conditional Residual Video Coding

Yi-Hsin Chen, Yi-Chen Yao, Kuan-Wei Ho et al.

ICCV 2025arXiv:2508.02072
3
citations
#10894

Ground-V: Teaching VLMs to Ground Complex Instructions in Pixels

Yongshuo Zong, Qin ZHANG, DONGSHENG An et al.

CVPR 2025arXiv:2505.13788
3
citations
#10895

EmbodiedSplat: Personalized Real-to-Sim-to-Real Navigation with Gaussian Splats from a Mobile Device

Gunjan Chhablani, Xiaomeng Ye, Muhammad Zubair Irshad et al.

ICCV 2025arXiv:2509.17430
3
citations
#10896

PHATNet: A Physics-guided Haze Transfer Network for Domain-adaptive Real-world Image Dehazing

Fu-Jen Tsai, Yan-Tsung Peng, Yen-Yu Lin et al.

ICCV 2025arXiv:2507.14826
3
citations
#10897

UVE: Are MLLMs Unified Evaluators for AI-Generated Videos?

Yuanxin Liu, Rui Zhu, Shuhuai Ren et al.

NEURIPS 2025arXiv:2503.09949
3
citations
#10898

Breaking the Encoder Barrier for Seamless Video-Language Understanding

Handong Li, Yiyuan Zhang, Longteng Guo et al.

ICCV 2025arXiv:2503.18422
3
citations
#10899

GroomLight: Hybrid Inverse Rendering for Relightable Human Hair Appearance Modeling

Yang Zheng, Menglei Chai, Delio Vicini et al.

CVPR 2025arXiv:2503.10597
3
citations
#10900

Counterfactual Realizability

Arvind Raghavan, Elias Bareinboim

ICLR 2025arXiv:2503.11870
3
citations
#10901

LOD-GS: Achieving Levels of Detail using Scalable Gaussian Soup

Jianxiong Shen, Yue Qian, Xiaohang Zhan

CVPR 2025
3
citations
#10902

Flow4Agent: Long-form Video Understanding via Motion Prior from Optical Flow

Ruyang Liu, Shangkun Sun, Haoran Tang et al.

ICCV 2025arXiv:2510.05836
3
citations
#10903

MomentSeeker: A Task-Oriented Benchmark For Long-Video Moment Retrieval

Huaying Yuan, Jian Ni, Zheng Liu et al.

NEURIPS 2025arXiv:2502.12558
3
citations
#10904

Preventing Shortcuts in Adapter Training via Providing the Shortcuts

Anujraaj Goyal, Guocheng Qian, Huseyin Coskun et al.

NEURIPS 2025arXiv:2510.20887
3
citations
#10905

FastLongSpeech: Enhancing Large Speech-Language Models for Efficient Long-Speech Processing

Shoutao Guo, Shaolei Zhang, Qingkai Fang et al.

NEURIPS 2025arXiv:2507.14815
3
citations
#10906

Parallel Sequence Modeling via Generalized Spatial Propagation Network

Hongjun Wang, Wonmin Byeon, Jiarui Xu et al.

CVPR 2025arXiv:2501.12381
3
citations
#10907

TAViS: Text-bridged Audio-Visual Segmentation with Foundation Models

Ziyang Luo, Nian Liu, Xuguang Yang et al.

ICCV 2025arXiv:2506.11436
3
citations
#10908

Mitigating Hallucination in VideoLLMs via Temporal-Aware Activation Engineering

JIANFENG CAI, Jiale Hong, Zongmeng Zhang et al.

NEURIPS 2025oralarXiv:2505.12826
3
citations
#10909

Align Your Rhythm: Generating Highly Aligned Dance Poses with Gating-Enhanced Rhythm-Aware Feature Representation

Congyi Fan, Jian Guan, Xuanjia Zhao et al.

ICCV 2025arXiv:2503.17340
3
citations
#10910

Improving Bilinear RNN with Closed-loop Control

Jiaxi Hu, Yongqi Pan, Jusen Du et al.

NEURIPS 2025spotlightarXiv:2506.02475
3
citations
#10911

Universal Visuo-Tactile Video Understanding for Embodied Interaction

Yifan Xie, Mingyang Li, Shoujie Li et al.

NEURIPS 2025arXiv:2505.22566
3
citations
#10912

Hardware-aligned Hierarchical Sparse Attention for Efficient Long-term Memory Access

Xiang Hu, Jiaqi Leng, Jun Zhao et al.

NEURIPS 2025arXiv:2504.16795
3
citations
#10913

4D Gaussian Splatting SLAM

Yanyan Li, Youxu Fang, Zunjie Zhu et al.

ICCV 2025arXiv:2503.16710
3
citations
#10914

TensorRL-QAS: Reinforcement learning with tensor networks for improved quantum architecture search

Akash Kundu, Stefano Mangini

NEURIPS 2025arXiv:2505.09371
3
citations
#10915

Task Vector Quantization for Memory-Efficient Model Merging

Youngeun Kim, Seunghwan Lee, Aecheon Jung et al.

ICCV 2025arXiv:2503.06921
3
citations
#10916

Enhancing Diversity for Data-free Quantization

Kai Zhao, zhihao zhuang, Miao Zhang et al.

CVPR 2025
3
citations
#10917

PSA-SSL: Pose and Size-aware Self-Supervised Learning on LiDAR Point Clouds

Barza Nisar, Steven L. Waslander

CVPR 2025arXiv:2503.13914
3
citations
#10918

Weakly Supervised Visible-Infrared Person Re-Identification via Heterogeneous Expert Collaborative Consistency Learning

Yafei Zhang, Lingqi Kong, Huafeng Li et al.

ICCV 2025arXiv:2507.12942
3
citations
#10919

Semantic Causality-Aware Vision-Based 3D Occupancy Prediction

Dubing Chen, Huan Zheng, Yucheng Zhou et al.

ICCV 2025arXiv:2509.08388
3
citations
#10920

Implicit Bias Injection Attacks against Text-to-Image Diffusion Models

Huayang Huang, Xiangye Jin, Jiaxu Miao et al.

CVPR 2025arXiv:2504.01819
3
citations
#10921

Fair Generation without Unfair Distortions: Debiasing Text-to-Image Generation with Entanglement-Free Attention

Jeonghoon Park, Juyoung Lee, Chaeyeon Chung et al.

ICCV 2025arXiv:2506.13298
3
citations
#10922

X2-Gaussian: 4D Radiative Gaussian Splatting for Continuous-time Tomographic Reconstruction

Weihao Yu, Yuanhao Cai, Ruyi Zha et al.

ICCV 2025
3
citations
#10923

SAM-REF: Introducing Image-Prompt Synergy during Interaction for Detail Enhancement in the Segment Anything Model

Chongkai Yu, Ting Liu, Li Anqi et al.

CVPR 2025arXiv:2408.11535
3
citations
#10924

RePIC: Reinforced Post-Training for Personalizing Multi-Modal Language Models

Yeongtak Oh, Dohyun Chung, Juhyeon Shin et al.

NEURIPS 2025arXiv:2506.18369
3
citations
#10925

DAGSM: Disentangled Avatar Generation with GS-enhanced Mesh

Jingyu Zhuang, Di Kang, Linchao Bao et al.

CVPR 2025arXiv:2411.15205
3
citations
#10926

Revisiting LRP: Positional Attribution as the Missing Ingredient for Transformer Explainability

Yarden Bakish, Itamar Zimerman, Hila Chefer et al.

NEURIPS 2025arXiv:2506.02138
3
citations
#10927

Towards Principled Unsupervised Multi-Agent Reinforcement Learning

Riccardo Zamboni, Mirco Mutti, Marcello Restelli

NEURIPS 2025arXiv:2502.08365
3
citations
#10928

Beyond Modality Collapse: Representation Blending for Multimodal Dataset Distillation

xin zhang, Ziruo Zhang, JIAWEI DU et al.

NEURIPS 2025arXiv:2505.14705
3
citations
#10929

Entropy Rectifying Guidance for Diffusion and Flow Models

Tariq Berrada Ifriqi, Adriana Romero-Soriano, Michal Drozdzal et al.

NEURIPS 2025arXiv:2504.13987
3
citations
#10930

FiRe: Fixed-points of Restoration Priors for Solving Inverse Problems

Matthieu Terris, Ulugbek Kamilov, Thomas Moreau

CVPR 2025arXiv:2411.18970
3
citations
#10931

DeltaPhi: Physical States Residual Learning for Neural Operators in Data-Limited PDE Solving

Xihang Yue, Yi Yang, Linchao Zhu

NEURIPS 2025arXiv:2406.09795
3
citations
#10932

EgoDTM: Towards 3D-Aware Egocentric Video-Language Pretraining

Boshen Xu, Yuting Mei, liu xinbi et al.

NEURIPS 2025arXiv:2503.15470
3
citations
#10933

Sharp-It: A Multi-view to Multi-view Diffusion Model for 3D Synthesis and Manipulation

Yiftach Edelstein, Or Patashnik, Dana Cohen-Bar et al.

CVPR 2025arXiv:2412.02631
3
citations
#10934

SceneDecorator: Towards Scene-Oriented Story Generation with Scene Planning and Scene Consistency

Quanjian Song, Donghao Zhou, Jingyu Lin et al.

NEURIPS 2025arXiv:2510.22994
3
citations
#10935

OWMM-Agent: Open World Mobile Manipulation With Multi-modal Agentic Data Synthesis

Junting Chen, Haotian Liang, Lingxiao Du et al.

NEURIPS 2025arXiv:2506.04217
3
citations
#10936

SCAN: Bootstrapping Contrastive Pre-training for Data Efficiency

Yangyang Guo, Mohan Kankanhalli

ICCV 2025arXiv:2411.09126
3
citations
#10937

Continuous Concepts Removal in Text-to-image Diffusion Models

Tingxu Han, Weisong Sun, Yanrong Hu et al.

NEURIPS 2025arXiv:2412.00580
3
citations
#10938

Electromyography-Informed Facial Expression Reconstruction for Physiological-Based Synthesis and Analysis

Tim Büchner, Christoph Anders, Orlando Guntinas-Lichius et al.

CVPR 2025highlightarXiv:2503.09556
3
citations
#10939

One Prompt Fits All: Universal Graph Adaptation for Pretrained Models

Yongqi Huang, Jitao Zhao, Dongxiao He et al.

NEURIPS 2025arXiv:2509.22416
3
citations
#10940

Scene-agnostic Pose Regression for Visual Localization

Junwei Zheng, Ruiping Liu, Yufan Chen et al.

CVPR 2025arXiv:2503.19543
3
citations
#10941

Towards Source-Free Machine Unlearning

Sk Miraj Ahmed, Umit Basaran, Dripta S. Raychaudhuri et al.

CVPR 2025arXiv:2508.15127
3
citations
#10942

Can MLLMs Absorb Math Reasoning Abilities from LLMs as Free Lunch?

Yijie Hu, Zihao Zhou, Kaizhu Huang et al.

NEURIPS 2025arXiv:2510.14387
3
citations
#10943

FedSPA: Generalizable Federated Graph Learning under Homophily Heterogeneity

Zihan Tan, Guancheng Wan, Wenke Huang et al.

CVPR 2025
3
citations
#10944

Large-scale Pre-training for Grounded Video Caption Generation

Evangelos Kazakos, Cordelia Schmid, Josef Sivic

ICCV 2025arXiv:2503.10781
3
citations
#10945

Bridging Sign and Spoken Languages: Pseudo Gloss Generation for Sign Language Translation

Jianyuan Guo, Peike Li, Trevor Cohn

NEURIPS 2025oralarXiv:2505.15438
3
citations
#10946

SPACE: Noise Contrastive Estimation Stabilizes Self-Play Fine-Tuning for Large Language Models

Yibo Wang, Guangda Huzhang, Qingguo Chen et al.

NEURIPS 2025arXiv:2512.07175
3
citations
#10947

Training-Free Bayesianization for Low-Rank Adapters of Large Language Models

Haizhou Shi, Yibin Wang, Ligong Han et al.

NEURIPS 2025arXiv:2412.05723
3
citations
#10948

Player-Centric Multimodal Prompt Generation for Large Language Model Based Identity-Aware Basketball Video Captioning

Zeyu Xi, Haoying Sun, Yaofei Wu et al.

ICCV 2025arXiv:2507.20163
3
citations
#10949

SOMBRL: Scalable and Optimistic Model-Based RL

Bhavya, Lenart Treven, Carmelo Sferrazza et al.

NEURIPS 2025arXiv:2511.20066
3
citations
#10950

Learning Differential Pyramid Representation for Tone Mapping

Qirui Yang, Yinbo Li, Yihao Liu et al.

NEURIPS 2025arXiv:2412.01463
3
citations
#10951

FADRM: Fast and Accurate Data Residual Matching for Dataset Distillation

Jiacheng Cui, Xinyue Bi, Yaxin Luo et al.

NEURIPS 2025arXiv:2506.24125
3
citations
#10952

Minimum Width for Deep, Narrow MLP: A Diffeomorphism Approach

Geonho Hwang

NEURIPS 2025arXiv:2308.15873
3
citations
#10953

Valid Selection among Conformal Sets

Mahmoud Hegazy, Liviu Aolaritei, Michael Jordan et al.

NEURIPS 2025arXiv:2506.20173
3
citations
#10954

GRIFFIN: Effective Token Alignment for Faster Speculative Decoding

Shijing Hu, Jingyang Li, Xingyu Xie et al.

NEURIPS 2025arXiv:2502.11018
3
citations
#10955

Shapley-Coop: Credit Assignment for Emergent Cooperation in Self-Interested LLM Agents

Yun Hua, Haosheng Chen, Shiqin Wang et al.

NEURIPS 2025arXiv:2506.07388
3
citations
#10956

Efficient Video Super-Resolution for Real-time Rendering with Decoupled G-buffer Guidance

Mingjun Zheng, Long Sun, Jiangxin Dong et al.

CVPR 2025
3
citations
#10957

Text-to-Decision Agent: Offline Meta-Reinforcement Learning from Natural Language Supervision

Shilin Zhang, Zican Hu, Wenhao Wu et al.

NEURIPS 2025arXiv:2504.15046
3
citations
#10958

SOGS: Second-Order Anchor for Advanced 3D Gaussian Splatting

Jiahui Zhang, Fangneng Zhan, Ling Shao et al.

CVPR 2025arXiv:2503.07476
3
citations
#10959

Towards Minimizing Feature Drift in Model Merging: Layer-wise Task Vector Fusion for Adaptive Knowledge Integration

Wenju Sun, Qingyong Li, Wen Wang et al.

NEURIPS 2025arXiv:2505.23859
3
citations
#10960

Safe-Sora: Safe Text-to-Video Generation via Graphical Watermarking

Zihan Su, Xuerui Qiu, Hongbin Xu Xu et al.

NEURIPS 2025oralarXiv:2505.12667
3
citations
#10961

LLaFEA: Frame-Event Complementary Fusion for Fine-Grained Spatiotemporal Understanding in LMMs

Hanyu Zhou, Gim Hee Lee

ICCV 2025arXiv:2503.06934
3
citations
#10962

FreeInv: Free Lunch for Improving DDIM Inversion

Yuxiang Bao, Huijie Liu, xun gao et al.

NEURIPS 2025arXiv:2503.23035
3
citations
#10963

Improve Representation for Imbalanced Regression through Geometric Constraints

Zijian Dong, Yilei Wu, Chongyao Chen et al.

CVPR 2025arXiv:2503.00876
3
citations
#10964

SHAP zero Explains Biological Sequence Models with Near-zero Marginal Cost for Future Queries

Darin Tsui, Aryan Musharaf, Yigit Efe Erginbas et al.

NEURIPS 2025arXiv:2410.19236
3
citations
#10965

DiMPLe - Disentangled Multi-Modal Prompt Learning: Enhancing Out-Of-Distribution Alignment with Invariant and Spurious Feature Separation

Umaima Rahman, Mohammad Yaqub, Dwarikanath Mahapatra

ICCV 2025arXiv:2506.21237
3
citations
#10966

URDF-Anything: Constructing Articulated Objects with 3D Multimodal Language Model

Zhe Li, Xiang Bai, Jieyu Zhang et al.

NEURIPS 2025spotlightarXiv:2511.00940
3
citations
#10967

SemAlign3D: Semantic Correspondence between RGB-Images through Aligning 3D Object-Class Representations

Krispin Wandel, Hesheng Wang

CVPR 2025arXiv:2503.22462
3
citations
#10968

TimeEmb: A Lightweight Static-Dynamic Disentanglement Framework for Time Series Forecasting

Mingyuan Xia, Chunxu Zhang, Zijian Zhang et al.

NEURIPS 2025oralarXiv:2510.00461
3
citations
#10969

Dark-ISP: Enhancing RAW Image Processing for Low-Light Object Detection

Jiasheng Guo, Xin Gao, Yuxiang Yan et al.

ICCV 2025arXiv:2509.09183
3
citations
#10970

M2SFormer: Multi-Spectral and Multi-Scale Attention with Edge-Aware Difficulty Guidance for Image Forgery Localization

Ju-Hyeon Nam, Dong-Hyun Moon, Sang-Chul Lee

ICCV 2025highlightarXiv:2506.20922
3
citations
#10971

Distilling Spatially-Heterogeneous Distortion Perception for Blind Image Quality Assessment

Xudong Li, Wenjie Nie, Yan Zhang et al.

CVPR 2025
3
citations
#10972

OracleFusion: Assisting the Decipherment of Oracle Bone Script with Structurally Constrained Semantic Typography

Li Caoshuo, Zengmao Ding, Xiaobin Hu et al.

ICCV 2025arXiv:2506.21101
3
citations
#10973

COSMO: Combination of Selective Memorization for Low-cost Vision-and-Language Navigation

Siqi Zhang, Yanyuan Qiao, Qunbo Wang et al.

ICCV 2025arXiv:2503.24065
3
citations
#10974

D2SP: Dynamic Dual-Stage Purification Framework for Dual Noise Mitigation in Vision-based Affective Recognition.

Haoran Wang, Xinji Mai, Zeng Tao et al.

CVPR 2025arXiv:2406.16473
3
citations
#10975

Rethinking Circuit Completeness in Language Models: AND, OR, and ADDER Gates

Hang Chen, Jiaying Zhu, Xinyu Yang et al.

NEURIPS 2025arXiv:2505.10039
3
citations
#10976

Making Classic GNNs Strong Baselines Across Varying Homophily: A Smoothness–Generalization Perspective

Ming Gu, Zhuonan Zheng, Sheng Zhou et al.

NEURIPS 2025arXiv:2412.09805
3
citations
#10977

Semantic and Visual Crop-Guided Diffusion Models for Heterogeneous Tissue Synthesis in Histopathology

Saghir Alfasly, Wataru Uegami, MD ENAMUL HOQ et al.

NEURIPS 2025arXiv:2509.17847
3
citations
#10978

VideoAds for Fast-Paced Video Understanding

Zheyuan Zhang, Wanying Dou, Linkai Peng et al.

ICCV 2025arXiv:2504.09282
3
citations
#10979

Articulated Kinematics Distillation from Video Diffusion Models

Xuan Li, Qianli Ma, Tsung-Yi Lin et al.

CVPR 2025arXiv:2504.01204
3
citations
#10980

Prototypes are Balanced Units for Efficient and Effective Partially Relevant Video Retrieval

WonJun Moon, Cheol-Ho Cho, Woojin Jun et al.

ICCV 2025arXiv:2504.13035
3
citations
#10981

TokensGen: Harnessing Condensed Tokens for Long Video Generation

Wenqi Ouyang, Zeqi Xiao, Danni Yang et al.

ICCV 2025arXiv:2507.15728
3
citations
#10982

Face Forgery Video Detection via Temporal Forgery Cue Unraveling

Zonghui Guo, YingJie Liu, Jie Zhang et al.

CVPR 2025
3
citations
#10983

Learning to Specialize: Joint Gating-Expert Training for Adaptive MoEs in Decentralized Settings

Yehya Farhat, Hamza ElMokhtar Shili, Fangshuo Liao et al.

NEURIPS 2025arXiv:2306.08586
3
citations
#10984

Mono2Stereo: A Benchmark and Empirical Study for Stereo Conversion

Songsong Yu, Yuxin Chen, Zhongang Qi et al.

CVPR 2025arXiv:2503.22262
3
citations
#10985

Adaptive Defense against Harmful Fine-Tuning for Large Language Models via Bayesian Data Scheduler

Zixuan Hu, Li Shen, Zhenyi Wang et al.

NEURIPS 2025spotlightarXiv:2510.27172
3
citations
#10986

Multipole Attention for Efficient Long Context Reasoning

Coleman Hooper, Sebastian Zhao, Luca Manolache et al.

NEURIPS 2025arXiv:2506.13059
3
citations
#10987

Attention Sinks: A 'Catch, Tag, Release' Mechanism for Embeddings

Stephen Zhang, Mustafa Khan, Vardan Papyan

NEURIPS 2025arXiv:2502.00919
3
citations
#10988

Compositional Caching for Training-free Open-vocabulary Attribute Detection

Marco Garosi, Alessandro Conti, Gaowen Liu et al.

CVPR 2025highlightarXiv:2503.19145
3
citations
#10989

TemCoCo: Temporally Consistent Multi-modal Video Fusion with Visual-Semantic Collaboration

Gong Meiqi, Hao Zhang, Xunpeng Yi et al.

ICCV 2025arXiv:2508.17817
3
citations
#10990

HuMoCon: Concept Discovery for Human Motion Understanding

Qihang Fang, Chengcheng Tang, Bugra Tekin et al.

CVPR 2025arXiv:2505.20920
3
citations
#10991

Revisiting Mode Connectivity in Neural Networks with Bezier Surface

Jie Ren, Pin-Yu Chen, Ren Wang

ICLR 2025
3
citations
#10992

AR-RAG: Autoregressive Retrieval Augmentation for Image Generation

Jingyuan Qi, Zhiyang Xu, Qifan Wang et al.

NEURIPS 2025arXiv:2506.06962
3
citations
#10993

Can We Infer Confidential Properties of Training Data from LLMs?

Pengrun Huang, Chhavi Yadav, Kamalika Chaudhuri et al.

NEURIPS 2025spotlightarXiv:2506.10364
3
citations
#10994

Beyond the Surface: Enhancing LLM-as-a-Judge Alignment with Human via Internal Representations

Peng Lai, Jianjie Zheng, Sijie Cheng et al.

NEURIPS 2025arXiv:2508.03550
3
citations
#10995

ORIGEN: Zero-Shot 3D Orientation Grounding in Text-to-Image Generation

Yunhong Min, Daehyeon Choi, Kyeongmin Yeo et al.

NEURIPS 2025arXiv:2503.22194
3
citations
#10996

High Temporal Consistency through Semantic Similarity Propagation in Semi-Supervised Video Semantic Segmentation for Autonomous Flight

Cédric Vincent, Taehyoung Kim, Henri Meeß

CVPR 2025arXiv:2503.15676
3
citations
#10997

GeoVideo: Introducing Geometric Regularization into Video Generation Model

Yunpeng Bai, Shaoheng Fang, Chaohui Yu et al.

NEURIPS 2025oralarXiv:2512.03453
3
citations
#10998

LabelAny3D: Label Any Object 3D in the Wild

Jin Yao, Radowan Mahmud Redoy, Sebastian Elbaum et al.

NEURIPS 2025arXiv:2601.01676
3
citations
#10999

Generative Photomontage

Sean J. Liu, Nupur Kumari, Ariel Shamir et al.

CVPR 2025arXiv:2408.07116
3
citations
#11000

DynRefer: Delving into Region-level Multimodal Tasks via Dynamic Resolution

Yuzhong Zhao, Feng Liu, Yue Liu et al.

CVPR 2025arXiv:2405.16071
3
citations