Most Cited 2025 "frequency domain features" Papers

22,274 papers found • Page 29 of 112

#5601

Transformers Learn Low Sensitivity Functions: Investigations and Implications

Bhavya Vasudeva, Deqing Fu, Tianyi Zhou et al.

ICLR 2025arXiv:2403.06925
8
citations
#5602

ZoomLDM: Latent Diffusion Model for Multi-scale Image Generation

Srikar Yellapragada, Alexandros Graikos, Kostas Triaridis et al.

CVPR 2025arXiv:2411.16969
8
citations
#5603

Accurate Differential Operators for Hybrid Neural Fields

Aditya Chetan, Guandao Yang, Zichen Wang et al.

CVPR 2025arXiv:2312.05984
8
citations
#5604

PhysX-3D: Physical-Grounded 3D Asset Generation

Ziang Cao, Zhaoxi Chen, Liang Pan et al.

NEURIPS 2025spotlightarXiv:2507.12465
8
citations
#5605

Scene4U: Hierarchical Layered 3D Scene Reconstruction from Single Panoramic Image for Your Immerse Exploration

Zilong Huang, Jun He, Junyan Ye et al.

CVPR 2025arXiv:2504.00387
8
citations
#5606

DepR: Depth Guided Single-view Scene Reconstruction with Instance-level Diffusion

Qingcheng Zhao, Xiang Zhang, Haiyang Xu et al.

ICCV 2025arXiv:2507.22825
8
citations
#5607

Spectral Informed Mamba for Robust Point Cloud Processing

Ali Bahri, Moslem Yazdanpanah, Mehrdad Noori et al.

CVPR 2025arXiv:2503.04953
8
citations
#5608

On the Sample Complexity Bounds of Bilevel Reinforcement Learning

Mudit Gaur, Utsav Singh, Amrit Singh Bedi et al.

NEURIPS 2025arXiv:2503.17644
8
citations
#5609

MAR-3D: Progressive Masked Auto-regressor for High-Resolution 3D Generation

Jinnan Chen, Lingting Zhu, Zeyu HU et al.

CVPR 2025highlightarXiv:2503.20519
8
citations
#5610

Embodied Web Agents: Bridging Physical-Digital Realms for Integrated Agent Intelligence

Yining Hong, Rui Sun, Bingxuan Li et al.

NEURIPS 2025spotlightarXiv:2506.15677
8
citations
#5611

NullSwap: Proactive Identity Cloaking Against Deepfake Face Swapping

Tianyi Wang, Shuaicheng Niu, Harry Cheng et al.

ICCV 2025arXiv:2503.18678
8
citations
#5612

Lost in Latent Space: An Empirical Study of Latent Diffusion Models for Physics Emulation

François Rozet, Ruben Ohana, Michael McCabe et al.

NEURIPS 2025arXiv:2507.02608
8
citations
#5613

ReDit: Reward Dithering for Improved LLM Policy Optimization

Chenxing Wei, Jiarui Yu, Ying He et al.

NEURIPS 2025arXiv:2506.18631
8
citations
#5614

Monocular and Generalizable Gaussian Talking Head Animation

Shengjie Gong, Haojie Li, Jiapeng Tang et al.

CVPR 2025arXiv:2504.00665
8
citations
#5615

Contextual AD Narration with Interleaved Multimodal Sequence

Hanlin Wang, Zhan Tong, Kecheng Zheng et al.

CVPR 2025arXiv:2403.12922
8
citations
#5616

Towards Physical Understanding in Video Generation: A 3D Point Regularization Approach

Yunuo Chen, Junli Cao, Vidit Goel et al.

NEURIPS 2025arXiv:2502.03639
8
citations
#5617

Instruction-based Image Manipulation by Watching How Things Move

Mingdeng Cao, Xuaner Zhang, Yinqiang Zheng et al.

CVPR 2025highlightarXiv:2412.12087
8
citations
#5618

Towards Generalizable Scene Change Detection

Jae-Woo KIM, Ue-Hwan Kim

CVPR 2025arXiv:2409.06214
8
citations
#5619

Oasis: One Image is All You Need for Multimodal Instruction Data Synthesis

Letian Zhang, Quan Cui, Bingchen Zhao et al.

ICCV 2025arXiv:2503.08741
8
citations
#5620

Stable Port-Hamiltonian Neural Networks

Fabian J. Roth, Dominik K. Klein, Maximilian Kannapinn et al.

NEURIPS 2025arXiv:2502.02480
8
citations
#5621

Thinker: Learning to Think Fast and Slow

Stephen Chung, Wenyu Du, Jie Fu

NEURIPS 2025arXiv:2505.21097
8
citations
#5622

p-MoD: Building Mixture-of-Depths MLLMs via Progressive Ratio Decay

Jun Zhang, Desen Meng, Zhengming Zhang et al.

ICCV 2025arXiv:2412.04449
8
citations
#5623

Luminance-GS: Adapting 3D Gaussian Splatting to Challenging Lighting Conditions with View-Adaptive Curve Adjustment

Ziteng Cui, Xuangeng Chu, Tatsuya Harada

CVPR 2025arXiv:2504.01503
8
citations
#5624

SWE-SQL: Illuminating LLM Pathways to Solve User SQL Issues in Real-World Applications

Jinyang Li, Xiaolong Li, Ge Qu et al.

NEURIPS 2025arXiv:2506.18951
8
citations
#5625

Orchid: Image Latent Diffusion for Joint Appearance and Geometry Generation

Akshay Krishnan, Xinchen Yan, Vincent Casser et al.

ICCV 2025arXiv:2501.13087
8
citations
#5626

The Art of Deception: Color Visual Illusions and Diffusion Models

Alexandra Gomez-Villa, Kai Wang, C.Alejandro Parraga et al.

CVPR 2025arXiv:2412.10122
8
citations
#5627

Causally Reliable Concept Bottleneck Models

Giovanni De Felice, Arianna Casanova Flores, Francesco De Santis et al.

NEURIPS 2025arXiv:2503.04363
8
citations
#5628

FAVOR-Bench: A Comprehensive Benchmark for Fine-Grained Video Motion Understanding

Chongjun Tu, Lin Zhang, pengtao chen et al.

NEURIPS 2025oralarXiv:2503.14935
8
citations
#5629

Understanding Adam Requires Better Rotation Dependent Assumptions

Tianyue Zhang, Lucas Maes, Alan Milligan et al.

NEURIPS 2025arXiv:2410.19964
8
citations
#5630

Unified Multimodal Understanding via Byte-Pair Visual Encoding

Wanpeng Zhang, Yicheng Feng, Hao Luo et al.

ICCV 2025highlightarXiv:2506.23639
8
citations
#5631

PERSE: Personalized 3D Generative Avatars from A Single Portrait

Hyunsoo Cha, Inhee Lee, Hanbyul Joo

CVPR 2025arXiv:2412.21206
8
citations
#5632

GTR: Guided Thought Reinforcement Prevents Thought Collapse in RL-based VLM Agent Training

Tong Wei, Yijun Yang, Junliang Xing et al.

ICCV 2025arXiv:2503.08525
8
citations
#5633

Root Cause Analysis of Outliers with Missing Structural Knowledge

William Roy Orchard, Nastaran Okati, Sergio Garrido Mejia et al.

NEURIPS 2025arXiv:2406.05014
8
citations
#5634

Why LVLMs Are More Prone to Hallucinations in Longer Responses: The Role of Context

Ge Zheng, Jiaye Qian, Jiajin Tang et al.

ICCV 2025arXiv:2510.20229
8
citations
#5635

LMM-Det: Make Large Multimodal Models Excel in Object Detection

Jincheng Li, Chunyu Xie, Ji Ao et al.

ICCV 2025arXiv:2507.18300
8
citations
#5636

Why Masking Diffusion Works: Condition on the Jump Schedule for Improved Discrete Diffusion

Alan Amin, Nate Gruver, Andrew Wilson

NEURIPS 2025arXiv:2506.08316
8
citations
#5637

FALCON: Fairness Learning via Contrastive Attention Approach to Continual Semantic Scene Understanding

Thanh-Dat Truong, Utsav Prabhu, Bhiksha Raj et al.

CVPR 2025arXiv:2311.15965
8
citations
#5638

Effective SAM Combination for Open-Vocabulary Semantic Segmentation

Minhyeok Lee, Suhwan Cho, Jungho Lee et al.

CVPR 2025arXiv:2411.14723
8
citations
#5639

CorrCLIP: Reconstructing Patch Correlations in CLIP for Open-Vocabulary Semantic Segmentation

Dengke Zhang, Fagui Liu, Quan Tang

ICCV 2025arXiv:2411.10086
8
citations
#5640

ModSkill: Physical Character Skill Modularization

Yiming Huang, Zhiyang Dou, Lingjie Liu

ICCV 2025arXiv:2502.14140
8
citations
#5641

Enhancing SAM with Efficient Prompting and Preference Optimization for Semi-supervised Medical Image Segmentation

Aishik Konwer, Zhijian Yang, Erhan Bas et al.

CVPR 2025arXiv:2503.04639
8
citations
#5642

Discrete Neural Flow Samplers with Locally Equivariant Transformer

Zijing Ou, Ruixiang Zhang, Yingzhen Li

NEURIPS 2025arXiv:2505.17741
8
citations
#5643

AMO Sampler: Enhancing Text Rendering with Overshooting

Xixi Hu, Keyang Xu, Bo Liu et al.

CVPR 2025arXiv:2411.19415
8
citations
#5644

MonoSplat: Generalizable 3D Gaussian Splatting from Monocular Depth Foundation Models

Yifan Liu, Keyu Fan, Weihao Yu et al.

CVPR 2025arXiv:2505.15185
8
citations
#5645

Edit Away and My Face Will not Stay: Personal Biometric Defense against Malicious Generative Editing

Hanhui Wang, Yihua Zhang, Ruizheng Bai et al.

CVPR 2025arXiv:2411.16832
8
citations
#5646

LLM Strategic Reasoning: Agentic Study through Behavioral Game Theory

Jingru Jia, Zehua Yuan, Junhao Pan et al.

NEURIPS 2025oralarXiv:2502.20432
8
citations
#5647

CPathAgent: An Agent-based Foundation Model for Interpretable High-Resolution Pathology Image Analysis Mimicking Pathologists' Diagnostic Logic

YUXUAN SUN, Yixuan Si, Chenglu Zhu et al.

NEURIPS 2025arXiv:2505.20510
8
citations
#5648

Among Us: A Sandbox for Measuring and Detecting Agentic Deception

Satvik Golechha, Adrià Garriga-Alonso

NEURIPS 2025spotlightarXiv:2504.04072
8
citations
#5649

RoboPEPP: Vision-Based Robot Pose and Joint Angle Estimation through Embedding Predictive Pre-Training

Raktim Gautam Goswami, Prashanth Krishnamurthy, Yann LeCun et al.

CVPR 2025highlightarXiv:2411.17662
8
citations
#5650

Can DPO Learn Diverse Human Values? A Theoretical Scaling Law

Shawn Im, Sharon Li

NEURIPS 2025arXiv:2408.03459
8
citations
#5651

ChatReID: Open-ended Interactive Person Retrieval via Hierarchical Progressive Tuning for Vision Language Models

Ke Niu, Haiyang Yu, Mengyang Zhao et al.

ICCV 2025arXiv:2502.19958
8
citations
#5652

A multiscale analysis of mean-field transformers in the moderate interaction regime

Giuseppe Bruno, Federico Pasqualotto, Andrea Agazzi

NEURIPS 2025oralarXiv:2509.25040
8
citations
#5653

MUNBa: Machine Unlearning via Nash Bargaining

Jing Wu, Mehrtash Harandi

ICCV 2025arXiv:2411.15537
8
citations
#5654

Interpretable Image Classification via Non-parametric Part Prototype Learning

Zhijie Zhu, Lei Fan, Maurice Pagnucco et al.

CVPR 2025arXiv:2503.10247
8
citations
#5655

Noise Modeling in One Hour: Minimizing Preparation Efforts for Self-supervised Low-Light RAW Image Denoising

Feiran Li, Haiyang Jiang, Daisuke Iso

CVPR 2025arXiv:2505.00045
8
citations
#5656

FaceBench: A Multi-View Multi-Level Facial Attribute VQA Dataset for Benchmarking Face Perception MLLMs

Xiaoqin Wang, Xusen Ma, Xianxu Hou et al.

CVPR 2025arXiv:2503.21457
8
citations
#5657

DiffusionSfM: Predicting Structure and Motion via Ray Origin and Endpoint Diffusion

Qitao Zhao, Amy Lin, Jeff Tan et al.

CVPR 2025arXiv:2505.05473
8
citations
#5658

ProbPose: A Probabilistic Approach to 2D Human Pose Estimation

Miroslav Purkrábek, Jiri Matas

CVPR 2025arXiv:2412.02254
8
citations
#5659

WHAT MAKES MATH PROBLEMS HARD FOR REINFORCEMENT LEARNING: A CASE STUDY

Ali Shehper, Anibal Medina-Mardones, Lucas Fagan et al.

NEURIPS 2025arXiv:2408.15332
8
citations
#5660

DnLUT: Ultra-Efficient Color Image Denoising via Channel-Aware Lookup Tables

Sidi Yang, Binxiao Huang, Yulun Zhang et al.

CVPR 2025arXiv:2503.15931
8
citations
#5661

T2I-Copilot: A Training-Free Multi-Agent Text-to-Image System for Enhanced Prompt Interpretation and Interactive Generation

Chieh-Yun Chen, Min Shi, Gong Zhang et al.

ICCV 2025arXiv:2507.20536
8
citations
#5662

ZIM: Zero-Shot Image Matting for Anything

Beomyoung Kim, Chanyong Shin, Joonhyun Jeong et al.

ICCV 2025highlightarXiv:2411.00626
8
citations
#5663

GraphLand: Evaluating Graph Machine Learning Models on Diverse Industrial Data

Gleb Bazhenov, Oleg Platonov, Liudmila Prokhorenkova

NEURIPS 2025oralarXiv:2409.14500
8
citations
#5664

LangSplatV2: High-dimensional 3D Language Gaussian Splatting with 450+ FPS

Wanhua Li, Yujie Zhao, Minghan Qin et al.

NEURIPS 2025arXiv:2507.07136
8
citations
#5665

HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned Guidance

Jiazi Bu, Pengyang Ling, Yujie Zhou et al.

NEURIPS 2025arXiv:2504.06232
8
citations
#5666

MMReason: An Open-Ended Multi-Modal Multi-Step Reasoning Benchmark for MLLMs Toward AGI

Huanjin Yao, Jiaxing Huang, Yawen Qiu et al.

ICCV 2025arXiv:2506.23563
8
citations
#5667

GENMANIP: LLM-driven Simulation for Generalizable Instruction-Following Manipulation

Ning Gao, Yilun Chen, Shuai Yang et al.

CVPR 2025arXiv:2506.10966
8
citations
#5668

DeNVeR: Deformable Neural Vessel Representations for Unsupervised Video Vessel Segmentation

Chun-Hung Wu, Shih-Hong Chen, Chih Yao Hu et al.

CVPR 2025arXiv:2406.01591
8
citations
#5669

Erase Diffusion: Empowering Object Removal Through Calibrating Diffusion Pathways

Yi Liu, Hao Zhou, Benlei Cui et al.

CVPR 2025highlightarXiv:2503.07026
8
citations
#5670

Exploring Temporally-Aware Features for Point Tracking

Inès Hyeonsu Kim, Seokju Cho, Gabriel Huang et al.

CVPR 2025arXiv:2501.12218
8
citations
#5671

Generating Multimodal Driving Scenes via Next-Scene Prediction

Yanhao Wu, Haoyang Zhang, Tianwei Lin et al.

CVPR 2025arXiv:2503.14945
8
citations
#5672

COCONut-PanCap: Joint Panoptic Segmentation and Grounded Captions for Fine-Grained Understanding and Generation

Xueqing Deng, Linjie Yang, Qihang Yu et al.

NEURIPS 2025arXiv:2502.02589
8
citations
#5673

Fish-Vista: A Multi-Purpose Dataset for Understanding & Identification of Traits from Images

Kazi Sajeed Mehrab, M. Maruf, Arka Daw et al.

CVPR 2025arXiv:2407.08027
8
citations
#5674

RoME: Domain-Robust Mixture-of-Experts for MILP Solution Prediction across Domains

Tianle Pu, Zijie Geng, Haoyang Liu et al.

NEURIPS 2025arXiv:2511.02331
8
citations
#5675

Cross-modal Causal Relation Alignment for Video Question Grounding

weixing chen, Yang Liu, Binglin Chen et al.

CVPR 2025highlightarXiv:2503.07635
8
citations
#5676

SyncDiff: Synchronized Motion Diffusion for Multi-Body Human-Object Interaction Synthesis

Wenkun He, Yun Liu, Ruitao Liu et al.

ICCV 2025arXiv:2412.20104
8
citations
#5677

Bringing RNNs Back to Efficient Open-Ended Video Understanding

Weili Xu, Enxin Song, Wenhao Chai et al.

ICCV 2025arXiv:2507.02591
8
citations
#5678

Overcoming Challenges of Long-Horizon Prediction in Driving World Models

Arian Mousakhan, Sudhanshu Mittal, Silvio Galesso et al.

NEURIPS 2025arXiv:2507.13162
8
citations
#5679

AFL: A Single-Round Analytic Approach for Federated Learning with Pre-trained Models

Run He, Kai Tong, Di Fang et al.

CVPR 2025arXiv:2405.16240
8
citations
#5680

Adaptive Unimodal Regulation for Balanced Multimodal Information Acquisition

Chengxiang Huang, Yake Wei, Zequn Yang et al.

CVPR 2025arXiv:2503.18595
8
citations
#5681

From Judgment to Interference: Early Stopping LLM Harmful Outputs via Streaming Content Monitoring

Yang Li, Qiang Sheng, Yehan Yang et al.

NEURIPS 2025arXiv:2506.09996
8
citations
#5682

nvBench 2.0: Resolving Ambiguity in Text-to-Visualization through Stepwise Reasoning

Tianqi Luo, Chuhan Huang, Leixian Shen et al.

NEURIPS 2025arXiv:2503.12880
8
citations
#5683

ConMo: Controllable Motion Disentanglement and Recomposition for Zero-Shot Motion Transfer

Jiayi Gao, Zijin Yin, Changcheng Hua et al.

CVPR 2025arXiv:2504.02451
8
citations
#5684

LangBridge: Interpreting Image as a Combination of Language Embeddings

Jiaqi Liao, Yuwei Niu, Fanqing Meng et al.

ICCV 2025arXiv:2503.19404
8
citations
#5685

HAMoBE: Hierarchical and Adaptive Mixture of Biometric Experts for Video-based Person ReID

Yiyang Su, Yunping Shi, Feng Liu et al.

ICCV 2025arXiv:2508.05038
8
citations
#5686

CrossOver: 3D Scene Cross-Modal Alignment

Sayan Deb Sarkar, Ondrej Miksik, Marc Pollefeys et al.

CVPR 2025highlightarXiv:2502.15011
8
citations
#5687

ProKeR: A Kernel Perspective on Few-Shot Adaptation of Large Vision-Language Models

Yassir Bendou, Amine Ouasfi, Vincent Gripon et al.

CVPR 2025arXiv:2501.11175
8
citations
#5688

Holistic Unlearning Benchmark: A Multi-Faceted Evaluation for Text-to-Image Diffusion Model Unlearning

Saemi Moon, Minjong Lee, Sangdon Park et al.

ICCV 2025arXiv:2410.05664
8
citations
#5689

GS-2DGS: Geometrically Supervised 2DGS for Reflective Object Reconstruction

Jinguang Tong, Xuesong li, Fahira Afzal Maken et al.

CVPR 2025arXiv:2506.13110
8
citations
#5690

PurpCode: Reasoning for Safer Code Generation

Jiawei Liu, Nirav Diwan, Zhe Wang et al.

NEURIPS 2025arXiv:2507.19060
8
citations
#5691

Deeply Supervised Flow-Based Generative Models

Inkyu Shin, Chenglin Yang, Liang-Chieh Chen

ICCV 2025arXiv:2503.14494
8
citations
#5692

3D-GSW: 3D Gaussian Splatting for Robust Watermarking

Youngdong Jang, Hyunje Park, Feng Yang et al.

CVPR 2025arXiv:2409.13222
8
citations
#5693

Learning normalized image densities via dual score matching

Florentin Guth, Zahra Kadkhodaie, Eero Simoncelli

NEURIPS 2025arXiv:2506.05310
8
citations
#5694

Do Automatic Factuality Metrics Measure Factuality? A Critical Evaluation

Sanjana Ramprasad, Byron Wallace

NEURIPS 2025arXiv:2411.16638
8
citations
#5695

Parameter-efficient Fine-tuning in Hyperspherical Space for Open-vocabulary Semantic Segmentation

Zelin Peng, Zhengqin Xu, Zhilin Zeng et al.

CVPR 2025arXiv:2405.18840
8
citations
#5696

DroneSplat: 3D Gaussian Splatting for Robust 3D Reconstruction from In-the-Wild Drone Imagery

Jiadong Tang, Yu Gao, Dianyi Yang et al.

CVPR 2025highlightarXiv:2503.16964
8
citations
#5697

GaussianFusion: Gaussian-Based Multi-Sensor Fusion for End-to-End Autonomous Driving

Shuai Liu, Quanmin Liang, Zefeng Li et al.

NEURIPS 2025spotlightarXiv:2506.00034
8
citations
#5698

ROD-MLLM: Towards More Reliable Object Detection in Multimodal Large Language Models

Heng Yin, Yuqiang Ren, Ke Yan et al.

CVPR 2025
8
citations
#5699

The Change You Want To Detect: Semantic Change Detection In Earth Observation With Hybrid Data Generationf

Yanis Benidir, Nicolas Gonthier, Clement Mallet

CVPR 2025
8
citations
#5700

Focusing on Tracks for Online Multi-Object Tracking

Kyujin Shim, Kangwook Ko, YuJin Yang et al.

CVPR 2025
8
citations
#5701

BANet: Bilateral Aggregation Network for Mobile Stereo Matching

Gangwei Xu, Jiaxin Liu, Xianqi Wang et al.

ICCV 2025arXiv:2503.03259
8
citations
#5702

SIGMAN: Scaling 3D Human Gaussian Generation with Millions of Assets

Yuhang Yang, Fengqi Liu, Yixing Lu et al.

ICCV 2025
8
citations
#5703

MimeQA: Towards Socially-Intelligent Nonverbal Foundation Models

Hengzhi Li, Megan Tjandrasuwita, Yi R. (May) Fung et al.

NEURIPS 2025arXiv:2502.16671
8
citations
#5704

Extrapolated Urban View Synthesis Benchmark

Xiangyu Han, Zhen Jia, Boyi Li et al.

ICCV 2025arXiv:2412.05256
8
citations
#5705

Solving Inverse Problems with FLAIR

Julius Erbach, Dominik Narnhofer, Andreas Dombos et al.

NEURIPS 2025arXiv:2506.02680
8
citations
#5706

Split Gibbs Discrete Diffusion Posterior Sampling

Wenda Chu, Zihui Wu, Yifan Chen et al.

NEURIPS 2025arXiv:2503.01161
8
citations
#5707

3D Student Splatting and Scooping

Jialin Zhu, Jiangbei Yue, Feixiang He et al.

CVPR 2025arXiv:2503.10148
8
citations
#5708

MMLongBench: Benchmarking Long-Context Vision-Language Models Effectively and Thoroughly

Zhaowei Wang, Wenhao Yu, Xiyu REN et al.

NEURIPS 2025spotlightarXiv:2505.10610
8
citations
#5709

True Zero-Shot Inference of Dynamical Systems Preserving Long-Term Statistics

Christoph Jürgen Hemmer, Daniel Durstewitz

NEURIPS 2025oralarXiv:2505.13192
8
citations
#5710

Decouple and Track: Benchmarking and Improving Video Diffusion Transformers For Motion Transfer

Qingyu Shi, Jianzong Wu, Jinbin Bai et al.

ICCV 2025arXiv:2503.17350
8
citations
#5711

RAD: Region-Aware Diffusion Models for Image Inpainting

Sora Kim, Sungho Suh, Minsik Lee

CVPR 2025arXiv:2412.09191
8
citations
#5712

Boost Your Human Image Generation Model via Direct Preference Optimization

Sanghyeon Na, Yonggyu Kim, Hyunjoon Lee

CVPR 2025highlightarXiv:2405.20216
8
citations
#5713

Neural LightRig: Unlocking Accurate Object Normal and Material Estimation with Multi-Light Diffusion

Zexin He, Tengfei Wang, Xin Huang et al.

CVPR 2025arXiv:2412.09593
8
citations
#5714

Exploring CLIP's Dense Knowledge for Weakly Supervised Semantic Segmentation

Zhiwei Yang, Yucong Meng, Kexue Fu et al.

CVPR 2025arXiv:2503.20826
8
citations
#5715

CRISP: Object Pose and Shape Estimation with Test-Time Adaptation

Jingnan Shi, Rajat Talak, Harry Zhang et al.

CVPR 2025highlightarXiv:2412.01052
8
citations
#5716

DynaGuide: Steering Diffusion Polices with Active Dynamic Guidance

Maximilian Du, Shuran Song

NEURIPS 2025arXiv:2506.13922
8
citations
#5717

Towards Stable and Storage-efficient Dataset Distillation: Matching Convexified Trajectory

Wenliang Zhong, Haoyu Tang, Qinghai Zheng et al.

CVPR 2025arXiv:2406.19827
8
citations
#5718

g3D-LF: Generalizable 3D-Language Feature Fields for Embodied Tasks

Zihan Wang, Gim Hee Lee

CVPR 2025arXiv:2411.17030
8
citations
#5719

Edge-SD-SR: Low Latency and Parameter Efficient On-device Super-Resolution with Stable Diffusion via Bidirectional Conditioning

Isma Hadji, Mehdi Noroozi, Victor Escorcia et al.

CVPR 2025arXiv:2412.06978
8
citations
#5720

SimpleStrat: Diversifying Language Model Generation with Stratification

Justin Wong, Yury Orlovskiy, Alexander Shypula et al.

NEURIPS 2025arXiv:2410.09038
8
citations
#5721

MME-VideoOCR: Evaluating OCR-Based Capabilities of Multimodal LLMs in Video Scenarios

Yang Shi, Huanqian Wang, Xie et al.

NEURIPS 2025oralarXiv:2505.21333
8
citations
#5722

MLZero: A Multi-Agent System for End-to-end Machine Learning Automation

Haoyang Fang, Boran Han, Nick Erickson et al.

NEURIPS 2025arXiv:2505.13941
8
citations
#5723

Training-Free Text-Guided Image Editing with Visual Autoregressive Model

Yufei Wang, Lanqing Guo, Zhihao Li et al.

ICCV 2025arXiv:2503.23897
8
citations
#5724

UniMamba: Unified Spatial-Channel Representation Learning with Group-Efficient Mamba for LiDAR-based 3D Object Detection

Xin Jin, Haisheng Su, Kai Liu et al.

CVPR 2025arXiv:2503.12009
8
citations
#5725

Sparse2DGS: Geometry-Prioritized Gaussian Splatting for Surface Reconstruction from Sparse Views

Jiang Wu, Rui Li, Yu Zhu et al.

CVPR 2025arXiv:2504.20378
8
citations
#5726

DyCON: Dynamic Uncertainty-aware Consistency and Contrastive Learning for Semi-supervised Medical Image Segmentation

Maregu Assefa, Muzammal Naseer, IYYAKUTTI IYAPPAN GANAPATHI et al.

CVPR 2025arXiv:2504.04566
8
citations
#5727

FlowMo: Variance-Based Flow Guidance for Coherent Motion in Video Generation

Ariel Shaulov, Itay Hazan, Lior Wolf et al.

NEURIPS 2025oralarXiv:2506.01144
8
citations
#5728

DTGBrepGen: A Novel B-rep Generative Model through Decoupling Topology and Geometry

Jing Li, Yihang Fu, Falai Chen

CVPR 2025arXiv:2503.13110
8
citations
#5729

IR3D-Bench: Evaluating Vision-Language Model Scene Understanding as Agentic Inverse Rendering

Hengyu Liu, Chenxin Li, Zhengxin Li et al.

NEURIPS 2025arXiv:2506.23329
8
citations
#5730

DynamicID: Zero-Shot Multi-ID Image Personalization with Flexible Facial Editability

Xirui Hu, Jiahao Wang, Hao chen et al.

ICCV 2025arXiv:2503.06505
8
citations
#5731

Analyzing Finetuning Representation Shift for Multimodal LLMs Steering

Pegah KHAYATAN, Mustafa Shukor, Jayneel Parekh et al.

ICCV 2025arXiv:2501.03012
8
citations
#5732

VideoMAR: Autoregressive Video Generation with Continuous Tokens

Hu Yu, Biao Gong, Hangjie Yuan et al.

NEURIPS 2025oral
8
citations
#5733

A Smooth Sea Never Made a Skilled SAILOR: Robust Imitation via Learning to Search

Arnav Kumar Jain, Vibhakar Mohta, Subin Kim et al.

NEURIPS 2025oralarXiv:2506.05294
8
citations
#5734

Composite Flow Matching for Reinforcement Learning with Shifted-Dynamics Data

Lingkai Kong, Haichuan Wang, Tonghan Wang et al.

NEURIPS 2025spotlightarXiv:2505.23062
8
citations
#5735

Iterative Tool Usage Exploration for Multimodal Agents via Step-wise Preference Tuning

Pengxiang Li, Zhi Gao, Bofei Zhang et al.

NEURIPS 2025arXiv:2504.21561
8
citations
#5736

LLAVIDAL: A Large LAnguage VIsion Model for Daily Activities of Living

Dominick Reilly, Rajatsubhra Chakraborty, Arkaprava Sinha et al.

CVPR 2025arXiv:2406.09390
8
citations
#5737

Finite-Sample Analysis of Policy Evaluation for Robust Average Reward Reinforcement Learning

Yang Xu, Washim Mondal, Vaneet Aggarwal

NEURIPS 2025arXiv:2502.16816
8
citations
#5738

DAViD: Modeling Dynamic Affordance of 3D Objects Using Pre-trained Video Diffusion Models

Hyeonwoo Kim, Sangwon Baik, Hanbyul Joo

ICCV 2025arXiv:2501.08333
8
citations
#5739

HouseCrafter: Lifting Floorplans to 3D Scenes with 2D Diffusion Models

YIWEN CHEN, Hieu Nguyen, Vikram Voleti et al.

ICCV 2025highlightarXiv:2406.20077
8
citations
#5740

CL-LoRA: Continual Low-Rank Adaptation for Rehearsal-Free Class-Incremental Learning

Jiangpeng He, Zhihao Duan, Fengqing Zhu

CVPR 2025arXiv:2505.24816
8
citations
#5741

SeqGrowGraph: Learning Lane Topology as a Chain of Graph Expansions

Mengwei Xie, Shuang Zeng, Xinyuan Chang et al.

ICCV 2025arXiv:2507.04822
8
citations
#5742

Web-Shepherd: Advancing PRMs for Reinforcing Web Agents

Hyungjoo Chae, Seonghwan Kim, Junhee Cho et al.

NEURIPS 2025spotlightarXiv:2505.15277
8
citations
#5743

Multi-modal Knowledge Distillation-based Human Trajectory Forecasting

Jaewoo Jeong, Seohee Lee, Daehee Park et al.

CVPR 2025arXiv:2503.22201
8
citations
#5744

GFlowVLM: Enhancing Multi-step Reasoning in Vision-Language Models with Generative Flow Networks

Haoqiang Kang, Enna Sachdeva, Piyush Gupta et al.

CVPR 2025arXiv:2503.06514
8
citations
#5745

Learned Image Compression with Hierarchical Progressive Context Modeling

Yuqi Li, Haotian Zhang, Li Li et al.

ICCV 2025arXiv:2507.19125
8
citations
#5746

Mask^2DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation

Tianhao Qi, Jianlong Yuan, Wanquan Feng et al.

CVPR 2025
8
citations
#5747

MPS-Prover: Advancing Stepwise Theorem Proving by Multi-Perspective Search and Data Curation

Zhenwen Liang, Linfeng Song, Yang Li et al.

NEURIPS 2025arXiv:2505.10962
8
citations
#5748

AnomalyNCD: Towards Novel Anomaly Class Discovery in Industrial Scenarios

Ziming Huang, Xurui Li, Haotian Liu et al.

CVPR 2025arXiv:2410.14379
8
citations
#5749

Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding

Jinlong Li, Cristiano Saltori, Fabio Poiesi et al.

CVPR 2025arXiv:2503.16707
8
citations
#5750

Visual Jenga: Discovering Object Dependencies via Counterfactual Inpainting

Anand Bhattad, Konpat Preechakul, Alexei Efros

NEURIPS 2025arXiv:2503.21770
8
citations
#5751

Generative Zero-Shot Composed Image Retrieval

Lan Wang, Wei Ao, Vishnu Naresh Boddeti et al.

CVPR 2025
8
citations
#5752

GC4NC: A Benchmark Framework for Graph Condensation on Node Classification with New Insights

Shengbo Gong, Juntong Ni, Noveen Sachdeva et al.

NEURIPS 2025arXiv:2406.16715
8
citations
#5753

Always Skip Attention

Yiping Ji, Hemanth Saratchandran, Peyman Moghadam et al.

ICCV 2025arXiv:2505.01996
8
citations
#5754

Extrapolation by Association: Length Generalization Transfer In Transformers

Ziyang Cai, Nayoung Lee, Avi Schwarzschild et al.

NEURIPS 2025spotlightarXiv:2506.09251
8
citations
#5755

LoTUS: Large-Scale Machine Unlearning with a Taste of Uncertainty

Christoforos N. Spartalis, Theodoros Semertzidis, Efstratios Gavves et al.

CVPR 2025arXiv:2503.18314
8
citations
#5756

Token Activation Map to Visually Explain Multimodal LLMs

Yi Li, Hualiang Wang, Xinpeng Ding et al.

ICCV 2025arXiv:2506.23270
8
citations
#5757

Linear Attention Modeling for Learned Image Compression

Donghui Feng, Zhengxue Cheng, Shen Wang et al.

CVPR 2025arXiv:2502.05741
8
citations
#5758

Multi-Granularity Class Prototype Topology Distillation for Class-Incremental Source-Free Unsupervised Domain Adaptation

Peihua Deng, Jiehua Zhang, Xichun Sheng et al.

CVPR 2025arXiv:2411.16064
8
citations
#5759

OpenGU: A Comprehensive Benchmark for Graph Unlearning

Bowen Fan, Yuming Ai, Xunkai Li et al.

NEURIPS 2025arXiv:2501.02728
8
citations
#5760

H3D-DGS: Exploring Heterogeneous 3D Motion Representation for Deformable 3D Gaussian Splatting

Bing He, Yunuo Chen, Guo Lu et al.

NEURIPS 2025arXiv:2408.13036
8
citations
#5761

PRE-Mamba: A 4D State Space Model for Ultra-High-Frequent Event Camera Deraining

Ciyu Ruan, Ruishan Guo, Zihang GONG et al.

ICCV 2025arXiv:2505.05307
8
citations
#5762

RS-vHeat: Heat Conduction Guided Efficient Remote Sensing Foundation Model

Huiyang Hu, Peijin Wang, Hanbo Bi et al.

ICCV 2025arXiv:2411.17984
8
citations
#5763

DA-VPT: Semantic-Guided Visual Prompt Tuning for Vision Transformers

Li Ren, Chen Chen, Liqiang Wang et al.

CVPR 2025arXiv:2505.23694
8
citations
#5764

Token Embeddings Violate the Manifold Hypothesis

Michael Robinson, Sourya Dey, Tony Chiang

NEURIPS 2025arXiv:2504.01002
8
citations
#5765

Training-Free Safe Denoisers for Safe Use of Diffusion Models

Mingyu Kim, Dongjun Kim, Amman Yusuf et al.

NEURIPS 2025arXiv:2502.08011
8
citations
#5766

DOLLAR: Few-Step Video Generation via Distillation and Latent Reward Optimization

Zihan Ding, Chi Jin, Difan Liu et al.

ICCV 2025arXiv:2412.15689
8
citations
#5767

DiC: Rethinking Conv3x3 Designs in Diffusion Models

Yuchuan Tian, Jing Han, Chengcheng Wang et al.

CVPR 2025arXiv:2501.00603
8
citations
#5768

PEACE: Empowering Geologic Map Holistic Understanding with MLLMs

Yangyu Huang, Tianyi Gao, Haoran Xu et al.

CVPR 2025arXiv:2501.06184
8
citations
#5769

InfiniDreamer: Arbitrarily Long Human Motion Generation via Segment Score Distillation

Wenjie Zhuo, Fan Ma, Hehe Fan

ICCV 2025arXiv:2411.18303
8
citations
#5770

GaussRender: Learning 3D Occupancy with Gaussian Rendering

Loick Chambon, Eloi Zablocki, Alexandre Boulch et al.

ICCV 2025arXiv:2502.05040
8
citations
#5771

TimE: A Multi-level Benchmark for Temporal Reasoning of LLMs in Real-World Scenarios

Shaohang Wei, Wei Li, Feifan Song et al.

NEURIPS 2025oralarXiv:2505.12891
8
citations
#5772

Interactive Medical Image Analysis with Concept-based Similarity Reasoning

Ta Duc Huy, Sen Kim Tran, Phan Nguyen et al.

CVPR 2025arXiv:2503.06873
8
citations
#5773

DNAEdit: Direct Noise Alignment for Text-Guided Rectified Flow Editing

Chenxi Xie, Minghan Li, Shuai Li et al.

NEURIPS 2025spotlightarXiv:2506.01430
8
citations
#5774

Optical-Flow Guided Prompt Optimization for Coherent Video Generation

Hyelin Nam, Jaemin Kim, Dohun Lee et al.

CVPR 2025arXiv:2411.15540
8
citations
#5775

CoTMR: Chain-of-Thought Multi-Scale Reasoning for Training-Free Zero-Shot Composed Image Retrieval

Zelong Sun, Dong Jing, Zhiwu Lu

ICCV 2025arXiv:2502.20826
8
citations
#5776

Scalable Fingerprinting of Large Language Models

Anshul Nasery, Jonathan Hayase, Creston Brooks et al.

NEURIPS 2025spotlightarXiv:2502.07760
8
citations
#5777

Erasing More Than Intended? How Concept Erasure Degrades the Generation of Non-Target Concepts

Ibtihel Amara, Ahmed Imtiaz Humayun, Ivana Kajic et al.

ICCV 2025arXiv:2501.09833
8
citations
#5778

POp-GS: Next Best View in 3D-Gaussian Splatting with P-Optimality

Joey Wilson, Marcelino M. de Almeida, Sachit Mahajan et al.

CVPR 2025arXiv:2503.07819
8
citations
#5779

Kinetics: Rethinking Test-Time Scaling Law

Ranajoy Sadhukhan, Zhuoming Chen, Haizhong Zheng et al.

NEURIPS 2025arXiv:2506.05333
8
citations
#5780

Joint Out-of-Distribution Filtering and Data Discovery Active Learning

Sebastian Schmidt, Leonard Schenk, Leo Schwinn et al.

CVPR 2025arXiv:2503.02491
8
citations
#5781

Tiled Flash Linear Attention: More Efficient Linear RNN and xLSTM Kernels

Maximilian Beck, Korbinian Pöppel, Phillip Lippe et al.

NEURIPS 2025arXiv:2503.14376
8
citations
#5782

Non-equilibrium Annealed Adjoint Sampler

Jaemoo Choi, Yongxin Chen, Molei Tao et al.

NEURIPS 2025arXiv:2506.18165
8
citations
#5783

CacheQuant: Comprehensively Accelerated Diffusion Models

Xuewen Liu, Zhikai Li, Qingyi Gu

CVPR 2025arXiv:2503.01323
8
citations
#5784

MEGA: Masked Generative Autoencoder for Human Mesh Recovery

Guénolé Fiche, Simon Leglaive, Xavier Alameda-Pineda et al.

CVPR 2025arXiv:2405.18839
8
citations
#5785

PhysSplat: Efficient Physics Simulation for 3D Scenes via MLLM-Guided Gaussian Splatting

Haoyu Zhao, Hao Wang, Xingyue Zhao et al.

ICCV 2025
8
citations
#5786

Self-Cross Diffusion Guidance for Text-to-Image Synthesis of Similar Subjects

Weimin Qiu, Jieke Wang, Meng Tang

CVPR 2025arXiv:2411.18936
8
citations
#5787

One-Step Diffusion for Detail-Rich and Temporally Consistent Video Super-Resolution

Yujing Sun, Lingchen Sun, Shuaizheng Liu et al.

NEURIPS 2025oralarXiv:2506.15591
8
citations
#5788

Adversarial Paraphrasing: A Universal Attack for Humanizing AI-Generated Text

Yize Cheng, Vinu Sankar Sadasivan, Mehrdad Saberi et al.

NEURIPS 2025arXiv:2506.07001
8
citations
#5789

ARM: Appearance Reconstruction Model for Relightable 3D Generation

Xiang Feng, Chang Yu, Zoubin Bi et al.

CVPR 2025highlightarXiv:2411.10825
8
citations
#5790

SemGeoMo: Dynamic Contextual Human Motion Generation with Semantic and Geometric Guidance

Peishan Cong, Ziyi Wang, Yuexin Ma et al.

CVPR 2025arXiv:2503.01291
8
citations
#5791

Compression-Aware One-Step Diffusion Model for JPEG Artifact Removal

Jinpei Guo, Zheng Chen, Wenbo Li et al.

ICCV 2025arXiv:2502.09873
8
citations
#5792

DISC: Dynamic Decomposition Improves LLM Inference Scaling

Jonathan Li, Wei Cheng, Benjamin Riviere et al.

NEURIPS 2025arXiv:2502.16706
8
citations
#5793

AV-Flow: Transforming Text to Audio-Visual Human-like Interactions

Aggelina Chatziagapi, Louis-Philippe Morency, Hongyu Gong et al.

ICCV 2025arXiv:2502.13133
8
citations
#5794

4KAgent: Agentic Any Image to 4K Super-Resolution

Yushen Zuo, Qi Zheng, Mingyang Wu et al.

NEURIPS 2025arXiv:2507.07105
8
citations
#5795

Tartan IMU: A Light Foundation Model for Inertial Positioning in Robotics

Shibo Zhao, Sifan Zhou, Raphael Blanchard et al.

CVPR 2025
8
citations
#5796

StateSpaceDiffuser: Bringing Long Context to Diffusion World Models

Nedko Savov, Naser Kazemi, Deheng Zhang et al.

NEURIPS 2025oralarXiv:2505.22246
8
citations
#5797

Foundations of Top-$k$ Decoding for Language Models

Georgy Noarov, Soham Mallick, Tao Wang et al.

NEURIPS 2025arXiv:2505.19371
8
citations
#5798

Hybrid Latent Reasoning via Reinforcement Learning

Zhenrui Yue, Bowen Jin, Huimin Zeng et al.

NEURIPS 2025arXiv:2505.18454
8
citations
#5799

Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation

Bolin Lai, Felix Juefei-Xu, Miao Liu et al.

CVPR 2025highlightarXiv:2412.01027
8
citations
#5800

Enhancing Creative Generation on Stable Diffusion-based Models

Jiyeon Han, Dahee Kwon, Gayoung Lee et al.

CVPR 2025arXiv:2503.23538
8
citations