Most Cited 2025 "hardware robotic control" Papers

22,274 papers found • Page 65 of 112

#12801

Visual Consensus Prompting for Co-Salient Object Detection

Jie Wang, Nana Yu, Zihao Zhang et al.

CVPR 2025arXiv:2504.14254
2
citations
#12802

Flexible Group Count Enables Hassle-Free Structured Pruning

Jiamu Zhang, Shaochen Zhong, Andrew Ye et al.

CVPR 2025
2
citations
#12803

Object-level Correlation for Few-Shot Segmentation

chunlin wen, Yu Zhang, Jie Fan et al.

ICCV 2025arXiv:2509.07917
2
citations
#12804

GLEAM: Enhanced Transferable Adversarial Attacks for Vision-Language Pre-training Models via Global-Local Transformations

Yunqi Liu, Xiaohui Cui, Ouyang Xue

ICCV 2025
2
citations
#12805

EasyCraft: A Robust and Efficient Framework for Automatic Avatar Crafting

Suzhen Wang, Weijie Chen, Wei Zhang et al.

CVPR 2025arXiv:2503.01158
2
citations
#12806

RefPose: Leveraging Reference Geometric Correspondences for Accurate 6D Pose Estimation of Unseen Objects

Jaeguk Kim, Jaewoo Park, Keuntek Lee et al.

CVPR 2025arXiv:2505.10841
2
citations
#12807

Neural Collapse in Cumulative Link Models for Ordinal Regression: An Analysis with Unconstrained Feature Model

Chuang Ma, Tomoyuki Obuchi, Toshiyuki Tanaka

NEURIPS 2025arXiv:2506.05801
2
citations
#12808

ChartPoint: Guiding MLLMs with Grounding Reflection for Chart Reasoning

Zhengzhuo Xu, Sinan Du, Yiyan Qi et al.

ICCV 2025arXiv:2512.00305
2
citations
#12809

Stepping Out of Similar Semantic Space for Open-Vocabulary Segmentation

Yong Liu, Song-Li Wu, Sule Bai et al.

ICCV 2025arXiv:2506.16058
2
citations
#12810

PhyS-EdiT: Physics-aware Semantic Image Editing with Text Description

Ziqi Cai, Shuchen Weng, Yifei Xia et al.

CVPR 2025
2
citations
#12811

MaGS: Reconstructing and Simulating Dynamic 3D Objects with Mesh-adsorbed Gaussian Splatting

Shaojie Ma, Yawei Luo, Wei Yang et al.

ICCV 2025highlightarXiv:2406.01593
2
citations
#12812

A Unifying View of Linear Function Approximation in Off-Policy RL Through Matrix Splitting and Preconditioning

Zechen Wu, Amy Greenwald, Ronald Parr

NEURIPS 2025oralarXiv:2501.01774
2
citations
#12813

Constrained Posterior Sampling: Time Series Generation with Hard Constraints

Sai Shankar Narasimhan, Shubhankar Agarwal, Litu Rout et al.

NEURIPS 2025arXiv:2410.12652
2
citations
#12814

Fancy123: One Image to High-Quality 3D Mesh Generation via Plug-and-Play Deformation

Qiao Yu, Xianzhi Li, Yuan Tang et al.

CVPR 2025arXiv:2411.16185
2
citations
#12815

Hand-held Object Reconstruction from RGB Video with Dynamic Interaction

Shijian Jiang, Qi Ye, Rengan Xie et al.

CVPR 2025
2
citations
#12816

Exploiting the Asymmetric Uncertainty Structure of Pre-trained VLMs on the Unit Hypersphere

Li Ju, Max Andersson, Stina Fredriksson et al.

NEURIPS 2025arXiv:2505.11029
2
citations
#12817

A Practical Guide for Incorporating Symmetry in Diffusion Policy

Dian Wang, Boce Hu, Shuran Song et al.

NEURIPS 2025arXiv:2505.13431
2
citations
#12818

Bootstrap Your Own Views: Masked Ego-Exo Modeling for Fine-grained View-invariant Video Representations

Jungin Park, Jiyoung Lee, Kwanghoon Sohn

CVPR 2025arXiv:2503.19706
2
citations
#12819

Multivariate Latent Recalibration for Conditional Normalizing Flows

Victor Dheur, Souhaib Ben Taieb

NEURIPS 2025arXiv:2505.16636
2
citations
#12820

Reinforcement Learning Meets Masked Generative Models: Mask-GRPO for Text-to-Image Generation

Yifu Luo, Xinhao Hu, Keyu Fan et al.

NEURIPS 2025arXiv:2510.13418
2
citations
#12821

Few-Shot Learning from Gigapixel Images via Hierarchical Vision-Language Alignment and Modeling

Bryan Wong, Jongwoo Kim, Huazhu Fu et al.

NEURIPS 2025arXiv:2505.17982
2
citations
#12822

GausSim: Foreseeing Reality by Gaussian Simulator for Elastic Objects

Yidi Shao, Mu Huang, Chen Change Loy et al.

ICCV 2025arXiv:2412.17804
2
citations
#12823

HIS-GPT: Towards 3D Human-In-Scene Multimodal Understanding

JIAHE ZHAO, RuiBing Hou, zejie tian et al.

ICCV 2025arXiv:2503.12955
2
citations
#12824

ProReflow: Progressive Reflow with Decomposed Velocity

Lei Ke, Haohang Xu, Xuefei Ning et al.

CVPR 2025arXiv:2503.04824
2
citations
#12825

LP-Diff: Towards Improved Restoration of Real-World Degraded License Plate

Haoyan Gong, Zhenrong Zhang, Yuzheng Feng et al.

CVPR 2025highlight
2
citations
#12826

Evaluating Model Perception of Color Illusions in Photorealistic Scenes

Lingjun Mao, Zineng Tang, Alane Suhr

CVPR 2025arXiv:2412.06184
2
citations
#12827

Generative Active Learning for Long-tail Trajectory Prediction via Controllable Diffusion Model

Daehee Park, Monu Surana, Pranav Desai et al.

ICCV 2025arXiv:2507.22615
2
citations
#12828

The Promise of RL for Autoregressive Image Editing

Saba Ahmadi, Rabiul Awal, Ankur Sikarwar et al.

NEURIPS 2025arXiv:2508.01119
2
citations
#12829

GLSim: Detecting Object Hallucinations in LVLMs via Global-Local Similarity

Seongheon Park, Sharon Li

NEURIPS 2025arXiv:2508.19972
2
citations
#12830

PointMapPolicy: Structured Point Cloud Processing for Multi-Modal Imitation Learning

Xiaogang Jia, Qian Wang, Anrui Wang et al.

NEURIPS 2025arXiv:2510.20406
2
citations
#12831

Teaching Language Models to Reason with Tools

Chengpeng Li, Zhengyang Tang, Ziniu Li et al.

NEURIPS 2025arXiv:2510.20342
2
citations
#12832

CLIMB: Class-imbalanced Learning Benchmark on Tabular Data

Zhining Liu, Zihao Li, Ze Yang et al.

NEURIPS 2025arXiv:2505.17451
2
citations
#12833

Demeter: A Parametric Model of Crop Plant Morphology from the Real World

Tianhang Cheng, Albert Zhai, Evan Chen et al.

ICCV 2025arXiv:2510.16377
2
citations
#12834

OpenWorldSAM: Extending SAM2 for Universal Image Segmentation with Language Prompts

Shiting (Ginny) Xiao, Rishabh Kabra, Yuhang Li et al.

NEURIPS 2025spotlightarXiv:2507.05427
2
citations
#12835

CMT: A Cascade MAR with Topology Predictor for Multimodal Conditional CAD Generation

Jianyu Wu, Yizhou Wang, Xiangyu Yue et al.

ICCV 2025arXiv:2504.20830
2
citations
#12836

Scalable In-context Ranking with Generative Models

Nilesh Gupta, Chong You, Srinadh Bhojanapalli et al.

NEURIPS 2025arXiv:2510.05396
2
citations
#12837

Perspective-Aware Teaching: Adapting Knowledge for Heterogeneous Distillation

Jhe-Hao Lin, Yi Yao, Chan-Feng Hsu et al.

ICCV 2025arXiv:2501.08885
2
citations
#12838

Self-Supervised Spatial Correspondence Across Modalities

Ayush Shrivastava, Andrew Owens

CVPR 2025arXiv:2506.03148
2
citations
#12839

Unsupervised Continual Domain Shift Learning with Multi-Prototype Modeling

Haopeng Sun, Yingwei Zhang, Lumin Xu et al.

CVPR 2025highlight
2
citations
#12840

High-order Equivariant Flow Matching for Density Functional Theory Hamiltonian Prediction

Seongsu Kim, Nayoung Kim, Dongwoo Kim et al.

NEURIPS 2025spotlightarXiv:2505.18817
2
citations
#12841

SVRPBench: A Realistic Benchmark for Stochastic Vehicle Routing Problem

Ahmed Heakl, Yahia Salaheldin Shaaban, Salem Lahlou et al.

NEURIPS 2025arXiv:2505.21887
2
citations
#12842

ACCO: Accumulate While You Communicate for Communication-Overlapped Sharded LLM Training

Adel Nabli, Louis Fournier, Pierre ERBACHER et al.

NEURIPS 2025arXiv:2406.02613
2
citations
#12843

Edit Less, Achieve More: Dynamic Sparse Neuron Masking for Lifelong Knowledge Editing in LLMs

Jinzhe Liu, Junshu Sun, Shufan Shen et al.

NEURIPS 2025arXiv:2510.22139
2
citations
#12844

HMARL-CBF – Hierarchical Multi-Agent Reinforcement Learning with Control Barrier Functions for Safety-Critical Autonomous Systems

H M Sabbir Ahmad, Ehsan Sabouni, Alexander Wasilkoff et al.

NEURIPS 2025
2
citations
#12845

FLOWING: Implicit Neural Flows for Structure-Preserving Morphing

Arthur Bizzi, Matias Grynberg Portnoy, Vitor Pereira Matias et al.

NEURIPS 2025oralarXiv:2510.09537
2
citations
#12846

A Structure-aware and Motion-adaptive Framework for 3D Human Pose Estimation with Mamba

Ye Lu, Jie Wang, Jianjun Gao et al.

ICCV 2025arXiv:2507.19852
2
citations
#12847

Proximalized Preference Optimization for Diverse Feedback Types: A Decomposed Perspective on DPO

Kaiyang Guo, Yinchuan Li, Zhitang Chen

NEURIPS 2025arXiv:2505.23316
2
citations
#12848

PT-MoE: An Efficient Finetuning Framework for Integrating Mixture-of-Experts into Prompt Tuning

Zongqian Li, Yixuan Su, Nigel Collier

NEURIPS 2025arXiv:2505.09519
2
citations
#12849

Continuous Simplicial Neural Networks

Aref Einizade, Dorina Thanou, Fragkiskos Malliaros et al.

NEURIPS 2025arXiv:2503.12919
2
citations
#12850

Neural Inverse Rendering from Propagating Light

Anagh Malik, Benjamin Attal, Andrew Xie et al.

CVPR 2025arXiv:2506.05347
2
citations
#12851

SplArt: Articulation Estimation and Part-Level Reconstruction with 3D Gaussian Splatting

Shengjie Lin, Jiading Fang, Muhammad Zubair Irshad et al.

ICCV 2025arXiv:2506.03594
2
citations
#12852

RAST: Reasoning Activation in LLMs via Small-model Transfer

Siru Ouyang, Xinyu Zhu, Zilin Xiao et al.

NEURIPS 2025arXiv:2506.15710
2
citations
#12853

Reasoning Mamba: Hypergraph-Guided Region Relation Calculating for Weakly Supervised Affordance Grounding

Yuxuan Wang, Aming Wu, Muli Yang et al.

CVPR 2025
2
citations
#12854

Derivative-Free Diffusion Manifold-Constrained Gradient for Unified XAI

Won Jun Kim, Hyungjin Chung, Jaemin Kim et al.

CVPR 2025arXiv:2411.15265
2
citations
#12855

Large Language Bayes

Justin Domke

NEURIPS 2025arXiv:2504.14025
2
citations
#12856

Hadamax Encoding: Elevating Performance in Model-Free Atari

Jacob Eeuwe Kooi, Zhao Yang, Vincent Francois-Lavet

NEURIPS 2025arXiv:2505.15345
2
citations
#12857

2D Gaussian Splatting-based Sparse-view Transparent Object Depth Reconstruction via Physics Simulation for Scene Update

Jeongyun Kim, Seunghoon Jeong, Giseop Kim et al.

ICCV 2025arXiv:2507.11069
2
citations
#12858

LookWhere? Efficient Visual Recognition by Learning Where to Look and What to See from Self-Supervision

Anthony Fuller, Yousef Yassin, Junfeng Wen et al.

NEURIPS 2025arXiv:2505.18051
2
citations
#12859

DTOS: Dynamic Time Object Sensing with Large Multimodal Model

Jirui Tian, Jinrong Zhang, Shenglan Liu et al.

CVPR 2025
2
citations
#12860

KMD: Koopman Multi-modality Decomposition for Generalized Brain Tumor Segmentation under Incomplete Modalities

Tianyi Liu, Haochuan Jiang, Kaizhu Huang

CVPR 2025
2
citations
#12861

Modeling the Economic Impacts of AI Openness Regulation

Tori Qiu, Benjamin Laufer, Jon Kleinberg et al.

NEURIPS 2025arXiv:2507.14193
2
citations
#12862

Blind2Sound: Self-Supervised Image Denoising without Residual Noise

Jiazheng Liu, Zejin Wang, Bohao Chen et al.

ICCV 2025arXiv:2303.05183
2
citations
#12863

Visual Chronicles: Using Multimodal LLMs to Analyze Massive Collections of Images

Boyang Deng, Kyle Genova, Songyou Peng et al.

ICCV 2025highlightarXiv:2504.08727
2
citations
#12864

IMoRe: Implicit Program-Guided Reasoning for Human Motion Q&A

Chen Li, Chinthani Sugandhika, Ee Yeo Keat et al.

ICCV 2025arXiv:2508.01984
2
citations
#12865

VSC: Visual Search Compositional Text-to-Image Diffusion Model

Do Dat, Nam Hyeon-Woo, Po-Yuan Mao et al.

ICCV 2025arXiv:2505.01104
2
citations
#12866

Bridging the Skeleton-Text Modality Gap: Diffusion-Powered Modality Alignment for Zero-shot Skeleton-based Action Recognition

Jeonghyeok Do, Munchurl Kim

ICCV 2025arXiv:2411.10745
2
citations
#12867

AIM-Fair: Advancing Algorithmic Fairness via Selectively Fine-Tuning Biased Models with Contextual Synthetic Data

Zengqun Zhao, Ziquan Liu, Yu Cao et al.

CVPR 2025arXiv:2503.05665
2
citations
#12868

ChartCap: Mitigating Hallucination of Dense Chart Captioning

Junyoung Lim, Jaewoo Ahn, Gunhee Kim

ICCV 2025highlightarXiv:2508.03164
2
citations
#12869

PERSONA: Personalized Whole-Body 3D Avatar with Pose-Driven Deformations from a Single Image

Geonhee Sim, Gyeongsik Moon

ICCV 2025arXiv:2508.09973
2
citations
#12870

Seeing is Believing? Mitigating OCR Hallucinations in Multimodal Large Language Models

zhentao he, Can Zhang, Ziheng Wu et al.

NEURIPS 2025arXiv:2506.20168
2
citations
#12871

A Unified, Resilient, and Explainable Adversarial Patch Detector

Vishesh Kumar, Akshay Agarwal

CVPR 2025
2
citations
#12872

SVG-Head: Hybrid Surface-Volumetric Gaussians for High-Fidelity Head Reconstruction and Real-Time Editing

Heyi Sun, Cong Wang, Tian-Xing Xu et al.

ICCV 2025arXiv:2508.09597
2
citations
#12873

Synchronization of Multiple Videos

Avihai Naaman, Ron Shapira Weber, Oren Freifeld

ICCV 2025arXiv:2510.14051
2
citations
#12874

Demystifying Network Foundation Models

Roman Beltiukov, Satyandra Guthula, Wenbo Guo et al.

NEURIPS 2025arXiv:2509.23089
2
citations
#12875

Progressive Rendering Distillation: Adapting Stable Diffusion for Instant Text-to-Mesh Generation without 3D Data

Zhiyuan Ma, Xinyue Liang, Rongyuan Wu et al.

CVPR 2025arXiv:2503.21694
2
citations
#12876

PseudoMapTrainer: Learning Online Mapping without HD Maps

Christian Löwens, Thorben Funke, Jingchao Xie et al.

ICCV 2025arXiv:2508.18788
2
citations
#12877

MaNGO — Adaptable Graph Network Simulators via Meta-Learning

Philipp Dahlinger, Tai Hoang, Denis Blessing et al.

NEURIPS 2025arXiv:2510.05874
2
citations
#12878

Hierarchical Flow Diffusion for Efficient Frame Interpolation

Yang Hai, Guo Wang, Tan Su et al.

CVPR 2025arXiv:2504.00380
2
citations
#12879

From Average-Iterate to Last-Iterate Convergence in Games: A Reduction and Its Applications

Yang Cai, Haipeng Luo, Chen-Yu Wei et al.

NEURIPS 2025arXiv:2506.03464
2
citations
#12880

Adapting In-Domain Few-Shot Segmentation to New Domains without Source Domain Retraining

Qi Fan, Kaiqi Liu, Nian Liu et al.

ICCV 2025arXiv:2504.21414
2
citations
#12881

DisenQ: Disentangling Q-Former for Activity-Biometrics

Shehreen Azad, Yogesh Rawat

ICCV 2025highlightarXiv:2507.07262
2
citations
#12882

Trokens: Semantic-Aware Relational Trajectory Tokens for Few-Shot Action Recognition

Pulkit Kumar, Shuaiyi Huang, Matthew Walmer et al.

ICCV 2025arXiv:2508.03695
2
citations
#12883

Datasets, Documents, and Repetitions: The Practicalities of Unequal Data Quality

Alex Fang, Hadi Pouransari, Matt Jordan et al.

NEURIPS 2025arXiv:2503.07879
2
citations
#12884

Improving Noise Efficiency in Privacy-preserving Dataset Distillation

Runkai Zheng, Vishnu Dasu, Yinong Wang et al.

ICCV 2025arXiv:2508.01749
2
citations
#12885

What Changed and What Could Have Changed? State-Change Counterfactuals for Procedure-Aware Video Representation Learning

Chi-Hsi Kung, Frangil Ramirez, Juhyung Ha et al.

ICCV 2025arXiv:2503.21055
2
citations
#12886

QUT-DV25: A Dataset for Dynamic Analysis of Next-Gen Software Supply Chain Attacks

Sk Tanzir Mehedi, Raja Jurdak, Chadni Islam et al.

NEURIPS 2025arXiv:2505.13804
2
citations
#12887

Reconciling Stochastic and Deterministic Strategies for Zero-shot Image Restoration using Diffusion Model in Dual

Chong Wang, Lanqing Guo, Zixuan Fu et al.

CVPR 2025arXiv:2503.01288
2
citations
#12888

Slow Transition to Low-Dimensional Chaos in Heavy-Tailed Recurrent Neural Networks

Eva Xie, Stefan Mihalas, Łukasz Kuśmierz

NEURIPS 2025arXiv:2505.09816
2
citations
#12889

VoluMe – Authentic 3D Video Calls from Live Gaussian Splat Prediction

Martin de La Gorce, Charlie Hewitt, Tibor Takács et al.

ICCV 2025arXiv:2507.21311
2
citations
#12890

A Theoretical Framework for Grokking: Interpolation followed by Riemannian Norm Minimisation

Etienne Boursier, Scott Pesme, Radu-Alexandru Dragomir

NEURIPS 2025arXiv:2505.20172
2
citations
#12891

Metric Convolutions: A Unifying Theory to Adaptive Image Convolutions

Thomas Dagès, Michael Lindenbaum, Alfred Bruckstein

ICCV 2025arXiv:2406.05400
2
citations
#12892

PatchGuard: Adversarially Robust Anomaly Detection and Localization through Vision Transformers and Pseudo Anomalies

Mojtaba Nafez, Amirhossein Koochakian, Arad Maleki et al.

CVPR 2025arXiv:2506.09237
2
citations
#12893

Mixed Signals: A Diverse Point Cloud Dataset for Heterogeneous LiDAR V2X Collaboration

Katie Luo, Minh-Quan Dao, Zhenzhen Liu et al.

ICCV 2025arXiv:2502.14156
2
citations
#12894

On-Device Diffusion Transformer Policy for Efficient Robot Manipulation

Yiming Wu, Huan Wang, Zhenghao Chen et al.

ICCV 2025arXiv:2508.00697
2
citations
#12895

A Partition Cover Approach to Tokenization

Jia Peng Lim, Shawn Tan, XianJun, Davin Choo et al.

NEURIPS 2025arXiv:2501.06246
2
citations
#12896

UPME: An Unsupervised Peer Review Framework for Multimodal Large Language Model Evaluation

Qihui Zhang, Munan Ning, Zheyuan Liu et al.

CVPR 2025arXiv:2503.14941
2
citations
#12897

Towards Explicit Exoskeleton for the Reconstruction of Complicated 3D Human Avatars

Yifan Zhan, Qingtian Zhu, Muyao Niu et al.

ICCV 2025arXiv:2410.08082
2
citations
#12898

Sound Bridge: Associating Egocentric and Exocentric Videos via Audio Cues

Sihong Huang, Jiaxin Wu, Xiaoyong Wei et al.

CVPR 2025
2
citations
#12899

PhySense: Sensor Placement Optimization for Accurate Physics Sensing

Yuezhou Ma, Haixu Wu, Hang Zhou et al.

NEURIPS 2025oralarXiv:2505.18190
2
citations
#12900

Music-Aligned Holistic 3D Dance Generation via Hierarchical Motion Modeling

LI XIAOJIE, Ronghui Li, Shukai Fang et al.

ICCV 2025arXiv:2507.14915
2
citations
#12901

Rethinking Optimal Verification Granularity for Compute-Efficient Test-Time Scaling

Hao Chen, Guanxi Lu, Yasuyuki Okoshi et al.

NEURIPS 2025arXiv:2505.11730
2
citations
#12902

Transfer Your Perspective: Controllable 3D Generation from Any Viewpoint in a Driving Scene

Tai-Yu Daniel Pan, Sooyoung Jeon, Mengdi Fan et al.

CVPR 2025arXiv:2502.06682
2
citations
#12903

DIP: Unsupervised Dense In-Context Post-training of Visual Representations

Sophia Sirko-Galouchenko, Spyros Gidaris, Antonin Vobecky et al.

ICCV 2025arXiv:2506.18463
2
citations
#12904

On the Global Optimality of Policy Gradient Methods in General Utility Reinforcement Learning

Anas Barakat, Souradip Chakraborty, Peihong Yu et al.

NEURIPS 2025arXiv:2410.04108
2
citations
#12905

LUT-Fuse: Towards Extremely Fast Infrared and Visible Image Fusion via Distillation to Learnable Look-Up Tables

Xunpeng Yi, yibing zhang, Xinyu Xiang et al.

ICCV 2025arXiv:2509.00346
2
citations
#12906

Bridging Symmetry and Robustness: On the Role of Equivariance in Enhancing Adversarial Robustness

Longwei Wang, Ifrat Ikhtear Uddin, Prof. KC Santosh (PhD) et al.

NEURIPS 2025spotlightarXiv:2510.16171
2
citations
#12907

Hierarchical Frequency Tagging Probe (HFTP): A Unified Approach to Investigate Syntactic Structure Representations in Large Language Models and the Human Brain

Jingmin An, Yilong Song, Ruolin Yang et al.

NEURIPS 2025oralarXiv:2510.13255
2
citations
#12908

ReliabilityRAG: Effective and Provably Robust Defense for RAG-based Web-Search

Zeyu Shen, Basileal Imana, Tong Wu et al.

NEURIPS 2025arXiv:2509.23519
2
citations
#12909

Not All Parameters Matter: Masking Diffusion Models for Enhancing Generation Ability

Lei Wang, Senmao Li, Fei Yang et al.

CVPR 2025arXiv:2505.03097
2
citations
#12910

Multi-Modal Synergistic Implicit Image Enhancement for Efficient Optical Flow Estimation

Weichen Dai, wu hexing, xiaoyang weng et al.

CVPR 2025
2
citations
#12911

Identity Preserving 3D Head Stylization with Multiview Score Distillation

Bahri Batuhan Bilecen, Ahmet Berke Gokmen, Furkan Güzelant et al.

ICCV 2025arXiv:2411.13536
2
citations
#12912

Text-Aware Real-World Image Super-Resolution via Diffusion Model with Joint Segmentation Decoders

Qiming Hu, Linlong Fan, Yiyan Luo et al.

NEURIPS 2025arXiv:2506.04641
2
citations
#12913

Boosting the Dual-Stream Architecture in Ultra-High Resolution Segmentation with Resolution-Biased Uncertainty Estimation

Rong Qin, Xingyu Liu, Jinglei Shi et al.

CVPR 2025
2
citations
#12914

SG-LDM: Semantic-Guided LiDAR Generation via Latent-Aligned Diffusion

Zhengkang Xiang, Zizhao Li, Amir Khodabandeh et al.

ICCV 2025arXiv:2506.23606
2
citations
#12915

IntroStyle: Training-Free Introspective Style Attribution using Diffusion Features

Anand Kumar, Jiteng Mu, Nuno Vasconcelos

ICCV 2025arXiv:2412.14432
2
citations
#12916

Towards Million-Scale Adversarial Robustness Evaluation With Stronger Individual Attacks

Yong Xie, Weijie Zheng, Hanxun Huang et al.

CVPR 2025arXiv:2411.15210
2
citations
#12917

Ponimator: Unfolding Interactive Pose for Versatile Human-human Interaction Animation

Shaowei Liu, chuan guo, Bing Zhou et al.

ICCV 2025arXiv:2510.14976
2
citations
#12918

SmallKV: Small Model Assisted Compensation of KV Cache Compression for Efficient LLM Inference

Yi Zhao, Yajuan Peng, Nguyen Cam-Tu et al.

NEURIPS 2025spotlightarXiv:2508.02751
2
citations
#12919

MatchDiffusion: Training-free Generation of Match-Cuts

Alejandro Pardo, Fabio Pizzati, Tong Zhang et al.

ICCV 2025arXiv:2411.18677
2
citations
#12920

GRAPE: Optimize Data Mixture for Group Robust Multi-target Adaptive Pretraining

Simin Fan, Maria Ios Glarou, Martin Jaggi

NEURIPS 2025arXiv:2505.20380
2
citations
#12921

Reversing Flow for Image Restoration

Haina Qin, Wenyang Luo, Bing Li et al.

CVPR 2025arXiv:2506.16961
2
citations
#12922

DOGR: Towards Versatile Visual Document Grounding and Referring

Yinan Zhou, Yuxin Chen, Haokun Lin et al.

ICCV 2025arXiv:2411.17125
2
citations
#12923

DualOptim: Enhancing Efficacy and Stability in Machine Unlearning with Dual Optimizers

Xuyang Zhong, Haochen Luo, Chen Liu

NEURIPS 2025arXiv:2504.15827
2
citations
#12924

GlyphMastero: A Glyph Encoder for High-Fidelity Scene Text Editing

Tong Wang, Ting Liu, Xiaochao Qu et al.

CVPR 2025arXiv:2505.04915
2
citations
#12925

MEAT: Multiview Diffusion Model for Human Generation on Megapixels with Mesh Attention

Yuhan Wang, Fangzhou Hong, Shuai Yang et al.

CVPR 2025arXiv:2503.08664
2
citations
#12926

SDMatte: Grafting Diffusion Models for Interactive Matting

Longfei Huang, Yu Liang, Hao Zhang et al.

ICCV 2025arXiv:2508.00443
2
citations
#12927

Certified Human Trajectory Prediction

Mohammadhossein Bahari, Saeed Saadatnejad, Amirhossein Askari Farsangi et al.

CVPR 2025arXiv:2403.13778
2
citations
#12928

SMGDiff: Soccer Motion Generation using Diffusion Probabilistic Models

Hongdi Yang, Chengyang Li, Zhenxuan Wu et al.

ICCV 2025arXiv:2411.16216
2
citations
#12929

Activation-Guided Consensus Merging for Large Language Models

Yuxuan Yao, Shuqi LIU, Zehua Liu et al.

NEURIPS 2025arXiv:2505.14009
2
citations
#12930

Nested Diffusion Models Using Hierarchical Latent Priors

Xiao Zhang, Ruoxi Jiang, Rebecca Willett et al.

CVPR 2025arXiv:2412.05984
2
citations
#12931

Quantum Doubly Stochastic Transformers

Jannis Born, Filip Skogh, Kahn Rhrissorrakrai et al.

NEURIPS 2025spotlightarXiv:2504.16275
2
citations
#12932

Fair Deepfake Detectors Can Generalize

Harry Cheng, Ming-Hui Liu, Yangyang Guo et al.

NEURIPS 2025arXiv:2507.02645
2
citations
#12933

Panoptic Captioning: An Equivalence Bridge for Image and Text

Kun-Yu Lin, Hongjun Wang, Weining Ren et al.

NEURIPS 2025arXiv:2505.16334
2
citations
#12934

Pairwise Calibrated Rewards for Pluralistic Alignment

Daniel Halpern, Evi Micha, Ariel Procaccia et al.

NEURIPS 2025arXiv:2506.06298
2
citations
#12935

Gradient Variance Reveals Failure Modes in Flow-Based Generative Models

Teodora Reu, Sixtine Dromigny, Michael Bronstein et al.

NEURIPS 2025spotlightarXiv:2510.18118
2
citations
#12936

DiffCAM: Data-Driven Saliency Maps by Capturing Feature Differences

Xingjian Li, Qiming Zhao, Neelesh Bisht et al.

CVPR 2025highlight
2
citations
#12937

Efficient Autoregressive Shape Generation via Octree-Based Adaptive Tokenization

Kangle Deng, Hsueh-Ti Derek Liu, Yiheng Zhu et al.

ICCV 2025arXiv:2504.02817
2
citations
#12938

SPADE: Spatial-Aware Denoising Network for Open-vocabulary Panoptic Scene Graph Generation with Long- and Local-range Context Reasoning

XIN Hu, Ke Qin, Guiduo Duan et al.

ICCV 2025arXiv:2507.05798
2
citations
#12939

EvolvingGrasp: Evolutionary Grasp Generation via Efficient Preference Alignment

Yufei Zhu, Yiming Zhong, Zemin Yang et al.

ICCV 2025arXiv:2503.14329
2
citations
#12940

RefEdit: A Benchmark and Method for Improving Instruction-based Image Editing Model on Referring Expressions

Bimsara Pathiraja, Maitreya Patel, Shivam Singh et al.

ICCV 2025arXiv:2506.03448
2
citations
#12941

A Theory for Worst-Case vs. Average-Case Guarantees for LLMs

Noga Amit, Shafi Goldwasser, Orr Paradise et al.

NEURIPS 2025
2
citations
#12942

Bootstrap3D: Improving Multi-view Diffusion Model with Synthetic Data

Zeyi Sun, Tong Wu, Pan Zhang et al.

ICCV 2025arXiv:2406.00093
2
citations
#12943

Enhancing Dance-to-Music Generation via Negative Conditioning Latent Diffusion Model

Changchang Sun, Gaowen Liu, Charles Fleming et al.

CVPR 2025arXiv:2503.22138
2
citations
#12944

StyleKeeper: Prevent Content Leakage using Negative Visual Query Guidance

Jaeseok Jeong, Junho Kim, Youngjung Uh et al.

ICCV 2025arXiv:2510.06827
2
citations
#12945

FaceCraft4D: Animated 3D Facial Avatar Generation from a Single Image

Fei Yin, Mallikarjun Reddy, Chun-Han Yao et al.

ICCV 2025arXiv:2504.15179
2
citations
#12946

Towards Human-Understandable Multi-Dimensional Concept Discovery

Arne Grobrügge, Niklas Kühl, Gerhard Satzger et al.

CVPR 2025arXiv:2503.18629
2
citations
#12947

MUSE: Multi-Subject Unified Synthesis via Explicit Layout Semantic Expansion

Fei Peng, Junqiang Wu, Yan Li et al.

ICCV 2025arXiv:2508.14440
2
citations
#12948

Sequential keypoint density estimator: an overlooked baseline of skeleton-based video anomaly detection

Anja Delić, Matej Grcic, Siniša Šegvić

ICCV 2025highlightarXiv:2506.18368
2
citations
#12949

Nabla-R2D3: Effective and Efficient 3D Diffusion Alignment with 2D Rewards

Qingming LIU, Zhen Liu, Dinghuai Zhang et al.

NEURIPS 2025arXiv:2506.15684
2
citations
#12950

ProbRes: Probabilistic Jump Diffusion for Open-World Egocentric Activity Recognition

Sanjoy Kundu, Shanmukha Vellamcheti, Sathyanarayanan Aakur

ICCV 2025arXiv:2504.03948
2
citations
#12951

Towards Identifiability of Hierarchical Temporal Causal Representation Learning

Zijian Li, Minghao Fu, Junxian Huang et al.

NEURIPS 2025oralarXiv:2510.18310
2
citations
#12952

Multi-Object Sketch Animation by Scene Decomposition and Motion Planning

Jingyu Liu, Zijie Xin, Yuhan Fu et al.

ICCV 2025arXiv:2503.19351
2
citations
#12953

SAGI: Semantically Aligned and Uncertainty Guided AI Image Inpainting

Paschalis Giakoumoglou, Dimitrios Karageorgiou, Symeon Papadopoulos et al.

ICCV 2025arXiv:2502.06593
2
citations
#12954

Toward Efficient Inference Attacks: Shadow Model Sharing via Mixture-of-Experts

Li Bai, Qingqing Ye, Xinwei Zhang et al.

NEURIPS 2025arXiv:2510.13451
2
citations
#12955

What Changed? Detecting and Evaluating Instruction-Guided Image Edits with Multimodal Large Language Models

Lorenzo Baraldi, Davide Bucciarelli, Federico Betti et al.

ICCV 2025arXiv:2505.20405
2
citations
#12956

Joint Asymmetric Loss for Learning with Noisy Labels

Jialiang Wang, Xianming Liu, Xiong Zhou et al.

ICCV 2025arXiv:2507.17692
2
citations
#12957

Imagined Autocurricula

Ahmet Hamdi Güzel, Matthew T Jackson, Jarek Liesen et al.

NEURIPS 2025arXiv:2509.13341
2
citations
#12958

From Black-box to Causal-box: Towards Building More Interpretable Models

Inwoo Hwang, Yushu Pan, Elias Bareinboim

NEURIPS 2025arXiv:2510.21998
2
citations
#12959

DIMCIM: A Quantitative Evaluation Framework for Default-mode Diversity and Generalization in Text-to-Image Generative Models

Revant Teotia, Candace Ross, Karen Ullrich et al.

ICCV 2025arXiv:2506.05108
2
citations
#12960

Boosting Domain Incremental Learning: Selecting the Optimal Parameters is All You Need

Qiang Wang, Xiang Song, Yuhang He et al.

CVPR 2025arXiv:2505.23744
2
citations
#12961

Attack by Yourself: Effective and Unnoticeable Multi-Category Graph Backdoor Attacks with Subgraph Triggers Pool

Jiangtong Li, Dongyi Liu, Kun Zhu et al.

NEURIPS 2025arXiv:2412.17213
2
citations
#12962

UniHOPE: A Unified Approach for Hand-Only and Hand-Object Pose Estimation

Yinqiao Wang, Hao Xu, Pheng-Ann Heng et al.

CVPR 2025arXiv:2503.13303
2
citations
#12963

Unlocking Constraints: Source-Free Occlusion-Aware Seamless Segmentation

Yihong Cao, Jiaming Zhang, Xu Zheng et al.

ICCV 2025arXiv:2506.21198
2
citations
#12964

Exploiting Vocabulary Frequency Imbalance in Language Model Pre-training

Woojin Chung, Jeonghoon Kim

NEURIPS 2025arXiv:2508.15390
2
citations
#12965

Path Gradients after Flow Matching

Lorenz Vaitl, Leon Klein

NEURIPS 2025arXiv:2505.10139
2
citations
#12966

ALINE: Joint Amortization for Bayesian Inference and Active Data Acquisition

Daolang Huang, Xinyi Wen, Ayush Bharti et al.

NEURIPS 2025spotlightarXiv:2506.07259
2
citations
#12967

MAVias: Mitigate any Visual Bias

Ioannis Sarridis, Christos Koutlis, Symeon Papadopoulos et al.

ICCV 2025arXiv:2412.06632
2
citations
#12968

LookOut: Real-World Humanoid Egocentric Navigation

Boxiao Pan, Adam Harley, Francis Engelmann et al.

ICCV 2025arXiv:2508.14466
2
citations
#12969

AdvDreamer Unveils: Are Vision-Language Models Truly Ready for Real-World 3D Variations?

Shouwei Ruan, Hanqing Liu, Yao Huang et al.

ICCV 2025highlightarXiv:2412.03002
2
citations
#12970

From Objects to Events: Unlocking Complex Visual Understanding in Object Detectors via LLM-guided Symbolic Reasoning

Yuhui Zeng, Haoxiang Wu, Wenjie Nie et al.

ICCV 2025arXiv:2502.05843
2
citations
#12971

PLA: Prompt Learning Attack against Text-to-Image Generative Models

XINQI LYU, Yihao LIU, Yanjie Li et al.

ICCV 2025arXiv:2508.03696
2
citations
#12972

msf-CNN: Patch-based Multi-Stage Fusion with Convolutional Neural Networks for TinyML

Zhaolan Huang, Emmanuel Baccelli

NEURIPS 2025arXiv:2505.11483
2
citations
#12973

No Thing, Nothing: Highlighting Safety-Critical Classes for Robust LiDAR Semantic Segmentation in Adverse Weather

Junsung Park, HwiJeong Lee, Inha Kang et al.

CVPR 2025arXiv:2503.15910
2
citations
#12974

Holistic Tokenizer for Autoregressive Image Generation

Anlin Zheng, Haochen Wang, Yucheng Zhao et al.

ICCV 2025arXiv:2507.02358
2
citations
#12975

Rao-Blackwell Gradient Estimators for Equivariant Denoising Diffusion

Vinh Tong, Trung-Dung Hoang, Anji Liu et al.

NEURIPS 2025arXiv:2502.09890
2
citations
#12976

Rethinking Lanes and Points in Complex Scenarios for Monocular 3D Lane Detection

Yifan Chang, Junjie Huang, Xiaofeng Wang et al.

CVPR 2025arXiv:2503.06237
2
citations
#12977

Restricted Spectral Gap Decomposition for Simulated Tempering Targeting Mixture Distributions

Jhanvi Garg, Krishnakumar Balasubramanian, Quan Zhou

NEURIPS 2025arXiv:2505.15059
2
citations
#12978

Improving Sound Source Localization with Joint Slot Attention on Image and Audio

Inho Kim, YOUNGKIL SONG, Jicheol Park et al.

CVPR 2025arXiv:2504.15118
2
citations
#12979

Constructing an Optimal Behavior Basis for the Option Keyboard

Lucas N. Alegre, Ana Bazzan, Andre Barreto et al.

NEURIPS 2025arXiv:2505.00787
2
citations
#12980

Volumetric Surfaces: Representing Fuzzy Geometries with Layered Meshes

Stefano Esposito, Anpei Chen, Christian Reiser et al.

CVPR 2025arXiv:2409.02482
2
citations
#12981

Draw Your Mind: Personalized Generation via Condition-Level Modeling in Text-to-Image Diffusion Models

Hyungjin Kim, Seokho Ahn, Young-Duk Seo

ICCV 2025arXiv:2508.03481
2
citations
#12982

Diffusion Adaptive Text Embedding for Text-to-Image Diffusion Models

Byeonghu Na, Minsang Park, Gyuwon Sim et al.

NEURIPS 2025arXiv:2510.23974
2
citations
#12983

Triad: Empowering LMM-based Anomaly Detection with Expert-guided Region-of-Interest Tokenizer and Manufacturing Process

Yuanze Li, Shihao Yuan, Haolin Wang et al.

ICCV 2025
2
citations
#12984

LLM Meets Diffusion: A Hybrid Framework for Crystal Material Generation

Subhojyoti Khastagir, KISHALAY DAS, Pawan Goyal et al.

NEURIPS 2025arXiv:2510.23040
2
citations
#12985

Auto-Encoded Supervision for Perceptual Image Super-Resolution

MinKyu Lee, Sangeek Hyun, Woojin Jun et al.

CVPR 2025arXiv:2412.00124
2
citations
#12986

LOTA: Bit-Planes Guided AI-Generated Image Detection

Renxi Cheng, Hongsong Wang, Yang Zhang et al.

ICCV 2025arXiv:2510.14230
2
citations
#12987

Trade-offs in Image Generation: How Do Different Dimensions Interact?

Sicheng Zhang, Binzhu Xie, Zhonghao Yan et al.

ICCV 2025arXiv:2507.22100
2
citations
#12988

RaSS: Improving Denoising Diffusion Samplers with Reinforced Active Sampling Scheduler

Xin Ding, Lei Yu, Xin Li et al.

CVPR 2025
2
citations
#12989

EgoAdapt: Adaptive Multisensory Distillation and Policy Learning for Efficient Egocentric Perception

Sanjoy Chowdhury, Subrata Biswas, Sayan Nag et al.

ICCV 2025arXiv:2506.21080
2
citations
#12990

Layer as Puzzle Pieces: Compressing Large Language Models through Layer Concatenation

Fei Wang, Li Shen, Liang Ding et al.

NEURIPS 2025arXiv:2510.15304
2
citations
#12991

TSP-Mamba: The Travelling Salesman Problem Meets Mamba for Image Super-resolution and Beyond

Kun Zhou, Xinyu Lin, Jiangbo Lu

CVPR 2025
2
citations
#12992

One Filters All: A Generalist Filter For State Estimation

Shiqi Liu, Wenhan Cao, Chang Liu et al.

NEURIPS 2025arXiv:2509.20051
2
citations
#12993

A Unified Framework for Motion Reasoning and Generation in Human Interaction

Jeongeun Park, Sungjoon Choi, Sangdoo Yun

ICCV 2025arXiv:2410.05628
2
citations
#12994

VTimeCoT: Thinking by Drawing for Video Temporal Grounding and Reasoning

Jinglei Zhang, Yuanfan Guo, Rolandos Alexandros Potamias et al.

ICCV 2025arXiv:2510.14672
2
citations
#12995

Progressive Growing of Video Tokenizers for Temporally Compact Latent Spaces

Aniruddha Mahapatra, Long Mai, David Bourgin et al.

ICCV 2025arXiv:2501.05442
2
citations
#12996

CorrBEV: Multi-View 3D Object Detection by Correlation Learning with Multi-modal Prototypes

ziteng xue, Mingzhe Guo, Heng Fan et al.

CVPR 2025
2
citations
#12997

Adaptive Kernel Design for Bayesian Optimization Is a Piece of CAKE with LLMs

Richard Suwandi, Feng Yin, Juntao Wang et al.

NEURIPS 2025arXiv:2509.17998
2
citations
#12998

EfficientMT: Efficient Temporal Adaptation for Motion Transfer in Text-to-Video Diffusion Models

Yufei Cai, Hu Han, Yuxiang Wei et al.

ICCV 2025arXiv:2503.19369
2
citations
#12999

SPARKE: Scalable Prompt-Aware Diversity and Novelty Guidance in Diffusion Models via RKE Score

Mohammad Jalali, Haoyu Lei, Amin Gohari et al.

NEURIPS 2025arXiv:2506.10173
2
citations
#13000

HouseTour: A Virtual Real Estate A(I)gent

Ata Çelen, Iro Armeni, Daniel Barath et al.

ICCV 2025arXiv:2510.18054
2
citations