Most Cited 2025 Poster Papers
22,274 papers found • Page 37 of 112
Conference
Reinforced Context Order Recovery for Adaptive Reasoning and Planning
Long Ma, Fangwei Zhong, Yizhou Wang
KL Penalty Control via Perturbation for Direct Preference Optimization
Sangkyu Lee, Janghoon Han, Hosung Song et al.
Sampling 3D Molecular Conformers with Diffusion Transformers
J. Thorben Frank, Winfried Ripken, Gregor Lied et al.
BenchmarkCards: Standardized Documentation for Large Language Model Benchmarks
Anna Sokol, Elizabeth Daly, Michael Hind et al.
Imagined Autocurricula
Ahmet Hamdi Güzel, Matthew T Jackson, Jarek Liesen et al.
Hankel Singular Value Regularization for Highly Compressible State Space Models
Paul Schwerdtner, Jules Berman, Benjamin Peherstorfer
Compass Control: Multi Object Orientation Control for Text-to-Image Generation
Rishubh Parihar, Vaibhav Agrawal, Sachidanand VS et al.
SPADE: Spatial-Aware Denoising Network for Open-vocabulary Panoptic Scene Graph Generation with Long- and Local-range Context Reasoning
XIN Hu, Ke Qin, Guiduo Duan et al.
FLOWING: Implicit Neural Flows for Structure-Preserving Morphing
Arthur Bizzi, Matias Grynberg Portnoy, Vitor Pereira Matias et al.
A Snapshot of Influence: A Local Data Attribution Framework for Online Reinforcement Learning
Yuzheng Hu, Fan Wu, Haotian Ye et al.
Deferring Concept Bottleneck Models: Learning to Defer Interventions to Inaccurate Experts
Andrea Pugnana, Riccardo Massidda, Francesco Giannini et al.
Decision SpikeFormer: Spike-Driven Transformer for Decision Making
Wei Huang, Qinying Gu, Nanyang Ye
RefEdit: A Benchmark and Method for Improving Instruction-based Image Editing Model on Referring Expressions
Bimsara Pathiraja, Maitreya Patel, Shivam Singh et al.
GBlobs: Explicit Local Structure via Gaussian Blobs for Improved Cross-Domain LiDAR-based 3D Object Detection
Dušan Malić, Christian Fruhwirth-Reisinger, Samuel Schulter et al.
Test-Time Domain Generalization via Universe Learning: A Multi-Graph Matching Approach for Medical Image Segmentation
Xingguo Lv, Xingbo Dong, Liwen Wang et al.
CMT: A Cascade MAR with Topology Predictor for Multimodal Conditional CAD Generation
Jianyu Wu, Yizhou Wang, Xiangyu Yue et al.
Diffusion Image Prior
Hamadi Chihaoui, Paolo Favaro
Bootstrap3D: Improving Multi-view Diffusion Model with Synthetic Data
Zeyi Sun, Tong Wu, Pan Zhang et al.
Color Matching Using Hypernetwork-Based Kolmogorov-Arnold Networks
Artem Nikonorov, Georgy Perevozchikov, Andrei Korepanov et al.
MVU-Eval: Towards Multi-Video Understanding Evaluation for Multimodal LLMs
Tianhao Peng, Haochen Wang, Yuanxing Zhang et al.
Potential Field Based Deep Metric Learning
Shubhang Bhatnagar, Narendra Ahuja
StyleKeeper: Prevent Content Leakage using Negative Visual Query Guidance
Jaeseok Jeong, Junho Kim, Youngjung Uh et al.
StarTrail: Concentric Ring Sequence Parallelism for Efficient Near-Infinite-Context Transformer Model Training
Ziming Liu, Shaoyu Wang, Shenggan Cheng et al.
Unifying Attention Heads and Task Vectors via Hidden State Geometry in In-Context Learning
Haolin Yang, Hakaze Cho, Yiqiao Zhong et al.
Trade-offs in Image Generation: How Do Different Dimensions Interact?
Sicheng Zhang, Binzhu Xie, Zhonghao Yan et al.
ViCTr: Vital Consistency Transfer for Pathology Aware Image Synthesis
Onkar Susladkar, Gayatri Deshmukh, Yalcin Tur et al.
Physics Context Builders: A Modular Framework for Physical Reasoning in Vision-Language Models
Vahid Balazadeh, Mohammadmehdi Ataei, Hyunmin Cheong et al.
EC-Flow: Enabling Versatile Robotic Manipulation from Action-Unlabeled Videos via Embodiment-Centric Flow
Yixiang Chen, Peiyan Li, Yan Huang et al.
CosmoBench: A Multiscale, Multiview, Multitask Cosmology Benchmark for Geometric Deep Learning
Teresa Huang, Richard Stiskalek, Jun-Young Lee et al.
Rethinking Correspondence-based Category-Level Object Pose Estimation
Huan Ren, Wenfei Yang, Shifeng Zhang et al.
Disentanglement Beyond Static vs. Dynamic: A Benchmark and Evaluation Framework for Multi-Factor Sequential Representations
Tal Barami, Nimrod Berman, Ilan Naiman et al.
Can't Slow Me Down: Learning Robust and Hardware-Adaptive Object Detectors against Latency Attacks for Edge Devices
Tianyi Wang, Zichen Wang, Cong Wang et al.
DTOS: Dynamic Time Object Sensing with Large Multimodal Model
Jirui Tian, Jinrong Zhang, Shenglan Liu et al.
Block Coordinate Descent for Neural Networks Provably Finds Global Minima
Shunta Akiyama
Sim-DETR: Unlock DETR for Temporal Sentence Grounding
Jiajin Tang, Zhengxuan Wei, Yuchen Zhu et al.
Datasets, Documents, and Repetitions: The Practicalities of Unequal Data Quality
Alex Fang, Hadi Pouransari, Matt Jordan et al.
beta-FFT: Nonlinear Interpolation and Differentiated Training Strategies for Semi-Supervised Medical Image Segmentation
Ming Hu, Jianfu Yin, Zhuangzhuang Ma et al.
Enhancing Dance-to-Music Generation via Negative Conditioning Latent Diffusion Model
Changchang Sun, Gaowen Liu, Charles Fleming et al.
Auto-Encoded Supervision for Perceptual Image Super-Resolution
MinKyu Lee, Sangeek Hyun, Woojin Jun et al.
Distilled Decoding 2: One-step Sampling of Image Auto-regressive Models with Conditional Score Distillation
Enshu Liu, Qian Chen, Xuefei Ning et al.
TSP-Mamba: The Travelling Salesman Problem Meets Mamba for Image Super-resolution and Beyond
Kun Zhou, Xinyu Lin, Jiangbo Lu
AgMMU: A Comprehensive Agricultural Multimodal Understanding Benchmark
Aruna Gauba, Irene Pi, Yunze Man et al.
Structure-aware Semantic Discrepancy and Consistency for 3D Medical Image Self-supervised Learning
Tan Pan, Zhaorui Tan, Kaiyu Guo et al.
Early-Bird Diffusion: Investigating and Leveraging Timestep-Aware Early-Bird Tickets in Diffusion Models for Efficient Training
Lexington Whalen, Zhenbang Du, Haoran You et al.
Vision‑Language‑Vision Auto‑Encoder: Scalable Knowledge Distillation from Diffusion Models
Tiezheng Zhang, Yitong Li, Yu-Cheng Chou et al.
Solving Instance Detection from an Open-World Perspective
Qianqian Shen, Yunhan Zhao, Nahyun Kwon et al.
Fair Deepfake Detectors Can Generalize
Harry Cheng, Ming-Hui Liu, Yangyang Guo et al.
GenVDM: Generating Vector Displacement Maps From a Single Image
Yuezhi Yang, Qimin Chen, Vladimir G. Kim et al.
Exploring the Noise Robustness of Online Conformal Prediction
HuaJun Xi, Kangdao Liu, Hao Zeng et al.
DGS-LRM: Real-Time Deformable 3D Gaussian Reconstruction From Monocular Videos
Chieh Lin, Zhaoyang Lv, Songyin Wu et al.
Learnable Feature Patches and Vectors for Boosting Low-light Image Enhancement without External Knowledge
Xiaogang Xu, Jiafei Wu, Qingsen Yan et al.
Multi-Modal Aerial-Ground Cross-View Place Recognition with Neural ODEs
Sijie Wang, Rui She, Qiyu Kang et al.
DCT-Shield: A Robust Frequency Domain Defense against Malicious Image Editing
Aniruddha Bala, Rohit Chowdhury, Rohan Jaiswal et al.
Adaptive Kernel Design for Bayesian Optimization Is a Piece of CAKE with LLMs
Richard Suwandi, Feng Yin, Juntao Wang et al.
A2Seek: Towards Reasoning-Centric Benchmark for Aerial Anomaly Understanding
Mengjingcheng Mo, Xinyang Tong, Mingpi Tan et al.
GausSim: Foreseeing Reality by Gaussian Simulator for Elastic Objects
Yidi Shao, Mu Huang, Chen Change Loy et al.
MUSE: Multi-Subject Unified Synthesis via Explicit Layout Semantic Expansion
Fei Peng, Junqiang Wu, Yan Li et al.
AdvDreamer Unveils: Are Vision-Language Models Truly Ready for Real-World 3D Variations?
Shouwei Ruan, Hanqing Liu, Yao Huang et al.
Seek Common Ground While Reserving Differences: Semi-Supervised Image-Text Sentiment Recognition
Wuyou Xia, Guoli Jia, Sicheng Zhao et al.
CoST: Efficient Collaborative Perception From Unified Spatiotemporal Perspective
Zongheng Tang, Yi Liu, Yifan Sun et al.
Boosting the Dual-Stream Architecture in Ultra-High Resolution Segmentation with Resolution-Biased Uncertainty Estimation
Rong Qin, Xingyu Liu, Jinglei Shi et al.
Enhancing Transformers Through Conditioned Embedded Tokens
Hemanth Saratchandran, Simon Lucey
A Structure-aware and Motion-adaptive Framework for 3D Human Pose Estimation with Mamba
Ye Lu, Jie Wang, Jianjun Gao et al.
GRAPE: Optimize Data Mixture for Group Robust Multi-target Adaptive Pretraining
Simin Fan, Maria Ios Glarou, Martin Jaggi
Triad: Empowering LMM-based Anomaly Detection with Expert-guided Region-of-Interest Tokenizer and Manufacturing Process
Yuanze Li, Shihao Yuan, Haolin Wang et al.
SparseDiT: Token Sparsification for Efficient Diffusion Transformer
Shuning Chang, Pichao WANG, Jiasheng Tang et al.
A Generalized Bisimulation Metric of State Similarity between Markov Decision Processes: From Theoretical Propositions to Applications
Zhenyu Tao, Wei Xu, Xiaohu You
A Unified, Resilient, and Explainable Adversarial Patch Detector
Vishesh Kumar, Akshay Agarwal
FedSVD: Adaptive Orthogonalization for Private Federated Learning with LoRA
Seanie Lee, Sangwoo Park, Dong Bok Lee et al.
SP3D: Boosting Sparsely-Supervised 3D Object Detection via Accurate Cross-Modal Semantic Prompts
Shijia Zhao, Qiming Xia, Xusheng Guo et al.
Bridging Symmetry and Robustness: On the Role of Equivariance in Enhancing Adversarial Robustness
Longwei Wang, Ifrat Ikhtear Uddin, Prof. KC Santosh (PhD) et al.
DictAS: A Framework for Class-Generalizable Few-Shot Anomaly Segmentation via Dictionary Lookup
Zhen Qu, Xian Tao, Xinyi Gong et al.
Neural-Driven Image Editing
Pengfei Zhou, Jie Xia, Xiaopeng Peng et al.
High-order Equivariant Flow Matching for Density Functional Theory Hamiltonian Prediction
Seongsu Kim, Nayoung Kim, Dongwoo Kim et al.
Training-Free Generation of Temporally Consistent Rewards from VLMs
Yinuo Zhao, Jiale Yuan, Zhiyuan Xu et al.
SAGI: Semantically Aligned and Uncertainty Guided AI Image Inpainting
Paschalis Giakoumoglou, Dimitrios Karageorgiou, Symeon Papadopoulos et al.
Generalizing while preserving monotonicity in comparison-based preference learning models
Julien Fageot, Peva Blanchard, Gilles Bareilles et al.
MEMFOF: High-Resolution Training for Memory-Efficient Multi-Frame Optical Flow Estimation
Vladislav Bargatin, Egor Chistov, Alexander Yakovenko et al.
SACB-Net: Spatial-awareness Convolutions for Medical Image Registration
Xinxing Cheng, Tianyang Zhang, Wenqi Lu et al.
VideoGuide: Improving Video Diffusion Models without Training Through a Teacher's Guide
Dohun Lee, Bryan Sangwoo Kim, Geon Yeong Park et al.
Stepping Out of Similar Semantic Space for Open-Vocabulary Segmentation
Yong Liu, Song-Li Wu, Sule Bai et al.
Factorio Learning Environment
Jack Hopkins, Mart Bakler, Akbir Khan
Towards Unsupervised Domain Bridging via Image Degradation in Semantic Segmentation
Wangkai Li, Rui Sun, Huayu Mai et al.
On Feasible Rewards in Multi-Agent Inverse Reinforcement Learning
Till Freihaut, Giorgia Ramponi
FFN Fusion: Rethinking Sequential Computation in Large Language Models
Akhiad Bercovich, Mohammed Dabbah, Omri Puny et al.
SkyLadder: Better and Faster Pretraining via Context Window Scheduling
Tongyao Zhu, Qian Liu, Haonan Wang et al.
RC-AutoCalib: An End-to-End Radar-Camera Automatic Calibration Network
Van-Tin Luu, Yong-Lin Cai, Vu-Hoang Tran et al.
Demystifying Network Foundation Models
Roman Beltiukov, Satyandra Guthula, Wenbo Guo et al.
A Multi-Task Benchmark for Abusive Language Detection in Low-Resource Settings
Fitsum Gaim, Hoyun Song, Huije Lee et al.
Differential-informed Sample Selection Accelerates Multimodal Contrastive Learning
Zihua Zhao, Feng Hong, Mengxi Chen et al.
Towards a Universal 3D Medical Multi-modality Generalization via Learning Personalized Invariant Representation
Zhaorui Tan, Xi Yang, Tan Pan et al.
Transfer Your Perspective: Controllable 3D Generation from Any Viewpoint in a Driving Scene
Tai-Yu Daniel Pan, Sooyoung Jeon, Mengdi Fan et al.
Rigor in AI: Doing Rigorous AI Work Requires a Broader, Responsible AI-Informed Conception of Rigor
Alexandra Olteanu, Su Lin Blodgett, Agathe Balayn et al.
Follow the Energy, Find the Path: Riemannian Metrics from Energy-Based Models
Louis Bethune, David Vigouroux, Yilun Du et al.
The PanAf-FGBG Dataset: Understanding the Impact of Backgrounds in Wildlife Behaviour Recognition
Otto Brookes, Maksim Kukushkin, Majid Mirmehdi et al.
PartNeXt: A Next-Generation Dataset for Fine-Grained and Hierarchical 3D Part Understanding
Penghao Wang, Yiyang He, Xin Lv et al.
What Changed? Detecting and Evaluating Instruction-Guided Image Edits with Multimodal Large Language Models
Lorenzo Baraldi, Davide Bucciarelli, Federico Betti et al.
EBS-EKF: Accurate and High Frequency Event-based Star Tracking
Albert Reed, Connor Hashemi, Dennis Melamed et al.
Hallucination at a Glance: Controlled Visual Edits and Fine-Grained Multimodal Learning
Tianyi Bai, Yuxuan Fan, Qiu Jiantao et al.
Filter Like You Test: Data-Driven Data Filtering for CLIP Pretraining
Mikey Shechter, Yair Carmon
A duality framework for analyzing random feature and two-layer neural networks
Hongrui Chen, Jihao Long, Lei Wu
JiSAM: Alleviate Labeling Burden and Corner Case Problems in Autonomous Driving via Minimal Real-World Data
Runjian Chen, Wenqi Shao, Bo Zhang et al.
Thin-Shell-SfT: Fine-Grained Monocular Non-rigid 3D Surface Tracking with Neural Deformation Fields
Navami Kairanda, Marc Habermann, Shanthika Shankar Naik et al.
Diffusion-Based Hierarchical Graph Neural Networks for Simulating Nonlinear Solid Mechanics
Tobias Würth, Niklas Freymuth, Gerhard Neumann et al.
InstaScene: Towards Complete 3D Instance Decomposition and Reconstruction from Cluttered Scenes
Zesong Yang, Bangbang Yang, Wenqi Dong et al.
Lost in Transmission: When and Why LLMs Fail to Reason Globally
Tobias Schnabel, Kiran Tomlinson, Adith Swaminathan et al.
Reconstruct, Inpaint, Test-Time Finetune: Dynamic Novel-view Synthesis from Monocular Videos
Kaihua Chen, Tarasha Khurana, Deva Ramanan
Towards Cross-modal Backward-compatible Representation Learning for Vision-Language Models
Young Kyun Jang, Ser-Nam Lim
UPME: An Unsupervised Peer Review Framework for Multimodal Large Language Model Evaluation
Qihui Zhang, Munan Ning, Zheyuan Liu et al.
Learning Deblurring Texture Prior from Unpaired Data with Diffusion Model
Chengxu Liu, Lu Qi, Jinshan Pan et al.
TRACE: Learning 3D Gaussian Physical Dynamics from Multi-view Videos
Jinxi Li, Ziyang Song, Bo Yang
Progressive Rendering Distillation: Adapting Stable Diffusion for Instant Text-to-Mesh Generation without 3D Data
Zhiyuan Ma, Xinyue Liang, Rongyuan Wu et al.
ACCO: Accumulate While You Communicate for Communication-Overlapped Sharded LLM Training
Adel Nabli, Louis Fournier, Pierre ERBACHER et al.
Diorama: Unleashing Zero-shot Single-view 3D Indoor Scene Modeling
Qirui Wu, Denys Iliash, Daniel Ritchie et al.
Online Language Splatting
Saimouli Katragadda, Cho-Ying Wu, Yuliang Guo et al.
ReCAP: Recursive Context-Aware Reasoning and Planning for Large Language Model Agents
Zhenyu Zhang, Tianyi Chen, Weiran Xu et al.
Synthetic Visual Genome
Jae Sung Park, Zixian Ma, Linjie Li et al.
Unlocking Constraints: Source-Free Occlusion-Aware Seamless Segmentation
Yihong Cao, Jiaming Zhang, Xu Zheng et al.
Emergent Risk Awareness in Rational Agents under Resource Constraints
Daniel Jarne Ornia, Nicholas Bishop, Joel Dyer et al.
JailbreakDiffBench: A Comprehensive Benchmark for Jailbreaking Diffusion Models
Xiaolong Jin, Zixuan Weng, Hanxi Guo et al.
Hybrid Concept Bottleneck Models
Yang Liu, Tianwei Zhang, Shi Gu
UGoDIT: Unsupervised Group Deep Image Prior Via Transferable Weights
Shijun Liang, Ismail Alkhouri, Siddhant Gautam et al.
Discretization-free Multicalibration through Loss Minimization over Tree Ensembles
Hongyi Henry Jin, Zijun Ding, Dung Daniel Ngo et al.
Unlearned but Not Forgotten: Data Extraction after Exact Unlearning in LLM
Xiaoyu Wu, Yifei Pang, Terrance Liu et al.
A Diffusion Model for Regular Time Series Generation from Irregular Data with Completion and Masking
Gal Fadlon, Idan Arbiv, Nimrod Berman et al.
ConText-CIR: Learning from Concepts in Text for Composed Image Retrieval
Eric Xing, Pranavi Kolouju, Robert Pless et al.
GeoDynamics: A Geometric State‑Space Neural Network for Understanding Brain Dynamics on Riemannian Manifolds
Tingting Dan, Jiaqi Ding, Guorong Wu
Alleviating Textual Reliance in Medical Language-guided Segmentation via Prototype-driven Semantic Approximation
Shuchang Ye, Usman Naseem, Mingyuan Meng et al.
TRIDENT: Tri-Modal Molecular Representation Learning with Taxonomic Annotations and Local Correspondence
Feng Jiang, Mangal Prakash, Hehuan Ma et al.
Conformal Information Pursuit for Interactively Guiding Large Language Models
Kwan Ho Ryan Chan, Yuyan Ge, Edgar Dobriban et al.
Scalable In-context Ranking with Generative Models
Nilesh Gupta, Chong You, Srinadh Bhojanapalli et al.
H2ST: Hierarchical Two-Sample Tests for Continual Out-of-Distribution Detection
Yuhang Liu, Wenjie Zhao, Yunhui Guo
Revisiting Follow-the-Perturbed-Leader with Unbounded Perturbations in Bandit Problems
Jongyeong Lee, Junya Honda, Shinji Ito et al.
Improving Visual and Downstream Performance of Low-Light Enhancer with Vision Foundation Models Collaboration
yuxuan Gu, Huaian Chen, Yi Jin et al.
Can3Tok: Canonical 3D Tokenization and Latent Modeling of Scene-Level 3D Gaussians
Quankai Gao, Iliyan Georgiev, Tuanfeng Wang et al.
OpenWorldSAM: Extending SAM2 for Universal Image Segmentation with Language Prompts
Shiting (Ginny) Xiao, Rishabh Kabra, Yuhang Li et al.
Practical Bayes-Optimal Membership Inference Attacks
Marcus Lassila, Johan Oestman, Khac-Hoang Ngo et al.
SHAP values via sparse Fourier representation
Ali Gorji, Andisheh Amrollahi, Andreas Krause
Derivative-Free Diffusion Manifold-Constrained Gradient for Unified XAI
Won Jun Kim, Hyungjin Chung, Jaemin Kim et al.
Attack by Yourself: Effective and Unnoticeable Multi-Category Graph Backdoor Attacks with Subgraph Triggers Pool
Jiangtong Li, Dongyi Liu, Kun Zhu et al.
MotionMap: Representing Multimodality in Human Pose Forecasting
Reyhaneh Hosseininejad, Megh Shukla, Saeed Saadatnejad et al.
SHAP Meets Tensor Networks: Provably Tractable Explanations with Parallelism
Reda Marzouk, Shahaf Bassan, Guy Katz
Neural Collapse in Cumulative Link Models for Ordinal Regression: An Analysis with Unconstrained Feature Model
Chuang Ma, Tomoyuki Obuchi, Toshiyuki Tanaka
Teaching VLMs to Localize Specific Objects from In-context Examples
Sivan Doveh, Nimrod Shabtay, Eli Schwartz et al.
AdaSPEC: Selective Knowledge Distillation for Efficient Speculative Decoders
Yuezhou Hu, Jiaxin Guo, Xinyu Feng et al.
Continuous Simplicial Neural Networks
Aref Einizade, Dorina Thanou, Fragkiskos Malliaros et al.
Learning to cluster neuronal function
Nina Nellen, Polina Turishcheva, Michaela Vystrčilová et al.
HiFi-Portrait: Zero-shot Identity-preserved Portrait Generation with High-fidelity Multi-face Fusion
Yifang Xu, BenXiang Zhai, Yunzhuo Sun et al.
Training-Free Personalization via Retrieval and Reasoning on Fingerprints
Deepayan Das, Davide Talon, Yiming Wang et al.
Gradient Variance Reveals Failure Modes in Flow-Based Generative Models
Teodora Reu, Sixtine Dromigny, Michael Bronstein et al.
LC-Mamba: Local and Continuous Mamba with Shifted Windows for Frame Interpolation
Min Wu Jeong, Chae Eun Rhee
Equilibrium Policy Generalization: A Reinforcement Learning Framework for Cross-Graph Zero-Shot Generalization in Pursuit-Evasion Games
Runyu Lu, Peng Zhang, Ruochuan Shi et al.
High Temporal Consistency through Semantic Similarity Propagation in Semi-Supervised Video Semantic Segmentation for Autonomous Flight
Cédric Vincent, Taehyoung Kim, Henri Meeß
Split Adaptation for Pre-trained Vision Transformers
Lixu Wang, Bingqi Shang, Yi Li et al.
DiffCAM: Data-Driven Saliency Maps by Capturing Feature Differences
Xingjian Li, Qiming Zhao, Neelesh Bisht et al.
Do Your Best and Get Enough Rest for Continual Learning
Hankyul Kang, Gregor Seifer, Donghyun Lee et al.
The Computational Complexity of Counting Linear Regions in ReLU Neural Networks
Moritz Stargalla, Christoph Hertrich, Daniel Reichman
Visual Instruction Bottleneck Tuning
Changdae Oh, Jiatong Li, Shawn Im et al.
Parameter Efficient Mamba Tuning via Projector-targeted Diagonal-centric Linear Transformation
Seokil Ham, Hee-Seon Kim, Sangmin Woo et al.
BLINK-Twice: You see, but do you observe? A Reasoning Benchmark on Visual Perception
junyan ye, Dongzhi JIANG, Jun He et al.
PT-MoE: An Efficient Finetuning Framework for Integrating Mixture-of-Experts into Prompt Tuning
Zongqian Li, Yixuan Su, Nigel Collier
The Promise of RL for Autoregressive Image Editing
Saba Ahmadi, Rabiul Awal, Ankur Sikarwar et al.
T-CIL: Temperature Scaling using Adversarial Perturbation for Calibration in Class-Incremental Learning
Seong-Hyeon Hwang, Minsu Kim, Steven Euijong Whang
Track Any Anomalous Object:A Granular Video Anomaly Detection Pipeline
Yuzhi Huang, Chenxin Li, Haitao Zhang et al.
High-Fidelity Lightweight Mesh Reconstruction from Point Clouds
Chen Zhang, Wentao Wang, Ximeng Li et al.
Logical Expressiveness of Graph Neural Networks with Hierarchical Node Individualization
Arie Soeteman, Balder ten Cate
Intervene-All-Paths: Unified Mitigation of LVLM Hallucinations across Alignment Formats
Jiaye Qian, Ge Zheng, Yuchen Zhu et al.
RoboPearls: Editable Video Simulation for Robot Manipulation
Tao Tang, Likui Zhang, Youpeng Wen et al.
GraspCoT: Integrating Physical Property Reasoning for 6-DoF Grasping under Flexible Language Instructions
Xiaomeng Chu, Jiajun Deng, Guoliang You et al.
HoloScene: Simulation‑Ready Interactive 3D Worlds from a Single Video
Hongchi Xia, Chih-Hao Lin, Hao-Yu Hsu et al.
The Gaussian Mixing Mechanism: Renyi Differential Privacy via Gaussian Sketches
Omri Lev, Vishwak Srinivasan, Moshe Shenfeld et al.
Multi-modal Multi-platform Person Re-Identification: Benchmark and Method
Ruiyang Ha, Songyi Jiang, Bin Li et al.
Cognitive Mirrors: Exploring the Diverse Functional Roles of Attention Heads in LLM Reasoning
Xueqi Ma, Jun Wang, Yanbei Jiang et al.
Using Powerful Prior Knowledge of Diffusion Model in Deep Unfolding Networks for Image Compressive Sensing
Chen Liao, Yan Shen, Dan Li et al.
PROGRESSOR: A Perceptually Guided Reward Estimator with Self-Supervised Online Refinement
Tewodros W. Ayalew, Xiao Zhang, Kevin Y Wu et al.
CoE: Chain-of-Explanation via Automatic Visual Concept Circuit Description and Polysemanticity Quantification
wenlong yu, Qilong Wang, Chuang Liu et al.
Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations
Yujia Zhang, Xiaoyang Wu, Yixing Lao et al.
SemAlign3D: Semantic Correspondence between RGB-Images through Aligning 3D Object-Class Representations
Krispin Wandel, Hesheng Wang
Few-Shot Learning from Gigapixel Images via Hierarchical Vision-Language Alignment and Modeling
Bryan Wong, Jongwoo Kim, Huazhu Fu et al.
FreeUV: Ground-Truth-Free Realistic Facial UV Texture Recovery via Cross-Assembly Inference Strategy
Xingchao Yang, Takafumi Taketomi, Yuki Endo et al.
Normalization in Attention Dynamics
Nikita Karagodin, Shu Ge, Yury Polyanskiy et al.
A High-Dimensional Statistical Method for Optimizing Transfer Quantities in Multi-Source Transfer Learning
Qingyue Zhang, Haohao Fu, Guanbo Huang et al.
Autoregressive Distillation of Diffusion Transformers
Yeongmin Kim, Sotiris Anagnostidis, Yuming Du et al.
Few-Shot Image Quality Assessment via Adaptation of Vision-Language Models
Xudong Li, Zihao Huang, Yan Zhang et al.
Reasoning Mamba: Hypergraph-Guided Region Relation Calculating for Weakly Supervised Affordance Grounding
Yuxuan Wang, Aming Wu, Muli Yang et al.
GLSim: Detecting Object Hallucinations in LVLMs via Global-Local Similarity
Seongheon Park, Sharon Li
AnyPortal: Zero-Shot Consistent Video Background Replacement
Wenshuo Gao, Xicheng Lan, Shuai Yang
ORIGAMISPACE: Benchmarking Multimodal LLMs in Multi-Step Spatial Reasoning with Mathematical Constraints
Rui Xu, Dakuan Lu, Zicheng Zhao et al.
RSCC: A Large-Scale Remote Sensing Change Caption Dataset for Disaster Events
Zhenyuan Chen, Chenxi Wang, Ningyu Zhang et al.
GnnXemplar: Exemplars to Explanations - Natural Language Rules for Global GNN Interpretability
Burouj Armgaan, Eshan Jain, Harsh Pandey et al.
PASTA: Part-Aware Sketch-to-3D Shape Generation with Text-Aligned Prior
Seunggwan Lee, Hwanhee Jung, ByoungSoo Koh et al.
Differentially Private Quantiles with Smaller Error
Jacob Imola, Fabrizio Boninsegna, Hannah Keller et al.
Diffusion-based Event Generation for High-Quality Image Deblurring
Xinan Xie, Qing Zhang, Wei-Shi Zheng
NeighborRetr: Balancing Hub Centrality in Cross-Modal Retrieval
Zengrong Lin, Zheng Wang, Tianwen Qian et al.
V2V: Scaling Event-Based Vision through Efficient Video-to-Voxel Simulation
Hanyue Lou, Jinxiu Liang, Minggui Teng et al.
Meta-Learning Objectives for Preference Optimization
Carlo Alfano, Silvia Sapora, Jakob Foerster et al.
Towards Human-Understandable Multi-Dimensional Concept Discovery
Arne Grobrügge, Niklas Kühl, Gerhard Satzger et al.
SPC-GS: Gaussian Splatting with Semantic-Prompt Consistency for Indoor Open-World Free-view Synthesis from Sparse Inputs
Guibiao Liao, Qing Li, Zhenyu Bao et al.
A Unified Framework for Motion Reasoning and Generation in Human Interaction
Jeongeun Park, Sungjoon Choi, Sangdoo Yun
Failure Prediction at Runtime for Generative Robot Policies
Ralf Römer, Adrian Kobras, Luca Worbis et al.