Most Cited 2025 "step-by-step reasoning" Papers
22,274 papers found • Page 101 of 112
Conference
Do We Really Need Curated Malicious Data for Safety Alignment in Multi-modal Large Language Models?
Yanbo Wang, Jiyang Guan, Jian Liang et al.
Less Is More, but Where? Dynamic Token Compression via LLM-Guided Keyframe Prior
Yulin Li, Haokun GUI, Ziyang Fan et al.
STAR: Efficient Preference-based Reinforcement Learning via Dual Regularization
Fengshuo Bai, Rui Zhao, Hongming Zhang et al.
GAM-Agent: Game-Theoretic and Uncertainty-Aware Collaboration for Complex Visual Reasoning
jusheng zhang, Yijia Fan, Wenjun Lin et al.
CAP-Net: A Unified Network for 6D Pose and Size Estimation of Categorical Articulated Parts from a Single RGB-D Image
Jingshun Huang, Haitao Lin, Tianyu Wang et al.
Jamais Vu: Exposing the Generalization Gap in Supervised Semantic Correspondence
Octave Mariotti, Zhipeng Du, Yash Bhalgat et al.
DGH: Dynamic Gaussian Hair
Junying Wang, Yuanlu Xu, Edith Tretschk et al.
OOD Detection with Relative Angles
Berker Demirel, Marco Fumero, Francesco Locatello
FineRS: Fine-grained Reasoning and Segmentation of Small Objects with Reinforcement Learning
Lu Zhang, Jiazuo Yu, Haomiao Xiong et al.
Multi-Scale Neighborhood Occupancy Masked Autoencoder for Self-Supervised Learning in LiDAR Point Clouds
Mohamed Abdelsamad, Michael Ulrich, Claudius Glaeser et al.
Decoupling Fine Detail and Global Geometry for Compressed Depth Map Super-Resolution
Huan Zheng, Wencheng Han, Jianbing Shen
InstructRestore: Region-Customized Image Restoration with Human Instructions
Shuaizheng Liu, Jianqi Ma, Lingchen Sun et al.
SVFR: A Unified Framework for Generalized Video Face Restoration
Zhiyao Wang, Xu Chen, Chengming Xu et al.
Effective SAM Combination for Open-Vocabulary Semantic Segmentation
Minhyeok Lee, Suhwan Cho, Jungho Lee et al.
Extracting task-relevant preserved dynamics from contrastive aligned neural recordings
Yiqi Jiang, Kaiwen Sheng, Yujia Gao et al.
SuffixDecoding: Extreme Speculative Decoding for Emerging AI Applications
Gabriele Oliaro, Zhihao Jia, Daniel Campos et al.
Covariate-moderated Empirical Bayes Matrix Factorization
William Denault, Karl Tayeb, Peter Carbonetto et al.
A Unified Approach to Interpreting Self-supervised Pre-training Methods for 3D Point Clouds via Interactions
Qiang Li, Jian Ruan, Fanghao Wu et al.
Improving Data Efficiency for LLM Reinforcement Fine-tuning Through Difficulty-targeted Online Data Selection and Rollout Replay
Yifan Sun, Jingyan Shen, Yibin Wang et al.
A Plug-and-Play Query Synthesis Active Learning Framework for Neural PDE Solvers
Zhiyuan Wang, Jinwoo Go, Byung-Jun Yoon et al.
Open Set Label Shift with Test Time Out-of-Distribution Reference
Changkun Ye, Russell Tsuchida, Lars Petersson et al.
Less is More: an Attention-free Sequence Prediction Modeling for Offline Embodied Learning
Wei Huang, Jianshu Zhang, Leiyu Wang et al.
Dynamic Motion Blending for Versatile Motion Editing
Nan Jiang, Hongjie Li, Ziye Yuan et al.
Learning and Planning Multi-Agent Tasks via an MoE-based World Model
Zijie Zhao, Zhongyue Zhao, Kaixuan Xu et al.
PDFactor: Learning Tri-Perspective View Policy Diffusion Field for Multi-Task Robotic Manipulation
Jingyi Tian, Le Wang, Sanping Zhou et al.
Towards Irreversible Attack: Fooling Scene Text Recognition via Multi-Population Coevolution Search
Jingyu Li, Pengwen Dai, Mingqing Zhu et al.
CG-SSL: Concept-Guided Self-Supervised Learning
Sara Atito, Josef Kittler, Imran Razzak et al.
Generalization Guarantees for Learning Score-Based Branch-and-Cut Policies in Integer Programming
Hongyu Cheng, Amitabh Basu
LangHOPS: Language Grounded Hierarchical Open-Vocabulary Part Segmentation
Yang Miao, Jan-Nico Zaech, Xi Wang et al.
Transferable Black-Box One-Shot Forging of Watermarks via Image Preference Models
Tomas Soucek, Sylvestre-Alvise Rebuffi, Pierre Fernandez et al.
Building Vision Models upon Heat Conduction
Zhaozhi Wang, Yue Liu, Yunjie Tian et al.
Skill-Driven Neurosymbolic State Abstractions
Alper Ahmetoglu, Steven James, Cameron Allen et al.
Last-Iterate Convergence of Smooth Regret Matching$^+$ Variants in Learning Nash Equilibria
Linjian Meng, Youzhi Zhang, Zhenxing Ge et al.
Learning to Clean: Reinforcement Learning for Noisy Label Correction
Marzi Heidari, Hanping Zhang, Yuhong Guo
Incomplete Multi-View Multi-label Learning via Disentangled Representation and Label Semantic Embedding
Xu Yan, Jun Yin, Jie Wen
MIBP-Cert: Certified Training against Data Perturbations with Mixed-Integer Bilinear Programs
Tobias Lorenz, Marta Kwiatkowska, Mario Fritz
Distributional Adversarial Attacks and Training in Deep Hedging
Guangyi He, Tobias Sutter, Lukas Gonon
Calibrating Translation Decoding with Quality Estimation on LLMs
Di Wu, Yibin Lei, Christof Monz
DnLUT: Ultra-Efficient Color Image Denoising via Channel-Aware Lookup Tables
Sidi Yang, Binxiao Huang, Yulun Zhang et al.
Improve Temporal Reasoning in Multimodal Large Language Models via Video Contrastive Decoding
Daiqing Qi, Dongliang Guo, Hanzhang Yuan et al.
h-Edit: Effective and Flexible Diffusion-Based Editing via Doob's h-Transform
Toan Nguyen, Kien Do, Duc Kieu et al.
Boundary-to-Region Supervision for Offline Safe Reinforcement Learning
Huikang Su, Dengyun Peng, Zifeng Zhuang et al.
Sample-Adaptivity Tradeoff in On-Demand Sampling
Nika Haghtalab, Omar Montasser, Mingda Qiao
Efficient Bayesian Experiment Design with Equivariant Networks
Conor Igoe, Tejus Gupta, Jeff Schneider
Luminance-Aware Statistical Quantization: Unsupervised Hierarchical Learning for Illumination Enhancement
Derong Kong, Zhixiong Yang, Shengxi Li et al.
CocoER: Aligning Multi-Level Feature by Competition and Coordination for Emotion Recognition
Xuli Shen, Hua Cai, Weilin Shen et al.
Brain-Inspired Spiking Neural Networks for Energy-Efficient Object Detection
Ziqi Li, Tao Gao, Yisheng An et al.
Understanding the Gain from Data Filtering in Multimodal Contrastive Learning
Divyansh Pareek, Sewoong Oh, Simon Du
Self-Expansion of Pre-trained Models with Mixture of Adapters for Continual Learning
Huiyi Wang, Haodong Lu, Lina Yao et al.
Enhancing Zero-Shot Black-Box Optimization via Pretrained Models with Efficient Population Modeling, Interaction, and Stable Gradient Approximation
Muqi Han, Xiaobin Li, Kai Wu et al.
PointSR: Self-Regularized Point Supervision for Drone-View Object Detection
Weizhuo Li, Yue Xi, Wenjing Jia et al.
MM-OR: A Large Multimodal Operating Room Dataset for Semantic Understanding of High-Intensity Surgical Environments
Ege Özsoy, Chantal Pellegrini, Tobias Czempiel et al.
TANDEM: Bi-Level Data Mixture Optimization with Twin Networks
Jiaxing Wang, Deping Xiang, Jin Xu et al.
Mitigating Instability in High Residual Adaptive Sampling for PINNs via Langevin Dynamics
Minseok Jeong, Giup Seo, Euiseok Hwang
ToxicTextCLIP: Text-Based Poisoning and Backdoor Attacks on CLIP Pre-training
Xin Yao, Haiyang Zhao, Yimin Chen et al.
Reaction Prediction via Interaction Modeling of Symmetric Difference Shingle Sets
Runhan Shi, Letian Chen, Gufeng Yu et al.
Anomaly Detection by an Ensemble of Random Pairs of Hyperspheres
Walid Durani, Collin Leiber, Khalid Durani et al.
DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation
Jianzong Wu, Chao Tang, Jingbo Wang et al.
Causal Spatio-Temporal Prediction: An Effective and Efficient Multi-Modal Approach
Yuting Huang, Ziquan Fang, Zhihao Zeng et al.
AniDoc: Animation Creation Made Easier
Yihao Meng, Hao Ouyang, Hanlin Wang et al.
Camouflage Anything: Learning to Hide using Controlled Out-painting and Representation Engineering
Biplab Das, Viswanath Gopalakrishnan
Learning to Reason under Off-Policy Guidance
Jianhao Yan, Yafu Li, Zican Hu et al.
Leveraging Temporal Cues for Semi-Supervised Multi-View 3D Object Detection
Jinhyung Park, Navyata Sanghvi, Hiroki Adachi et al.
Backdoor Mitigation via Invertible Pruning Masks
Kealan Dunnett, Reza Arablouei, Volkan Dedeoglu et al.
Data-Dependent Regret Bounds for Constrained MABs
Gianmarco Genalti, Francesco Emanuele Stradi, Matteo Castiglioni et al.
CGMatch: A Different Perspective of Semi-supervised Learning
Bo Cheng, Jueqing Lu, Yuan Tian et al.
ProtoPairNet: Interpretable Regression through Prototypical Pair Reasoning
Rose Gurung, Ronilo Ragodos, Chiyu Ma et al.
Shape-Informed Clustering of Multi-Dimensional Functional Data via Deep Functional Autoencoders
Samuel V. Singh, Shirley Coyle, Mimi Zhang
Improving Model-Based Reinforcement Learning by Converging to Flatter Minima
Shrinivas Ramasubramanian, Benjamin Freed, Alexandre Capone et al.
Compositional Targeted Multi-Label Universal Perturbations
Hassan Mahmood, Ehsan Elhamifar
Online Bilateral Trade With Minimal Feedback: Don’t Waste Seller’s Time
Francesco Bacchiocchi, Matteo Castiglioni, Roberto Colomboni et al.
ODA-GAN: Orthogonal Decoupling Alignment GAN Assisted by Weakly-supervised Learning for Virtual Immunohistochemistry Staining
Tong Wang, Mingkang Wang, Zhongze Wang et al.
Zero-shot Denoising via Neural Compression: Theoretical and algorithmic framework
Ali Zafari, Xi Chen, Shirin Jalali
Fairness-aware Anomaly Detection via Fair Projection
Feng Xiao, Xiaoying Tang, Jicong Fan
Proximal Algorithm Unrolling: Flexible and Efficient Reconstruction Networks for Single-Pixel Imaging
Ping Wang, Lishun Wang, Gang Qu et al.
Scalable Evaluation and Neural Models for Compositional Generalization
Giacomo Camposampiero, Pietro Barbiero, Michael Hersche et al.
IllumiCraft: Unified Geometry and Illumination Diffusion for Controllable Video Generation
Yuanze Lin, Yi-Wen Chen, Yi-Hsuan Tsai et al.
VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos
Ziyang Wang, Shoubin Yu, Elias Stengel-Eskin et al.
NeuroPath: Neurobiology-Inspired Path Tracking and Reflection for Semantically Coherent Retrieval
Junchen Li, Rongzheng Wang, Yihong Huang et al.
EverybodyDance: Bipartite Graph–Based Identity Correspondence for Multi-Character Animation
Haotian Ling, Zequn Chen, Qiuying Chen et al.
EMLoC: Emulator-based Memory-efficient Fine-tuning with LoRA Correction
Hsi-Che Lin, Yu-Chu Yu, Kai-Po Chang et al.
From Head to Tail: Efficient Black-box Model Inversion Attack via Long-tailed Learning
Ziang Li, Hongguang Zhang, Juan Wang et al.
Multi-Scale Finetuning for Encoder-based Time Series Foundation Models
Zhongzheng Qiao, Chenghao Liu, Yiming Zhang et al.
FlowFeat: Pixel-Dense Embedding of Motion Profiles
Nikita Araslanov, Anna Sonnweber, Daniel Cremers
Faster Fixed-Point Methods for Multichain MDPs
Matthew Zurek, Yudong Chen
Learning Source-Free Domain Adaptation for Visible-Infrared Person Re-Identification
Yongxiang Li, Yanglin Feng, Yuan Sun et al.
BlurGuard: A Simple Approach for Robustifying Image Protection Against AI-Powered Editing
Jinsu Kim, Yunhun Nam, Minseon Kim et al.
PANORAMA: A Dataset and Benchmarks Capturing Decision Trails and Rationales in Patent Examination
Hyunseung Lim, Sooyohn Nam, Sungmin Na et al.
The Dual Nature of Plasticity Loss in Deep Continual Learning: Dissection and Mitigation
Haoyu Wang, Wei Dai, Jiawei Zhang et al.
SimAvatar: Simulation-Ready Avatars with Layered Hair and Clothing
Xueting Li, Ye Yuan, Shalini De Mello et al.
HOComp: Interaction-Aware Human-Object Composition
Dong Liang, Jinyuan Jia, Yuhao LIU et al.
Finding Low-Rank Matrix Weights in DNNs via Riemannian Optimization: RAdaGrad and RAdamW
Fengmiao Bian, Jinyang ZHENG, Ziyun Liu et al.
FAIR Universe HiggsML Uncertainty Dataset and Competition
Wahid Bhimji, Ragansu Chakkappai, Po-Wen Chang et al.
Rare Text Semantics Were Always There in Your Diffusion Transformer
seil kang, Woojung Han, Dayun Ju et al.
Document Summarization with Conformal Importance Guarantees
Bruce Kuwahara, Chen-Yuan Lin, Xiao Shi Huang et al.
Multivariate Time Series Anomaly Detection with Idempotent Reconstruction
Xin Sun, Heng Zhou, Chao Li
Aligning Transformers with Continuous Feedback via Energy Rank Alignment
Shriram Chennakesavalu, Frank Hu, Sebastian Ibarraran et al.
Learning with Restricted Boltzmann Machines: Asymptotics of AMP and GD in High Dimensions
Yizhou Xu, Florent Krzakala, Lenka Zdeborová
Preference-Based Dynamic Ranking Structure Recognition
Nan Lu, Jian Shi, Xinyu Tian
A Single-Loop Gradient Algorithm for Pessimistic Bilevel Optimization via Smooth Approximation
Qichao Cao, Shangzhi Zeng, Jin Zhang
SpaceServe: Spatial Multiplexing of Complementary Encoders and Decoders for Multimodal LLMs
zhicheng li, Shuoming Zhang, Jiacheng Zhao et al.
Latent Harmony: Synergistic Unified UHD Image Restoration via Latent Space Regularization and Controllable Refinement
Yidi Liu, Xueyang Fu, Jie Huang et al.
CLIPTTA: Robust Contrastive Vision-Language Test-Time Adaptation
Marc Lafon, Gustavo Vargas Hakim, Clément Rambour et al.
SerialGen: Personalized Image Generation by First Standardization Then Personalization
Cong Xie, Han Zou, Ruiqi Yu et al.
Repurposing Pre-trained Video Diffusion Models for Event-based Video Interpolation
Jingxi Chen, Brandon Y. Feng, Haoming Cai et al.
RSAR: Restricted State Angle Resolver and Rotated SAR Benchmark
Xin Zhang, Xue Yang, Yuxuan Li et al.
Few-shot Implicit Function Generation via Equivariance
Suizhi Huang, Xingyi Yang, Hongtao Lu et al.
Model-Informed Flows for Bayesian Inference
Joohwan Ko, Justin Domke
Regression Trees Know Calculus
Nathan Wycoff
UnCLe: Towards Scalable Dynamic Causal Discovery in Non-linear Temporal Systems
Tingzhu Bi, Yicheng Pan, Xinrui Jiang et al.
The Mirage of Performance Gains: Why Contrastive Decoding Fails to Mitigate Object Hallucinations in MLLMs?
Hao Yin, Guangzong Si, Zilei Wang
Salient Concept-Aware Generative Data Augmentation
Tianchen Zhao, Xuanbai Chen, Zhihua Li et al.
From stability of Langevin diffusion to convergence of proximal MCMC for non-log-concave sampling
Marien Renaud, Valentin De Bortoli, Arthur Leclaire et al.
MaIR: A Locality- and Continuity-Preserving Mamba for Image Restoration
Boyun Li, Haiyu Zhao, Wenxin Wang et al.
What Can RL Bring to VLA Generalization? An Empirical Study
Jijia Liu, Feng Gao, Bingwen Wei et al.
Segment Anything Model Meets Semi-supervised Medical Image Segmentation: A Novel Perspective
Haifeng Zhao, Haiyang Li, Lei-Lei Ma et al.
ViKIENet: Towards Efficient 3D Object Detection with Virtual Key Instance Enhanced Network
Zhuochen Yu, Bijie Qiu, Andy W. H. Khong
CoCoGaussian: Leveraging Circle of Confusion for Gaussian Splatting from Defocused Images
Jungho Lee, Suhwan Cho, Taeoh Kim et al.
Re-coding for Uncertainties: Edge-awareness Semantic Concordance for Resilient Event-RGB Segmentation
Nan Bao, Yifan Zhao, Lin Zhu et al.
ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning
David Junhao Zhang, Roni Paiss, Shiran Zada et al.
Perception Tokens Enhance Visual Reasoning in Multimodal Language Models
Mahtab Bigverdi, Zelun Luo, Cheng-Yu Hsieh et al.
Augmenting Perceptual Super-Resolution via Image Quality Predictors
Fengjia Zhang, Samrudhdhi Rangrej, Tristan T Aumentado-Armstrong et al.
CURE: Co-Evolving Coders and Unit Testers via Reinforcement Learning
Yinjie Wang, Ling Yang, Ye Tian et al.
Beyond Single-Modal Boundary: Cross-Modal Anomaly Detection through Visual Prototype and Harmonization
Kai Mao, Ping Wei, Yiyang Lian et al.
Text Augmented Correlation Transformer For Few-shot Classification & Segmentation
Srinivasa Rao Nandam, Sara Atito, Zhenhua Feng et al.
RoME: Domain-Robust Mixture-of-Experts for MILP Solution Prediction across Domains
Tianle Pu, Zijie Geng, Haoyang Liu et al.
CALICO: Part-Focused Semantic Co-Segmentation with Large Vision-Language Models
Kiet A. Nguyen, Adheesh Juvekar, Tianjiao Yu et al.
Multi-scale Temporal Prediction via Incremental Generation and Multi-agent Collaboration
Zhitao Zeng, Guojian Yuan, Junyuan Mao et al.
Compiler-R1: Towards Agentic Compiler Auto-tuning with Reinforcement Learning
Haolin Pan, Hongyu Lin, Haoran Luo et al.
RAG4GFM: Bridging Knowledge Gaps in Graph Foundation Models through Graph Retrieval Augmented Generation
Xingliang Wang, Zemin Liu, Junxiao Han et al.
Graph Persistence goes Spectral
Mattie Ji, Amauri Souza, Vikas Garg
Graph Few-Shot Learning via Adaptive Spectrum Experts and Cross-Set Distribution Calibration
Yonghao Liu, Yajun Wang, Chunli Guo et al.
Globally Optimal Policy Gradient Algorithms for Reinforcement Learning with PID Control Policies
Vipul Sharma, Wesley Suttle, S Sivaranjani
Why Playing Against Diverse and Challenging Opponents Speeds Up Coevolution: A Theoretical Analysis on Combinatorial Games
Alistair Benford, Per Kristian Lehre
MoManipVLA: Transferring Vision-language-action Models for General Mobile Manipulation
Zhenyu Wu, Yuheng Zhou, Xiuwei Xu et al.
MonoLift: Learning 3D Manipulation Policies from Monocular RGB via Distillation
Ziru Wang, Mengmeng Wang, Guang Dai et al.
MambaVO: Deep Visual Odometry Based on Sequential Matching Refinement and Training Smoothing
Shuo Wang, Wanting Li, Yongcai Wang et al.
Aggregation Hides Out-of-Distribution Generalization Failures from Spurious Correlations
Olawale Salaudeen, Haoran Zhang, Kumail Alhamoud et al.
Approximation and Generalization Abilities of Score-based Neural Network Generative Models for Sub-Gaussian Distributions
Guoji Fu, Wee Sun Lee
TAGA: Self-supervised Learning for Template-free Animatable Gaussian Articulated Model
Zhichao Zhai, Guikun Chen, Wenguan Wang et al.
DIsoN: Decentralized Isolation Networks for Out-of-Distribution Detection in Medical Imaging
Felix Wagner, Pramit Saha, Harry Anthony et al.
MMaDA: Multimodal Large Diffusion Language Models
Ling Yang, Ye Tian, Bowen Li et al.
3DLLM-Mem: Long-Term Spatial-Temporal Memory for Embodied 3D Large Language Model
Wenbo Hu, Yining Hong, Yanjun Wang et al.
Visual Diversity and Region-aware Prompt Learning for Zero-shot HOI Detection
Chanhyeong Yang, Taehoon song, Jihwan Park et al.
HoT-VI: Reparameterizable Variational Inference for Capturing Instance-Level High-Order Correlations
Junxi Xiao, Qinliang Su, Zexin Yuan
Incentivizing Dual Process Thinking for Efficient Large Language Model Reasoning
Xiaoxue Cheng, Junyi Li, Zhenduo Zhang et al.
Continuous-time Riemannian SGD and SVRG Flows on Wasserstein Probabilistic Space
Mingyang Yi, Bohan Wang
Rethinking Few-Shot Adaptation of Vision-Language Models in Two Stages
Matteo Farina, Massimiliano Mancini, Giovanni Iacca et al.
Uncertainty Estimation by Flexible Evidential Deep Learning
Taeseong Yoon, Heeyoung Kim
TITAN: A Trajectory-Informed Technique for Adaptive Parameter Freezing in Large-Scale VQE
Yifeng Peng, Xinyi Li, Samuel Yen-Chi Chen et al.
Performative Risk Control: Calibrating Models for Reliable Deployment under Performativity
Victor Li, Baiting Chen, Yuzhen Mao et al.
LayerNavigator: Finding Promising Intervention Layers for Efficient Activation Steering in Large Language Models
Hao Sun, Huailiang Peng, Qiong Dai et al.
Document Haystacks: Vision-Language Reasoning Over Piles of 1000+ Documents
Jun Chen, Dannong Xu, Junjie Fei et al.
Generative Perception of Shape and Material from Differential Motion
Xinran Han, Ko Nishino, Todd Zickler
Sequential Attention-based Sampling for Histopathological Analysis
Tarun Gogisetty, Naman Malpani, Gugan Chandrashekhar Mallika Thoppe et al.
Generating Full-field Evolution of Physical Dynamics from Irregular Sparse Observations
Panqi Chen, Yifan Sun, Lei Cheng et al.
Collaborative Reasoner: Self-Improving Social Agents with Synthetic Conversations
Ansong Ni, Ruta Desai, Yang Li et al.
All-Day Multi-Camera Multi-Target Tracking
Huijie Fan, Yu Qiao, Yihao Zhen et al.
Task-aware Cross-modal Feature Refinement Transformer with Large Language Models for Visual Grounding
Wenbo Chen, Zhen Xu, Ruotao Xu et al.
Beyond Prediction: Managing the Repercussions of Machine Learning Applications
Aline Weber, Blossom Metevier, Yuriy Brun et al.
Sparse Image Synthesis via Joint Latent and RoI Flow
Ziteng Gao, Jay Zhangjie Wu, Mike Zheng Shou
Segment Any Motion in Videos
Nan Huang, Wenzhao Zheng, Chenfeng Xu et al.
Visual Prompting for One-shot Controllable Video Editing without Inversion
Zhengbo Zhang, Yuxi Zhou, DUO PENG et al.
Risk-Averse Constrained Reinforcement Learning with Optimized Certainty Equivalents
Jane Lee, Baturay Saglam, Spyridon Pougkakiotis et al.
BIMBA: Selective-Scan Compression for Long-Range Video Question Answering
Md Mohaiminul Islam, Tushar Nagarajan, Huiyu Wang et al.
Recognition through Reasoning: Reinforcing Image Geo-localization with Large Vision-Language Models
Ling Li, Yao Zhou, Yuxuan Liang et al.
Beyond Sight: Towards Cognitive Alignment in LVLM via Enriched Visual Knowledge
Yaqi Zhao, Yuanyang Yin, Lin Li et al.
Towards Effective and Sparse Adversarial Attack on Spiking Neural Networks via Breaking Invisible Surrogate Gradients
Li Lun, Kunyu Feng, Qinglong Ni et al.
VividFace: A Robost and High-Fidelity Video Face Swapping Framework
Hao Shao, Shulun Wang, Yang Zhou et al.
Non-Convex Tensor Recovery from Tube-Wise Sensing
Tongle Wu, Ying Sun
Hazy Low-Quality Satellite Video Restoration Via Learning Optimal Joint Degradation Patterns and Continuous-Scale Super-Resolution Reconstruction
Ning Ni, Libao Zhang
Act Only When It Pays: Efficient Reinforcement Learning for LLM Reasoning via Selective Rollouts
Haizhong Zheng, Yang Zhou, Brian Bartoldson et al.
ADD: Attribution-Driven Data Augmentation Framework for Boosting Image Super-Resolution
Zeyu Mi, Yu-Bin Yang
DeepHalo: A Neural Choice Model with Controllable Context Effects
Shuhan Zhang, Zhi Wang, Rui Gao et al.
SASep: Saliency-Aware Structured Separation of Geometry and Feature for Open Set Learning on Point Clouds
Jinfeng Xu, Xianzhi Li, Yuan Tang et al.
Synthetic Series-Symbol Data Generation for Time Series Foundation Models
Wenxuan Wang, Kai Wu, yujian li et al.
LotusFilter: Fast Diverse Nearest Neighbor Search via a Learned Cutoff Table
Yusuke Matsui
RNNs perform task computations by dynamically warping neural representations
Arthur Pellegrino, Angus Chadwick
SIR-DIFF: Sparse Image Sets Restoration with Multi-View Diffusion Model
Yucheng Mao, Boyang Wang, Nilesh Kulkarni et al.
Path-Enhanced Contrastive Learning for Recommendation
Haoran Sun, Fei Xiong, Yuanzhe Hu et al.
A Learning-Augmented Dynamic Programming Approach for Orienteering Problem with Time Windows
Guansheng Peng, Lining Xing, Fuyan Ma et al.
All-Optical Nonlinear Diffractive Deep Network for Ultrafast Image Denoising
Xiaoling Zhou, Zhemg Lee, Wei Ye et al.
DPSeg: Dual-Prompt Cost Volume Learning for Open-Vocabulary Semantic Segmentation
Ziyu Zhao, Xiaoguang Li, Lingjia Shi et al.
A Pre-training Framework for Relational Data with Information-theoretic Principles
Quang Truong, Zhikai Chen, Mingxuan Ju et al.
HopaDIFF: Holistic-Partial Aware Fourier Conditioned Diffusion for Referring Human Action Segmentation in Multi-Person Scenarios
Kunyu Peng, Junchao Huang, Xiangsheng Huang et al.
Dimension-Reduction Attack! Video Generative Models are Experts on Controllable Image Synthesis
Hengyuan Cao, Yutong Feng, Biao Gong et al.
DejaVid: Encoder-Agnostic Learned Temporal Matching for Video Classification
Darryl Ho, Samuel Madden
DAPO : Improving Multi-Step Reasoning Abilities of Large Language Models with Direct Advantage-Based Policy Optimization
Jiacai Liu, Chaojie Wang, Chris Liu et al.
On Group Sufficiency Under Label Bias
Haoran Zhang, Olawale Salaudeen, Marzyeh Ghassemi
PatchDEMUX: A Certifiably Robust Framework for Multi-label Classifiers Against Adversarial Patches
Dennis Jacob, Chong Xiang, Prateek Mittal
Smooth Regularization for Efficient Video Recognition
Gil Goldman, Raja Giryes, Mahadev Satyanarayanan
Hierarchical Knowledge Prompt Tuning for Multi-task Test-Time Adaptation
Qiang Zhang, Mengsheng Zhao, Jiawei Liu et al.
Performative Validity of Recourse Explanations
Gunnar König, Hidde Fokkema, Timo Freiesleben et al.
LLM-DAMVC: A Large Language Model Assisted Dynamic Agent for Multi-View Clustering
Qianqian Wang, Qianqian Wang
Context-Aware Regularization with Markovian Integration for Attention-Based Nucleotide Analysis
Mohammad Saleh Refahi, Mahdi Abavisani, Bahrad Sokhansanj et al.
Robust Cross-modal Alignment Learning for Cross-Scene Spatial Reasoning and Grounding
Yanglin Feng, Hongyuan Zhu, Dezhong Peng et al.
Multimodal Causal Reasoning for UAV Object Detection
Nianxin Li, Mao Ye, Lihua Zhou et al.
CoSDH: Communication-Efficient Collaborative Perception via Supply-Demand Awareness and Intermediate-Late Hybridization
Junhao Xu, Yanan Zhang, Zhi Cai et al.
Train to Defend: First Defense Against Cryptanalytic Neural Network Parameter Extraction Attacks
Ashley Kurian, Aydin Aysu
ARMesh: Autoregressive Mesh Generation via Next-Level-of-Detail Prediction
Jiabao Lei, Kewei Shi, Zhihao Liang et al.