Most Cited 2025 Poster Papers
22,274 papers found • Page 103 of 112
Conference
SeCap: Self-Calibrating and Adaptive Prompts for Cross-view Person Re-Identification in Aerial-Ground Networks
Shining Wang, Yunlong Wang, Ruiqi Wu et al.
Near-Optimal Policy Identification in Robust Constrained Markov Decision Processes via Epigraph Form
Toshinori Kitamura, Tadashi Kozuno, Wataru Kumagai et al.
V2V3D: View-to-View Denoised 3D Reconstruction for Light Field Microscopy
Jiayin Zhao, Zhenqi Fu, Tao Yu et al.
A Unified Framework for Heterogeneous Semi-supervised Learning
Marzi Heidari, Abdullah Alchihabi, Hao Yan et al.
SLADE: Shielding against Dual Exploits in Large Vision-Language Models
Md Zarif Hossain, AHMED IMTEAJ
MODA: Motion-Drift Augmentation for Inertial Human Motion Analysis
Yinghao Wu, Shihui Guo, Yipeng Qin
Learning to Filter Outlier Edges in Global SfM
Nicole Damblon, Marc Pollefeys, Daniel Barath
PhysPDE: Rethinking PDE Discovery and a Physical HYpothesis Selection Benchmark
Mingquan Feng, Yixin Huang, Yizhou Liu et al.
Improving the Training of Data-Efficient GANs via Quality Aware Dynamic Discriminator Rejection Sampling
Zhaoyu Zhang, Yang Hua, Guanxiong Sun et al.
NLPrompt: Noise-Label Prompt Learning for Vision-Language Models
Bikang Pan, Qun Li, Xiaoying Tang et al.
No Pains, More Gains: Recycling Sub-Salient Patches for Efficient High-Resolution Image Recognition
Rong Qin, Xin Liu, Xingyu Liu et al.
A Coefficient Makes SVRG Effective
Yida Yin, Zhiqiu Xu, Zhiyuan Li et al.
nGPT: Normalized Transformer with Representation Learning on the Hypersphere
Ilya Loshchilov, Cheng-Ping Hsieh, Simeng Sun et al.
Where's the Liability in the Generative Era? Recovery-based Black-Box Detection of AI-Generated Content
Haoyue Bai, Yiyou Sun, Wei Cheng et al.
Understanding Warmup-Stable-Decay Learning Rates: A River Valley Loss Landscape View
Kaiyue Wen, Zhiyuan Li, Jason Wang et al.
Koala-36M: A Large-scale Video Dataset Improving Consistency between Fine-grained Conditions and Video Content
Qiuheng Wang, Yukai Shi, Jiarong Ou et al.
VASparse: Towards Efficient Visual Hallucination Mitigation via Visual-Aware Token Sparsification
Xianwei Zhuang, Zhihong Zhu, Yuxin Xie et al.
Elucidating the Preconditioning in Consistency Distillation
Kaiwen Zheng, Guande He, Jianfei Chen et al.
Dual-Interrelated Diffusion Model for Few-Shot Anomaly Image Generation
Ying Jin, Jinlong Peng, Qingdong He et al.
Beyond Worst-Case Dimensionality Reduction for Sparse Vectors
Sandeep Silwal, David Woodruff, Qiuyi (Richard) Zhang
Towards a Geometric Understanding of Tensor Learning via the t-Product
Andong Wang, Yuning Qiu, Haonan Huang et al.
Diffusion Bridge Implicit Models
Kaiwen Zheng, Guande He, Jianfei Chen et al.
Decoupled Subgraph Federated Learning
Javad Aliakbari, Johan Östman, Alexandre Graell i Amat
CoMatcher: Multi-View Collaborative Feature Matching
Jintao Zhang, Zimin Xia, Mingyue Dong et al.
Repurposing in AI: A Distinct Approach or an Extension of Creative Problem Solving?
Aissatou Diallo, Antonis Bikakis, Luke Dickens et al.
Motif-aware Graph Neural Networks for Networked Time Series Imputation
Nourhan Ahmed, Vijaya Krishna Yalavarthi, Lars Schmidt-Thieme
PillarHist: A Quantization-aware Pillar Feature Encoder based on Height-aware Histogram
Sifan Zhou, Zhihang Yuan, Dawei Yang et al.
SparseMVC: Probing Cross-view Sparsity Variations for Multi-view Clustering
Ruimeng Liu, Xin Zou, Chang Tang et al.
Scalable Quantum-Inspired Optimization Through Dynamic Qubit Compression
Co Tran, Quoc-Bao Tran, Hy Truong Son et al.
Towards Explicit Geometry-Reflectance Collaboration for Generalized LiDAR Segmentation in Adverse Weather
Longyu Yang, Ping Hu, Shangbo Yuan et al.
Contextual Self-paced Learning for Weakly Supervised Spatio-Temporal Video Grounding
Akash Kumar, Zsolt Kira, Yogesh S Rawat
Generalizable Object Keypoint Localization from Generative Priors
Dongkai Wang, Jiang Duan, Liangjian Wen et al.
Integrating Task-Specific and Universal Adapters for Pre-Trained Model-based Class-Incremental Learning
yan wang, Da-Wei Zhou, Han-Jia Ye
From an LLM Swarm to a PDDL-empowered Hive: Planning Self-executed Instructions in a Multi-modal Jungle
Kaustubh Vyas, Damien Graux, Yijun Yang et al.
Port-Hamiltonian Architectural Bias for Long-Range Propagation in Deep Graph Networks
Simon Heilig, Alessio Gravina, Alessandro Trenta et al.
Chebyshev Attention Depth Permutation Texture Network with Latent Texture Attribute Loss
Ravishankar Evani, Deepu Rajan, Shangbo Mao
Universal Image Restoration Pre-training via Degradation Classification
Jiakui Hu, Lujia Jin, Zhengjian Yao et al.
Shining Yourself: High-Fidelity Ornaments Virtual Try-on with Diffusion Model
Yingmao Miao, Zhanpeng Huang, Rui Han et al.
AnomalyCoT: A Multi-Scenario Chain-of-Thought Dataset for Multimodal Large Language Models
Jiaxi Cheng, Yuliang Xu, Shoupeng Wang et al.
DreamCache: Finetuning-Free Lightweight Personalized Image Generation via Feature Caching
Emanuele Aiello, Umberto Michieli, Diego Valsesia et al.
Rethinking Evaluation of Infrared Small Target Detection
Youwei Pang, Xiaoqi Zhao, Lihe Zhang et al.
Linguistics-aware Masked Image Modeling for Self-supervised Scene Text Recognition
Yifei Zhang, Chang Liu, Jin Wei et al.
EIDT-V: Exploiting Intersections in Diffusion Trajectories for Model-Agnostic, Zero-Shot, Training-Free Text-to-Video Generation
Diljeet Jagpal, Xi Chen, Vinay P. Namboodiri
GenSpace: Benchmarking Spatially-Aware Image Generation
Zehan Wang, Jiayang Xu, Ziang Zhang et al.
ReWind: Understanding Long Videos with Instructed Learnable Memory
Anxhelo Diko, Tinghuai Wang, Wassim Swaileh et al.
ABBSPO: Adaptive Bounding Box Scaling and Symmetric Prior based Orientation Prediction for Detecting Aerial Image Objects
Woojin Lee, Hyugjae Chang, Jaeho Moon et al.
Semantic-guided Cross-Modal Prompt Learning for Skeleton-based Zero-shot Action Recognition
Anqi Zhu, Jingmin Zhu, James Bailey et al.
SolidGeo: Measuring Multimodal Spatial Math Reasoning in Solid Geometry
Peijie Wang, Chao Yang, Zhong-Zhi Li et al.
DISCO: DISCrete nOise for Conditional Control in Text-to-Image Diffusion Models
Longquan Dai, Wu Ming, Dejiao Xue et al.
Learning to Contextualize Web Pages for Enhanced Decision Making by LLM Agents
Dongjun Lee, Juyong Lee, Kyuyoung Kim et al.
VidHalluc: Evaluating Temporal Hallucinations in Multimodal Large Language Models for Video Understanding
Chaoyu Li, Eun Woo Im, Pooyan Fazli
CrypticBio: A Large Multimodal Dataset for Visually Confusing Species
Georgiana Manolache, Gerard Schouten, Joaquin Vanschoren
Reasoning is Periodicity? Improving Large Language Models Through Effective Periodicity Modeling
Yihong Dong, Ge Li, Xue Jiang et al.
All-directional Disparity Estimation for Real-world QPD Images
Hongtao Yu, Shaohui Song, Lihu Sun et al.
Unifying Reconstruction and Density Estimation via Invertible Contraction Mapping in One-Class Classification
Xiaolei Wang, Tianhong Dai, Huihui Bai et al.
COBRA: COmBinatorial Retrieval Augmentation for Few-Shot Adaptation
Arnav Mohanty Das, Gantavya Bhatt, Lilly Kumari et al.
MFogHub: Bridging Multi-Regional and Multi-Satellite Data for Global Marine Fog Detection and Forecasting
Mengqiu XU, Kaixin Chen, Heng Guo et al.
Dual Consolidation for Pre-Trained Model-Based Domain-Incremental Learning
Da-Wei Zhou, Zi-Wen Cai, Han-Jia Ye et al.
Efficient Test-time Adaptive Object Detection via Sensitivity-Guided Pruning
Kunyu Wang, Xueyang Fu, Xin Lu et al.
TEMPO: Temporal Multi-scale Autoregressive Generation of Protein Conformational Ensembles
Yaoyao Xu, Di Wang, Zihan Zhou et al.
Alignment, Mining and Fusion: Representation Alignment with Hard Negative Mining and Selective Knowledge Fusion for Medical Visual Question Answering
Yuanhao Zou, Zhaozheng Yin
Q-Bench-Video: Benchmark the Video Quality Understanding of LMMs
Zicheng Zhang, Ziheng Jia, Haoning Wu et al.
STAIR: Addressing Stage Misalignment through Temporal-Aligned Preference Reinforcement Learning
Yao Luan, Ni Mu, Yiqin Yang et al.
MAD: Memory-Augmented Detection of 3D Objects
Ben Agro, Sergio Casas, Patrick Wang et al.
Continual Release Moment Estimation with Differential Privacy
Nikita Kalinin, Jalaj Upadhyay, Christoph Lampert
Training-free Neural Architecture Search through Variance of Knowledge of Deep Network Weights
Ondrej Tybl, Lukas Neumann
RAEncoder: A Label-Free Reversible Adversarial Examples Encoder for Dataset Intellectual Property Protection
Fan Xing, Zhuo Tian, Xuefeng Fan et al.
NavBench: Probing Multimodal Large Language Models for Embodied Navigation
Yanyuan Qiao, Haodong Hong, Wenqi Lyu et al.
Towards Fine-Grained Interpretability: Counterfactual Explanations for Misclassification with Saliency Partition
ZHANG LINTONG, Kang Yin, Seong-Whan Lee
Not Just Text: Uncovering Vision Modality Typographic Threats in Image Generation Models
Hao Cheng, Erjia Xiao, Jiayan Yang et al.
Generation as Search Operator for Test-Time Scaling of Diffusion-based Combinatorial Optimization
Yang Li, Lvda Chen, Haonan Wang et al.
Mamba-Reg: Vision Mamba Also Needs Registers
Feng Wang, Jiahao Wang, Sucheng Ren et al.
OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts
Yuxuan Wang, Yueqian Wang, Bo Chen et al.
When Semantics Mislead Vision: Mitigating Large Multimodal Models Hallucinations in Scene Text Spotting and Understanding
Yan Shu, Hangui Lin, Yexin Liu et al.
Imputation-free and Alignment-free: Incomplete Multi-view Clustering Driven by Consensus Semantic Learning
yuzhuo dai, Jiaqi Jin, Zhibin Dong et al.
Accelerating Block Coordinate Descent for LLM Finetuning via Landscape Expansion
Qijun Luo, Yifei Shen, Liangzu Peng et al.
Autoregressive Sequential Pretraining for Visual Tracking
Shiyi Liang, Yifan Bai, Yihong Gong et al.
Number it: Temporal Grounding Videos like Flipping Manga
Yongliang Wu, Xinting Hu, Yuyang Sun et al.
HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation
Ling Yang, Xinchen Zhang, Ye Tian et al.
OpenOmni: Advancing Open-Source Omnimodal Large Language Models with Progressive Multimodal Alignment and Real-time Emotional Speech Synthesis
Run Luo, Ting-En Lin, Haonan Zhang et al.
MoodAngels: A Retrieval-augmented Multi-agent Framework for Psychiatry Diagnosis
Mengxi Xiao, Ben Liu, He Li et al.
A Regularized Newton Method for Nonconvex Optimization with Global and Local Complexity Guarantees
Yuhao Zhou, Jintao Xu, Bingrui Li et al.
Uncertainty Meets Diversity: A Comprehensive Active Learning Framework for Indoor 3D Object Detection
Jiangyi Wang, Na Zhao
Interaction-Centric Knowledge Infusion and Transfer for Open Vocabulary Scene Graph Generation
Lin Li, Chuhan ZHANG, Dong Zhang et al.
CVGL: Causal Learning and Geometric Topology
Songsong Ouyang, Yingying Zhu
SeriesBench: A Benchmark for Narrative-Driven Drama Series Understanding
chenkai zhang, Yiming Lei, Zeming Liu et al.
GS-2DGS: Geometrically Supervised 2DGS for Reflective Object Reconstruction
Jinguang Tong, Xuesong li, Fahira Afzal Maken et al.
PC-Net: Weakly Supervised Compositional Moment Retrieval via Proposal-Centric Network
Mingyao Zhou, Hao Sun, Wei Xie et al.
PanSplat: 4K Panorama Synthesis with Feed-Forward Gaussian Splatting
Cheng Zhang, Haofei Xu, Qianyi Wu et al.
GTR-Loc: Geospatial Text Regularization Assisted Outdoor LiDAR Localization
Shangshu Yu, Wen Li, Xiaotian Sun et al.
LEDiff: Latent Exposure Diffusion for HDR Generation
Chao Wang, Zhihao Xia, Thomas Leimkuehler et al.
Flattening Hierarchies with Policy Bootstrapping
John Zhou, Jonathan Kao
Unleashing the Power of One-Step Diffusion based Image Super-Resolution via a Large-Scale Diffusion Discriminator
Jianze Li, Jiezhang Cao, Zichen Zou et al.
FloVD: Optical Flow Meets Video Diffusion Model for Enhanced Camera-Controlled Video Synthesis
Wonjoon Jin, Qi Dai, Chong Luo et al.
LithoSim: A Large, Holistic Lithography Simulation Benchmark for AI-Driven Semiconductor Manufacturing
Hongquan He, Zhen Wang, Jingya Wang et al.
Resounding Acoustic Fields with Reciprocity
Zitong Lan, Yiduo Hao, Mingmin Zhao
NVILA: Efficient Frontier Visual Language Models
Zhijian Liu, Ligeng Zhu, Baifeng Shi et al.
Fuzzy Multimodal Learning for Trusted Cross-modal Retrieval
Siyuan Duan, Yuan Sun, Dezhong Peng et al.
AniGS: Animatable Gaussian Avatar from a Single Image with Inconsistent Gaussian Reconstruction
Lingteng Qiu, Shenhao Zhu, Qi Zuo et al.
Steering Information Utility in Key-Value Memory for Language Model Post-Training
Chunyuan Deng, Ruidi Chang, Hanjie Chen
Analog Foundation Models
Julian Büchel, Iason Chalas, Giovanni Acampa et al.
UniMRSeg: Unified Modality-Relax Segmentation via Hierarchical Self-Supervised Compensation
Xiaoqi Zhao, Youwei Pang, Chenyang Yu et al.
NTKMTL: Mitigating Task Imbalance in Multi-Task Learning from Neural Tangent Kernel Perspective
Xiaohan Qin, Xiaoxing Wang, Ning Liao et al.
Can LLMs Correct Themselves? A Benchmark of Self-Correction in LLMs
Guiyao Tie, Zenghui Yuan, Zeli Zhao et al.
Causal Discovery and Inference through Next-Token Prediction
Eivinas Butkus, Nikolaus Kriegeskorte
Simultaneous Statistical Inference for Off-Policy Evaluation in Reinforcement Learning
Tianpai Luo, Xinyuan Fan, Weichi Wu
Seeing More with Less: Human-like Representations in Vision Models
Andrey Gizdov, Shimon Ullman, Daniel Harari
How Far Are We from Optimal Reasoning Efficiency?
Jiaxuan Gao, Shu Yan, Qixin Tan et al.
A Unified Reasoning Framework for Holistic Zero-Shot Video Anomaly Analysis
Dongheng Lin, Mengxue Qu, Kunyang Han et al.
EF-3DGS: Event-Aided Free-Trajectory 3D Gaussian Splatting
Bohao Liao, Wei Zhai, Zengyu Wan et al.
HPSERec: A Hierarchical Partitioning and Stepwise Enhancement Framework for Long-tailed Sequential Recommendation
Xiaolong Xu, Xudong Zhao, Haolong Xiang et al.
S$^2$M-Former: Spiking Symmetric Mixing Branchformer for Brain Auditory Attention Detection
Jiaqi Wang, Zhengyu Ma, Xiongri Shen et al.
Disentangling Safe and Unsafe Image Corruptions via Anisotropy and Locality
Ramchandran Muthukumar, Ambar Pal, Jeremias Sulam et al.
Noise-Robustness Through Noise: A Framework combining Asymmetric LoRA with Poisoning MoE
Zhaokun Wang, Jinyu Guo, Jingwen Pu et al.
Sample-Conditional Coverage in Split-Conformal Prediction
John Duchi
ViCaS: A Dataset for Combining Holistic and Pixel-level Video Understanding using Captions with Grounded Segmentation
Ali Athar, Xueqing Deng, Liang-Chieh Chen
IRIS: Inverse Rendering of Indoor Scenes from Low Dynamic Range Images
Chih-Hao Lin, Jia-Bin Huang, Zhengqin Li et al.
World Models Should Prioritize the Unification of Physical and Social Dynamics
Xiaoyuan Zhang, Chengdong Ma, Yizhe Huang et al.
Can Machines Understand Composition? Dataset and Benchmark for Photographic Image Composition Embedding and Understanding
Zhaoran Zhao, Peng Lu, Anran Zhang et al.
NeurIPS should lead scientific consensus on AI policy
Rishi Bommasani
Dense-SfM: Structure from Motion with Dense Consistent Matching
JongMin Lee, Sungjoo Yoo
Foundation Models for Scientific Discovery: From Paradigm Enhancement to Paradigm Transition
Fan LIU, Jindong Han, Tengfei Lyu et al.
Let's Chorus: Partner-aware Hybrid Song-Driven 3D Head Animation
Xiumei Xie, Zikai Huang, Wenhao Xu et al.
SPARS3R: Semantic Prior Alignment and Regularization for Sparse 3D Reconstruction
Yutao Tang, Yuxiang Guo, Deming Li et al.
Factored-NeuS: Reconstructing Surfaces, Illumination, and Materials of Possibly Glossy Objects
Yue Fan, Ningjing Fan, Ivan Skorokhodov et al.
ICLScan: Detecting Backdoors in Black-Box Large Language Models via Targeted In-context Illumination
Xiaoyi Pang, Xuanyi Hao, Song Guo et al.
TransPixeler: Advancing Text-to-Video Generation with Transparency
Luozhou Wang, Yijun Li, ZhiFei Chen et al.
On the Stability and Generalization of Meta-Learning: the Impact of Inner-Levels
Wenjun Ding, Jingling Liu, Lixing Chen et al.
RLtools: A Fast, Portable Deep Reinforcement Learning Library for Continuous Control
Jonas Eschmann, Dario Albani, Giuseppe Loianno
FlashGS: Efficient 3D Gaussian Splatting for Large-scale and High-resolution Rendering
Guofeng Feng, Siyan Chen, Rong Fu et al.
PaceLLM: Brain-Inspired Large Language Models for Long-Context Understanding
Kangcong Li, Peng Ye, Chongjun Tu et al.
Geometric Learning with Positively Decomposable Kernels
Nathael Da Costa, Cyrus Mostajeran, Juan-Pablo Ortega et al.
Constrained Linear Thompson Sampling
Aditya Gangrade, Venkatesh Saligrama
Versatile differentially private learning for general loss functions
Qilong Lu, Songxi Chen, Yumou Qiu
Statistical Inference for Decentralized Federated Learning
Jia Gu, Songxi Chen
LoSplit: Loss-Guided Dynamic Split for Training-Time Defense Against Graph Backdoor Attacks
Di Jin, Yuxiang Zhang, Bingdao Feng et al.
Retrieval is Not Enough: Enhancing RAG through Test-Time Critique and Optimization
Jiaqi Wei, Hao Zhou, Xiang Zhang et al.
Technical Debt in In-Context Learning: Diminishing Efficiency in Long Context
Taejong Joo, Diego Klabjan
Variance-Based Membership Inference Attacks Against Large-Scale Image Captioning Models
Daniel Samira, Edan Habler, Yuval Elovici et al.
VLog: Video-Language Models by Generative Retrieval of Narration Vocabulary
Kevin Qinghong Lin, Mike Zheng Shou
From Pose to Muscle: Multimodal Learning for Piano Hand Muscle Electromyography
RUOFAN LIU, YICHEN PENG, Takanori Oku et al.
AdvEDM: Fine-grained Adversarial Attack against VLM-based Embodied Agents
Yichen Wang, Hangtao Zhang, Hewen Pan et al.
ERUPT: Efficient Rendering with Unposed Patch Transformer
Maxim Shugaev, Vincent Chen, Maxim Karrenbach et al.
Improved Monocular Depth Prediction Using Distance Transform Over Pre-semantic Contours with Self-supervised Neural Networks
Marwane Hariat, Antoine Manzanera, David Filliat
Ascent Fails to Forget
Ioannis Mavrothalassitis, Pol Puigdemont, Noam Levi et al.
On the Sample Complexity of Differentially Private Policy Optimization
Yi He, Xingyu Zhou
CompGS: Unleashing 2D Compositionality for Compositional Text-to-3D via Dynamically Optimizing 3D Gaussians
Chongjian GE, Chenfeng Xu, Yuanfeng Ji et al.
FIRE: Robust Detection of Diffusion-Generated Images via Frequency-Guided Reconstruction Error
Beilin Chu, Xuan Xu, Xin Wang et al.
BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific Literature
Alejandro Lozano, Min Woo Sun, James Burgess et al.
Learning 3D Anisotropic Noise Distributions Improves Molecular Force Fields
Xixian Liu, Rui Jiao, ZHIYUAN LIU et al.
FOCUS: Unified Vision-Language Modeling for Interactive Editing Driven by Referential Segmentation
Fan Yang, Yousong Zhu, Xin Li et al.
DynaNav: Dynamic Feature and Layer Selection for Efficient Visual Navigation
Jiahui Wang, Changhao Chen
GUI-G1: Understanding R1-Zero-Like Training for Visual Grounding in GUI Agents
Yuqi Zhou, Sunhao Dai, Shuai Wang et al.
DPC: Dual-Prompt Collaboration for Tuning Vision-Language Models
Haoyang Li, Liang Wang, Chao Wang et al.
DEFT: Decompositional Efficient Fine-Tuning for Text-to-Image Models
Komal Kumar, Rao Anwer, Fahad Shahbaz Khan et al.
Taxonomy-Aware Evaluation of Vision-Language Models
Vésteinn Snæbjarnarson, Kevin Du, Niklas Stoehr et al.
LaViDa: A Large Diffusion Model for Vision-Language Understanding
Shufan Li, Konstantinos Kallidromitis, Hritik Bansal et al.
TAPT: Test-Time Adversarial Prompt Tuning for Robust Inference in Vision-Language Models
Xin Wang, Kai Chen, Jiaming Zhang et al.
Cross City Traffic Flow Generation via Retrieval Augmented Diffusion Model
Yudong Li, Jingyuan Wang, Xie Yu et al.
Towards Practical Real-Time Neural Video Compression
Zhaoyang Jia, Bin Li, Jiahao Li et al.
CDI: Copyrighted Data Identification in Diffusion Models
Jan Dubiński, Antoni Kowalczuk, Franziska Boenisch et al.
Binarized Neural Network for Multi-spectral Image Fusion
Junming Hou, Xiaoyu Chen, Ran Ran et al.
GaussianIP: Identity-Preserving Realistic 3D Human Generation via Human-Centric Diffusion Prior
Zichen Tang, Yuan Yao, Miaomiao Cui et al.
Quantifying Uncertainty in Error Consistency: Towards Reliable Behavioral Comparison of Classifiers
Thomas Klein, Sascha Meyen, Wieland Brendel et al.
Sim-to-Real Causal Transfer: A Metric Learning Approach to Causally-Aware Interaction Representations
Ahmad Rahimi, Po-Chien Luan, Yuejiang Liu et al.
Automated Model Discovery via Multi-modal & Multi-step Pipeline
Lee Jung-Mok, Nam Hyeon-Woo, Moon Ye-Bin et al.
Holmes-VAU: Towards Long-term Video Anomaly Understanding at Any Granularity
Huaxin Zhang, Xiaohao Xu, Xiang Wang et al.
MOS-Attack: A Scalable Multi-objective Adversarial Attack Framework
Ping Guo, Cheng Gong, Fei Liu et al.
Weakly Supervised Semantic Segmentation via Progressive Confidence Region Expansion
Xiangfeng Xu, Pinyi Zhang, Wenxuan Huang et al.
Dynamic Siamese Expansion Framework for Improving Robustness in Online Continual Learning
Fei Ye, Yulong Zhao, Qihe Liu et al.
Prior-Guided Flow Matching for Target-Aware Molecule Design with Learnable Atom Number
Jingyuan Zhou, Hao Qian, Shikui Tu et al.
Confusion-Driven Self-Supervised Progressively Weighted Ensemble Learning for Non-Exemplar Class Incremental Learning
Kai Hu, Zhang Yu, Yuan Zhang et al.
VidEmo: Affective-Tree Reasoning for Emotion-Centric Video Foundation Models
Zhicheng Zhang, Weicheng Wang, Yongjie Zhu et al.
QiMeng-SALV: Signal-Aware Learning for Verilog Code Generation
Yang Zhang, Rui Zhang, Jiaming Guo et al.
Disentangled Pose and Appearance Guidance for Multi-Pose Generation
Tengfei Xiao, Yue Wu, Yuelong Li et al.
Regional Explanations: Bridging Local and Global Variable Importance
Salim I. Amoukou, Nicolas Brunel
ByTheWay: Boost Your Text-to-Video Generation Model to Higher Quality in a Training-free Way
Jiazi Bu, Pengyang Ling, Pan Zhang et al.
Social World Model-Augmented Mechanism Design Policy Learning
Xiaoyuan Zhang, Yizhe Huang, Chengdong Ma et al.
No-Regret Thompson Sampling for Finite-Horizon Markov Decision Processes with Gaussian Processes
Jasmine Bayrooti, Sattar Vakili, Amanda Prorok et al.
SeCon-RAG: A Two-Stage Semantic Filtering and Conflict-Free Framework for Trustworthy RAG
Xiaonan Si, Meilin Zhu, Simeng Qin et al.
Can MLLMs Absorb Math Reasoning Abilities from LLMs as Free Lunch?
Yijie Hu, Zihao Zhou, Kaizhu Huang et al.
Learning Conditional Space-Time Prompt Distributions for Video Class-Incremental Learning
Xiaohan Zou, Wenchao Ma, Shu Zhao
Convex Combination Star Shape Prior for Data-driven Image Semantic Segmentation
Xinyu Zhao, Jun Xie, Shengzhe Chen et al.
Problem-Parameter-Free Decentralized Bilevel Optimization
Zhiwei Zhai, Wenjing Yan, Ying-Jun Zhang
Rethinking Personalized Aesthetics Assessment: Employing Physique Aesthetics Assessment as An Exemplification
Haobin Zhong, Shuai He, Anlong Ming et al.
Learning Memory-Enhanced Improvement Heuristics for Flexible Job Shop Scheduling
Jiaqi Wang, Zhiguang Cao, Peng Zhao et al.
Adaptive and Multi-scale Affinity Alignment for Hierarchical Contrastive Learning
Jiawei Huang, Minming Li, Hu Ding
Boosting Knowledge Utilization in Multimodal Large Language Models via Adaptive Logits Fusion and Attention Reallocation
Wenbin An, Jiahao Nie, Feng Tian et al.
Domain Adaptive Diabetic Retinopathy Grading with Model Absence and Flowing Data
Wenxin Su, Song Tang, Xiaofeng Liu et al.
VideoREPA: Learning Physics for Video Generation through Relational Alignment with Foundation Models
Xiangdong Zhang, Jiaqi Liao, Shaofeng Zhang et al.
SCFlow2: Plug-and-Play Object Pose Refiner with Shape-Constraint Scene Flow
Qingyuan Wang, Rui Song, Jiaojiao Li et al.
GoLF-NRT: Integrating Global Context and Local Geometry for Few-Shot View Synthesis
You Wang, Li Fang, Hao Zhu et al.
SpatialLLM: A Compound 3D-Informed Design towards Spatially-Intelligent Large Multimodal Models
Wufei Ma, Luoxin Ye, Nessa McWeeney et al.
Semi-Supervised State-Space Model with Dynamic Stacking Filter for Real-World Video Deraining
Shangquan Sun, Wenqi Ren, Juxiang Zhou et al.
Streaming Audio Generation from Discrete Tokens via Streaming Flow Matching
Ha-Yeong Choi, Sang-Hoon Lee
Targeted Maximum Likelihood Learning: An Optimization Perspective
Diyang Li, Kyra Gan
EntropyMark: Towards More Harmless Backdoor Watermark via Entropy-based Constraint for Open-source Dataset Copyright Protection
Ming Sun, Rui Wang, Zixuan Zhu et al.
Rethinking the Adversarial Robustness of Multi-Exit Neural Networks in an Attack-Defense Game
Keyizhi Xu, Chi Zhang, Zhan Chen et al.
AegisGuard: RL-Guided Adapter Tuning for TEE-Based Efficient & Secure On-Device Inference
CHE WANG, Ziqi Zhang, Yinggui Wang et al.
DrivingSphere: Building a High-fidelity 4D World for Closed-loop Simulation
Tianyi Yan, Dongming Wu, Wencheng Han et al.