Most Cited 2025 "limitations" Papers
22,274 papers found • Page 31 of 112
Conference
CCMNet: Leveraging Calibrated Color Correction Matrices for Cross-Camera Color Constancy
Dongyoung Kim, Mahmoud Afifi, Dongyun Kim et al.
Towards Real Unsupervised Anomaly Detection Via Confident Meta-Learning
Muhammad Aqeel, Shakiba Sharifi, Marco Cristani et al.
SymDPO: Boosting In-Context Learning of Large Multimodal Models with Symbol Demonstration Direct Preference Optimization
Hongrui Jia, Chaoya Jiang, Haiyang Xu et al.
Provable and Practical Online Learning Rate Adaptation with Hypergradient Descent
Ya-Chi Chu, Wenzhi Gao, Yinyu Ye et al.
From Passive to Active Reasoning: Can Large Language Models Ask the Right Questions under Incomplete Information?
Zhanke Zhou, Xiao Feng, Zhaocheng Zhu et al.
Repurposing Stable Diffusion Attention for Training-Free Unsupervised Interactive Segmentation
Markus Karmann, Onay Urfalioglu
MLRC-Bench: Can Language Agents Solve Machine Learning Research Challenges?
Yunxiang Zhang, Muhammad Khalifa, Shitanshu Bhushan et al.
Tracing the Representation Geometry of Language Models from Pretraining to Post-training
Melody Li, Kumar Krishna Agrawal, Arna Ghosh et al.
MOS: Modeling Object-Scene Associations in Generalized Category Discovery
Zhengyuan Peng, Jinpeng Ma, Zhimin Sun et al.
Multi-modal Vision Pre-training for Medical Image Analysis
Shaohao Rui, Lingzhi Chen, Zhenyu Tang et al.
Orientation Matters: Making 3D Generative Models Orientation-Aligned
Yichong Lu, Yuzhuo Tian, Zijin Jiang et al.
LLM Alignment as Retriever Optimization: An Information Retrieval Perspective
Bowen Jin, Jinsung Yoon, Zhen Qin et al.
UniMoMo: Unified Generative Modeling of 3D Molecules for De Novo Binder Design
Xiangzhe Kong, Zishen Zhang, Ziting Zhang et al.
Backdoor Token Unlearning: Exposing and Defending Backdoors in Pretrained Language Models
Peihai Jiang, Xixiang Lyu, Yige Li et al.
FactCheXcker: Mitigating Measurement Hallucinations in Chest X-ray Report Generation Models
Alice Heiman, Xiaoman Zhang, Emma Chen et al.
CTRL-O: Language-Controllable Object-Centric Visual Representation Learning
Aniket Rajiv Didolkar, Andrii Zadaianchuk, Rabiul Awal et al.
FAM Diffusion: Frequency and Attention Modulation for High-Resolution Image Generation with Stable Diffusion
Haosen Yang, Adrian Bulat, Isma Hadji et al.
Mask in the Mirror: Implicit Sparsification
Tom Jacobs, Rebekka Burkholz
On the Relation between Rectified Flows and Optimal Transport
Johannes Hertrich, Antonin Chambolle, Julie Delon
GeoLoRA: Geometric integration for parameter efficient fine-tuning
Steffen Schotthöfer, Emanuele Zangrando, Gianluca Ceruti et al.
Federated Continual Instruction Tuning
Haiyang Guo, Fanhu Zeng, Fei Zhu et al.
Learning Complex Heterogeneous Multimodal Fake News via Social Latent Network Inference
Mingxin Li, Yuchen Zhang, Haowei Xu et al.
Gradient Weight-normalized Low-rank Projection for Efficient LLM Training
Jia-Hong Huang, Yixian Shen, Hongyi Zhu et al.
Bi-level Contrastive Learning for Knowledge-Enhanced Molecule Representations
Pengcheng Jiang, Cao Xiao, Tianfan Fu et al.
Towards Robustness and Explainability of Automatic Algorithm Selection
Xingyu Wu, Jibin Wu, Yu Zhou et al.
Is Complex Query Answering Really Complex?
Cosimo Gregucci, Bo Xiong, Daniel Hernández et al.
Understanding and Mitigating Bottlenecks of State Space Models through the Lens of Recency and Over-smoothing
Peihao Wang, Ruisi Cai, Yuehao Wang et al.
Physics-Informed Generative Modeling of Wireless Channels
Benedikt Böck, Andreas Oeldemann, Timo Mayer et al.
Noise Injection Reveals Hidden Capabilities of Sandbagging Language Models
Cameron Tice, Philipp Kreer, Nathan Helm-Burger et al.
CoPRA: Bridging Cross-domain Pretrained Sequence Models with Complex Structures for Protein-RNA Binding Affinity Prediction
Rong Han, Xiaohong Liu, Tong Pan et al.
Federated Class-Incremental Learning: A Hybrid Approach Using Latent Exemplars and Data-Free Techniques to Address Local and Global Forgetting
Milad Khademi Nori, IL-MIN KIM, Guanghui Wang
Smoothed Preference Optimization via ReNoise Inversion for Aligning Diffusion Models with Varied Human Preferences
Yunhong Lu, Qichao Wang, Hengyuan Cao et al.
Towards Understanding the Robustness of Diffusion-Based Purification: A Stochastic Perspective
Yiming Liu, Kezhao Liu, Yao Xiao et al.
PostCast: Generalizable Postprocessing for Precipitation Nowcasting via Unsupervised Blurriness Modeling
Junchao Gong, Siwei Tu, Weidong Yang et al.
FactorGCL: A Hypergraph-Based Factor Model with Temporal Residual Contrastive Learning for Stock Returns Prediction
Yitong Duan, Weiran Wang, Jian Li
TaskGalaxy: Scaling Multi-modal Instruction Fine-tuning with Tens of Thousands Vision Task Types
Jiankang Chen, Tianke Zhang, Changyi Liu et al.
Theoretical Insights in Model Inversion Robustness and Conditional Entropy Maximization for Collaborative Inference Systems
Song Xia, Yi Yu, Wenhan Yang et al.
LPOSS: Label Propagation Over Patches and Pixels for Open-vocabulary Semantic Segmentation
Vladan Stojnić, Yannis Kalantidis, Jiri Matas et al.
Mitigating Plasticity Loss in Continual Reinforcement Learning by Reducing Churn
Hongyao Tang, Johan Obando-Ceron, Pablo Samuel Castro et al.
Detecting Visual Information Manipulation Attacks in Augmented Reality: A Multimodal Semantic Reasoning Approach
Yanming Xiu, Maria Gorlatova
FlexSelect: Flexible Token Selection for Efficient Long Video Understanding
yunzhu zhang, Yu Lu, Tianyi Wang et al.
Beyond Masked and Unmasked: Discrete Diffusion Models via Partial Masking
Chen-Hao (Lance) Chao, Wei-Fang Sun, Hanwen Liang et al.
S4M: S4 for multivariate time series forecasting with Missing values
Jing Peng, Meiqi Yang, Qiong Zhang et al.
Vad-R1: Towards Video Anomaly Reasoning via Perception-to-Cognition Chain-of-Thought
Chao Huang, Benfeng Wang, Wei Wang et al.
STPro: Spatial and Temporal Progressive Learning for Weakly Supervised Spatio-Temporal Grounding
Aaryan Garg, Akash Kumar, Yogesh S. Rawat
AnoLLM: Large Language Models for Tabular Anomaly Detection
Che-Ping Tsai, Ganyu Teng, Phillip Wallis et al.
Enhancing Uncertainty Modeling with Semantic Graph for Hallucination Detection
Kedi Chen, Qin Chen, Jie Zhou et al.
Mesh-RFT: Enhancing Mesh Generation via Fine-grained Reinforcement Fine-Tuning
Jian Liu, Jing Xu, Song Guo et al.
Privacy Attacks on Image AutoRegressive Models
Antoni Kowalczuk, Jan Dubiński, Franziska Boenisch et al.
FreqDebias: Towards Generalizable Deepfake Detection via Consistency-Driven Frequency Debiasing
Hossein Kashiani, Niloufar Alipour Talemi, Fatemeh Afghah
MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization
Siyuan Li, Luyuan Zhang, Zedong Wang et al.
The Bandit Whisperer: Communication Learning for Restless Bandits
Yunfan Zhao, Tonghan Wang, Dheeraj Mysore Nagaraj et al.
Multi-Agent Motion Planning for Differential Drive Robots Through Stationary State Search
Jingtian Yan, Jiaoyang Li
ROPO: Robust Preference Optimization for Large Language Models
Xize Liang, Chao Chen, Shuang Qiu et al.
On Union-Closedness of Language Generation
Steve Hanneke, Amin Karbasi, Anay Mehrotra et al.
Hyperbolic Dataset Distillation
Wenyuan Li, Guang Li, Keisuke Maeda et al.
Fully Test-time Adaptation for Tabular Data
Zhi Zhou, Kun-Yang Yu, Lan-Zhe Guo et al.
Active Task Disambiguation with LLMs
Katarzyna Kobalczyk, Nicolás Astorga, Tennison Liu et al.
MindSimulator: Exploring Brain Concept Localization via Synthetic fMRI
Qi Zhang, Qi Zhang, Zixuan Gong et al.
APHQ-ViT: Post-Training Quantization with Average Perturbation Hessian Based Reconstruction for Vision Transformers
Zhuguanyu Wu, Jiayi Zhang, Jiaxin Chen et al.
Factor Augmented Tensor-on-Tensor Neural Networks
Guanhao Zhou, Yuefeng Han, Xiufan Yu
IMDPrompter: Adapting SAM to Image Manipulation Detection by Cross-View Automated Prompt Learning
Quan Zhang, Yuxin Qi, Xi Tang et al.
PBR-NeRF: Inverse Rendering with Physics-Based Neural Fields
Sean Wu, Shamik Basu, Tim Broedermann et al.
Integrative Decoding: Improving Factuality via Implicit Self-consistency
Yi Cheng, Xiao Liang, Yeyun Gong et al.
AdaVideoRAG: Omni-Contextual Adaptive Retrieval-Augmented Efficient Long Video Understanding
Xue zhucun, Jiangning Zhang, Xie Xurong et al.
DataRater: Meta-Learned Dataset Curation
Dan Andrei Calian, Greg Farquhar, Iurii Kemaev et al.
CoMapGS: Covisibility Map-based Gaussian Splatting for Sparse Novel View Synthesis
Youngkyoon Jang, Eduardo Pérez-Pellitero
Overestimation in LLM Evaluation: A Controlled Large-Scale Study on Data Contamination’s Impact on Machine Translation
Muhammed Yusuf Kocyigit, Eleftheria Briakou, Daniel Deutsch et al.
LibriBrain: Over 50 Hours of Within-Subject MEG to Improve Speech Decoding Methods at Scale
Miran Özdogan, Gilad Landau, Gereon Elvers et al.
CLIPDrag: Combining Text-based and Drag-based Instructions for Image Editing
Ziqi Jiang, Zhen Wang, Long Chen
Generalized Gradient Norm Clipping & Non-Euclidean $(L_0,L_1)$-Smoothness
Thomas Pethick, Wanyun Xie, Mete Erdogan et al.
Train Small, Infer Large: Memory-Efficient LoRA Training for Large Language Models
Jun Zhang, Jue Wang, Huan Li et al.
On Extending Direct Preference Optimization to Accommodate Ties
Jinghong Chen, Guangyu Yang, Weizhe Lin et al.
TOPLOC: A Locality Sensitive Hashing Scheme for Trustless Verifiable Inference
Jack Min Ong, Matthew Di Ferrante, Aaron Pazdera et al.
On the Expressiveness and Length Generalization of Selective State Space Models on Regular Languages
Aleksandar Terzic, Michael Hersche, Giacomo Camposampiero et al.
Doubly Robust Conformalized Survival Analysis with Right-Censored Data
Matteo Sesia, vladimir svetnik
Cross-Domain Graph Data Scaling: A Showcase with Diffusion Models
Wenzhuo Tang, Haitao Mao, Danial Dervovic et al.
Steepest Descent Density Control for Compact 3D Gaussian Splatting
Peihao Wang, Yuehao Wang, Dilin Wang et al.
Breaking Silos: Adaptive Model Fusion Unlocks Better Time Series Forecasting
Zhining Liu, Ze Yang, Xiao Lin et al.
Sculpting Features from Noise: Reward-Guided Hierarchical Diffusion for Task-Optimal Feature Transformation
Nanxu Gong, Zijun Li, Sixun Dong et al.
PICD: Versatile Perceptual Image Compression with Diffusion Rendering
Tongda Xu, Jiahao Li, Bin Li et al.
Improving Complex Reasoning with Dynamic Prompt Corruption: A Soft Prompt Optimization Approach
Sinan Fan, Liang Xie, Chen Shen et al.
Empowering LLM Agents with Zero-Shot Optimal Decision-Making through Q-learning
Jiajun Chai, Sicheng Li, Yuqian Fu et al.
Language Agents Meet Causality -- Bridging LLMs and Causal World Models
John Gkountouras, Matthias Lindemann, Phillip Lippe et al.
LASeR: Towards Diversified and Generalizable Robot Design with Large Language Models
JUNRU SONG, Yang Yang, Huan Xiao et al.
Seeking and Updating with Live Visual Knowledge
Mingyang Fu, Yuyang Peng, Dongping Chen et al.
TempSamp-R1: Effective Temporal Sampling with Reinforcement Fine-Tuning for Video LLMs
Yunheng Li, Jing Cheng, Shaoyong Jia et al.
MMTU: A Massive Multi-Task Table Understanding and Reasoning Benchmark
Junjie Xing, Yeye He, Mengyu Zhou et al.
Robust Multimodal Survival Prediction with Conditional Latent Differentiation Variational AutoEncoder
Junjie Zhou, Jiao Tang, Yingli Zuo et al.
Sim-to-Real Causal Transfer: A Metric Learning Approach to Causally-Aware Interaction Representations
Ahmad Rahimi, Po-Chien Luan, Yuejiang Liu et al.
Error Bounds for Gaussian Process Regression Under Bounded Support Noise with Applications to Safety Certification
Robert Reed, Luca Laurenti, Morteza Lahijanian
Object-Shot Enhanced Grounding Network for Egocentric Video
Yisen Feng, Haoyu Zhang, Meng Liu et al.
AVerImaTeC: A Dataset for Automatic Verification of Image-Text Claims with Evidence from the Web
RUI CAO, Zifeng Ding, Zhijiang Guo et al.
Co-op: Correspondence-based Novel Object Pose Estimation
Sungphill Moon, Hyeontae Son, Dongcheol Hur et al.
TVNet: A Novel Time Series Analysis Method Based on Dynamic Convolution and 3D-Variation
Chenghan Li, Mingchen LI, Ruisheng Diao
BinaryDM: Accurate Weight Binarization for Efficient Diffusion Models
Xingyu Zheng, Xianglong Liu, Haotong Qin et al.
Multimodal LLMs as Customized Reward Models for Text-to-Image Generation
Shijie Zhou, Ruiyi Zhang, Huaisheng Zhu et al.
Dense SAE Latents Are Features, Not Bugs
Xiaoqing Sun, Alessandro Stolfo, Joshua Engels et al.
Conformalized Interactive Imitation Learning: Handling Expert Shift and Intermittent Feedback
Michelle Zhao, Henny Admoni, Reid Simmons et al.
Domain-RAG: Retrieval-Guided Compositional Image Generation for Cross-Domain Few-Shot Object Detection
Yu Li, Xingyu Qiu, Yuqian Fu et al.
Is PRM Necessary? Problem-Solving RL Implicitly Induces PRM Capability in LLMs
Zhangyin Feng, Qianglong Chen, Ning Lu et al.
Efficient Active Imitation Learning with Random Network Distillation
Emilien Biré, Anthony Kobanda, Ludovic Denoyer et al.
HawkBench: Investigating Resilience of RAG Methods on Stratified Information-Seeking Tasks
Hongjin Qian, Zheng Liu, Chao Gao et al.
Security Challenges in AI Agent Deployment: Insights from a Large Scale Public Competition
Andy Zou, Maxwell Lin, Eliot Jones et al.
Variational Regularized Unbalanced Optimal Transport: Single Network, Least Action
Yuhao Sun, Zhenyi Zhang, Zihan Wang et al.
COSMOS: Cross-Modality Self-Distillation for Vision Language Pre-training
Sanghwan Kim, Rui Xiao, Iuliana Georgescu et al.
Reasoning Models Hallucinate More: Factuality-Aware Reinforcement Learning for Large Reasoning Models
Junyi Li, Hwee Tou Ng
Kinetic Langevin Diffusion for Crystalline Materials Generation
François Cornet, Federico Bergamin, Arghya Bhowmik et al.
Attributing Culture-Conditioned Generations to Pretraining Corpora
Huihan Li, Arnav Goel, Keyu He et al.
Generative Pre-trained Autoregressive Diffusion Transformer
Yuan Zhang, Jiacheng Jiang, Guoqing Ma et al.
ObjectGS: Object-aware Scene Reconstruction and Scene Understanding via Gaussian Splatting
Ruijie Zhu, Mulin Yu, Linning Xu et al.
Explaining Modern Gated-Linear RNNs via a Unified Implicit Attention Formulation
Itamar Zimerman, ameen ali ali, Lior Wolf
CO-MOT: Boosting End-to-end Transformer-based Multi-Object Tracking via Coopetition Label Assignment and Shadow Sets
feng yan, Weixin Luo, Yujie Zhong et al.
A Unified Solution to Video Fusion: From Multi-Frame Learning to Benchmarking
Zixiang Zhao, Haowen Bai, Bingxin Ke et al.
InteractVLM: 3D Interaction Reasoning from 2D Foundational Models
Sai Kumar Dwivedi, Dimitrije Antić, Shashank Tripathi et al.
Synergizing Motion and Appearance: Multi-Scale Compensatory Codebooks for Talking Head Video Generation
Shuling Zhao, Fa-Ting Hong, Xiaoshui Huang et al.
StreamGS: Online Generalizable Gaussian Splatting Reconstruction for Unposed Image Streams
Yang LI, Jinglu Wang, Lei Chu et al.
AutoSGNN: Automatic Propagation Mechanism Discovery for Spectral Graph Neural Networks
Shibing Mo, Kai Wu, Qixuan Gao et al.
Boosting the visual interpretability of CLIP via adversarial fine-tuning
Shizhan Gong, Haoyu LEI, Qi Dou et al.
Continual Learning Using a Kernel-Based Method Over Foundation Models
Saleh Momeni, Sahisnu Mazumder, Bing Liu
MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering
Rushi Qiang, Yuchen Zhuang, Yinghao Li et al.
GARF: Learning Generalizable 3D Reassembly for Real-World Fractures
Sihang Li, Zeyu Jiang, Grace Chen et al.
Training Consistent Mixture-of-Experts-Based Prompt Generator for Continual Learning
Yue Lu, Shizhou Zhang, De Cheng et al.
MathGAP: Out-of-Distribution Evaluation on Problems with Arbitrarily Complex Proofs
Andreas Opedal, Haruki Shirakami, Bernhard Schölkopf et al.
Zebra-Llama: Towards Extremely Efficient Hybrid Models
Mingyu Yang, Mehdi Rezagholizadeh, Guihong Li et al.
Neighborhood Self-Dissimilarity Attention for Medical Image Segmentation
Junren Chen, Rui Chen, Wei Wang et al.
Conformal Linguistic Calibration: Trading-off between Factuality and Specificity
Zhengping Jiang, Anqi Liu, Ben Van Durme
DebGCD: Debiased Learning with Distribution Guidance for Generalized Category Discovery
Yuanpei Liu, Kai Han
Expressivity of Neural Networks with Random Weights and Learned Biases
Ezekiel Williams, Alexandre Payeur, Avery Ryoo et al.
On the Completeness of Invariant Geometric Deep Learning Models
Zian Li, Xiyuan Wang, Shijia Kang et al.
Emergent Response Planning in LLMs
Zhichen Dong, Zhanhui Zhou, Zhixuan Liu et al.
Validating LLM-as-a-Judge Systems under Rating Indeterminacy
Luke Guerdan, Solon Barocas, Kenneth Holstein et al.
Depth-Bounds for Neural Networks via the Braid Arrangement
Moritz Grillo, Christoph Hertrich, Georg Loho
MMTL-UniAD: A Unified Framework for Multimodal and Multi-Task Learning in Assistive Driving Perception
Wenzhuo Liu, Wenshuo Wang, Yicheng Qiao et al.
Graph Neural Ricci Flow: Evolving Feature from a Curvature Perspective
Jialong Chen, Bowen Deng, Zhen WANG et al.
BrainOOD: Out-of-distribution Generalizable Brain Network Analysis
Jiaxing Xu, Yongqiang Chen, Xia Dong et al.
AgentRecBench: Benchmarking LLM Agent-based Personalized Recommender Systems
Yu Shang, Peijie Liu, Yuwei Yan et al.
Geometric Knowledge-Guided Localized Global Distribution Alignment for Federated Learning
Yanbiao Ma, Wei Dai, Wenke Huang et al.
M3amba: Memory Mamba is All You Need for Whole Slide Image Classification
Tingting Zheng, Kui Jiang, Yi Xiao et al.
Humanizing the Machine: Proxy Attacks to Mislead LLM Detectors
Tianchun Wang, Yuanzhou Chen, Zichuan Liu et al.
VCT: Training Consistency Models with Variational Noise Coupling
Gianluigi Silvestri, Luca Ambrogioni, Chieh-Hsin Lai et al.
Vulnerability-Aware Alignment: Mitigating Uneven Forgetting in Harmful Fine-Tuning
Liang CHEN, Xueting Han, Li Shen et al.
Causal LLM Routing: End-to-End Regret Minimization from Observational Data
Asterios Tsiourvas, Wei Sun, Georgia Perakis
Adaptive Gradient Clipping for Robust Federated Learning
Youssef Allouah, Rachid Guerraoui, Nirupam Gupta et al.
Building a Mind Palace: Structuring Environment-Grounded Semantic Graphs for Effective Long Video Analysis with LLMs
Zeyi Huang, Yuyang Ji, Xiaofang Wang et al.
Distance-Based Tree-Sliced Wasserstein Distance
Viet-Hoang Tran, Minh-Khoi Nguyen-Nhat, Trang Pham et al.
Robust Federated Finetuning of LLMs via Alternating Optimization of LoRA
Shuangyi Chen, Yuanxin Guo, Yue Ju et al.
UniDetox: Universal Detoxification of Large Language Models via Dataset Distillation
Huimin LU, Masaru Isonuma, Junichiro Mori et al.
Aligning Protein Conformation Ensemble Generation with Physical Feedback
Jiarui Lu, Xiaoyin Chen, Stephen Lu et al.
GRAPHGPT-O: Synergistic Multimodal Comprehension and Generation on Graphs
Yi Fang, Bowen Jin, Jiacheng Shen et al.
Spatial Understanding from Videos: Structured Prompts Meet Simulation Data
Haoyu Zhang, Meng Liu, Zaijing Li et al.
Feature Responsiveness Scores: Model-Agnostic Explanations for Recourse
Seung Hyun Cheon, Anneke Wernerfelt, Sorelle Friedler et al.
MI-DETR: An Object Detection Model with Multi-time Inquiries Mechanism
Zhixiong Nan, Xianghong Li, Tao Xiang et al.
Integral Imprecise Probability Metrics
Siu Lun (Alan) Chau, Michele Caprio, Krikamol Muandet
Circuit Representation Learning with Masked Gate Modeling and Verilog-AIG Alignment
Haoyuan Wu, Haisheng Zheng, Yuan Pu et al.
Sequential Conditional Transport on Probabilistic Graphs for Interpretable Counterfactual Fairness
Agathe Fernandes Machado, Arthur Charpentier, Ewen Gallic
Afterburner: Reinforcement Learning Facilitates Self-Improving Code Efficiency Optimization
Mingzhe Du, Anh Tuan Luu, Yue Liu et al.
AtomSurf: Surface Representation for Learning on Protein Structures
Vincent Mallet, Yangyang Miao, Souhaib Attaiki et al.
SubTrack++ : Gradient Subspace Tracking for Scalable LLM Training
Sahar Rajabi, Nayeema Nonta, Sirisha Rambhatla
Dynam3D: Dynamic Layered 3D Tokens Empower VLM for Vision-and-Language Navigation
Zihan Wang, Seungjun Lee, Gim Hee Lee
Can Students Beyond the Teacher? Distilling Knowledge from Teacher’s Bias
Jianhua Zhang, Yi Gao, Ruyu Liu et al.
Provable Maximum Entropy Manifold Exploration via Diffusion Models
Riccardo De Santi, Marin Vlastelica, Ya-Ping Hsieh et al.
Routing Experts: Learning to Route Dynamic Experts in Existing Multi-modal Large Language Models
Qiong Wu, Zhaoxi Ke, Yiyi Zhou et al.
BitStack: Any-Size Compression of Large Language Models in Variable Memory Environments
Xinghao Wang, Pengyu Wang, Bo Wang et al.
Sweeping Heterogeneity with Smart MoPs: Mixture of Prompts for LLM Task Adaptation
Chen Dun, Mirian Del Carmen Hipolito Garcia, Guoqing Zheng et al.
AVF-MAE++: Scaling Affective Video Facial Masked Autoencoders via Efficient Audio-Visual Self-Supervised Learning
Xuecheng Wu, Heli Sun, Yifan Wang et al.
Coeff-Tuning: A Graph Filter Subspace View for Tuning Attention-Based Large Models
Zichen Miao, WEI CHEN, Qiang Qiu
Stochastic Online Instrumental Variable Regression: Regrets for Endogeneity and Bandit Feedback
Riccardo Della Vecchia, Debabrota Basu
MimiQ: Low-Bit Data-Free Quantization of Vision Transformers with Encouraging Inter-Head Attention Similarity
Kanghyun Choi, Hyeyoon Lee, Dain Kwon et al.
Generalized Dimension Reduction Using Semi-Relaxed Gromov-Wasserstein Distance
Ranthony A. Clark, Tom Needham, Thomas Weighill
Mix-CPT: A Domain Adaptation Framework via Decoupling Knowledge Learning and Format Alignment
Jinhao Jiang, Junyi Li, Xin Zhao et al.
Dense Video Object Captioning from Disjoint Supervision
Xingyi Zhou, Anurag Arnab, Chen Sun et al.
Controllable Satellite-to-Street-View Synthesis with Precise Pose Alignment and Zero-Shot Environmental Control
Xianghui Ze, Zhenbo Song, Qiwei Wang et al.
Improving Language Model Distillation through Hidden State Matching
Sayantan Dasgupta, Trevor Cohn
VLsI: Verbalized Layers-to-Interactions from Large to Small Vision Language Models
Byung-Kwan Lee, Ryo Hachiuma, Yu-Chiang Frank Wang et al.
AnyCam: Learning to Recover Camera Poses and Intrinsics from Casual Videos
Felix Wimbauer, Weirong Chen, Dominik Muhle et al.
TRCE: Towards Reliable Malicious Concept Erasure in Text-to-Image Diffusion Models
Ruidong Chen, honglin guo, Lanjun Wang et al.
Understanding Individual Agent Importance in Multi-Agent System via Counterfactual Reasoning
Jianming Chen, Yawen Wang, Junjie Wang et al.
LazyMAR: Accelerating Masked Autoregressive Models via Feature Caching
Feihong Yan, qingyan wei, Jiayi Tang et al.
SVasP: Self-Versatility Adversarial Style Perturbation for Cross-Domain Few-Shot Learning
Wenqian Li, Pengfei Fang, Hui Xue
FedAWA: Adaptive Optimization of Aggregation Weights in Federated Learning Using Client Vectors
Changlong Shi, He Zhao, Bingjie Zhang et al.
Loosely Synchronized Rule-Based Planning for Multi-Agent Path Finding with Asynchronous Actions
Shuai Zhou, Shizhe Zhao, Zhongqiang Ren
MultiVENT 2.0: A Massive Multilingual Benchmark for Event-Centric Video Retrieval
Reno Kriz, Kate Sanders, David Etter et al.
MotionAgent: Fine-grained Controllable Video Generation via Motion Field Agent
Xinyao Liao, Xianfang Zeng, Liao Wang et al.
On the Transfer of Object-Centric Representation Learning
Aniket Rajiv Didolkar, Andrii Zadaianchuk, Anirudh Goyal et al.
Compliant Residual DAgger: Improving Real-World Contact-Rich Manipulation with Human Corrections
Xiaomeng Xu, Yifan Hou, Zeyi Liu et al.
Earlier Tokens Contribute More: Learning Direct Preference Optimization From Temporal Decay Perspective
Ruichen Shao, Bei Li, Gangao Liu et al.
The emergence of sparse attention: impact of data distribution and benefits of repetition
Nicolas Zucchet, Francesco D'Angelo, Andrew Lampinen et al.
Int2Planner: An Intention-based Multi-modal Motion Planner for Integrated Prediction and Planning
Xiaolei Chen, Junchi Yan, Wenlong Liao et al.
Safe Planner: Empowering Safety Awareness in Large Pre-Trained Models for Robot Task Planning
Siyuan Li, Feifan Liu, Lingfei Cui et al.
OpenRSD: Towards Open-prompts for Object Detection in Remote Sensing Images
Ziyue Huang, Yongchao Feng, Ziqi Liu et al.
Discovering Fine-Grained Visual-Concept Relations by Disentangled Optimal Transport Concept Bottleneck Models
Yan Xie, Zequn Zeng, Hao Zhang et al.
Instruction-Augmented Long-Horizon Planning: Embedding Grounding Mechanisms in Embodied Mobile Manipulation
Fangyuan Wang, Shipeng Lyu, Peng Zhou et al.
CREIMBO: Cross-Regional Ensemble Interactions in Multi-view Brain Observations
Noga Mudrik, Ryan Ly, Oliver Ruebel et al.
AerialVG: A Challenging Benchmark for Aerial Visual Grounding by Exploring Positional Relations
Junli Liu, Qizhi Chen, Zhigang Wang et al.
HyGen: Efficient LLM Serving via Elastic Online-Offline Request Co-location
Ting Sun, Penghan Wang, Fan Lai
Enhancing Target-unspecific Tasks through a Features Matrix
Fangming Cui, Yonggang Zhang, Xuan Wang et al.
Don't flatten, tokenize! Unlocking the key to SoftMoE's efficacy in deep RL
Ghada Sokar, Johan S Obando Ceron, Aaron Courville et al.
CWNet: Causal Wavelet Network for Low-Light Image Enhancement
Tongshun Zhang, Pingping Liu, Yubing Lu et al.
SceneDiffuser++: City-Scale Traffic Simulation via a Generative World Model
Shuhan Tan, John Wheatley Lambert, Hong Jeon et al.