Most Cited 2025 "key-value pair reuse" Papers
22,274 papers found • Page 22 of 112
Conference
QCS:Feature Refining from Quadruplet Cross Similarity for Facial Expression Recognition
Chengpeng Wang, Li Chen, Lili Wang et al.
Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better
Enshu Liu, Junyi Zhu, Zinan Lin et al.
LotteryCodec: Searching the Implicit Representation in a Random Network for Low-Complexity Image Compression
Haotian Wu, Gongpu Chen, Pier Luigi Dragotti et al.
VoiceCraft-Dub: Automated Video Dubbing with Neural Codec Language Models
Kim Sung-Bin, Jeongsoo Choi, Puyuan Peng et al.
EditRoom: LLM-parameterized Graph Diffusion for Composable 3D Room Layout Editing
Kaizhi Zheng, Xiaotong Chen, Xuehai He et al.
Gaussian-Det: Learning Closed-Surface Gaussians for 3D Object Detection
Hongru Yan, Yu Zheng, Yueqi Duan
BrainOmni: A Brain Foundation Model for Unified EEG and MEG Signals
Qinfan Xiao, Ziyun Cui, Chi Zhang et al.
Learning-Augmented Search Data Structures
Chunkai Fu, Brandon G. Nguyen, Jung Seo et al.
Reference-Based 3D-Aware Image Editing with Triplanes
Bahri Batuhan Bilecen, Yiğit Yalın, Ning Yu et al.
VE-Bench: Subjective-Aligned Benchmark Suite for Text-Driven Video Editing Quality Assessment
Shangkun Sun, Xiaoyu Liang, Songlin Fan et al.
MetaMind: Modeling Human Social Thoughts with Metacognitive Multi-Agent Systems
Xuanming Zhang, Yuxuan Chen, Samuel (Min-Hsuan) Yeh et al.
Data-Juicer 2.0: Cloud-Scale Adaptive Data Processing for and with Foundation Models
Daoyuan Chen, Yilun Huang, Xuchen Pan et al.
HumanoidGen: Data Generation for Bimanual Dexterous Manipulation via LLM Reasoning
Zhi Jing, Siyuan Yang, Jicong Ao et al.
StochSync: Stochastic Diffusion Synchronization for Image Generation in Arbitrary Spaces
Kyeongmin Yeo, Jaihoon Kim, Minhyuk Sung
Precedence-Constrained Winter Value for Effective Graph Data Valuation
Hongliang Chi, Wei Jin, Charu Aggarwal et al.
Týr-the-Pruner: Structural Pruning LLMs via Global Sparsity Distribution Optimization
Guanchen Li, Yixing Xu, Zeping Li et al.
Dual Conditioned Motion Diffusion for Pose-Based Video Anomaly Detection
Hongsong Wang, Andi Xu, Pinle Ding et al.
LDP: Generalizing to Multilingual Visual Information Extraction by Language Decoupled Pretraining
Huawen Shen, Gengluo Li, Jinwen Zhong et al.
ProcTag: Process Tagging for Assessing the Efficacy of Document Instruction Data
Yufan Shen, Chuwei Luo, Zhaoqing Zhu et al.
Student-Informed Teacher Training
Nico Messikommer, Jiaxu Xing, Elie Aljalbout et al.
Hearing Anywhere in Any Environment
Xiulong Liu, Anurag Kumar, Paul Calamia et al.
TempSamp-R1: Effective Temporal Sampling with Reinforcement Fine-Tuning for Video LLMs
Yunheng Li, Jing Cheng, Shaoyong Jia et al.
ARKit LabelMaker: A New Scale for Indoor 3D Scene Understanding
Guangda Ji, Silvan Weder, Francis Engelmann et al.
DynamicRAG: Leveraging Outputs of Large Language Model as Feedback for Dynamic Reranking in Retrieval-Augmented Generation
Jiashuo Sun, Xianrui Zhong, Sizhe Zhou et al.
StreamForest: Efficient Online Video Understanding with Persistent Event Memory
Xiangyu Zeng, Kefan Qiu, Qingyu Zhang et al.
Dense SAE Latents Are Features, Not Bugs
Xiaoqing Sun, Alessandro Stolfo, Joshua Engels et al.
Learning Distances from Data with Normalizing Flows and Score Matching
Peter Sorrenson, Daniel Behrend-Uriarte, Christoph Schnörr et al.
Jacobian Sparse Autoencoders: Sparsify Computations, Not Just Activations
Lucy Farnik, Tim Lawson, Conor Houghton et al.
Rethinking Bimanual Robotic Manipulation: Learning with Decoupled Interaction Framework
Jian-Jian Jiang, Xiao-Ming Wu, Yi-Xiang He et al.
ShowHowTo: Generating Scene-Conditioned Step-by-Step Visual Instructions
Tomas Soucek, Prajwal Gatti, Michael Wray et al.
Language-Guided Audio-Visual Learning for Long-Term Sports Assessment
Huangbiao Xu, Xiao Ke, Huanqi Wu et al.
ZAPBench: A Benchmark for Whole-Brain Activity Prediction in Zebrafish
Jan-Matthis Lueckmann, Alexander Immer, Alex Chen et al.
Spreading Out-of-Distribution Detection on Graphs
Daeho Um, Jongin Lim, Sunoh Kim et al.
EOV-Seg: Efficient Open-Vocabulary Panoptic Segmentation
Hongwei Niu, Jie Hu, Jianghang Lin et al.
Planning from Imagination: Episodic Simulation and Episodic Memory for Vision-and-Language Navigation
Yiyuan Pan, Yunzhe Xu, Zhe Liu et al.
Denoising Functional Maps: Diffusion Models for Shape Correspondence
Aleksei Zhuravlev, Zorah Lähner, Vladislav Golyanik
Parameter Efficient Fine-tuning via Explained Variance Adaptation
Fabian Paischer, Lukas Hauzenberger, Thomas Schmied et al.
Boosting ViT-based MRI Reconstruction from the Perspectives of Frequency Modulation, Spatial Purification, and Scale Diversification
Yucong Meng, Zhiwei Yang, Yonghong Shi et al.
Binarized Mamba-Transformer for Lightweight Quad Bayer HybridEVS Demosaicing
Shiyang Zhou, Haijin Zeng, Yunfan Lu et al.
NVComposer: Boosting Generative Novel View Synthesis with Multiple Sparse and Unposed Images
Lingen Li, Zhaoyang Zhang, Yaowei Li et al.
Ultra-High Resolution Segmentation via Boundary-Enhanced Patch-Merging Transformer
Haopeng Sun, Yingwei Zhang, Lumin Xu et al.
Functionality Understanding and Segmentation in 3D Scenes
Jaime Corsetti, Francesco Giuliari, Alice Fasoli et al.
Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models
Yudong Jin, Sida Peng, Xuan Wang et al.
Decouple and Track: Benchmarking and Improving Video Diffusion Transformers For Motion Transfer
Qingyu Shi, Jianzong Wu, Jinbin Bai et al.
SpotActor: Training-Free Layout-Controlled Consistent Image Generation
Jiahao Wang, Caixia Yan, Weizhan Zhang et al.
Language Models over Canonical Byte-Pair Encodings
Tim Vieira, Tianyu Liu, Clemente Pasti et al.
DexGarmentLab: Dexterous Garment Manipulation Environment with Generalizable Policy
Yuran Wang, Ruihai Wu, Yue Chen et al.
SimMotionEdit: Text-Based Human Motion Editing with Motion Similarity Prediction
Zhengyuan Li, Kai Cheng, Anindita Ghosh et al.
AnomalyNCD: Towards Novel Anomaly Class Discovery in Industrial Scenarios
Ziming Huang, Xurui Li, Haotian Liu et al.
Causal Discovery from Conditionally Stationary Time Series
Carles Balsells-Rodas, Xavier Sumba, Tanmayee Narendra et al.
Intrinsic Dimension Correlation: uncovering nonlinear connections in multimodal representations
Lorenzo Basile, Santiago Acevedo, Luca Bortolussi et al.
Do LLMs Really Forget? Evaluating Unlearning with Knowledge Correlation and Confidence Awareness
Rongzhe Wei, Peizhi Niu, Hans Hao-Hsun Hsu et al.
Exploring Visual Vulnerabilities via Multi-Loss Adversarial Search for Jailbreaking Vision-Language Models
Shuyang Hao, Bryan Hooi, Jun Liu et al.
FlashDepth: Real-time Streaming Video Depth Estimation at 2K Resolution
Gene Chou, Wenqi Xian, Guandao Yang et al.
Seq2Time: Sequential Knowledge Transfer for Video LLM Temporal Grounding
Andong Deng, Zhongpai Gao, Anwesa Choudhuri et al.
RCTrans: Radar-Camera Transformer via Radar Densifier and Sequential Decoder for 3D Object Detection
Yiheng Li, Yang Yang, Zhen Lei
Exploring Simple Open-Vocabulary Semantic Segmentation
Zihang Lai
Tracing the Representation Geometry of Language Models from Pretraining to Post-training
Melody Li, Kumar Krishna Agrawal, Arna Ghosh et al.
The Persistence of Neural Collapse Despite Low-Rank Bias
Connall Garrod, Jonathan Keating
Probing Equivariance and Symmetry Breaking in Convolutional Networks
Sharvaree Vadgama, Mohammad Islam, Domas Buracas et al.
Chiron-o1: Igniting Multimodal Large Language Models towards Generalizable Medical Reasoning via Mentor-Intern Collaborative Search
Haoran Sun, Yankai Jiang, Wenjie Lou et al.
No-Regret is not enough! Bandits with General Constraints through Adaptive Regret Minimization
Martino Bernasconi, Matteo Castiglioni, Andrea Celli
Pamba: Enhancing Global Interaction in Point Clouds via State Space Model
Zhuoyuan Li, Yubo Ai, Jiahao Lu et al.
Asymmetric Visual Semantic Embedding Framework for Efficient Vision-Language Alignment
Yang Liu, Mengyuan Liu, Shudong Huang et al.
NormalCrafter: Learning Temporally Consistent Normals from Video Diffusion Priors
Yanrui Bin, Wenbo Hu, Haoyuan Wang et al.
TimePro: Efficient Multivariate Long-term Time Series Forecasting with Variable- and Time-Aware Hyper-state
Xiaowen Ma, Zhen-Liang Ni, Shuai Xiao et al.
Network Sparsity Unlocks the Scaling Potential of Deep Reinforcement Learning
Guozheng Ma, Lu Li, Zilin Wang et al.
Lift3D Policy: Lifting 2D Foundation Models for Robust 3D Robotic Manipulation
Yueru Jia, Jiaming Liu, Sixiang Chen et al.
Prediction-Feedback DETR for Temporal Action Detection
Jihwan Kim, Miso Lee, Cheol-Ho Cho et al.
Bifrost-1: Bridging Multimodal LLMs and Diffusion Models with Patch-level CLIP Latents
Han Lin, Jaemin Cho, Amir Zadeh et al.
AnyCalib: On-Manifold Learning for Model-Agnostic Single-View Camera Calibration
Javier Tirado-Garín, Javier Civera
Event-Enhanced Blurry Video Super-Resolution
Dachun Kai, Yueyi Zhang, Jin Wang et al.
Jailbreak-AudioBench: In-Depth Evaluation and Analysis of Jailbreak Threats for Large Audio Language Models
Hao Cheng, Erjia Xiao, Jing Shao et al.
CL-Splats: Continual Learning of Gaussian Splatting with Local Optimization
Jan Ackermann, Jonas Kulhanek, Shengqu Cai et al.
Active Fine-Tuning of Multi-Task Policies
Marco Bagatella, Jonas Hübotter, Georg Martius et al.
Position: The Artificial Intelligence and Machine Learning Community Should Adopt a More Transparent and Regulated Peer Review Process
Jing Yang
Order-Robust Class Incremental Learning: Graph-Driven Dynamic Similarity Grouping
Guannan Lai, Yujie Li, Xiangkun Wang et al.
EARL-BO: Reinforcement Learning for Multi-Step Lookahead, High-Dimensional Bayesian Optimization
Mujin Cheon, Jay Lee, Dong-Yeun Koh et al.
AutoURDF: Unsupervised Robot Modeling from Point Cloud Frames Using Cluster Registration
Jiong Lin, Lechen Zhang, Kwansoo Lee et al.
Linear combinations of latents in generative models: subspaces and beyond
Erik Bodin, Alexandru Stere, Dragos Margineantu et al.
LayerDAG: A Layerwise Autoregressive Diffusion Model for Directed Acyclic Graph Generation
Mufei Li, Viraj Shitole, Eli Chien et al.
Towards Doctor-Like Reasoning: Medical RAG Fusing Knowledge with Patient Analogy through Textual Gradients
Yuxing Lu, Gecheng Fu, Wei Wu et al.
Volume Optimality in Conformal Prediction with Structured Prediction Sets
Chao Gao, Liren Shan, Vaidehi Srinivas et al.
Multiview Equivariance Improves 3D Correspondence Understanding with Minimal Feature Finetuning
Yang You, Yixin Li, Congyue Deng et al.
CLIP Under the Microscope: A Fine-Grained Analysis of Multi-Object Representation
Reza Abbasi, Ali Nazari, Aminreza Sefid et al.
SteerConf: Steering LLMs for Confidence Elicitation
Ziang Zhou, Tianyuan Jin, Jieming Shi et al.
Same Task, Different Circuits: Disentangling Modality-Specific Mechanisms in VLMs
Yaniv Nikankin, Dana Arad, Yossi Gandelsman et al.
From Sparse to Dense: Camera Relocalization with Scene-Specific Detector from Feature Gaussian Splatting
Zhiwei Huang, Hailin Yu, Yichun Shentu et al.
Toward Efficient Kernel-Based Solvers for Nonlinear PDEs
Zhitong Xu, Da Long, Yiming Xu et al.
QT-DoG: Quantization-Aware Training for Domain Generalization
Saqib Javed, Hieu Le, Mathieu Salzmann
Grounding Language with Vision: A Conditional Mutual Information Calibrated Decoding Strategy for Reducing Hallucinations in LVLMs
Hao Fang, Changle Zhou, Jiawei Kong et al.
LibriBrain: Over 50 Hours of Within-Subject MEG to Improve Speech Decoding Methods at Scale
Miran Özdogan, Gilad Landau, Gereon Elvers et al.
On scalable and efficient training of diffusion samplers
Minkyu Kim, Kiyoung Seong, Dongyeop Woo et al.
T2V-OptJail: Discrete Prompt Optimization for Text-to-Video Jailbreak Attacks
Jiayang Liu, Siyuan Liang, Shiqian Zhao et al.
Understanding and Mitigating Memorization in Diffusion Models for Tabular Data
Zhengyu Fang, Zhimeng Jiang, Huiyuan Chen et al.
Critical Batch Size Revisited: A Simple Empirical Approach to Large-Batch Language Model Training
Will Merrill, Shane Arora, Dirk Groeneveld et al.
MOS: Modeling Object-Scene Associations in Generalized Category Discovery
Zhengyuan Peng, Jinpeng Ma, Zhimin Sun et al.
Flowing Datasets with Wasserstein over Wasserstein Gradient Flows
Clément Bonet, Christophe Vauthier, Anna Korba
BrainACTIV: Identifying visuo-semantic properties driving cortical selectivity using diffusion-based image manipulation
Diego García Cerdas, Christina Sartzetaki, Magnus Petersen et al.
Auto-Regressive Diffusion for Generating 3D Human-Object Interactions
Zichen Geng, Zeeshan Hayder, Wei Liu et al.
ELICIT: LLM Augmentation Via External In-context Capability
Futing Wang, Jianhao (Elliott) Yan, Yue Zhang et al.
Task Generalization with Autoregressive Compositional Structure: Can Learning from $D$ Tasks Generalize to $D^T$ Tasks?
Amirhesam Abedsoltan, Huaqing Zhang, Kaiyue Wen et al.
LoRAShop: Training-Free Multi-Concept Image Generation and Editing with Rectified Flow Transformers
Yusuf Dalva, Hidir Yesiltepe, Pinar Yanardag
Probabilistic Learning to Defer: Handling Missing Expert Annotations and Controlling Workload Distribution
Cuong Nguyen, Thanh-Toan Do, Gustavo Carneiro
Towards Understanding the Robustness of Diffusion-Based Purification: A Stochastic Perspective
Yiming Liu, Kezhao Liu, Yao Xiao et al.
Broken Tokens? Your Language Model can Secretly Handle Non-Canonical Tokenizations
Brian Zheng, Alisa Liu, Orevaoghene Ahia et al.
Leveraging Submodule Linearity Enhances Task Arithmetic Performance in LLMs
Rui Dai, Sile Hu, Xu Shen et al.
DiffRetouch: Using Diffusion to Retouch on the Shoulder of Experts
Zheng-Peng Duan, Jiawei Zhang, Zheng Lin et al.
PROXSPARSE: REGULARIZED LEARNING OF SEMI-STRUCTURED SPARSITY MASKS FOR PRETRAINED LLMS
Hongyi Liu, Rajarshi Saha, Zhen Jia et al.
Split Gibbs Discrete Diffusion Posterior Sampling
Wenda Chu, Zihui Wu, Yifan Chen et al.
Short-length Adversarial Training Helps LLMs Defend Long-length Jailbreak Attacks: Theoretical and Empirical Evidence
Shaopeng Fu, Liang Ding, Jingfeng ZHANG et al.
Uncertain Multimodal Intention and Emotion Understanding in the Wild
Qu Yang, QingHongYa Shi, Tongxin Wang et al.
H-MBA: Hierarchical MamBa Adaptation for Multi-Modal Video Understanding in Autonomous Driving
Siran Chen, Yuxiao Luo, Yue Ma et al.
Enhancing Cognition and Explainability of Multimodal Foundation Models with Self-Synthesized Data
Yucheng Shi, Quanzheng Li, Jin Sun et al.
WildSeg3D: Segment Any 3D Objects in the Wild from 2D Images
Yansong Guo, Jie Hu, Yansong Qu et al.
Dynamical Diffusion: Learning Temporal Dynamics with Diffusion Models
Xingzhuo Guo, Yu Zhang, Baixu Chen et al.
VELOCITI: Benchmarking Video-Language Compositional Reasoning with Strict Entailment
Darshana Saravanan, Varun Gupta, Darshan Singh S et al.
Active Task Disambiguation with LLMs
Katarzyna Kobalczyk, Nicolás Astorga, Tennison Liu et al.
Mask in the Mirror: Implicit Sparsification
Tom Jacobs, Rebekka Burkholz
Structured Linear CDEs: Maximally Expressive and Parallel-in-Time Sequence Models
Benjamin Walker, Lingyi Yang, Nicola Muca Cirone et al.
Video Perception Models for 3D Scene Synthesis
Rui Huang, Guangyao Zhai, Zuria Bauer et al.
Text2Relight: Creative Portrait Relighting with Text Guidance
Junuk Cha, Mengwei Ren, Krishna Kumar Singh et al.
WaterDiffusion: Learning a Prior-involved Unrolling Diffusion for Joint Underwater Saliency Detection and Visual Restoration
Laibin Chang, Yunke Wang, Longxiang Deng et al.
Understanding the Limits of Deep Tabular Methods with Temporal Shift
Haorun Cai, Han-Jia Ye
DEALing with Image Reconstruction: Deep Attentive Least Squares
Mehrsa Pourya, Erich Kobler, Michael Unser et al.
Estimating Model Performance Under Covariate Shift Without Labels
Jakub Białek, Juhani Kivimäki, Wojciech Kuberski et al.
PhysAug: A Physical-guided and Frequency-based Data Augmentation for Single-Domain Generalized Object Detection
Xiaoran Xu, Jiangang Yang, Wenhui Shi et al.
Efficient Hybrid Language Model Compression through Group-Aware SSM Pruning
Ali Taghibakhshi, Sharath Turuvekere Sreenivas, Saurav Muralidharan et al.
Scaffolding Dexterous Manipulation with Vision-Language Models
Vincent de Bakker, Joey Hejna, Tyler Lum et al.
Improving Gaussian Splatting with Localized Points Management
Haosen Yang, Chenhao Zhang, Wenqing Wang et al.
CSformer: Combining Channel Independence and Mixing for Robust Multivariate Time Series Forecasting
Haoxin Wang, Yipeng Mo, Kunlan Xiang et al.
TaskGalaxy: Scaling Multi-modal Instruction Fine-tuning with Tens of Thousands Vision Task Types
Jiankang Chen, Tianke Zhang, Changyi Liu et al.
Dissecting Generalized Category Discovery: Multiplex Consensus under Self-Deconstruction
Luyao Tang, Kunze Huang, Yuxuan Yuan et al.
Exploit Your Latents: Coarse-Grained Protein Backmapping with Latent Diffusion Models
Rongchao Zhang, Yu Huang, Yiwei Lou et al.
DualCP: Rehearsal-Free Domain-Incremental Learning via Dual-Level Concept Prototype
Qiang Wang, Yuhang He, Songlin Dong et al.
Are Expressive Models Truly Necessary for Offline RL?
Guan Wang, Haoyi Niu, Jianxiong Li et al.
Generating Multimodal Driving Scenes via Next-Scene Prediction
Yanhao Wu, Haoyang Zhang, Tianwei Lin et al.
ProtoArgNet: Interpretable Image Classification with Super-Prototypes and Argumentation
Hamed Ayoobi, Nico Potyka, Francesca Toni
Bayesian Experimental Design Via Contrastive Diffusions
Jacopo Iollo, Christophe Heinkelé, Pierre Alliez et al.
Attention Mechanism, Max-Affine Partition, and Universal Approximation
Hude Liu, Jerry Yao-Chieh Hu, Zhao Song et al.
Audio Super-Resolution with Latent Bridge Models
Chang Li, Zehua Chen, Liyuan Wang et al.
IGL-Bench: Establishing the Comprehensive Benchmark for Imbalanced Graph Learning
Jiawen Qin, Haonan Yuan, Qingyun Sun et al.
FreqDebias: Towards Generalizable Deepfake Detection via Consistency-Driven Frequency Debiasing
Hossein Kashiani, Niloufar Alipour Talemi, Fatemeh Afghah
MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization
Siyuan Li, Luyuan Zhang, Zedong Wang et al.
Runtime Analysis for Multi-Objective Evolutionary Algorithms in Unbounded Integer Spaces
Benjamin Doerr, Martin S. Krejca, Günter Rudolph
Locally Convex Global Loss Network for Decision-Focused Learning
Haeun Jeon, Hyunglip Bae, Minsu Park et al.
Autocorrelation Matters: Understanding the Role of Initialization Schemes for State Space Models
Fusheng Liu, Qianxiao Li
MindSimulator: Exploring Brain Concept Localization via Synthetic fMRI
Qi Zhang, Qi Zhang, Zixuan Gong et al.
IMDPrompter: Adapting SAM to Image Manipulation Detection by Cross-View Automated Prompt Learning
Quan Zhang, Yuxin Qi, Xi Tang et al.
Revisiting a Design Choice in Gradient Temporal Difference Learning
Xiaochi Qian, Shangtong Zhang
Decompile-Bench: Million-Scale Binary-Source Function Pairs for Real-World Binary Decompilation
hanzhuo tan, Xiaolong Tian, Hanrui Qi et al.
SUMI-IFL: An Information-Theoretic Framework for Image Forgery Localization with Sufficiency and Minimality Constraints
Ziqi Sheng, Wei Lu, Xiangyang Luo et al.
Why LVLMs Are More Prone to Hallucinations in Longer Responses: The Role of Context
Ge Zheng, Jiaye Qian, Jiajin Tang et al.
Understanding Adam Requires Better Rotation Dependent Assumptions
Tianyue Zhang, Lucas Maes, Alan Milligan et al.
Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning
Roger Creus Castanyer, Johan Obando Ceron, Lu Li et al.
OpenVision: A Fully-Open, Cost-Effective Family of Advanced Vision Encoders for Multimodal Learning
Xianhang Li, Yanqing Liu, Haoqin Tu et al.
Learning a Neural Solver for Parametric PDEs to Enhance Physics-Informed Methods
Lise Le Boudec, Emmanuel de Bézenac, Louis Serrano et al.
HM3: Hierarchical Multi-Objective Model Merging for Pretrained Models
Yu Zhou, Xingyu Wu, Jibin Wu et al.
SoPo: Text-to-Motion Generation Using Semi-Online Preference Optimization
Xiaofeng Tan, Hongsong Wang, Xin Geng et al.
Oasis: One Image is All You Need for Multimodal Instruction Data Synthesis
Letian Zhang, Quan Cui, Bingchen Zhao et al.
Unisolver: PDE-Conditional Transformers Towards Universal Neural PDE Solvers
Hang Zhou, Yuezhou Ma, Haixu Wu et al.
Hypergraph Attacks via Injecting Homogeneous Nodes into Elite Hyperedges
Meixia He, Peican Zhu, Keke Tang et al.
MOL-Mamba: Enhancing Molecular Representation with Structural & Electronic Insights
Jingjing Hu, Dan Guo, Zhan Si et al.
Aligning Language Models Using Follow-up Likelihood as Reward Signal
Chen Zhang, Dading Chong, Feng Jiang et al.
Prediction-Powered Causal Inferences
Riccardo Cadei, Ilker Demirel, Piersilvio De Bartolomeis et al.
Motion Modes: What Could Happen Next?
Karran Pandey, Yannick Hold-Geoffroy, Matheus Gadelha et al.
Logits DeConfusion with CLIP for Few-Shot Learning
Shuo Li, Fang Liu, Zehua Hao et al.
PICD: Versatile Perceptual Image Compression with Diffusion Rendering
Tongda Xu, Jiahao Li, Bin Li et al.
Provable Scaling Laws for the Test-Time Compute of Large Language Models
Yanxi Chen, Xuchen Pan, Yaliang Li et al.
FIMA-Q: Post-Training Quantization for Vision Transformers by Fisher Information Matrix Approximation
Zhuguanyu Wu, Shihe Wang, Jiayi Zhang et al.
ConStellaration: A dataset of QI-like stellarator plasma boundaries and optimization benchmarks
Santiago Cadena, Andrea Merlo, Emanuel Laude et al.
Large Language Models Think Too Fast To Explore Effectively
Lan Pan, Hanbo Xie, Robert Wilson
Golden Cudgel Network for Real-Time Semantic Segmentation
Guoyu Yang, Yuan Wang, Daming Shi et al.
6D Object Pose Tracking in Internet Videos for Robotic Manipulation
Georgy Ponimatkin, Martin Cífka, Tomas Soucek et al.
DWIM: Towards Tool-aware Visual Reasoning via Discrepancy-aware Workflow Generation & Instruct-Masking Tuning
Fucai Ke, Vijay Kumar b g, Xingjian Leng et al.
SUMO: Subspace-Aware Moment-Orthogonalization for Accelerating Memory-Efficient LLM Training
Yehonathan Refael, Guy Smorodinsky, Tom Tirer et al.
GNNs Getting ComFy: Community and Feature Similarity Guided Rewiring
Celia Rubio-Madrigal, Adarsh Jamadandi, Rebekka Burkholz
Momentum Multi-Marginal Schrödinger Bridge Matching
Panagiotis Theodoropoulos, Augustinos Saravanos, Evangelos Theodorou et al.
Understanding Representation Dynamics of Diffusion Models via Low-Dimensional Modeling
Xiao Li, Zekai Zhang, Xiang Li et al.
Truth over Tricks: Measuring and Mitigating Shortcut Learning in Misinformation Detection
Herun Wan, Jiaying Wu, Minnan Luo et al.
Multimodal Tabular Reasoning with Privileged Structured Information
Jun-Peng Jiang, Yu Xia, Hai-Long Sun et al.
The emergence of sparse attention: impact of data distribution and benefits of repetition
Nicolas Zucchet, Francesco D'Angelo, Andrew Lampinen et al.
EVOS: Efficient Implicit Neural Training via EVOlutionary Selector
Weixiang Zhang, Shuzhao Xie, Chengwei Ren et al.
EOC-Bench: Can MLLMs Identify, Recall, and Forecast Objects in an Egocentric World?
Yuqian Yuan, Ronghao Dang, long li et al.
Diversity as a Reward: Fine-Tuning LLMs on a Mixture of Domain-Undetermined Data
Zhenqing Ling, Daoyuan Chen, Liuyi Yao et al.
Where, What, Why: Towards Explainable Driver Attention Prediction
Yuchen Zhou, Jiayu Tang, Xiaoyan Xiao et al.
End-to-end Learning of Gaussian Mixture Priors for Diffusion Sampler
Denis Blessing, Xiaogang Jia, Gerhard Neumann
HybridGS: Decoupling Transients and Statics with 2D and 3D Gaussian Splatting
Jingyu Lin, Jiaqi Gu, Lubin Fan et al.
Train Small, Infer Large: Memory-Efficient LoRA Training for Large Language Models
Jun Zhang, Jue Wang, Huan Li et al.
Cached Multi-Lora Composition for Multi-Concept Image Generation
Xiandong Zou, Mingzhu Shen, Christos-Savvas Bouganis et al.
CLIPDrag: Combining Text-based and Drag-based Instructions for Image Editing
Ziqi Jiang, Zhen Wang, Long Chen
BrainOOD: Out-of-distribution Generalizable Brain Network Analysis
Jiaxing Xu, Yongqiang Chen, Xia Dong et al.
Task-level Distributionally Robust Optimization for Large Language Model-based Dense Retrieval
Guangyuan Ma, Yongliang Ma, Xing Wu et al.
Show and Tell: Visually Explainable Deep Neural Nets via Spatially-Aware Concept Bottleneck Models
Itay Benou, Tammy Riklin Raviv
VALLR: Visual ASR Language Model for Lip Reading
Marshall Thomas, Edward Fish, Richard Bowden
Large Self-Supervised Models Bridge the Gap in Domain Adaptive Object Detection
Marc-Antoine Lavoie, Anas Mahmoud, Steven L. Waslander
Dynamical Low-Rank Compression of Neural Networks with Robustness under Adversarial Attacks
Steffen Schotthöfer, Lexie Yang, Stefan Schnake
MMTL-UniAD: A Unified Framework for Multimodal and Multi-Task Learning in Assistive Driving Perception
Wenzhuo Liu, Wenshuo Wang, Yicheng Qiao et al.
Boosting Short Text Classification with Multi-Source Information Exploration and Dual-Level Contrastive Learning
Yonghao Liu, Mengyu Li, Wei Pang et al.
Prompt-CAM: Making Vision Transformers Interpretable for Fine-Grained Analysis
Arpita Chowdhury, Dipanjyoti Paul, Zheda Mai et al.