Most Cited 2025 "benchmark failure modes" Papers
22,274 papers found • Page 26 of 112
Conference
FactorGCL: A Hypergraph-Based Factor Model with Temporal Residual Contrastive Learning for Stock Returns Prediction
Yitong Duan, Weiran Wang, Jian Li
RadarQA: Multi-modal Quality Analysis of Weather Radar Forecasts
Xuming He, Zhiyuan You, Junchao Gong et al.
Calibrating Expressions of Certainty
Peiqi Wang, Barbara Lam, Yingcheng Liu et al.
Convergence of Clipped SGD on Convex $(L_0,L_1)$-Smooth Functions
Ofir Gaash, Kfir Y. Levy, Yair Carmon
PSMGD: Periodic Stochastic Multi-Gradient Descent for Fast Multi-Objective Optimization
Mingjing Xu, Peizhong Ju, Jia Liu et al.
How to Verify Any (Reasonable) Distribution Property: Computationally Sound Argument Systems for Distributions
Tal Herman, Guy Rothblum
Through the Dual-Prism: A Spectral Perspective on Graph Data Augmentation for Graph Classifications
Yutong Xia, Runpeng Yu, Yuxuan Liang et al.
Strategic Classification With Externalities
Safwan Hossain, Evi Micha, Yiling Chen et al.
Interpretable Generative Models through Post-hoc Concept Bottlenecks
Akshay R. Kulkarni, Ge Yan, Chung-En Sun et al.
Cluster Based Heterogeneous Federated Foundation Model Adaptation and Fine-Tuning
Xianda Wang, Yaqi Qiao, Duo Wu et al.
Reinforcement learning with combinatorial actions for coupled restless bandits
Lily Xu, Bryan Wilder, Elias Khalil et al.
CoPRA: Bridging Cross-domain Pretrained Sequence Models with Complex Structures for Protein-RNA Binding Affinity Prediction
Rong Han, Xiaohong Liu, Tong Pan et al.
Two Sparse Matrices are Better than One: Sparsifying Neural Networks with Double Sparse Factorization
Vladimir Boza, Vladimir Macko
HeterGP: Bridging Heterogeneity in Graph Neural Networks with Multi-View Prompting
Fengyu Yan, Xiaobao Wang, Dongxiao He et al.
Latent Radiance Fields with 3D-aware 2D Representations
Chaoyi Zhou, Xi Liu, Feng Luo et al.
Novel View Synthesis with Pixel-Space Diffusion Models
Noam Elata, Bahjat Kawar, Yaron Ostrovsky-Berman et al.
The Fluorescent Veil: A Stealthy and Effective Physical Adversarial Patch Against Traffic Sign Recognition
Shuai Yuan, Xingshuo Han, Hongwei Li et al.
Improved Balanced Classification with Theoretically Grounded Loss Functions
Corinna Cortes, Mehryar Mohri, Yutao Zhong
Spiral: Semantic-Aware Progressive LiDAR Scene Generation and Understanding
Dekai Zhu, Yixuan Hu, Youquan Liu et al.
Towards Learnable Anchor for Deep Multi-View Clustering
Bocheng Wang, Chusheng Zeng, Mulin Chen et al.
Distilled Prompt Learning for Incomplete Multimodal Survival Prediction
Yingxue Xu, Fengtao ZHOU, Chenyu Zhao et al.
When Are Concepts Erased From Diffusion Models?
Kevin Lu, Nicky Kriplani, Rohit Gandikota et al.
Severing Spurious Correlations with Data Pruning
Varun Mulchandani, Jung-Eun Kim
Bridging Molecular Graphs and Large Language Models
Runze Wang, Mingqi Yang, Yanming Shen
Time-o1: Time-Series Forecasting Needs Transformed Label Alignment
Hao Wang, Licheng Pan, Zhichao Chen et al.
Uncertainty Quantification with the Empirical Neural Tangent Kernel
Joseph Wilson, Chris van der Heide, Liam Hodgkinson et al.
Accelerating 3D Molecule Generation via Jointly Geometric Optimal Transport
Haokai Hong, Wanyu LIN, KC Tan
Zeroth-Order Fine-Tuning of LLMs with Transferable Static Sparsity
Wentao Guo, Jikai Long, Yimeng Zeng et al.
Disentangled World Models: Learning to Transfer Semantic Knowledge from Distracting Videos for Reinforcement Learning
Qi Wang, Zhipeng Zhang, Baao Xie et al.
Cyclic Contrastive Knowledge Transfer for Open-Vocabulary Object Detection
Chuhan ZHANG, Chaoyang Zhu, Pingcheng Dong et al.
GARLIC: GPT-Augmented Reinforcement Learning with Intelligent Control for Vehicle Dispatching
Xiao Han, Zijian Zhang, Xiangyu Zhao et al.
MERGE$^3$: Efficient Evolutionary Merging on Consumer-grade GPUs
Tommaso Mencattini, Adrian Robert Minut, Donato Crisostomi et al.
Shape it Up! Restoring LLM Safety during Finetuning
ShengYun Peng, Pin-Yu Chen, Jianfeng Chi et al.
On Generalization Across Environments In Multi-Objective Reinforcement Learning
Jayden Teoh, Pradeep Varakantham, Peter Vamplew
Bayesian WeakS-to-Strong from Text Classification to Generation
Ziyun Cui, Ziyang Zhang, Guangzhi Sun et al.
Learning Normal Flow Directly From Events
Dehao Yuan, Levi Burner, Jiayi Wu et al.
SweetTok: Semantic-Aware Spatial-Temporal Tokenizer for Compact Video Discretization
Zhentao Tan, Ben Xue, Jian Jia et al.
Exact Expressive Power of Transformers with Padding
Will Merrill, Ashish Sabharwal
Learning to Communicate Through Implicit Communication Channels
Han Wang, Binbin Chen, zhang et al.
Rectifying Magnitude Neglect in Linear Attention
Qihang Fan, Huaibo Huang, Yuang Ai et al.
Towards Omnimodal Expressions and Reasoning in Referring Audio-Visual Segmentation
Kaining Ying, Henghui Ding, Guangquan Jie et al.
Pushing the Limits of All-Atom Geometric Graph Neural Networks: Pre-Training, Scaling, and Zero-Shot Transfer
Zihan Pengmei, Zhengyuan Shen, Zichen Wang et al.
BiggerGait: Unlocking Gait Recognition with Layer-wise Representations from Large Vision Models
Dingqiang Ye, Chao Fan, Zhanbo Huang et al.
From Probability to Counterfactuals: the Increasing Complexity of Satisfiability in Pearl's Causal Hierarchy
Julian Dörfler, Benito van der Zander, Markus Bläser et al.
Reverse Diffusion Sequential Monte Carlo Samplers
Luhuan Wu, Yi Han, Christian Andersson Naesseth et al.
Multilevel neural simulation-based inference
Yuga Hikida, Ayush Bharti, Niall Jeffrey et al.
VLMaterial: Procedural Material Generation with Large Vision-Language Models
Beichen Li, Rundi Wu, Armando Solar-Lezama et al.
InfinityStar: Unified Spacetime AutoRegressive Modeling for Visual Generation
Jinlai Liu, Jian Han, Bin Yan et al.
Lay2Story: Extending Diffusion Transformers for Layout-Togglable Story Generation
Ao Ma, Jiasong Feng, Ke Cao et al.
Leveraging Attention to Effectively Compress Prompts for Long-Context LLMs
Yunlong Zhao, Haoran Wu, Bo Xu
AToM: Aligning Text-to-Motion Model at Event-Level with GPT-4Vision Reward
Haonan Han, Xiangzuo Wu, Huan Liao et al.
STAR: Stability-Inducing Weight Perturbation for Continual Learning
Masih Eskandar, Tooba Imtiaz, Davin Hill et al.
Revisiting Source-Free Domain Adaptation: a New Perspective via Uncertainty Control
Gezheng Xu, Hui GUO, Li Yi et al.
Rethinking Fair Representation Learning for Performance-Sensitive Tasks
Charles Jones, Fabio De Sousa Ribeiro, Mélanie Roschewitz et al.
SMI-Editor: Edit-based SMILES Language Model with Fragment-level Supervision
Kangjie Zheng, Siyue Liang, Junwei Yang et al.
Free360: Layered Gaussian Splatting for Unbounded 360-Degree View Synthesis from Extremely Sparse and Unposed Views
Chong Bao, Xiyu Zhang, Zehao Yu et al.
Understanding the Generalization of In-Context Learning in Transformers: An Empirical Study
Xingxuan Zhang, Haoran Wang, Jiansheng Li et al.
BigDocs: An Open Dataset for Training Multimodal Models on Document and Code Tasks
Juan A. Rodriguez, Xiangru Jian, Siba Smarak Panigrahi et al.
4Deform: Neural Surface Deformation for Robust Shape Interpolation
Lu Sang, Zehranaz Canfes, Dongliang Cao et al.
CaO2: Rectifying Inconsistencies in Diffusion-Based Dataset Distillation
Haoxuan Wang, Zhenghao Zhao, Junyi Wu et al.
Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning
Xiaochuan Li, Zichun Yu, Chenyan Xiong
Cocoon: Robust Multi-Modal Perception with Uncertainty-Aware Sensor Fusion
Minkyoung Cho, Yulong Cao, Jiachen Sun et al.
Improving Multimodal Learning via Imbalanced Learning
Shicai Wei, Chunbo Luo, Yang Luo
GLoRa: A Benchmark to Evaluate the Ability to Learn Long-Range Dependencies in Graphs
Dongzhuoran Zhou, Evgeny Kharlamov, Egor Kostylev
RLPF: Reinforcement Learning from Prediction Feedback for User Summarization with LLMs
Jiaxing Wu, Lin Ning, Luyang Liu et al.
Learning Diffusion Models with Flexible Representation Guidance
Chenyu Wang, Cai Zhou, Sharut Gupta et al.
HERO: Human Reaction Generation from Videos
Chengjun Yu, Wei Zhai, Yuhang Yang et al.
PerLDiff: Controllable Street View Synthesis Using Perspective-Layout Diffusion Model
Jinhua Zhang, Hualian Sheng, Sijia Cai et al.
DecoupledGaussian: Object-Scene Decoupling for Physics-Based Interaction
Miaowei Wang, Yibo Zhang, Rui Ma et al.
Conformal Generative Modeling with Improved Sample Efficiency through Sequential Greedy Filtering
Klaus-Rudolf Kladny, Bernhard Schölkopf, Michael Muehlebach
Few-Shot, No Problem: Descriptive Continual Relation Extraction
Nguyen Xuan Thanh, Anh Duc Le, Quyen Tran et al.
ChartSketcher: Reasoning with Multimodal Feedback and Reflection for Chart Understanding
Muye Huang, Lingling Zhang, Jie Ma et al.
Infer Human’s Intentions Before Following Natural Language Instructions
Yanming Wan, Yue Wu, Yiping Wang et al.
Multi-party Collaborative Attention Control for Image Customization
Han Yang, Chuanguang Yang, Qiuli Wang et al.
HoVLE: Unleashing the Power of Monolithic Vision-Language Models with Holistic Vision-Language Embedding
Chenxin Tao, Shiqian Su, Xizhou Zhu et al.
LiteSearch: Efficient Tree Search with Dynamic Exploration Budget for Math Reasoning
Ante Wang, Linfeng Song, Ye Tian et al.
SEPARATE: A Simple Low-rank Projection for Gradient Compression in Modern Large-scale Model Training Process
Hanzhen Zhao, Xingyu Xie, Cong Fang et al.
From Alexnet to Transformers: Measuring the Non-linearity of Deep Neural Networks with Affine Optimal Transport
Quentin Bouniot, Ievgen Redko, Anton Mallasto et al.
MAGNET: A Multi-agent Framework for Finding Audio-Visual Needles by Reasoning over Multi-Video Haystacks
Sanjoy Chowdhury, Mohamed Elmoghany, Yohan Abeysinghe et al.
Model-Free Offline Reinforcement Learning with Enhanced Robustness
Chi Zhang, Zain Ulabedeen Farhat, George Atia et al.
Hardware-Rasterized Ray-Based Gaussian Splatting
Samuel Rota Bulò, Lorenzo Porzi, Nemanja Bartolovic et al.
Self-Evolutionary Large Language Models Through Uncertainty-Enhanced Preference Optimization
Jianing Wang, Yang Zhou, Xiaocheng Zhang et al.
DOLPHIN: A Programmable Framework for Scalable Neurosymbolic Learning
Aaditya Naik, Jason Liu, Claire Wang et al.
Conformal Language Model Reasoning with Coherent Factuality
Maxon Rubin-Toles, Maya Gambhir, Keshav Ramji et al.
FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute
Sotiris Anagnostidis, Gregor Bachmann, Yeongmin Kim et al.
MINGLE: Mixture of Null-Space Gated Low-Rank Experts for Test-Time Continual Model Merging
Zihuan Qiu, Yi Xu, Chiyuan He et al.
Beyond FVD: An Enhanced Evaluation Metrics for Video Generation Distribution Quality
Ge Ya Luo, Gian M Favero, Zhi Hao Luo et al.
Removing Reflections from RAW Photos
Eric Kee, Adam Pikielny, Kevin Blackburn-Matzen et al.
Curly Flow Matching for Learning Non-gradient Field Dynamics
Katarina Petrović, Lazar Atanackovic, Viggo Moro et al.
PanoWan: Lifting Diffusion Video Generation Models to 360$^\circ$ with Latitude/Longitude-aware Mechanisms
Yifei Xia, Shuchen Weng, Siqi Yang et al.
Lawma: The Power of Specialization for Legal Annotation
Ricardo Dominguez-Olmedo, Vedant Nanda, Rediet Abebe et al.
Logic.py: Bridging the Gap between LLMs and Constraint Solvers
Pascal Kesseli, Peter O'Hearn, Ricardo Cabral
CMT: A Memory Compression Method for Continual Knowledge Learning of Large Language Models
Dongfang Li, Zetian Sun, Xinshuo Hu et al.
BitStack: Any-Size Compression of Large Language Models in Variable Memory Environments
Xinghao Wang, Pengyu Wang, Bo Wang et al.
CoHD: A Counting-Aware Hierarchical Decoding Framework for Generalized Referring Expression Segmentation
Zhuoyan Luo, Yinghao Wu, Tianheng Cheng et al.
In-Context Learning Strategies Emerge Rationally
Daniel Wurgaft, Ekdeep S Lubana, Core Francisco Park et al.
HeGTa: Leveraging Heterogeneous Graph-enhanced Large Language Models for Few-shot Complex Table Understanding
Rihui Jin, Yu Li, Guilin Qi et al.
Low-Light Image Enhancement using Event-Based Illumination Estimation
Lei Sun, Yuhan Bao, Jiajun Zhai et al.
Memorize and Rank: Elevating Large Language Models for Clinical Diagnosis Prediction
Mingyu Derek Ma, Xiaoxuan Wang, Yijia Xiao et al.
Many-Objective Multi-Solution Transport
Ziyue Li, Tian Li, Virginia Smith et al.
Make Your Training Flexible: Towards Deployment-Efficient Video Models
Chenting Wang, Kunchang Li, Tianxiang Jiang et al.
Lifelong Knowledge Editing for Vision Language Models with Low-Rank Mixture-of-Experts
Qizhou Chen, Chengyu Wang, Dakan Wang et al.
CITI: Enhancing Tool Utilizing Ability in Large Language Models Without Sacrificing General Performance
Yupu Hao, Pengfei Cao, Zhuoran Jin et al.
Privacy-Preserving Personalized Federated Prompt Learning for Multimodal Large Language Models
Linh Tran, Wei Sun, Stacy Patterson et al.
Learning Heterogeneous Tissues with Mixture of Experts for Gigapixel Whole Slide Images
Junxian Wu, Minheng Chen, Xinyi Ke et al.
Importance-Based Token Merging for Efficient Image and Video Generation
Haoyu Wu, Jingyi Xu, Hieu Le et al.
Understanding LLM Behaviors via Compression: Data Generation, Knowledge Acquisition and Scaling Laws
Zhixuan Pan, Shaowen Wang, Liao Pengfei et al.
Optimal Non-Asymptotic Rates of Value Iteration for Average-Reward Markov Decision Processes
Jongmin Lee, Ernest Ryu
Preference-Oriented Supervised Fine-Tuning: Favoring Target Model over Aligned Large Language Models
Yuchen Fan, Yuzhong Hong, Qiushi Wang et al.
Think Thrice Before You Act: Progressive Thought Refinement in Large Language Models
Chengyu Du, Jinyi Han, Yizhou Ying et al.
Time-IMM: A Dataset and Benchmark for Irregular Multimodal Multivariate Time Series
Ching Chang, Jeehyun Hwang, Yidan Shi et al.
SwitchLingua: The First Large-Scale Multilingual and Multi-Ethnic Code-Switching Dataset
Peng Xie, Xingyuan Liu, Yequan Bie et al.
Semantic and Expressive Variations in Image Captions Across Languages
Andre Ye, Sebastin Santy, Jena D. Hwang et al.
NightAdapter: Learning a Frequency Adapter for Generalizable Night-time Scene Segmentation
Qi Bi, Jingjun Yi, Huimin Huang et al.
Learning-Guided Rolling Horizon Optimization for Long-Horizon Flexible Job-Shop Scheduling
Sirui Li, Wenbin Ouyang, Yining Ma et al.
ABC-Former: Auxiliary Bimodal Cross-domain Transformer with Interactive Channel Attention for White Balance
Yu-Cheng Chiu, GUAN-RONG CHEN, Zihao Chen et al.
BEDLAM2.0: Synthetic humans and cameras in motion
Joachim Tesch, Giorgio Becherini, Prerana Achar et al.
DeblurDiff: Real-Word Image Deblurring with Generative Diffusion Models
Lingshun Kong, Jiawei Zhang, Dongqing Zou et al.
Flavors of Margin: Implicit Bias of Steepest Descent in Homogeneous Neural Networks
Nikolaos Tsilivis, Gal Vardi, Julia Kempe
UIBDiffusion: Universal Imperceptible Backdoor Attack for Diffusion Models
Yuning Han, Bingyin Zhao, Rui Chu et al.
LNS2+RL: Combining Multi-agent Reinforcement Learning with Large Neighborhood Search in Multi-agent Path Finding
Yutong Wang, Tanishq Duhan, Jiaoyang Li et al.
SocialGesture: Delving into Multi-person Gesture Understanding
Xu Cao, Pranav Virupaksha, Wenqi Jia et al.
TFCustom: Customized Image Generation with Time-Aware Frequency Feature Guidance
Mushui Liu, Dong She, Qihan Huang et al.
Long Video Diffusion Generation with Segmented Cross-Attention and Content-Rich Video Data Curation
Xin Yan, Yuxuan Cai, Qiuyue Wang et al.
A Generic Framework for Conformal Fairness
Aditya Vadlamani, Anutam Srinivasan, Pranav Maneriker et al.
AdaMMS: Model Merging for Heterogeneous Multimodal Large Language Models with Unsupervised Coefficient Optimization
Yiyang Du, Xiaochen Wang, Chi Chen et al.
Progressive Compression with Universally Quantized Diffusion Models
Yibo Yang, Justus Will, Stephan Mandt
A Simple Approach to Unifying Diffusion-based Conditional Generation
Xirui Li, Charles Herrmann, Kelvin Chan et al.
EgoExOR: An Ego-Exo-Centric Operating Room Dataset for Surgical Activity Understanding
Ege Özsoy, Arda Mamur, Felix Tristram et al.
CSA: Data-efficient Mapping of Unimodal Features to Multimodal Features
Po-han Li, Sandeep Chinchali, ufuk topcu
ProtCLIP: Function-Informed Protein Multi-Modal Learning
Hanjing Zhou, Mingze Yin, Wei Wu et al.
MindGYM: What Matters in Question Synthesis for Thinking-Centric Fine-Tuning?
Zhe Xu, Daoyuan Chen, Zhenqing Ling et al.
Uncertainty and Influence aware Reward Model Refinement for Reinforcement Learning from Human Feedback
Zexu Sun, Yiju Guo, Yankai Lin et al.
Incomplete and Unpaired Multi-View Graph Clustering with Cross-View Feature Fusion
Liang Zhao, Ziyue Wang, Xiao Wang et al.
FedSA: A Unified Representation Learning via Semantic Anchors for Prototype-based Federated Learning
Yanbing Zhou, Xiangmou Qu, Chenlong You et al.
Specifying What You Know or Not for Multi-Label Class-Incremental Learning
Aoting Zhang, Dongbao Yang, Chang Liu et al.
PatientSim: A Persona-Driven Simulator for Realistic Doctor-Patient Interactions
Daeun Kyung, Hyunseung Chung, Seongsu Bae et al.
Can Students Beyond the Teacher? Distilling Knowledge from Teacher’s Bias
Jianhua Zhang, Yi Gao, Ruyu Liu et al.
Uni-Renderer: Unifying Rendering and Inverse Rendering Via Dual Stream Diffusion
ZhiFei Chen, Tianshuo Xu, Wenhang Ge et al.
Improving Transferable Targeted Attacks with Feature Tuning Mixup
Kaisheng Liang, Xuelong Dai, Yanjie Li et al.
From Elements to Design: A Layered Approach for Automatic Graphic Design Composition
Jiawei Lin, Shizhao Sun, Danqing Huang et al.
Poplar: Efficient Scaling of Distributed DNN Training on Heterogeneous GPU Clusters
WenZheng Zhang, Yang Hu, Jing Shi et al.
DiffVsgg: Diffusion-Driven Online Video Scene Graph Generation
Mu Chen, Liulei Li, Wenguan Wang et al.
Open-World Objectness Modeling Unifies Novel Object Detection
Shan Zhang, Yao Ni, Jinhao Du et al.
Noise-Resilient Symbolic Regression with Dynamic Gating Reinforcement Learning
Chenglu Sun, Shuo Shen, Wenzhi Tao et al.
Feature Clipping for Uncertainty Calibration
Linwei Tao, Minjing Dong, Chang Xu
MoRE-Brain: Routed Mixture of Experts for Interpretable and Generalizable Cross-Subject fMRI Visual Decoding
YUXIANG WEI, Yanteng Zhang, Xi Xiao et al.
One2Any: One-Reference 6D Pose Estimation for Any Object
Mengya Liu, Siyuan Li, Ajad Chhatkuli et al.
Difficulty-aware Balancing Margin Loss for Long-tailed Recognition
Minseok Son, Inyong Koo, Jinyoung Park et al.
Harmonizing Visual and Textual Embeddings for Zero-Shot Text-to-Image Customization
Yeji Song, Jimyeong Kim, Wonhark Park et al.
Online Estimation and Inference for Robust Policy Evaluation in Reinforcement Learning
Weidong Liu, Jiyuan Tu, Xi Chen et al.
Boltzmann-Aligned Inverse Folding Model as a Predictor of Mutational Effects on Protein-Protein Interactions
Xiaoran Jiao, Weian Mao, Wengong Jin et al.
MOFFlow: Flow Matching for Structure Prediction of Metal-Organic Frameworks
Nayoung Kim, Seongsu Kim, Minsu Kim et al.
Understanding Prompt Tuning and In-Context Learning via Meta-Learning
Tim Genewein, Kevin Li, Jordi Grau-Moya et al.
Every Rollout Counts: Optimal Resource Allocation for Efficient Test-Time Scaling
Xinglin Wang, Yiwei Li, Shaoxiong Feng et al.
Integral Imprecise Probability Metrics
Siu Lun (Alan) Chau, Michele Caprio, Krikamol Muandet
Multimodal Variational Autoencoder: A Barycentric View
Peijie Qiu, Wenhui Zhu, Sayantan Kumar et al.
PrefixKV: Adaptive Prefix KV Cache is What Vision Instruction-Following Models Need for Efficient Generation
Ao Wang, Hui Chen, Jianchao Tan et al.
JTD-UAV: MLLM-Enhanced Joint Tracking and Description Framework for Anti-UAV Systems
Yifan Wang, Jian Zhao, Zhaoxin Fan et al.
FedSPU: Personalized Federated Learning for Resource-Constrained Devices with Stochastic Parameter Update
Ziru Niu, Hai Dong, A. K. Qin
Gaussian Splatting with Discretized SDF for Relightable Assets
Zuo-Liang Zhu, jian Yang, Beibei Wang
Atomic Thinking of LLMs: Decoupling and Exploring Mathematical Reasoning Abilities
Jiayi Kuang, Haojing Huang, Yinghui Li et al.
TikZero: Zero-Shot Text-Guided Graphics Program Synthesis
Jonas Belouadi, Eddy Ilg, Margret Keuper et al.
PhysRig: Differentiable Physics-Based Skinning and Rigging Framework for Realistic Articulated Object Modeling
Hao Zhang, Haolan Xu, Chun Feng et al.
Latent Diffusion Models with Masked AutoEncoders
Junho Lee, Jeongwoo Shin, Hyungwook Choi et al.
OpenRSD: Towards Open-prompts for Object Detection in Remote Sensing Images
Ziyue Huang, Yongchao Feng, Ziqi Liu et al.
Learning Interleaved Image-Text Comprehension in Vision-Language Large Models
Chenyu Zhou, Mengdan Zhang, Peixian Chen et al.
Birth and Death of a Rose
Chen Geng, Yunzhi Zhang, Shangzhe Wu et al.
QA-Calibration of Language Model Confidence Scores
Putra Manggala, Atalanti A Mastakouri, Elke Kirschbaum et al.
SimMLM: A Simple Framework for Multi-modal Learning with Missing Modality
Sijie Li, Chen Chen, Jungong Han
Subgraph Aggregation for Out-of-Distribution Generalization on Graphs
Bowen Liu, Haoyang Li, Shuning Wang et al.
Adversarial Domain Prompt Tuning and Generation for Single Domain Generalization
Zhipeng Xu, De Cheng, XINYANG JIANG et al.
GaussianUpdate: Continual 3D Gaussian Splatting Update for Changing Environments
Lin Zeng, Boming Zhao, Jiarui Hu et al.
InfantAgent-Next: A Multimodal Generalist Agent for Automated Computer Interaction
Bin Lei, Weitai Kang, Zijian Zhang et al.
SteerX: Creating Any Camera-Free 3D and 4D Scenes with Geometric Steering
Byeongjun Park, Hyojun Go, Hyelin Nam et al.
Constructing Confidence Intervals for Average Treatment Effects from Multiple Datasets
Yuxin Wang, Maresa Schröder, Dennis Frauen et al.
Multi-Resolution Pathology-Language Pre-training Model with Text-Guided Visual Representation
Shahad Albastaki, Anabia Sohail, IYYAKUTTI IYAPPAN GANAPATHI et al.
Episodic Novelty Through Temporal Distance
Yuhua Jiang, Qihan Liu, Yiqin Yang et al.
SEGS-SLAM: Structure-enhanced 3D Gaussian Splatting SLAM with Appearance Embedding
Tianci Wen, Zhiang Liu, Yongchun Fang
Boost 3D Reconstruction using Diffusion-based Monocular Camera Calibration
Junyuan Deng, Wei Yin, Xiaoyang Guo et al.
Natural Language Inference Improves Compositionality in Vision-Language Models
Paola Cascante-Bonilla, Yu (Hope) Hou, Yang Cao et al.
New Perspectives on the Polyak Stepsize: Surrogate Functions and Negative Results
Francesco Orabona, Ryan D'Orazio
Self-Updatable Large Language Models by Integrating Context into Model Parameters
Yu Wang, Xinshuang Liu, Xiusi Chen et al.
Re-evaluating Open-ended Evaluation of Large Language Models
Si-Qi Liu, Ian Gemp, Luke Marris et al.
SymMaP: Improving Computational Efficiency in Linear Solvers through Symbolic Preconditioning
Hong Wang, Jie Wang, Minghao Ma et al.
Context Clues: Evaluating Long Context Models for Clinical Prediction Tasks on EHR Data
Michael Wornow, Suhana Bedi, Miguel Angel Fuentes Hernandez et al.
Adaptive Non-Uniform Timestep Sampling for Accelerating Diffusion Model Training
Myunsoo Kim, Donghyeon Ki, Seong-Woong Shim et al.
Frequency-Dynamic Attention Modulation For Dense Prediction
Linwei Chen, Lin Gu, Ying Fu
HALL-E: Hierarchical Neural Codec Language Model for Minute-Long Zero-Shot Text-to-Speech Synthesis
Yuto Nishimura, Takumi Hirose, Masanari Ohi et al.
Orientation Matters: Making 3D Generative Models Orientation-Aligned
Yichong Lu, Yuzhuo Tian, Zijin Jiang et al.
Certification of Speaker Recognition Models to Additive Perturbations
Dmitrii Korzh, Elvir Karimov, Mikhail Pautov et al.
p-Mean Regret for Stochastic Bandits
Anand Krishna, Philips George John, Adarsh Barik et al.
RILQ: Rank-Insensitive LoRA-Based Quantization Error Compensation for Boosting 2-Bit Large Language Model Accuracy
Geonho Lee, Janghwan Lee, Sukjin Hong et al.
SAME: Learning Generic Language-Guided Visual Navigation with State-Adaptive Mixture of Experts
Gengze Zhou, Yicong Hong, Zun Wang et al.
DC-ControlNet: Decoupling Inter- and Intra-Element Conditions in Image Generation with Diffusion Models
hongji yang, Wencheng Han, Yucheng Zhou et al.
SAP: Corrective Machine Unlearning with Scaled Activation Projection for Label Noise Robustness
Sangamesh Kodge, Deepak Ravikumar, Gobinda Saha et al.
WaveMamba: Wavelet-Driven Mamba Fusion for RGB-Infrared Object Detection
Haodong Zhu, Wenhao Dong, Linlin Yang et al.
LongDiff: Training-Free Long Video Generation in One Go
Zhuoling Li, Hossein Rahmani, Qiuhong Ke et al.
X2I: Seamless Integration of Multimodal Understanding into Diffusion Transformer via Attention Distillation
jian ma, Qirong Peng, Xu Guo et al.
Is Large-scale Pretraining the Secret to Good Domain Generalization?
Piotr Teterwak, Kuniaki Saito, Theodoros Tsiligkaridis et al.