Most Cited 2025 Poster Papers
22,274 papers found • Page 23 of 112
Conference
Generalization Bounds and Model Complexity for Kolmogorov–Arnold Networks
Xianyang Zhang, Huijuan Zhou
CoHD: A Counting-Aware Hierarchical Decoding Framework for Generalized Referring Expression Segmentation
Zhuoyan Luo, Yinghao Wu, Tianheng Cheng et al.
Frequency-Dynamic Attention Modulation For Dense Prediction
Linwei Chen, Lin Gu, Ying Fu
Seeing the Arrow of Time in Large Multimodal Models
Zihui (Sherry) Xue, Romy Luo, Kristen Grauman
Direct Numerical Layout Generation for 3D Indoor Scene Synthesis via Spatial Reasoning
Xingjian Ran, Yixuan Li, Linning Xu et al.
Open-World Objectness Modeling Unifies Novel Object Detection
Shan Zhang, Yao Ni, Jinhao Du et al.
ECHOPulse: ECG Controlled Echocardio-gram Video Generation
Yiwei Li, Sekeun Kim, Zihao Wu et al.
BigDocs: An Open Dataset for Training Multimodal Models on Document and Code Tasks
Juan A. Rodriguez, Xiangru Jian, Siba Smarak Panigrahi et al.
Revisiting Source-Free Domain Adaptation: a New Perspective via Uncertainty Control
Gezheng Xu, Hui GUO, Li Yi et al.
Make Your Training Flexible: Towards Deployment-Efficient Video Models
Chenting Wang, Kunchang Li, Tianxiang Jiang et al.
Calibrating Expressions of Certainty
Peiqi Wang, Barbara Lam, Yingcheng Liu et al.
Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning
Xiaochuan Li, Zichun Yu, Chenyan Xiong
Infer Human’s Intentions Before Following Natural Language Instructions
Yanming Wan, Yue Wu, Yiping Wang et al.
Few-Shot, No Problem: Descriptive Continual Relation Extraction
Nguyen Xuan Thanh, Anh Duc Le, Quyen Tran et al.
Audio-Visual Semantic Graph Network for Audio-Visual Event Localization
Liang Liu, Shuaiyong Li, Yongqiang Zhu
Can LLMs Obfuscate Code? A Systematic Analysis of Large Language Models into Assembly Code Obfuscation
Seyedreza Mohseni, Seyedali Mohammadi, Deepa Tilwani et al.
Memorize and Rank: Elevating Large Language Models for Clinical Diagnosis Prediction
Mingyu Derek Ma, Xiaoxuan Wang, Yijia Xiao et al.
Causally Reliable Concept Bottleneck Models
Giovanni De Felice, Arianna Casanova Flores, Francesco De Santis et al.
SMI-Editor: Edit-based SMILES Language Model with Fragment-level Supervision
Kangjie Zheng, Siyue Liang, Junwei Yang et al.
MultiverSeg: Scalable Interactive Segmentation of Biomedical Imaging Datasets with In-Context Guidance
Hallee Wong, Jose Javier Gonzalez Ortiz, John Guttag et al.
SwitchLingua: The First Large-Scale Multilingual and Multi-Ethnic Code-Switching Dataset
Peng Xie, Xingyuan Liu, Yequan Bie et al.
STAR: Stability-Inducing Weight Perturbation for Continual Learning
Masih Eskandar, Tooba Imtiaz, Davin Hill et al.
Strategic Classification With Externalities
Safwan Hossain, Evi Micha, Yiling Chen et al.
CMT: A Memory Compression Method for Continual Knowledge Learning of Large Language Models
Dongfang Li, Zetian Sun, Xinshuo Hu et al.
DreamFuse: Adaptive Image Fusion with Diffusion Transformer
Junjia Huang, Pengxiang Yan, Jiyang Liu et al.
Many-Objective Multi-Solution Transport
Ziyue Li, Tian Li, Virginia Smith et al.
Benchmarking and Learning Multi-Dimensional Quality Evaluator for Text-to-3D Generation
Yujie Zhang, Bingyang Cui, Qi Yang et al.
A Simple Approach to Unifying Diffusion-based Conditional Generation
Xirui Li, Charles Herrmann, Kelvin Chan et al.
Ineq-Comp: Benchmarking Human-Intuitive Compositional Reasoning in Automated Theorem Proving of Inequalities
Haoyu Zhao, Yihan Geng, Shange Tang et al.
Preference-Oriented Supervised Fine-Tuning: Favoring Target Model over Aligned Large Language Models
Yuchen Fan, Yuzhong Hong, Qiushi Wang et al.
Learning-Guided Rolling Horizon Optimization for Long-Horizon Flexible Job-Shop Scheduling
Sirui Li, Wenbin Ouyang, Yining Ma et al.
LaTexBlend: Scaling Multi-concept Customized Generation with Latent Textual Blending
Jian Jin, Zhenbo Yu, Yang Shen et al.
MoRE-Brain: Routed Mixture of Experts for Interpretable and Generalizable Cross-Subject fMRI Visual Decoding
YUXIANG WEI, Yanteng Zhang, Xi Xiao et al.
DualReal: Adaptive Joint Training for Lossless Identity-Motion Fusion in Video Customization
Wenchuan Wang, Mengqi Huang, Yijing Tu et al.
DeblurDiff: Real-Word Image Deblurring with Generative Diffusion Models
Lingshun Kong, Jiawei Zhang, Dongqing Zou et al.
Online Estimation and Inference for Robust Policy Evaluation in Reinforcement Learning
Weidong Liu, Jiyuan Tu, Xi Chen et al.
On the Relation between Rectified Flows and Optimal Transport
Johannes Hertrich, Antonin Chambolle, Julie Delon
Episodic Novelty Through Temporal Distance
Yuhua Jiang, Qihan Liu, Yiqin Yang et al.
Can Students Beyond the Teacher? Distilling Knowledge from Teacher’s Bias
Jianhua Zhang, Yi Gao, Ruyu Liu et al.
Specifying What You Know or Not for Multi-Label Class-Incremental Learning
Aoting Zhang, Dongbao Yang, Chang Liu et al.
Atomic Thinking of LLMs: Decoupling and Exploring Mathematical Reasoning Abilities
Jiayi Kuang, Haojing Huang, Yinghui Li et al.
On Extending Direct Preference Optimization to Accommodate Ties
Jinghong Chen, Guangyu Yang, Weizhe Lin et al.
DMWM: Dual-Mind World Model with Long-Term Imagination
Lingyi Wang, Rashed Shelim, Walid Saad et al.
BitStack: Any-Size Compression of Large Language Models in Variable Memory Environments
Xinghao Wang, Pengyu Wang, Bo Wang et al.
CityAnchor: City-scale 3D Visual Grounding with Multi-modality LLMs
Jinpeng Li, Haiping Wang, Jiabin chen et al.
Towards Immersive Human-X Interaction: A Real-Time Framework for Physically Plausible Motion Synthesis
Kaiyang Ji, Ye Shi, Zichen Jin et al.
Harmonizing Visual and Textual Embeddings for Zero-Shot Text-to-Image Customization
Yeji Song, Jimyeong Kim, Wonhark Park et al.
Difficulty-aware Balancing Margin Loss for Long-tailed Recognition
Minseok Son, Inyong Koo, Jinyoung Park et al.
FactorGCL: A Hypergraph-Based Factor Model with Temporal Residual Contrastive Learning for Stock Returns Prediction
Yitong Duan, Weiran Wang, Jian Li
ZeroGrasp: Zero-Shot Shape Reconstruction Enabled Robotic Grasping
Shun Iwase, Muhammad Zubair Irshad, Katherine Liu et al.
SDFit: 3D Object Pose and Shape by Fitting a Morphable SDF to a Single Image
Dimitrije Antić, Georgios Paschalidis, Shashank Tripathi et al.
Thousand Voices of Trauma: A Large-Scale Synthetic Dataset for Modeling Prolonged Exposure Therapy Conversations
Suhas BN, Andrew Sherrill, Rosa I. Arriaga et al.
Co-Reinforcement Learning for Unified Multimodal Understanding and Generation
Jingjing Jiang, Chongjie Si, Jun Luo et al.
FlexEvent: Towards Flexible Event-Frame Object Detection at Varying Operational Frequencies
Dongyue Lu, Lingdong Kong, Gim Hee Lee et al.
VeriThoughts: Enabling Automated Verilog Code Generation using Reasoning and Formal Verification
Patrick Yubeaton, Andre Nakkab, Weihua Xiao et al.
Thinker: Learning to Think Fast and Slow
Stephen Chung, Wenyu Du, Jie Fu
LightsOut: Diffusion-based Outpainting for Enhanced Lens Flare Removal
Shr-Ruei Tsai, Wei-Cheng Chang, Jie-Ying Lee et al.
PFDiff: Training-Free Acceleration of Diffusion Models Combining Past and Future Scores
Guangyi Wang, Yuren Cai, lijiang Li et al.
On Union-Closedness of Language Generation
Steve Hanneke, Amin Karbasi, Anay Mehrotra et al.
WorldWeaver: Generating Long-Horizon Video Worlds via Rich Perception
Zhiheng Liu, Xueqing Deng, Shoufa Chen et al.
A solvable model of learning generative diffusion: theory and insights
Hugo Cui, Cengiz Pehlevan, Yue Lu
Adversarial Robust Memory-Based Continual Learner
Xiaoyue Mi, Fan Tang, Zonghan Yang et al.
Hierarchical Cross-modal Prompt Learning for Vision-Language Models
Hao Zheng, Shunzhi Yang, Zhuoxin He et al.
Video Perception Models for 3D Scene Synthesis
Rui Huang, Guangyao Zhai, Zuria Bauer et al.
Why Do Some Language Models Fake Alignment While Others Don't?
Abhay Sheshadri, John Hughes, Julian Michael et al.
A Simple Graph Contrastive Learning Framework for Short Text Classification
Yonghao Liu, Fausto Giunchiglia, Lan Huang et al.
Constrained Optimization From a Control Perspective via Feedback Linearization
Runyu Zhang, Arvind Raghunathan, Jeff Shamma et al.
Learning Graph Invariance by Harnessing Spuriosity
Tianjun Yao, Yongqiang Chen, Kai Hu et al.
Incorporating Visual Correspondence into Diffusion Model for Virtual Try-On
Siqi Wan, Jingwen Chen, Yingwei Pan et al.
Multi-Modal View Enhanced Large Vision Models for Long-Term Time Series Forecasting
ChengAo Shen, Wenchao Yu, Ziming Zhao et al.
DreamLayer: Simultaneous Multi-Layer Generation via Diffusion Model
Junjia Huang, Pengxiang Yan, Jinhang Cai et al.
Exploring the Design Space of Visual Context Representation in Video MLLMs
Yifan Du, Yuqi Huo, Kun Zhou et al.
Real-Time Recurrent Reinforcement Learning
Julian Lemmel, Radu Grosu
On the Consistency of Video Large Language Models in Temporal Comprehension
Minjoon Jung, Junbin Xiao, Byoung-Tak Zhang et al.
Revisiting End-to-End Learning with Slide-level Supervision in Computational Pathology
Wenhao Tang, Rong Qin, Heng Fang et al.
Chiron-o1: Igniting Multimodal Large Language Models towards Generalizable Medical Reasoning via Mentor-Intern Collaborative Search
Haoran Sun, Yankai Jiang, Wenjie Lou et al.
PhyBlock: A Progressive Benchmark for Physical Understanding and Planning via 3D Block Assembly
Liang Ma, Jiajun Wen, Min Lin et al.
Certification of Speaker Recognition Models to Additive Perturbations
Dmitrii Korzh, Elvir Karimov, Mikhail Pautov et al.
Purifying Shampoo: Investigating Shampoo's Heuristics by Decomposing its Preconditioner
Runa Eschenhagen, Aaron Defazio, Tsung-Hsien Lee et al.
Towards Homogeneous Lexical Tone Decoding from Heterogeneous Intracranial Recordings
Di Wu, Siyuan Li, Chen Feng et al.
Compliant Residual DAgger: Improving Real-World Contact-Rich Manipulation with Human Corrections
Xiaomeng Xu, Yifan Hou, Zeyi Liu et al.
Learning Spatial-Semantic Features for Robust Video Object Segmentation
Xin Li, Deshui Miao, Zhenyu He et al.
MotionPRO: Exploring the Role of Pressure in Human MoCap and Beyond
Shenghao Ren, Yi Lu, Jiayi Huang et al.
Thinking in Character: Advancing Role-Playing Agents with Role-Aware Reasoning
Yihong Tang, Kehai Chen, Muyun Yang et al.
Denoising with a Joint-Embedding Predictive Architecture
Chen Dengsheng, Jie Hu, Xiaoming Wei et al.
Frame In-N-Out: Unbounded Controllable Image-to-Video Generation
Boyang Wang, Xuweiyi Chen, Matheus Gadelha et al.
MeshCoder: LLM-Powered Structured Mesh Code Generation from Point Clouds
Bingquan Dai, Luo Li, Qihong Tang et al.
Unlocking Point Processes through Point Set Diffusion
David Lüdke, Enric Rabasseda Raventós, Marcel Kollovieh et al.
MetaBox-v2: A Unified Benchmark Platform for Meta-Black-Box Optimization
Zeyuan Ma, Yue-Jiao Gong, Hongshu Guo et al.
HeMoRa: Unsupervised Heuristic Consensus Sampling for Robust Point Cloud Registration
Shaocheng Yan, Yiming Wang, Kaiyan Zhao et al.
BUFFER-X: Towards Zero-Shot Point Cloud Registration in Diverse Scenes
Minkyun Seo, Hyungtae Lim, Kanghee Lee et al.
Learning Orthogonal Multi-Index Models: A Fine-Grained Information Exponent Analysis
Yunwei Ren, Jason Lee
QMP: Q-switch Mixture of Policies for Multi-Task Behavior Sharing
Grace Zhang, Ayush Jain, Injune Hwang et al.
SoMi-ToM: Evaluating Multi-Perspective Theory of Mind in Embodied Social Interactions
Xianzhe Fan, Xuhui Zhou, Chuanyang Jin et al.
Uncertainty-Aware Global-View Reconstruction for Multi-View Multi-Label Feature Selection
Pingting Hao, Kunpeng Liu, Wanfu Gao
OnlineAnySeg: Online Zero-Shot 3D Segmentation by Visual Foundation Model Guided 2D Mask Merging
Yijie Tang, Jiazhao Zhang, Yuqing Lan et al.
Critical Batch Size Revisited: A Simple Empirical Approach to Large-Batch Language Model Training
Will Merrill, Shane Arora, Dirk Groeneveld et al.
On scalable and efficient training of diffusion samplers
Minkyu Kim, Kiyoung Seong, Dongyeop Woo et al.
Learning Physics Informed Neural ODEs with Partial Measurements
Paul Ghanem, Ahmet Demirkaya, Tales Imbiriba et al.
Stochastic Process Learning via Operator Flow Matching
Yaozhong Shi, Zachary Ross, Domniki Asimaki et al.
Ego4o: Egocentric Human Motion Capture and Understanding from Multi-Modal Input
Jian Wang, Rishabh Dabral, Diogo Luvizon et al.
Bootstrapping Heterogeneous Graph Representation Learning via Large Language Models: A Generalized Approach
Hang Gao, Chenhao Zhang, Fengge Wu et al.
MEGADance: Mixture-of-Experts Architecture for Genre-Aware 3D Dance Generation
kaixing yang, Xulong Tang, Ziqiao Peng et al.
Seeing What Matters: Empowering CLIP with Patch Generation-to-Selection
Gensheng Pei, Tao Chen, Yujia Wang et al.
EchoONE: Segmenting Multiple Echocardiography Planes in One Model
Jiongtong Hu, Wei Zhuo, Jun Cheng et al.
When Are Concepts Erased From Diffusion Models?
Kevin Lu, Nicky Kriplani, Rohit Gandikota et al.
Predicting Empirical AI Research Outcomes with Language Models
Jiaxin Wen, Chenglei Si, Yueh-Han Chen et al.
Smoothness Really Matters: A Simple Yet Effective Approach for Unsupervised Graph Domain Adaptation
Wei Chen, Guo Ye, Yakun Wang et al.
ZAPBench: A Benchmark for Whole-Brain Activity Prediction in Zebrafish
Jan-Matthis Lueckmann, Alexander Immer, Alex Chen et al.
Efficient Multi-agent Offline Coordination via Diffusion-based Trajectory Stitching
Lei Yuan, Yuqi Bian, Lihe Li et al.
Complementary Advantages: Exploiting Cross-Field Frequency Correlation for NIR-Assisted Image Denoising
Yuchen Wang, Hongyuan Wang, Lizhi Wang et al.
Estimating Model Performance Under Covariate Shift Without Labels
Jakub Białek, Juhani Kivimäki, Wojciech Kuberski et al.
HoVLE: Unleashing the Power of Monolithic Vision-Language Models with Holistic Vision-Language Embedding
Chenxin Tao, Shiqian Su, Xizhou Zhu et al.
AesthetiQ: Enhancing Graphic Layout Design via Aesthetic-Aware Preference Alignment of Multi-modal Large Language Models
Sohan Patnaik, Rishabh Jain, Balaji Krishnamurthy et al.
Omnidirectional Multi-Object Tracking
Kai Luo, Hao Shi, Sheng Wu et al.
DLF: Extreme Image Compression with Dual-generative Latent Fusion
Naifu Xue, Zhaoyang Jia, Jiahao Li et al.
On the Value of Cross-Modal Misalignment in Multimodal Representation Learning
Yichao Cai, Yuhang Liu, Erdun Gao et al.
BiggerGait: Unlocking Gait Recognition with Layer-wise Representations from Large Vision Models
Dingqiang Ye, Chao Fan, Zhanbo Huang et al.
Loosely Synchronized Rule-Based Planning for Multi-Agent Path Finding with Asynchronous Actions
Shuai Zhou, Shizhe Zhao, Zhongqiang Ren
Robust Message Embedding via Attention Flow-Based Steganography
Huayuan Ye, Shenzhuo Zhang, Shiqi Jiang et al.
One2Any: One-Reference 6D Pose Estimation for Any Object
Mengya Liu, Siyuan Li, Ajad Chhatkuli et al.
When Should We Prefer State-to-Visual DAgger over Visual Reinforcement Learning?
Tongzhou Mu, Zhaoyang Li, Stanisław Wiktor Strzelecki et al.
Multi-Resolution Pathology-Language Pre-training Model with Text-Guided Visual Representation
Shahad Albastaki, Anabia Sohail, IYYAKUTTI IYAPPAN GANAPATHI et al.
GM-MoE: Low-Light Enhancement with Gated-Mechanism Mixture-of-Experts
Minwen Liao, Hao Dong, Xinyi Wang et al.
Efficient Quadratic Corrections for Frank-Wolfe Algorithms
Jannis Halbey, Seta Rakotomandimby, Mathieu Besançon et al.
Distillation Robustifies Unlearning
Bruce W, Lee, Addie Foote, Alex Infanger et al.
Improved Balanced Classification with Theoretically Grounded Loss Functions
Corinna Cortes, Mehryar Mohri, Yutao Zhong
Learning Diffusion Models with Flexible Representation Guidance
Chenyu Wang, Cai Zhou, Sharut Gupta et al.
Detect Any Mirrors: Boosting Learning Reliability on Large-Scale Unlabeled Data with an Iterative Data Engine
Zhaohu Xing, Lihao Liu, Yijun Yang et al.
Zero-shot 3D Question Answering via Voxel-based Dynamic Token Compression
Hsiang-Wei Huang, Fu-Chen Chen, Wenhao Chai et al.
Lay-Your-Scene: Natural Scene Layout Generation with Diffusion Transformers
Divyansh Srivastava, Xiang Zhang, He Wen et al.
Proportional Representation in Practice: Quantifying Proportionality in Ordinal Elections
Tuva Bardal, Markus Brill, David McCune et al.
Addressing Cold-Start Problem in Click-Through Rate Prediction via Supervised Diffusion Modeling
Wenqiao Zhu, Lulu Wang, Jun Wu
High-Precision Dichotomous Image Segmentation via Probing Diffusion Capacity
Qian Yu, Peng-Tao Jiang, Hao Zhang et al.
Beyond FVD: An Enhanced Evaluation Metrics for Video Generation Distribution Quality
Ge Ya Luo, Gian M Favero, Zhi Hao Luo et al.
MINGLE: Mixture of Null-Space Gated Low-Rank Experts for Test-Time Continual Model Merging
Zihuan Qiu, Yi Xu, Chiyuan He et al.
Faster Inference of Flow-Based Generative Models via Improved Data-Noise Coupling
Aram Davtyan, Leello Dadi, Volkan Cevher et al.
SAVVY: Spatial Awareness via Audio-Visual LLMs through Seeing and Hearing
Mingfei Chen, Zijun Cui, Xiulong Liu et al.
Towards RAW Object Detection in Diverse Conditions
Zhong-Yu Li, Xin Jin, Bo-Yuan Sun et al.
Lawma: The Power of Specialization for Legal Annotation
Ricardo Dominguez-Olmedo, Vedant Nanda, Rediet Abebe et al.
Training-free Dense-Aligned Diffusion Guidance for Modular Conditional Image Synthesis
Zixuan Wang, DUO PENG, Feng Chen et al.
Towards Understanding Text Hallucination of Diffusion Models via Local Generation Bias
Rui Lu, Runzhe Wang, Kaifeng Lyu et al.
Geo-Sign: Hyperbolic Contrastive Regularisation for Geometrically Aware Sign Language Translation
Edward Fish, Richard Bowden
Progressive Compression with Universally Quantized Diffusion Models
Yibo Yang, Justus Will, Stephan Mandt
AdaFisher: Adaptive Second Order Optimization via Fisher Information
Damien GOMES, Yanlei Zhang, Eugene Belilovsky et al.
When Selection Meets Intervention: Additional Complexities in Causal Discovery
Haoyue Dai, Ignavier Ng, Jianle Sun et al.
From Bytes to Ideas: Language Modeling with Autoregressive U-Nets
Mathurin VIDEAU, Badr Youbi Idrissi, Alessandro Leite et al.
In-Context Learning and Occam's Razor
Eric Elmoznino, Tom Marty, Tejas Kasetty et al.
Ranked Entropy Minimization for Continual Test-Time Adaptation
Jisu Han, Jaemin Na, Wonjun Hwang
A Polarization-Aided Transformer for Image Deblurring via Motion Vector Decomposition
Duosheng Chen, Shihao Zhou, Jinshan Pan et al.
Contextual Online Decision Making with Infinite-Dimensional Functional Regression
Haichen Hu, Rui Ai, Stephen Bates et al.
DipLLM: Fine-Tuning LLM for Strategic Decision-making in Diplomacy
Kaixuan Xu, Jiajun Chai, Sicheng Li et al.
Cross-modal Multi-task Learning for Multimedia Event Extraction
Jianwei Cao, Yanli Hu, Zhen Tan et al.
Auditing Meta-Cognitive Hallucinations in Reasoning Large Language Models
Haolang Lu, Yilian Liu, Jingxin Xu et al.
Automatically Identify and Rectify: Robust Deep Contrastive Multi-view Clustering in Noisy Scenarios
xihong yang, Siwei Wang, Fangdi Wang et al.
Constructing Confidence Intervals for Average Treatment Effects from Multiple Datasets
Yuxin Wang, Maresa Schröder, Dennis Frauen et al.
One is Plenty: A Polymorphic Feature Interpreter for Immutable Heterogeneous Collaborative Perception
Yuchen Xia, Quan Yuan, Guiyang Luo et al.
Context Clues: Evaluating Long Context Models for Clinical Prediction Tasks on EHR Data
Michael Wornow, Suhana Bedi, Miguel Angel Fuentes Hernandez et al.
Variational Search Distributions
Dan Steinberg, Rafael Oliveira, Cheng Soon Ong et al.
Anti-Exposure Bias in Diffusion Models
Junyu Zhang, Daochang Liu, Eunbyung Park et al.
What should a neuron aim for? Designing local objective functions based on information theory
Andreas C. Schneider, Valentin Neuhaus, David Ehrlich et al.
ReNeg: Learning Negative Embedding with Reward Guidance
Xiaomin Li, yixuan liu, Takashi Isobe et al.
AIpparel: A Multimodal Foundation Model for Digital Garments
Kiyohiro Nakayama, Jan Ackermann, Timur Levent Kesdogan et al.
A Common Pitfall of Margin-based Language Model Alignment: Gradient Entanglement
Hui Yuan, Yifan Zeng, Yue Wu et al.
MagicNaming: Consistent Identity Generation by Finding a “Name Space” in T2I Diffusion Models
Jing Zhao, Heliang Zheng, Chaoyue Wang et al.
MTGA: Multi-View Temporal Granularity Aligned Aggregation for Event-Based Lip-Reading
Wenhao Zhang, Jun Wang, Yong Luo et al.
LASeR: Learning to Adaptively Select Reward Models with Multi-Arm Bandits
Duy Nguyen, Archiki Prasad, Elias Stengel-Eskin et al.
EgoBridge: Domain Adaptation for Generalizable Imitation from Egocentric Human Data
Ryan Punamiya, Dhruv Patel, Patcharapong Aphiwetsa et al.
ChangeDiff: A Multi-Temporal Change Detection Data Generator with Flexible Text Prompts via Diffusion Model
Qi Zang, Jiayi Yang, Shuang Wang et al.
Dynam3D: Dynamic Layered 3D Tokens Empower VLM for Vision-and-Language Navigation
Zihan Wang, Seungjun Lee, Gim Hee Lee
What Matters in Data for DPO?
Yu Pan, Zhongze Cai, Huaiyang Zhong et al.
We Should Chart an Atlas of All the World's Models
Eliahu Horwitz, Nitzan Kurer, Jonathan Kahana et al.
AtomSurf: Surface Representation for Learning on Protein Structures
Vincent Mallet, Yangyang Miao, Souhaib Attaiki et al.
Risk-Controlling Model Selection via Guided Bayesian Optimization
Adam Fisch, Regina Barzilay, Bracha Laufer-Goldshtein et al.
LuxDiT: Lighting Estimation with Video Diffusion Transformer
Ruofan Liang, Kai He, Zan Gojcic et al.
A Unified Solution to Video Fusion: From Multi-Frame Learning to Benchmarking
Zixiang Zhao, Haowen Bai, Bingxin Ke et al.
KaSA: Knowledge-Aware Singular-Value Adaptation of Large Language Models
Fan Wang, Juyong Jiang, Chansung Park et al.
Detection-Friendly Nonuniformity Correction: A Union Framework for Infrared UAV Target Detection
Houzhang Fang, Xiaolin Wang, Zengyang Li et al.
LUCAS: Layered Universal Codec Avatars
Di Liu, Teng Deng, Giljoo Nam et al.
Modeling Thousands of Human Annotators for Generalizable Text-to-Image Person Re-identification
Jiayu Jiang, Changxing Ding, Wentao Tan et al.
KAC: Kolmogorov-Arnold Classifier for Continual Learning
Yusong Hu, Zichen Liang, Fei Yang et al.
Sparis: Neural Implicit Surface Reconstruction of Indoor Scenes from Sparse Views
Yulun Wu, Han Huang, Wenyuan Zhang et al.
Topological Schrödinger Bridge Matching
Maosheng Yang
Constrained Belief Updates Explain Geometric Structures in Transformer Representations
Mateusz Piotrowski, Paul Riechers, Daniel Filan et al.
Language Agents Meet Causality -- Bridging LLMs and Causal World Models
John Gkountouras, Matthias Lindemann, Phillip Lippe et al.
True Zero-Shot Inference of Dynamical Systems Preserving Long-Term Statistics
Christoph Jürgen Hemmer, Daniel Durstewitz
Bokehlicious: Photorealistic Bokeh Rendering with Controllable Apertures
Tim Seizinger, Florin-Alexandru Vasluianu, Marcos Conde et al.
On Generalization Across Environments In Multi-Objective Reinforcement Learning
Jayden Teoh, Pradeep Varakantham, Peter Vamplew
Interpretable Image Classification via Non-parametric Part Prototype Learning
Zhijie Zhu, Lei Fan, Maurice Pagnucco et al.
Mosaic of Modalities: A Comprehensive Benchmark for Multimodal Graph Learning
Jing Zhu, Yuhang Zhou, Shengyi Qian et al.
Towards Improving Exploration through Sibling Augmented GFlowNets
Kanika Madan, Alex Lamb, Emmanuel Bengio et al.
Rethinking Fair Representation Learning for Performance-Sensitive Tasks
Charles Jones, Fabio De Sousa Ribeiro, Mélanie Roschewitz et al.
Efficient Prompt Compression with Evaluator Heads for Long-Context Transformer Inference
Weizhi Fei, Xueyan Niu, XIE GUOQING et al.
Lifelong Knowledge Editing for Vision Language Models with Low-Rank Mixture-of-Experts
Qizhou Chen, Chengyu Wang, Dakan Wang et al.
Alligat0R: Pre-Training through Covisibility Segmentation for Relative Camera Pose Regression
Thibaut Loiseau, Guillaume Bourmaud, Vincent Lepetit
Hierarchically-Structured Open-Vocabulary Indoor Scene Synthesis with Pre-trained Large Language Model
Weilin Sun, Xinran Li, Manyi Li et al.
Unleashing the Potential of Vision-Language Pre-Training for 3D Zero-Shot Lesion Segmentation via Mask-Attribute Alignment
Yankai Jiang, Wenhui Lei, Xiaofan Zhang et al.
Understanding the Generalization of In-Context Learning in Transformers: An Empirical Study
Xingxuan Zhang, Haoran Wang, Jiansheng Li et al.
A Solvable Attention for Neural Scaling Laws
Bochen Lyu, Di Wang, Zhanxing Zhu
TGBFormer: Transformer-GraphFormer Blender Network for Video Object Detection
Qiang Qi, Xiao Wang