Most Cited 2025 Poster Papers
22,274 papers found • Page 26 of 112
Conference
Generative Medical Segmentation
Jiayu Huo, Xi Ouyang, Sébastien Ourselin et al.
Context-Enhanced Memory-Refined Transformer for Online Action Detection
Zhanzhong Pang, Fadime Sener, Angela Yao
Modeling Thousands of Human Annotators for Generalizable Text-to-Image Person Re-identification
Jiayu Jiang, Changxing Ding, Wentao Tan et al.
A Unified Solution to Video Fusion: From Multi-Frame Learning to Benchmarking
Zixiang Zhao, Haowen Bai, Bingxin Ke et al.
Prosody-Enhanced Acoustic Pre-training and Acoustic-Disentangled Prosody Adapting for Movie Dubbing
Zhedong Zhang, Liang Li, Chenggang Yan et al.
LuxDiT: Lighting Estimation with Video Diffusion Transformer
Ruofan Liang, Kai He, Zan Gojcic et al.
On the Zero-shot Adversarial Robustness of Vision-Language Models: A Truly Zero-shot and Training-free Approach
Baoshun Tong, Hanjiang Lai, Yan Pan et al.
3DEnhancer: Consistent Multi-View Diffusion for 3D Enhancement
Yihang Luo, Shangchen Zhou, Yushi Lan et al.
MonoBox: Tightness-Free Box-Supervised Polyp Segmentation Using Monotonicity Constraint
Qiang Hu, Zhenyu Yi, Ying Zhou et al.
Concept Replacer: Replacing Sensitive Concepts in Diffusion Models via Precision Localization
lingyun zhang, Yu Xie, Yanwei Fu et al.
MetaNeRV: Meta Neural Representations for Videos with Spatial-Temporal Guidance
Jialong Guo, Ke Liu, Jiangchao Yao et al.
Variational Search Distributions
Dan Steinberg, Rafael Oliveira, Cheng Soon Ong et al.
Understanding Fine-tuning CLIP for Open-vocabulary Semantic Segmentation in Hyperbolic Space
Zelin Peng, Zhengqin Xu, Zhilin Zeng et al.
Efficient Connectivity-Preserving Instance Segmentation with Supervoxel-Based Loss Function
Anna Grim, Jayaram Chandrashekar, Uygar Sümbül
OnlineAnySeg: Online Zero-Shot 3D Segmentation by Visual Foundation Model Guided 2D Mask Merging
Yijie Tang, Jiazhao Zhang, Yuqing Lan et al.
LICORICE: Label-Efficient Concept-Based Interpretable Reinforcement Learning
Zhuorui Ye, Stephanie Milani, Geoff Gordon et al.
PoseLLaVA: Pose Centric Multimodal LLM for Fine-Grained 3D Pose Manipulation
Dong Feng, Ping Guo, Encheng Peng et al.
ViPOcc: Leveraging Visual Priors from Vision Foundation Models for Single-View 3D Occupancy Prediction
Yi Feng, Yu Han, Xijing Zhang et al.
E(3)-equivariant models cannot learn chirality: Field-based molecular generation
Alexandru Dumitrescu, Dani Korpela, Markus Heinonen et al.
DOLPHIN: A Programmable Framework for Scalable Neurosymbolic Learning
Aaditya Naik, Jason Liu, Claire Wang et al.
Purifying Shampoo: Investigating Shampoo's Heuristics by Decomposing its Preconditioner
Runa Eschenhagen, Aaron Defazio, Tsung-Hsien Lee et al.
MERGE$^3$: Efficient Evolutionary Merging on Consumer-grade GPUs
Tommaso Mencattini, Adrian Robert Minut, Donato Crisostomi et al.
Q-Adapter: Customizing Pre-trained LLMs to New Preferences with Forgetting Mitigation
Yi-Chen Li, Fuxiang Zhang, Wenjie Qiu et al.
CoMBO: Conflict Mitigation via Branched Optimization for Class Incremental Segmentation
Kai Fang, Anqi Zhang, Guangyu Gao et al.
Question-Aware Gaussian Experts for Audio-Visual Question Answering
Hongyeob Kim, Inyoung Jung, Dayoon Suh et al.
Enforcing Latent Euclidean Geometry in Single-Cell VAEs for Manifold Interpolation
Alessandro Palma, Sergei Rybakov, Leon Hetzel et al.
Finding Local Diffusion Schrödinger Bridge using Kolmogorov-Arnold Network
Xingyu Qiu, Mengying Yang, Xinghua Ma et al.
Object-aware Sound Source Localization via Audio-Visual Scene Understanding
Sung Jin Um, Dongjin Kim, Sangmin Lee et al.
Inverse Problem Sampling in Latent Space Using Sequential Monte Carlo
Idan Achituve, Hai Victor Habi, Amir Rosenfeld et al.
EvHDR-NeRF: Building High Dynamic Range Radiance Fields with Single Exposure Images and Events
Zehao Chen, Zhanfeng Liao, De Ma et al.
PanDA: Towards Panoramic Depth Anything with Unlabeled Panoramas and Mobius Spatial Augmentation
Zidong Cao, Jinjing Zhu, Weiming Zhang et al.
What should a neuron aim for? Designing local objective functions based on information theory
Andreas C. Schneider, Valentin Neuhaus, David Ehrlich et al.
Φ-GAN:Physics-Inspired GAN for Generating SAR Images Under Limited Data
Xidan Zhang, Yihan Zhuang, Qian Guo et al.
NeurOp-Diff: Continuous Remote Sensing Image Super-Resolution via Neural Operator Diffusion
Zihao Xu, Yuzhi Tang, Bowen Xu et al.
AoP-SAM: Automation of Prompts for Efficient Segmentation
Yi Chen, Muyoung Son, Chuanbo Hua et al.
Language Models Can Predict Their Own Behavior
Dhananjay Ashok, Jonathan May
Causal LLM Routing: End-to-End Regret Minimization from Observational Data
Asterios Tsiourvas, Wei Sun, Georgia Perakis
A Common Pitfall of Margin-based Language Model Alignment: Gradient Entanglement
Hui Yuan, Yifan Zeng, Yue Wu et al.
Attribution Analysis Meets Model Editing: Advancing Knowledge Correction in Vision Language Models with VisEdit
Qizhou Chen, Taolin Zhang, Chengyu Wang et al.
SAM-CP: Marrying SAM with Composable Prompts for Versatile Segmentation
Pengfei Chen, Lingxi Xie, xinyue huo et al.
Cropper: Vision-Language Model for Image Cropping through In-Context Learning
Seung Hyun Lee, Jijun jiang, Yiran Xu et al.
EMPLACE: Self-Supervised Urban Scene Change Detection
Tim Alpherts, Sennay Ghebreab, Nanne van Noord
Spatial Transport Optimization by Repositioning Attention Map for Training-Free Text-to-Image Synthesis
Woojung Han, Yeonkyung Lee, Chanyoung Kim et al.
Accelerating Training with Neuron Interaction and Nowcasting Networks
Boris Knyazev, Abhinav Moudgil, Guillaume Lajoie et al.
Ego4o: Egocentric Human Motion Capture and Understanding from Multi-Modal Input
Jian Wang, Rishabh Dabral, Diogo Luvizon et al.
Multi-Modal View Enhanced Large Vision Models for Long-Term Time Series Forecasting
ChengAo Shen, Wenchao Yu, Ziming Zhao et al.
Correlated Errors in Large Language Models
Elliot Myunghoon Kim, Avi Garg, Kenny Peng et al.
What makes an Ensemble (Un) Interpretable?
Shahaf Bassan, Guy Amir, Meirav Zehavi et al.
Multi-View Collaborative Learning Network for Speech Deepfake Detection
Kuiyuan Zhang, Zhongyun Hua, Rushi Lan et al.
Revisiting End-to-End Learning with Slide-level Supervision in Computational Pathology
Wenhao Tang, Rong Qin, Heng Fang et al.
ZETA: Leveraging $Z$-order Curves for Efficient Top-$k$ Attention
Qiuhao Zeng, Jierui Huang, Peng Lu et al.
mRNA2vec: mRNA Embedding with Language Model in the 5'UTR-CDS for mRNA Design
Honggen Zhang, Xiangrui Gao, June Zhang et al.
Lightweight Predictive 3D Gaussian Splats
Junli Cao, Vidit Goel, Chaoyang Wang et al.
Phoneme-Level Feature Discrepancies: A Key to Detecting Sophisticated Speech Deepfakes
Kuiyuan Zhang, Zhongyun Hua, Rushi Lan et al.
Learning Orthogonal Multi-Index Models: A Fine-Grained Information Exponent Analysis
Yunwei Ren, Jason Lee
DrVD-Bench: Do Vision-Language Models Reason Like Human Doctors in Medical Image Diagnosis?
Tianhong Zhou, xu yin, Yingtao Zhu et al.
Data-adaptive Differentially Private Prompt Synthesis for In-Context Learning
Fengyu Gao, Ruida Zhou, Tianhao Wang et al.
Self-Supervised Diffusion MRI Denoising via Iterative and Stable Refinement
Chenxu Wu, Qingpeng Kong, Zihang Jiang et al.
Graph Structure Learning for Spatial-Temporal Imputation: Adapting to Node and Feature Scales
Xinyu Yang, Yu Sun, Xinyang Chen et al.
RaSA: Rank-Sharing Low-Rank Adaptation
Zhiwei He, Zhaopeng Tu, Xing Wang et al.
CAT: Content-Adaptive Image Tokenization
Junhong Shen, Kushal Tirumala, Michihiro Yasunaga et al.
Compliant Residual DAgger: Improving Real-World Contact-Rich Manipulation with Human Corrections
Xiaomeng Xu, Yifan Hou, Zeyi Liu et al.
MeshCoder: LLM-Powered Structured Mesh Code Generation from Point Clouds
Bingquan Dai, Luo Li, Qihong Tang et al.
GTG: Generalizable Trajectory Generation Model for Urban Mobility
Jingyuan Wang, Yujing Lin, Yudong Li
Autonomous Goal Detection and Cessation in Reinforcement Learning: A Case Study on Source Term Estimation
Yiwei Shi, Muning Wen, Qi Zhang et al.
Tree-Sliced Wasserstein Distance with Nonlinear Projection
Thanh Tran, Viet Hoang Tran, Thanh Chu et al.
Through the Dual-Prism: A Spectral Perspective on Graph Data Augmentation for Graph Classifications
Yutong Xia, Runpeng Yu, Yuxuan Liang et al.
IterIS: Iterative Inference-Solving Alignment for LoRA Merging
Hongxu chen, Zhen Wang, Runshi Li et al.
PSMGD: Periodic Stochastic Multi-Gradient Descent for Fast Multi-Objective Optimization
Mingjing Xu, Peizhong Ju, Jia Liu et al.
ECHOPulse: ECG Controlled Echocardio-gram Video Generation
Yiwei Li, Sekeun Kim, Zihao Wu et al.
Graph-Based Cross-Domain Knowledge Distillation for Cross-Dataset Text-to-Image Person Retrieval
Bingjun Luo, Jinpeng Wang, Zewen Wang et al.
Bridging Molecular Graphs and Large Language Models
Runze Wang, Mingqi Yang, Yanming Shen
Predicting Empirical AI Research Outcomes with Language Models
Jiaxin Wen, Chenglei Si, Yueh-Han Chen et al.
A Thorough Comparison Between Independent Cascade and Susceptible-Infected-Recovered Models
Panfeng Liu, Guoliang Qiu, Biaoshuai Tao et al.
Cluster Based Heterogeneous Federated Foundation Model Adaptation and Fine-Tuning
Xianda Wang, Yaqi Qiao, Duo Wu et al.
uniINF: Best-of-Both-Worlds Algorithm for Parameter-Free Heavy-Tailed MABs
Yu Chen, Jiatai Huang, Yan Dai et al.
DF-MIA: A Distribution-Free Membership Inference Attack on Fine-Tuned Large Language Models
Zhiheng Huang, Yannan Liu, Daojing He et al.
Towards Learnable Anchor for Deep Multi-View Clustering
Bocheng Wang, Chusheng Zeng, Mulin Chen et al.
On Speeding Up Language Model Evaluation
Jin Zhou, Christian Belardi, Ruihan Wu et al.
EchoONE: Segmenting Multiple Echocardiography Planes in One Model
Jiongtong Hu, Wei Zhuo, Jun Cheng et al.
Neural Interactive Proofs
Lewis Hammond, Sam Adam-Day
AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction
Junhao Cheng, Yuying Ge, Yixiao Ge et al.
GARLIC: GPT-Augmented Reinforcement Learning with Intelligent Control for Vehicle Dispatching
Xiao Han, Zijian Zhang, Xiangyu Zhao et al.
StressPrompt: Does Stress Impact Large Language Models and Human Performance Similarly?
Guobin Shen, Dongcheng Zhao, Aorigele Bao et al.
LeanVAE: An Ultra-Efficient Reconstruction VAE for Video Diffusion Models
Yu Cheng, Fajie Yuan
Seeing What Matters: Empowering CLIP with Patch Generation-to-Selection
Gensheng Pei, Tao Chen, Yujia Wang et al.
HeterGP: Bridging Heterogeneity in Graph Neural Networks with Multi-View Prompting
Fengyu Yan, Xiaobao Wang, Dongxiao He et al.
Skip-Vision: Efficient and Scalable Acceleration of Vision-Language Models via Adaptive Token Skipping
Weili Zeng, Ziyuan Huang, Kaixiang Ji et al.
MOSCATO: Predicting Multiple Object State Change Through Actions
Parnian Zameni, Yuhan Shen, Ehsan Elhamifar
NUDGE: Lightweight Non-Parametric Fine-Tuning of Embeddings for Retrieval
Sepanta Zeighami, Zac Wellmer, Aditya Parameswaran
MrSteve: Instruction-Following Agents in Minecraft with What-Where-When Memory
Junyeong Park, Junmo Cho, Sungjin Ahn
mmFAS: Multimodal Face Anti-Spoofing Using Multi-Level Alignment and Switch-Attention Fusion
Geng Chen, Wuyuan Xie, Di Lin et al.
FactorGCL: A Hypergraph-Based Factor Model with Temporal Residual Contrastive Learning for Stock Returns Prediction
Yitong Duan, Weiran Wang, Jian Li
RadarQA: Multi-modal Quality Analysis of Weather Radar Forecasts
Xuming He, Zhiyuan You, Junchao Gong et al.
Calibrating Expressions of Certainty
Peiqi Wang, Barbara Lam, Yingcheng Liu et al.
Convergence of Clipped SGD on Convex $(L_0,L_1)$-Smooth Functions
Ofir Gaash, Kfir Y. Levy, Yair Carmon
Strategic Classification With Externalities
Safwan Hossain, Evi Micha, Yiling Chen et al.
Interpretable Generative Models through Post-hoc Concept Bottlenecks
Akshay R. Kulkarni, Ge Yan, Chung-En Sun et al.
How to Verify Any (Reasonable) Distribution Property: Computationally Sound Argument Systems for Distributions
Tal Herman, Guy Rothblum
Two Sparse Matrices are Better than One: Sparsifying Neural Networks with Double Sparse Factorization
Vladimir Boza, Vladimir Macko
Reinforcement learning with combinatorial actions for coupled restless bandits
Lily Xu, Bryan Wilder, Elias Khalil et al.
Severing Spurious Correlations with Data Pruning
Varun Mulchandani, Jung-Eun Kim
Latent Radiance Fields with 3D-aware 2D Representations
Chaoyi Zhou, Xi Liu, Feng Luo et al.
Novel View Synthesis with Pixel-Space Diffusion Models
Noam Elata, Bahjat Kawar, Yaron Ostrovsky-Berman et al.
The Fluorescent Veil: A Stealthy and Effective Physical Adversarial Patch Against Traffic Sign Recognition
Shuai Yuan, Xingshuo Han, Hongwei Li et al.
Improved Balanced Classification with Theoretically Grounded Loss Functions
Corinna Cortes, Mehryar Mohri, Yutao Zhong
Spiral: Semantic-Aware Progressive LiDAR Scene Generation and Understanding
Dekai Zhu, Yixuan Hu, Youquan Liu et al.
Distilled Prompt Learning for Incomplete Multimodal Survival Prediction
Yingxue Xu, Fengtao ZHOU, Chenyu Zhao et al.
When Are Concepts Erased From Diffusion Models?
Kevin Lu, Nicky Kriplani, Rohit Gandikota et al.
Leveraging Attention to Effectively Compress Prompts for Long-Context LLMs
Yunlong Zhao, Haoran Wu, Bo Xu
Time-o1: Time-Series Forecasting Needs Transformed Label Alignment
Hao Wang, Licheng Pan, Zhichao Chen et al.
Uncertainty Quantification with the Empirical Neural Tangent Kernel
Joseph Wilson, Chris van der Heide, Liam Hodgkinson et al.
Accelerating 3D Molecule Generation via Jointly Geometric Optimal Transport
Haokai Hong, Wanyu LIN, KC Tan
Cyclic Contrastive Knowledge Transfer for Open-Vocabulary Object Detection
Chuhan ZHANG, Chaoyang Zhu, Pingcheng Dong et al.
Zeroth-Order Fine-Tuning of LLMs with Transferable Static Sparsity
Wentao Guo, Jikai Long, Yimeng Zeng et al.
Disentangled World Models: Learning to Transfer Semantic Knowledge from Distracting Videos for Reinforcement Learning
Qi Wang, Zhipeng Zhang, Baao Xie et al.
CoPRA: Bridging Cross-domain Pretrained Sequence Models with Complex Structures for Protein-RNA Binding Affinity Prediction
Rong Han, Xiaohong Liu, Tong Pan et al.
Shape it Up! Restoring LLM Safety during Finetuning
ShengYun Peng, Pin-Yu Chen, Jianfeng Chi et al.
Bayesian WeakS-to-Strong from Text Classification to Generation
Ziyun Cui, Ziyang Zhang, Guangzhi Sun et al.
Learning Normal Flow Directly From Events
Dehao Yuan, Levi Burner, Jiayi Wu et al.
On Generalization Across Environments In Multi-Objective Reinforcement Learning
Jayden Teoh, Pradeep Varakantham, Peter Vamplew
SweetTok: Semantic-Aware Spatial-Temporal Tokenizer for Compact Video Discretization
Zhentao Tan, Ben Xue, Jian Jia et al.
Exact Expressive Power of Transformers with Padding
Will Merrill, Ashish Sabharwal
Rectifying Magnitude Neglect in Linear Attention
Qihang Fan, Huaibo Huang, Yuang Ai et al.
Towards Omnimodal Expressions and Reasoning in Referring Audio-Visual Segmentation
Kaining Ying, Henghui Ding, Guangquan Jie et al.
Pushing the Limits of All-Atom Geometric Graph Neural Networks: Pre-Training, Scaling, and Zero-Shot Transfer
Zihan Pengmei, Zhengyuan Shen, Zichen Wang et al.
BiggerGait: Unlocking Gait Recognition with Layer-wise Representations from Large Vision Models
Dingqiang Ye, Chao Fan, Zhanbo Huang et al.
From Probability to Counterfactuals: the Increasing Complexity of Satisfiability in Pearl's Causal Hierarchy
Julian Dörfler, Benito van der Zander, Markus Bläser et al.
Rethinking Fair Representation Learning for Performance-Sensitive Tasks
Charles Jones, Fabio De Sousa Ribeiro, Mélanie Roschewitz et al.
Learning to Communicate Through Implicit Communication Channels
Han Wang, Binbin Chen, zhang et al.
Reverse Diffusion Sequential Monte Carlo Samplers
Luhuan Wu, Yi Han, Christian Andersson Naesseth et al.
Multilevel neural simulation-based inference
Yuga Hikida, Ayush Bharti, Niall Jeffrey et al.
VLMaterial: Procedural Material Generation with Large Vision-Language Models
Beichen Li, Rundi Wu, Armando Solar-Lezama et al.
RLPF: Reinforcement Learning from Prediction Feedback for User Summarization with LLMs
Jiaxing Wu, Lin Ning, Luyang Liu et al.
LiteSearch: Efficient Tree Search with Dynamic Exploration Budget for Math Reasoning
Ante Wang, Linfeng Song, Ye Tian et al.
Self-Evolutionary Large Language Models Through Uncertainty-Enhanced Preference Optimization
Jianing Wang, Yang Zhou, Xiaocheng Zhang et al.
InfinityStar: Unified Spacetime AutoRegressive Modeling for Visual Generation
Jinlai Liu, Jian Han, Bin Yan et al.
Lay2Story: Extending Diffusion Transformers for Layout-Togglable Story Generation
Ao Ma, Jiasong Feng, Ke Cao et al.
AToM: Aligning Text-to-Motion Model at Event-Level with GPT-4Vision Reward
Haonan Han, Xiangzuo Wu, Huan Liao et al.
Revisiting Source-Free Domain Adaptation: a New Perspective via Uncertainty Control
Gezheng Xu, Hui GUO, Li Yi et al.
Few-Shot, No Problem: Descriptive Continual Relation Extraction
Nguyen Xuan Thanh, Anh Duc Le, Quyen Tran et al.
STAR: Stability-Inducing Weight Perturbation for Continual Learning
Masih Eskandar, Tooba Imtiaz, Davin Hill et al.
SMI-Editor: Edit-based SMILES Language Model with Fragment-level Supervision
Kangjie Zheng, Siyue Liang, Junwei Yang et al.
Infer Human’s Intentions Before Following Natural Language Instructions
Yanming Wan, Yue Wu, Yiping Wang et al.
Understanding the Generalization of In-Context Learning in Transformers: An Empirical Study
Xingxuan Zhang, Haoran Wang, Jiansheng Li et al.
Free360: Layered Gaussian Splatting for Unbounded 360-Degree View Synthesis from Extremely Sparse and Unposed Views
Chong Bao, Xiyu Zhang, Zehao Yu et al.
BigDocs: An Open Dataset for Training Multimodal Models on Document and Code Tasks
Juan A. Rodriguez, Xiangru Jian, Siba Smarak Panigrahi et al.
Can LLMs Obfuscate Code? A Systematic Analysis of Large Language Models into Assembly Code Obfuscation
Seyedreza Mohseni, Seyedali Mohammadi, Deepa Tilwani et al.
4Deform: Neural Surface Deformation for Robust Shape Interpolation
Lu Sang, Zehranaz Canfes, Dongliang Cao et al.
CaO2: Rectifying Inconsistencies in Diffusion-Based Dataset Distillation
Haoxuan Wang, Zhenghao Zhao, Junyi Wu et al.
Memorize and Rank: Elevating Large Language Models for Clinical Diagnosis Prediction
Mingyu Derek Ma, Xiaoxuan Wang, Yijia Xiao et al.
Cocoon: Robust Multi-Modal Perception with Uncertainty-Aware Sensor Fusion
Minkyoung Cho, Yulong Cao, Jiachen Sun et al.
GLoRa: A Benchmark to Evaluate the Ability to Learn Long-Range Dependencies in Graphs
Dongzhuoran Zhou, Evgeny Kharlamov, Egor Kostylev
Improving Multimodal Learning via Imbalanced Learning
Shicai Wei, Chunbo Luo, Yang Luo
Learning Diffusion Models with Flexible Representation Guidance
Chenyu Wang, Cai Zhou, Sharut Gupta et al.
HERO: Human Reaction Generation from Videos
Chengjun Yu, Wei Zhai, Yuhang Yang et al.
PerLDiff: Controllable Street View Synthesis Using Perspective-Layout Diffusion Model
Jinhua Zhang, Hualian Sheng, Sijia Cai et al.
DecoupledGaussian: Object-Scene Decoupling for Physics-Based Interaction
Miaowei Wang, Yibo Zhang, Rui Ma et al.
Conformal Generative Modeling with Improved Sample Efficiency through Sequential Greedy Filtering
Klaus-Rudolf Kladny, Bernhard Schölkopf, Michael Muehlebach
CMT: A Memory Compression Method for Continual Knowledge Learning of Large Language Models
Dongfang Li, Zetian Sun, Xinshuo Hu et al.
ChartSketcher: Reasoning with Multimodal Feedback and Reflection for Chart Understanding
Muye Huang, Lingling Zhang, Jie Ma et al.
Multi-party Collaborative Attention Control for Image Customization
Han Yang, Chuanguang Yang, Qiuli Wang et al.
HeGTa: Leveraging Heterogeneous Graph-enhanced Large Language Models for Few-shot Complex Table Understanding
Rihui Jin, Yu Li, Guilin Qi et al.
HoVLE: Unleashing the Power of Monolithic Vision-Language Models with Holistic Vision-Language Embedding
Chenxin Tao, Shiqian Su, Xizhou Zhu et al.
CITI: Enhancing Tool Utilizing Ability in Large Language Models Without Sacrificing General Performance
Yupu Hao, Pengfei Cao, Zhuoran Jin et al.
SEPARATE: A Simple Low-rank Projection for Gradient Compression in Modern Large-scale Model Training Process
Hanzhen Zhao, Xingyu Xie, Cong Fang et al.
From Alexnet to Transformers: Measuring the Non-linearity of Deep Neural Networks with Affine Optimal Transport
Quentin Bouniot, Ievgen Redko, Anton Mallasto et al.
Model-Free Offline Reinforcement Learning with Enhanced Robustness
Chi Zhang, Zain Ulabedeen Farhat, George Atia et al.
Towards Understanding Text Hallucination of Diffusion Models via Local Generation Bias
Rui Lu, Runzhe Wang, Kaifeng Lyu et al.
MAGNET: A Multi-agent Framework for Finding Audio-Visual Needles by Reasoning over Multi-Video Haystacks
Sanjoy Chowdhury, Mohamed Elmoghany, Yohan Abeysinghe et al.
Hardware-Rasterized Ray-Based Gaussian Splatting
Samuel Rota Bulò, Lorenzo Porzi, Nemanja Bartolovic et al.
Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning
Xiaochuan Li, Zichun Yu, Chenyan Xiong
FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute
Sotiris Anagnostidis, Gregor Bachmann, Yeongmin Kim et al.
MINGLE: Mixture of Null-Space Gated Low-Rank Experts for Test-Time Continual Model Merging
Zihuan Qiu, Yi Xu, Chiyuan He et al.
Beyond FVD: An Enhanced Evaluation Metrics for Video Generation Distribution Quality
Ge Ya Luo, Gian M Favero, Zhi Hao Luo et al.
Removing Reflections from RAW Photos
Eric Kee, Adam Pikielny, Kevin Blackburn-Matzen et al.
Lawma: The Power of Specialization for Legal Annotation
Ricardo Dominguez-Olmedo, Vedant Nanda, Rediet Abebe et al.
Curly Flow Matching for Learning Non-gradient Field Dynamics
Katarina Petrović, Lazar Atanackovic, Viggo Moro et al.
Preference-Oriented Supervised Fine-Tuning: Favoring Target Model over Aligned Large Language Models
Yuchen Fan, Yuzhong Hong, Qiushi Wang et al.
PanoWan: Lifting Diffusion Video Generation Models to 360$^\circ$ with Latitude/Longitude-aware Mechanisms
Yifei Xia, Shuchen Weng, Siqi Yang et al.
Conformal Language Model Reasoning with Coherent Factuality
Maxon Rubin-Toles, Maya Gambhir, Keshav Ramji et al.
Logic.py: Bridging the Gap between LLMs and Constraint Solvers
Pascal Kesseli, Peter O'Hearn, Ricardo Cabral
BitStack: Any-Size Compression of Large Language Models in Variable Memory Environments
Xinghao Wang, Pengyu Wang, Bo Wang et al.
CoHD: A Counting-Aware Hierarchical Decoding Framework for Generalized Referring Expression Segmentation
Zhuoyan Luo, Yinghao Wu, Tianheng Cheng et al.
In-Context Learning Strategies Emerge Rationally
Daniel Wurgaft, Ekdeep S Lubana, Core Francisco Park et al.
Low-Light Image Enhancement using Event-Based Illumination Estimation
Lei Sun, Yuhan Bao, Jiajun Zhai et al.
Privacy-Preserving Personalized Federated Prompt Learning for Multimodal Large Language Models
Linh Tran, Wei Sun, Stacy Patterson et al.
Many-Objective Multi-Solution Transport
Ziyue Li, Tian Li, Virginia Smith et al.
Make Your Training Flexible: Towards Deployment-Efficient Video Models
Chenting Wang, Kunchang Li, Tianxiang Jiang et al.
Lifelong Knowledge Editing for Vision Language Models with Low-Rank Mixture-of-Experts
Qizhou Chen, Chengyu Wang, Dakan Wang et al.
LNS2+RL: Combining Multi-agent Reinforcement Learning with Large Neighborhood Search in Multi-agent Path Finding
Yutong Wang, Tanishq Duhan, Jiaoyang Li et al.
Learning Heterogeneous Tissues with Mixture of Experts for Gigapixel Whole Slide Images
Junxian Wu, Minheng Chen, Xinyi Ke et al.
Incomplete and Unpaired Multi-View Graph Clustering with Cross-View Feature Fusion
Liang Zhao, Ziyue Wang, Xiao Wang et al.
Think Thrice Before You Act: Progressive Thought Refinement in Large Language Models
Chengyu Du, Jinyi Han, Yizhou Ying et al.
Importance-Based Token Merging for Efficient Image and Video Generation
Haoyu Wu, Jingyi Xu, Hieu Le et al.
Understanding LLM Behaviors via Compression: Data Generation, Knowledge Acquisition and Scaling Laws
Zhixuan Pan, Shaowen Wang, Liao Pengfei et al.
ProtCLIP: Function-Informed Protein Multi-Modal Learning
Hanjing Zhou, Mingze Yin, Wei Wu et al.
Time-IMM: A Dataset and Benchmark for Irregular Multimodal Multivariate Time Series
Ching Chang, Jeehyun Hwang, Yidan Shi et al.
SwitchLingua: The First Large-Scale Multilingual and Multi-Ethnic Code-Switching Dataset
Peng Xie, Xingyuan Liu, Yequan Bie et al.
Semantic and Expressive Variations in Image Captions Across Languages
Andre Ye, Sebastin Santy, Jena D. Hwang et al.