Most Cited 2025 Poster Papers
21,856 papers found • Page 20 of 110
Conference
ZeroHAR: Sensor Context Augments Zero-Shot Wearable Action Recognition
Ranak Roy Chowdhury, Ritvik Kapila, Ameya Panse et al.
Understanding Individual Agent Importance in Multi-Agent System via Counterfactual Reasoning
Jianming Chen, Yawen Wang, Junjie Wang et al.
VAE-Var: Variational Autoencoder-Enhanced Variational Methods for Data Assimilation in Meteorology
Yi Xiao, Qilong Jia, Kun Chen et al.
Decoder-Hybrid-Decoder Architecture for Efficient Reasoning with Long Generation
Liliang Ren, Congcong Chen, Haoran Xu et al.
Generalized Dimension Reduction Using Semi-Relaxed Gromov-Wasserstein Distance
Ranthony A. Clark, Tom Needham, Thomas Weighill
DyCON: Dynamic Uncertainty-aware Consistency and Contrastive Learning for Semi-supervised Medical Image Segmentation
Maregu Assefa, Muzammal Naseer, IYYAKUTTI IYAPPAN GANAPATHI et al.
3D Occupancy Prediction with Low-Resolution Queries via Prototype-aware View Transformation
Gyeongrok Oh, Sung June Kim, Heeju Ko et al.
Learning Distances from Data with Normalizing Flows and Score Matching
Peter Sorrenson, Daniel Behrend-Uriarte, Christoph Schnörr et al.
CoVoMix2: Advancing Zero-Shot Dialogue Generation with Fully Non-Autoregressive Flow Matching
Leying Zhang, Yao Qian, Xiaofei Wang et al.
Test3R: Learning to Reconstruct 3D at Test Time
Yuheng Yuan, Qiuhong Shen, Shizun Wang et al.
Fast Inference for Augmented Large Language Models
Rana Shahout, Cong Liang, Shiji Xin et al.
Achieving Maximin Share and EFX/EF1 Guarantees Simultaneously
Hannaneh Akrami, Nidhi Rathi
Braess’s Paradox of Generative AI
Boaz Taitler, Omer Ben-Porat
Inverse Reinforcement Learning by Estimating Expertise of Demonstrators
Mark Beliaev, Ramtin Pedarsani
Janus-Pro-R1: Advancing Collaborative Visual Comprehension and Generation via Reinforcement Learning
Kaihang Pan, Yang Wu, Wendong Bu et al.
Probabilistic Stability Guarantees for Feature Attributions
Helen Jin, Anton Xue, Weiqiu You et al.
Prompt-based Unifying Inference Attack on Graph Neural Networks
Yuecen Wei, Xingcheng Fu, Lingyun Liu et al.
Inversion Circle Interpolation: Diffusion-based Image Augmentation for Data-scarce Classification
Yanghao Wang, Long Chen
Lifelong Safety Alignment for Language Models
Haoyu Wang, Yifei Zhao, Zeyu Qin et al.
Language Models over Canonical Byte-Pair Encodings
Tim Vieira, Tianyu Liu, Clemente Pasti et al.
Accessing Vision Foundation Models via ImageNet-1K
Yitian Zhang, Xu Ma, Yue Bai et al.
Conditional Latent Coding with Learnable Synthesized Reference for Deep Image Compression
Siqi Wu, Yinda Chen, Dong Liu et al.
Logits DeConfusion with CLIP for Few-Shot Learning
Shuo Li, Fang Liu, Zehua Hao et al.
Simultaneous Swap Regret Minimization via KL-Calibration
Haipeng Luo, Spandan Senapati, Vatsal Sharan
Bigger, Regularized, Categorical: High-Capacity Value Functions are Efficient Multi-Task Learners
Michal Nauman, Marek Cygan, Carmelo Sferrazza et al.
A Unified Model for Compressed Sensing MRI Across Undersampling Patterns
Armeet Singh Jatyani, Jiayun Wang, Aditi Chandrashekar et al.
Memory-Efficient Visual Autoregressive Modeling with Scale-Aware KV Cache Compression
Kunjun Li, Zigeng Chen, Cheng-Yen Yang et al.
SimpleStrat: Diversifying Language Model Generation with Stratification
Justin Wong, Yury Orlovskiy, Alexander Shypula et al.
Blending Complementary Memory Systems in Hybrid Quadratic-Linear Transformers
Kazuki Irie, Morris Yau, Samuel J Gershman
Dynamic Graph Learning with Static Relations for Credit Risk Assessment
Qi Yuan, Yang Liu, Yateng Tang et al.
Network Sparsity Unlocks the Scaling Potential of Deep Reinforcement Learning
Guozheng Ma, Lu Li, Zilin Wang et al.
RouterRetriever: Routing over a Mixture of Expert Embedding Models
Hyunji Lee, Luca Soldaini, Arman Cohan et al.
MobileIE: An Extremely Lightweight and Effective ConvNet for Real-Time Image Enhancement on Mobile Devices
HAILONG YAN, Ao Li, Xiangtao Zhang et al.
T-SHIRT: Token-Selective Hierarchical Data Selection for Instruction Tuning
Yanjun Fu, Faisal Hamman, Sanghamitra Dutta
Zeroth-Order Fine-Tuning of LLMs in Random Subspaces
Ziming Yu, Pan Zhou, Sike Wang et al.
A multiscale analysis of mean-field transformers in the moderate interaction regime
Giuseppe Bruno, Federico Pasqualotto, Andrea Agazzi
Statistically Valid Post-Deployment Monitoring Should Be Standard for AI-Based Digital Health
Pavel Dolin, Weizhi Li, Gautam Dasarathy et al.
DexGarmentLab: Dexterous Garment Manipulation Environment with Generalizable Policy
Yuran Wang, Ruihai Wu, Yue Chen et al.
TimePro: Efficient Multivariate Long-term Time Series Forecasting with Variable- and Time-Aware Hyper-state
Xiaowen Ma, Zhen-Liang Ni, Shuai Xiao et al.
Keeping an Eye on LLM Unlearning: The Hidden Risk and Remedy
Jie Ren, Zhenwei Dai, Xianfeng Tang et al.
No-Regret is not enough! Bandits with General Constraints through Adaptive Regret Minimization
Martino Bernasconi, Matteo Castiglioni, Andrea Celli
DataRater: Meta-Learned Dataset Curation
Dan Andrei Calian, Greg Farquhar, Iurii Kemaev et al.
BOOM: Benchmarking Out-Of-distribution Molecular Property Predictions of Machine Learning Models
Evan Antoniuk, Shehtab Zaman, Tal Ben-Nun et al.
EVOS: Efficient Implicit Neural Training via EVOlutionary Selector
Weixiang Zhang, Shuzhao Xie, Chengwei Ren et al.
ZPressor: Bottleneck-Aware Compression for Scalable Feed-Forward 3DGS
Weijie Wang, Donny Y. Chen, Zeyu Zhang et al.
Beyond Spatial Domain: Cross-domain Promoted Fourier Convolution Helps Single Image Dehazing
Xiaozhe Zhang, Haidong Ding, Fengying Xie et al.
Position: The Artificial Intelligence and Machine Learning Community Should Adopt a More Transparent and Regulated Peer Review Process
Jing Yang
Interpretable Face Anti-Spoofing: Enhancing Generalization with Multimodal Large Language Models
Guosheng Zhang, Keyao Wang, Haixiao Yue et al.
Textured 3D Regenerative Morphing with 3D Diffusion Prior
Songlin Yang, Yushi LAN, Honghua Chen et al.
Snakes and Ladders: Two Steps Up for VideoMamba
Hui Lu, Albert Ali Salah, Ronald Poppe
Active Fine-Tuning of Multi-Task Policies
Marco Bagatella, Jonas Hübotter, Georg Martius et al.
Volume Optimality in Conformal Prediction with Structured Prediction Sets
Chao Gao, Liren Shan, Vaidehi Srinivas et al.
LayerCraft: Enhancing Text-to-Image Generation with CoT Reasoning and Layered Object Integration
Yuyao Zhang, Jinghao Li, Yu-Wing Tai
EARL-BO: Reinforcement Learning for Multi-Step Lookahead, High-Dimensional Bayesian Optimization
Mujin Cheon, Jay Lee, Dong-Yeun Koh et al.
Straight-Line Diffusion Model for Efficient 3D Molecular Generation
Yuyan Ni, Shikun Feng, Haohan Chi et al.
Maximizing the Potential of Synthetic Data: Insights from Random Matrix Theory
Aymane El Firdoussi, Mohamed El Amine Seddik, Soufiane Hayou et al.
Bridging Compressed Image Latents and Multimodal Large Language Models
Chia-Hao Kao, Cheng Chien, Yu-Jen Tseng et al.
Truth over Tricks: Measuring and Mitigating Shortcut Learning in Misinformation Detection
Herun Wan, Jiaying Wu, Minnan Luo et al.
Efficient Hybrid Language Model Compression through Group-Aware SSM Pruning
Ali Taghibakhshi, Sharath Turuvekere Sreenivas, Saurav Muralidharan et al.
Open-Vocabulary Octree-Graph for 3D Scene Understanding
Zhigang Wang, Yifei Su, Chenhui Li et al.
Scaffolding Dexterous Manipulation with Vision-Language Models
Vincent de Bakker, Joey Hejna, Tyler Lum et al.
AugRefer: Advancing 3D Visual Grounding via Cross-Modal Augmentation and Spatial Relation-based Referring
Xinyi Wang, Na Zhao, Zhiyuan Han et al.
Understanding and Mitigating Memorization in Diffusion Models for Tabular Data
Zhengyu Fang, Zhimeng Jiang, Huiyuan Chen et al.
PROXSPARSE: REGULARIZED LEARNING OF SEMI-STRUCTURED SPARSITY MASKS FOR PRETRAINED LLMS
Hongyi Liu, Rajarshi Saha, Zhen Jia et al.
MetaMind: Modeling Human Social Thoughts with Metacognitive Multi-Agent Systems
Xuanming Zhang, Yuxuan Chen, Samuel (Min-Hsuan) Yeh et al.
Enhancing Text-to-Image Diffusion Transformer via Split-Text Conditioning
Yu Zhang, Jialei Zhou, Xinchen Li et al.
State Space Models are Provably Comparable to Transformers in Dynamic Token Selection
Naoki Nishikawa, Taiji Suzuki
DEALing with Image Reconstruction: Deep Attentive Least Squares
Mehrsa Pourya, Erich Kobler, Michael Unser et al.
Beyond Content Relevance: Evaluating Instruction Following in Retrieval Models
Jianqun Zhou, Yuanlei Zheng, Wei Chen et al.
VE-Bench: Subjective-Aligned Benchmark Suite for Text-Driven Video Editing Quality Assessment
Shangkun Sun, Xiaoyu Liang, Songlin Fan et al.
Harnessing Uncertainty-aware Bounding Boxes for Unsupervised 3D Object Detection
Ruiyang Zhang, Hu Zhang, Zhedong Zheng
LDP: Generalizing to Multilingual Visual Information Extraction by Language Decoupled Pretraining
Huawen Shen, Gengluo Li, Jinwen Zhong et al.
Circuit Representation Learning with Masked Gate Modeling and Verilog-AIG Alignment
Haoyuan Wu, Haisheng Zheng, Yuan Pu et al.
BadRobot: Jailbreaking Embodied LLM Agents in the Physical World
Hangtao Zhang, Chenyu Zhu, Xianlong Wang et al.
Do LLMs Really Forget? Evaluating Unlearning with Knowledge Correlation and Confidence Awareness
Rongzhe Wei, Peizhi Niu, Hans Hao-Hsun Hsu et al.
EOV-Seg: Efficient Open-Vocabulary Panoptic Segmentation
Hongwei Niu, Jie Hu, Jianghang Lin et al.
ConMix: Contrastive Mixup at Representation Level for Long-tailed Deep Clustering
Zhixin Li, Yuheng Jia
Noisy Label Calibration for Multi-View Classification
Shilin Xu, Yuan Sun, Xingfeng Li et al.
AnomalyNCD: Towards Novel Anomaly Class Discovery in Industrial Scenarios
Ziming Huang, Xurui Li, Haotian Liu et al.
Jacobian Sparse Autoencoders: Sparsify Computations, Not Just Activations
Lucy Farnik, Tim Lawson, Conor Houghton et al.
Probing Equivariance and Symmetry Breaking in Convolutional Networks
Sharvaree Vadgama, Mohammad Islam, Domas Buracas et al.
DualCP: Rehearsal-Free Domain-Incremental Learning via Dual-Level Concept Prototype
Qiang Wang, Yuhang He, Songlin Dong et al.
CSformer: Combining Channel Independence and Mixing for Robust Multivariate Time Series Forecasting
Haoxin Wang, Yipeng Mo, Kunlan Xiang et al.
Are Expressive Models Truly Necessary for Offline RL?
Guan Wang, Haoyi Niu, Jianxiong Li et al.
Fused View-Time Attention and Feedforward Reconstruction for 4D Scene Generation
Chaoyang Wang, Ashkan Mirzaei, Vidit Goel et al.
Uniform Generalization Bounds on Data-Dependent Hypothesis Sets via PAC-Bayesian Theory on Random Sets
Benjamin Dupuis, Paul Viallard, George Deligiannidis et al.
FlowCut: Rethinking Redundancy via Information Flow for Efficient Vision-Language Models
Jintao Tong, Wenwei Jin, Pengda Qin et al.
Bridging the Gap between Database Search and \emph{De Novo} Peptide Sequencing with SearchNovo
Jun Xia, Sizhe Liu, Jingbo Zhou et al.
Towards Doctor-Like Reasoning: Medical RAG Fusing Knowledge with Patient Analogy through Textual Gradients
Yuxing Lu, Gecheng Fu, Wei Wu et al.
Solving Robust Markov Decision Processes: Generic, Reliable, Efficient
Tobias Meggendorfer, Maximilian Weininger, Patrick Wienhöft
1000+ FPS 4D Gaussian Splatting for Dynamic Scene Rendering
Yuheng Yuan, Qiuhong Shen, Xingyi Yang et al.
Diff-Prompt: Diffusion-driven Prompt Generator with Mask Supervision
Weicai Yan, Wang Lin, Zirun Guo et al.
A Smooth Sea Never Made a Skilled SAILOR: Robust Imitation via Learning to Search
Arnav Kumar Jain, Vibhakar Mohta, Subin Kim et al.
Prediction-Feedback DETR for Temporal Action Detection
Jihwan Kim, Miso Lee, Cheol-Ho Cho et al.
Benign Overfitting in Single-Head Attention
Roey Magen, Shuning Shang, Zhiwei Xu et al.
Utilize the Flow Before Stepping into the Same River Twice: Certainty Represented Knowledge Flow for Refusal-Aware Instruction Tuning
Runchuan Zhu, Zhipeng Ma, Jiang Wu et al.
GlycanML: A Multi-Task and Multi-Structure Benchmark for Glycan Machine Learning
Minghao Xu, Yunteng Geng, Yihang Zhang et al.
T2V-OptJail: Discrete Prompt Optimization for Text-to-Video Jailbreak Attacks
Jiayang Liu, Siyuan Liang, Shiqian Zhao et al.
Neuron-based Multifractal Analysis of Neuron Interaction Dynamics in Large Models
Xiongye Xiao, Heng Ping, Chenyu Zhou et al.
Growth Inhibitors for Suppressing Inappropriate Image Concepts in Diffusion Models
Die Chen, Zhiwen Li, Mingyuan Fan et al.
StochSync: Stochastic Diffusion Synchronization for Image Generation in Arbitrary Spaces
Kyeongmin Yeo, Jaihoon Kim, Minhyuk Sung
LoRAShop: Training-Free Multi-Concept Image Generation and Editing with Rectified Flow Transformers
Yusuf Dalva, Hidir Yesiltepe, Pinar Yanardag
RadarSplat: Radar Gaussian Splatting for High-Fidelity Data Synthesis and 3D Reconstruction of Autonomous Driving Scenes
Pou-Chun Kung, Skanda Harisha, Ram Vasudevan et al.
Breaking Neural Network Scaling Laws with Modularity
Akhilan Boopathy, Sunshine Jiang, William Yue et al.
SVIP: Semantically Contextualized Visual Patches for Zero-Shot Learning
Zhi Chen, Zecheng Zhao, Jingcai Guo et al.
``Principal Components" Enable A New Language of Images
Xin Wen, Bingchen Zhao, Ismail Elezi et al.
Data-Juicer 2.0: Cloud-Scale Adaptive Data Processing for and with Foundation Models
Daoyuan Chen, Yilun Huang, Xuchen Pan et al.
Auto-Regressive Diffusion for Generating 3D Human-Object Interactions
Zichen Geng, Zeeshan Hayder, Wei Liu et al.
Enhancing Uncertainty Modeling with Semantic Graph for Hallucination Detection
Kedi Chen, Qin Chen, Jie Zhou et al.
Seg4Diff: Unveiling Open-Vocabulary Semantic Segmentation in Text-to-Image Diffusion Transformers
Chaehyun Kim, Heeseong Shin, Eunbeen Hong et al.
Dual-Process Image Generation
Grace Luo, Jonathan Granskog, Aleksander Holynski et al.
DyMU: Dynamic Merging and Virtual Unmerging for Efficient Variable-Length VLMs
Zhenhailong Wang, Senthil Purushwalkam, Caiming Xiong et al.
Unbiased Region-Language Alignment for Open-Vocabulary Dense Prediction
Yunheng Li, Yuxuan Li, Quan-Sheng Zeng et al.
Attention Mechanism, Max-Affine Partition, and Universal Approximation
Hude Liu, Jerry Yao-Chieh Hu, Zhao Song et al.
D^2iT: Dynamic Diffusion Transformer for Accurate Image Generation
Weinan Jia, Mengqi Huang, Nan Chen et al.
Space Group Equivariant Crystal Diffusion
Rees Chang, Angela Pak, Alex Guerra et al.
Let Me Grok for You: Accelerating Grokking via Embedding Transfer from a Weaker Model
Zhiwei Xu, Zhiyu Ni, Yixin Wang et al.
REVECA: Adaptive Planning and Trajectory-Based Validation in Cooperative Language Agents Using Information Relevance and Relative Proximity
SeungWon Seo, SeongRae Noh, Junhyeok Lee et al.
H-MBA: Hierarchical MamBa Adaptation for Multi-Modal Video Understanding in Autonomous Driving
Siran Chen, Yuxiao Luo, Yue Ma et al.
CAM: A Constructivist View of Agentic Memory for LLM-Based Reading Comprehension
Rui Li, Zeyu Zhang, Xiaohe Bo et al.
ELICIT: LLM Augmentation Via External In-context Capability
Futing Wang, Jianhao (Elliott) Yan, Yue Zhang et al.
Understanding Adam Requires Better Rotation Dependent Assumptions
Tianyue Zhang, Lucas Maes, Alan Milligan et al.
Split Gibbs Discrete Diffusion Posterior Sampling
Wenda Chu, Zihui Wu, Yifan Chen et al.
Prediction-Powered Causal Inferences
Riccardo Cadei, Ilker Demirel, Piersilvio De Bartolomeis et al.
Federated Class-Incremental Learning: A Hybrid Approach Using Latent Exemplars and Data-Free Techniques to Address Local and Global Forgetting
Milad Khademi Nori, IL-MIN KIM, Guanghui Wang
TaskGalaxy: Scaling Multi-modal Instruction Fine-tuning with Tens of Thousands Vision Task Types
Jiankang Chen, Tianke Zhang, Changyi Liu et al.
Noise Injection Reveals Hidden Capabilities of Sandbagging Language Models
Cameron Tice, Philipp Kreer, Nathan Helm-Burger et al.
HumanoidGen: Data Generation for Bimanual Dexterous Manipulation via LLM Reasoning
Zhi Jing, Siyuan Yang, Jicong Ao et al.
TR-PTS: Task-Relevant Parameter and Token Selection for Efficient Tuning
Siqi Luo, Haoran Yang, Yi Xin et al.
Neural Eulerian Scene Flow Fields
Kyle Vedder, Neehar Peri, Ishan Khatri et al.
MambaVLT: Time-Evolving Multimodal State Space Model for Vision-Language Tracking
Xinqi Liu, Li Zhou, Zikun Zhou et al.
Event-based Tiny Object Detection: A Benchmark Dataset and Baselines
Nuo Chen, Chao Xiao, Yimian Dai et al.
Týr-the-Pruner: Structural Pruning LLMs via Global Sparsity Distribution Optimization
Guanchen Li, Yixing Xu, Zeping Li et al.
Understanding Representation Dynamics of Diffusion Models via Low-Dimensional Modeling
Xiao Li, Zekai Zhang, Xiang Li et al.
StableCodec: Taming One-Step Diffusion for Extreme Image Compression
Tianyu Zhang, Xin Luo, Li Li et al.
Everything is Editable: Extend Knowledge Editing to Unstructured Data in Large Language Models
Jingcheng Deng, Zihao Wei, Liang Pang et al.
Zebra-Llama: Towards Extremely Efficient Hybrid Models
Mingyu Yang, Mehdi Rezagholizadeh, Guihong Li et al.
Precedence-Constrained Winter Value for Effective Graph Data Valuation
Hongliang Chi, Wei Jin, Charu Aggarwal et al.
AutoSGNN: Automatic Propagation Mechanism Discovery for Spectral Graph Neural Networks
Shibing Mo, Kai Wu, Qixuan Gao et al.
Same Task, Different Circuits: Disentangling Modality-Specific Mechanisms in VLMs
Yaniv Nikankin, Dana Arad, Yossi Gandelsman et al.
LibriBrain: Over 50 Hours of Within-Subject MEG to Improve Speech Decoding Methods at Scale
Miran Özdogan, Gilad Landau, Gereon Elvers et al.
AURELIA: Test-time Reasoning Distillation in Audio-Visual LLMs
Sanjoy Chowdhury, Hanan Gani, Nishit Anand et al.
Train Small, Infer Large: Memory-Efficient LoRA Training for Large Language Models
Jun Zhang, Jue Wang, Huan Li et al.
Expressivity of Neural Networks with Random Weights and Learned Biases
Ezekiel Williams, Alexandre Payeur, Avery Ryoo et al.
Validating LLM-as-a-Judge Systems under Rating Indeterminacy
Luke Guerdan, Solon Barocas, Kenneth Holstein et al.
Trust Region Constrained Measure Transport in Path Space for Stochastic Optimal Control and Inference
Denis Blessing, Julius Berner, Lorenz Richter et al.
NVComposer: Boosting Generative Novel View Synthesis with Multiple Sparse and Unposed Images
Lingen Li, Zhaoyang Zhang, Yaowei Li et al.
Stealthy Shield Defense: A Conditional Mutual Information-Based Approach against Black-Box Model Inversion Attacks
Tianqu Zhuang, Hongyao Yu, Yixiang Qiu et al.
Stable Hadamard Memory: Revitalizing Memory-Augmented Agents for Reinforcement Learning
Hung Le, Dung Nguyen, Kien Do et al.
Learned Image Compression with Hierarchical Progressive Context Modeling
Yuqi Li, Haotian Zhang, Li Li et al.
HiBug2: Efficient and Interpretable Error Slice Discovery for Comprehensive Model Debugging
Muxi Chen, Chenchen Zhao, Qiang Xu
EOC-Bench: Can MLLMs Identify, Recall, and Forecast Objects in an Egocentric World?
Yuqian Yuan, Ronghao Dang, long li et al.
Mask in the Mirror: Implicit Sparsification
Tom Jacobs, Rebekka Burkholz
Not All LLM-Generated Data Are Equal: Rethinking Data Weighting in Text Classification
Hsun-Yu Kuo, Yin-Hsiang Liao, Yu-Chieh Chao et al.
CODA: Repurposing Continuous VAEs for Discrete Tokenization
Zeyu Liu, Zanlin Ni, Yeguo Hua et al.
GS-LIVM: Real-Time Photo-Realistic LiDAR-Inertial-Visual Mapping with Gaussian Splatting
Yusen XIE, Zhenmin Huang, Jin Wu et al.
DRL: Decomposed Representation Learning for Tabular Anomaly Detection
Hangting Ye, He Zhao, Wei Fan et al.
Rethinking Bimanual Robotic Manipulation: Learning with Decoupled Interaction Framework
Jian-Jian Jiang, Xiao-Ming Wu, Yi-Xiang He et al.
Functionality Understanding and Segmentation in 3D Scenes
Jaime Corsetti, Francesco Giuliari, Alice Fasoli et al.
Bi-Directional Multi-Scale Graph Dataset Condensation via Information Bottleneck
Xingcheng Fu, Yisen Gao, Beining Yang et al.
SynFER: Towards Boosting Facial Expression Recognition with Synthetic Data
Xilin He, Cheng Luo, Xiaole Xian et al.
IMDPrompter: Adapting SAM to Image Manipulation Detection by Cross-View Automated Prompt Learning
Quan Zhang, Yuxin Qi, Xi Tang et al.
Tight Clusters Make Specialized Experts
Stefan Nielsen, Rachel Teo, Laziz Abdullaev et al.
Autocorrelation Matters: Understanding the Role of Initialization Schemes for State Space Models
Fusheng Liu, Qianxiao Li
FlashDepth: Real-time Streaming Video Depth Estimation at 2K Resolution
Gene Chou, Wenqi Xian, Guandao Yang et al.
ReAL-AD: Towards Human-Like Reasoning in End-to-End Autonomous Driving
Yuhang Lu, Jiadong Tu, Yuexin Ma et al.
HM3: Hierarchical Multi-Objective Model Merging for Pretrained Models
Yu Zhou, Xingyu Wu, Jibin Wu et al.
REDUCIO! Generating 1K Video within 16 Seconds using Extremely Compressed Motion Latents
Rui Tian, Qi Dai, Jianmin Bao et al.
Generating Counterfactual Explanations Under Temporal Constraints
Andrei Buliga, Chiara Di Francescomarino, Chiara Ghidini et al.
Momentum Multi-Marginal Schrödinger Bridge Matching
Panagiotis Theodoropoulos, Augustinos Saravanos, Evangelos Theodorou et al.
SUMO: Subspace-Aware Moment-Orthogonalization for Accelerating Memory-Efficient LLM Training
Yehonathan Refael, Guy Smorodinsky, Tom Tirer et al.
Dynamic Sparse Training versus Dense Training: The Unexpected Winner in Image Corruption Robustness
Boqian Wu, Qiao Xiao, Shunxin Wang et al.
OSDA Agent: Leveraging Large Language Models for De Novo Design of Organic Structure Directing Agents
Zhaolin Hu, Yixiao Zhou, Zhongan Wang et al.
Decoupling Angles and Strength in Low-rank Adaptation
Massimo Bini, Leander Girrbach, Zeynep Akata
Always Skip Attention
Yiping Ji, Hemanth Saratchandran, Peyman Moghadam et al.
Multimodal Tabular Reasoning with Privileged Structured Information
Jun-Peng Jiang, Yu Xia, Hai-Long Sun et al.
Dissecting Generalized Category Discovery: Multiplex Consensus under Self-Deconstruction
Luyao Tang, Kunze Huang, Yuxuan Yuan et al.
When Maximum Entropy Misleads Policy Optimization
Ruipeng Zhang, Ya-Chien Chang, Sicun Gao
Improving Energy Natural Gradient Descent through Woodbury, Momentum, and Randomization
Andrés Guzmán-Cordero, Felix Dangel, Gil Goldshlager et al.
Vgent: Graph-based Retrieval-Reasoning-Augmented Generation For Long Video Understanding
Xiaoqian Shen, Wenxuan Zhang, Jun Chen et al.
DropletVideo: A Dataset and Approach to Explore Integral Spatio-Temporal Consistent Video Generation
Runze Zhang, Guoguang Du, Xiaochuan Li et al.
Gradient descent with generalized Newton’s method
Zhiqi Bu, Shiyun Xu
Turbo3D: Ultra-fast Text-to-3D Generation
Hanzhe Hu, Tianwei Yin, Fujun Luan et al.
Mixture of Experts Based Multi-Task Supervise Learning from Crowds
Tao Han, Huaixuan Shi, Xinyi Ding et al.
R-LiViT: A LiDAR-Visual-Thermal Dataset Enabling Vulnerable Road User Focused Roadside Perception
Jonas Mirlach, Lei Wan, Andreas Wiedholz et al.
MOL-Mamba: Enhancing Molecular Representation with Structural & Electronic Insights
Jingjing Hu, Dan Guo, Zhan Si et al.
CompCap: Improving Multimodal Large Language Models with Composite Captions
Xiaohui Chen, Satya Narayan Shukla, Mahmoud Azab et al.
Detail-Preserving Latent Diffusion for Stable Shadow Removal
Jiamin Xu, Yuxin Zheng, Zelong Li et al.
FedTMOS: Efficient One-Shot Federated Learning with Tsetlin Machine
Shannon How, Jagmohan Chauhan, Geoff Merrett et al.
Bringing RNNs Back to Efficient Open-Ended Video Understanding
Weili Xu, Enxin Song, Wenhao Chai et al.
Federated Residual Low-Rank Adaption of Large Language Models
Yunlu Yan, Chun-Mei Feng, Wangmeng Zuo et al.
Tuning-Free Bilevel Optimization: New Algorithms and Convergence Analysis
Yifan Yang, Hao Ban, Minhui Huang et al.
MIEB: Massive Image Embedding Benchmark
Chenghao Xiao, Isaac Chung, Imene Kerboua et al.
OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding
Jingli Lin, Chenming Zhu, Runsen Xu et al.
Visual Persona: Foundation Model for Full-Body Human Customization
Jisu Nam, Soowon Son, Zhan Xu et al.
Where, What, Why: Towards Explainable Driver Attention Prediction
Yuchen Zhou, Jiayu Tang, Xiaoyan Xiao et al.
Enhancing SAM with Efficient Prompting and Preference Optimization for Semi-supervised Medical Image Segmentation
Aishik Konwer, Zhijian Yang, Erhan Bas et al.
EditRoom: LLM-parameterized Graph Diffusion for Composable 3D Room Layout Editing
Kaizhi Zheng, Xiaotong Chen, Xuehai He et al.
Prediction-Powered E-Values
Daniel Csillag, Claudio Struchiner, Guilherme Tegoni Goedert
Stiefel Flow Matching for Moment-Constrained Structure Elucidation
Austin H Cheng, Alston Lo, Kin Long Kelvin Lee et al.