Most Cited 2025 "latent dimension alignment" Papers
22,274 papers found • Page 23 of 112
Conference
FlipSketch: Flipping Static Drawings to Text-Guided Sketch Animations
Hmrishav Bandyopadhyay, Yi-Zhe Song
RANGE: Retrieval Augmented Neural Fields for Multi-Resolution Geo-Embeddings
Aayush Dhakal, Srikumar Sastry, Subash Khanal et al.
DiffRetouch: Using Diffusion to Retouch on the Shoulder of Experts
Zheng-Peng Duan, Jiawei Zhang, Zheng Lin et al.
Auto-Regressive Diffusion for Generating 3D Human-Object Interactions
Zichen Geng, Zeeshan Hayder, Wei Liu et al.
Revisiting a Design Choice in Gradient Temporal Difference Learning
Xiaochi Qian, Shangtong Zhang
Autocorrelation Matters: Understanding the Role of Initialization Schemes for State Space Models
Fusheng Liu, Qianxiao Li
Augmented Deep Contexts for Spatially Embedded Video Coding
Yifan Bian, Chuanbo Tang, Li Li et al.
QT-DoG: Quantization-Aware Training for Domain Generalization
Saqib Javed, Hieu Le, Mathieu Salzmann
PEACE: Empowering Geologic Map Holistic Understanding with MLLMs
Yangyu Huang, Tianyi Gao, Haoran Xu et al.
Position: The Artificial Intelligence and Machine Learning Community Should Adopt a More Transparent and Regulated Peer Review Process
Jing Yang
Unisolver: PDE-Conditional Transformers Towards Universal Neural PDE Solvers
Hang Zhou, Yuezhou Ma, Haixu Wu et al.
Toward Efficient Kernel-Based Solvers for Nonlinear PDEs
Zhitong Xu, Da Long, Yiming Xu et al.
H-MBA: Hierarchical MamBa Adaptation for Multi-Modal Video Understanding in Autonomous Driving
Siran Chen, Yuxiao Luo, Yue Ma et al.
Large Language Models Think Too Fast To Explore Effectively
Lan Pan, Hanbo Xie, Robert Wilson
Stable-SCore: A Stable Registration-based Framework for 3D Shape Correspondence
Haolin Liu, Xiaohang Zhan, Zizheng Yan et al.
MindSimulator: Exploring Brain Concept Localization via Synthetic fMRI
Qi Zhang, Qi Zhang, Zixuan Gong et al.
Uniform Generalization Bounds on Data-Dependent Hypothesis Sets via PAC-Bayesian Theory on Random Sets
Benjamin Dupuis, Paul Viallard, George Deligiannidis et al.
Understanding and Mitigating Memorization in Diffusion Models for Tabular Data
Zhengyu Fang, Zhimeng Jiang, Huiyuan Chen et al.
RealEdit: Reddit Edits As a Large-scale Empirical Dataset for Image Transformations
Peter Sushko, Ayana Bharadwaj, Zhi Yang Lim et al.
The emergence of sparse attention: impact of data distribution and benefits of repetition
Nicolas Zucchet, Francesco D'Angelo, Andrew Lampinen et al.
Truth over Tricks: Measuring and Mitigating Shortcut Learning in Misinformation Detection
Herun Wan, Jiaying Wu, Minnan Luo et al.
Task Generalization with Autoregressive Compositional Structure: Can Learning from $D$ Tasks Generalize to $D^T$ Tasks?
Amirhesam Abedsoltan, Huaqing Zhang, Kaiyue Wen et al.
FlowCut: Rethinking Redundancy via Information Flow for Efficient Vision-Language Models
Jintao Tong, Wenwei Jin, Pengda Qin et al.
DualPM: Dual Posed-Canonical Point Maps for 3D Shape and Pose Reconstruction
Ben Kaye, Tomas Jakab, Shangzhe Wu et al.
Token Perturbation Guidance for Diffusion Models
Javad Rajabi, Soroush Mehraban, Seyedmorteza Sadat et al.
WaterDiffusion: Learning a Prior-involved Unrolling Diffusion for Joint Underwater Saliency Detection and Visual Restoration
Laibin Chang, Yunke Wang, Longxiang Deng et al.
Generative Sparse-View Gaussian Splatting
Hanyang Kong, Xingyi Yang, Xinchao Wang
ProtoArgNet: Interpretable Image Classification with Super-Prototypes and Argumentation
Hamed Ayoobi, Nico Potyka, Francesca Toni
PROXSPARSE: REGULARIZED LEARNING OF SEMI-STRUCTURED SPARSITY MASKS FOR PRETRAINED LLMS
Hongyi Liu, Rajarshi Saha, Zhen Jia et al.
Exploit Your Latents: Coarse-Grained Protein Backmapping with Latent Diffusion Models
Rongchao Zhang, Yu Huang, Yiwei Lou et al.
StyleSSP: Sampling StartPoint Enhancement for Training-free Diffusion-based Method for Style Transfer
ruojun xu, Weijie Xi, Xiaodi Wang et al.
Robust Federated Finetuning of LLMs via Alternating Optimization of LoRA
Shuangyi Chen, Yuanxin Guo, Yue Ju et al.
4DTAM: Non-Rigid Tracking and Mapping via Dynamic Surface Gaussians
Hidenobu Matsuki, Gwangbin Bae, Andrew J. Davison
DEALing with Image Reconstruction: Deep Attentive Least Squares
Mehrsa Pourya, Erich Kobler, Michael Unser et al.
Manipulating Feature Visualizations with Gradient Slingshots
Dilyara Bareeva, Marina Höhne, Alexander Warnecke et al.
TurboFill: Adapting Few-step Text-to-image Model for Fast Image Inpainting
Liangbin Xie, Daniil Pakhomov, Zhonghao Wang et al.
IMDPrompter: Adapting SAM to Image Manipulation Detection by Cross-View Automated Prompt Learning
Quan Zhang, Yuxin Qi, Xi Tang et al.
Flowing Datasets with Wasserstein over Wasserstein Gradient Flows
Clément Bonet, Christophe Vauthier, Anna Korba
Afterburner: Reinforcement Learning Facilitates Self-Improving Code Efficiency Optimization
Mingzhe Du, Anh Tuan Luu, Yue Liu et al.
PhysAug: A Physical-guided and Frequency-based Data Augmentation for Single-Domain Generalized Object Detection
Xiaoran Xu, Jiangang Yang, Wenhui Shi et al.
GNNs Getting ComFy: Community and Feature Similarity Guided Rewiring
Celia Rubio-Madrigal, Adarsh Jamadandi, Rebekka Burkholz
Cached Multi-Lora Composition for Multi-Concept Image Generation
Xiandong Zou, Mingzhu Shen, Christos-Savvas Bouganis et al.
MOL-Mamba: Enhancing Molecular Representation with Structural & Electronic Insights
Jingjing Hu, Dan Guo, Zhan Si et al.
Bi-level Contrastive Learning for Knowledge-Enhanced Molecule Representations
Pengcheng Jiang, Cao Xiao, Tianfan Fu et al.
StableCodec: Taming One-Step Diffusion for Extreme Image Compression
Tianyu Zhang, Xin Luo, Li Li et al.
BrainOOD: Out-of-distribution Generalizable Brain Network Analysis
Jiaxing Xu, Yongqiang Chen, Xia Dong et al.
Hypergraph Attacks via Injecting Homogeneous Nodes into Elite Hyperedges
Meixia He, Peican Zhu, Keke Tang et al.
Are Expressive Models Truly Necessary for Offline RL?
Guan Wang, Haoyi Niu, Jianxiong Li et al.
Multi-modal Knowledge Distillation-based Human Trajectory Forecasting
Jaewoo Jeong, Seohee Lee, Daehee Park et al.
CSformer: Combining Channel Independence and Mixing for Robust Multivariate Time Series Forecasting
Haoxin Wang, Yipeng Mo, Kunlan Xiang et al.
FluxSpace: Disentangled Semantic Editing in Rectified Flow Models
Yusuf Dalva, Kavana Venkatesh, Pinar Yanardag
DualCP: Rehearsal-Free Domain-Incremental Learning via Dual-Level Concept Prototype
Qiang Wang, Yuhang He, Songlin Dong et al.
End-to-end Learning of Gaussian Mixture Priors for Diffusion Sampler
Denis Blessing, Xiaogang Jia, Gerhard Neumann
SDGOCC: Semantic and Depth-Guided Bird's-Eye View Transformation for 3D Multimodal Occupancy Prediction
ZaiPeng Duan, Xuzhong Hu, Pei An et al.
Rethinking the Embodied Gap in Vision-and-Language Navigation: A Holistic Study of Physical and Visual Disparities
Liuyi Wang, Xinyuan Xia, Hui Zhao et al.
Locally Convex Global Loss Network for Decision-Focused Learning
Haeun Jeon, Hyunglip Bae, Minsu Park et al.
Decoder-Hybrid-Decoder Architecture for Efficient Reasoning with Long Generation
Liliang Ren, Congcong Chen, Haoran Xu et al.
Utilize the Flow Before Stepping into the Same River Twice: Certainty Represented Knowledge Flow for Refusal-Aware Instruction Tuning
Runchuan Zhu, Zhipeng Ma, Jiang Wu et al.
Expressivity of Neural Networks with Random Weights and Learned Biases
Ezekiel Williams, Alexandre Payeur, Avery Ryoo et al.
Maximizing the Potential of Synthetic Data: Insights from Random Matrix Theory
Aymane El Firdoussi, Mohamed El Amine Seddik, Soufiane Hayou et al.
AdaFisher: Adaptive Second Order Optimization via Fisher Information
Damien GOMES, Yanlei Zhang, Eugene Belilovsky et al.
T-JEPA: Augmentation-Free Self-Supervised Learning for Tabular Data
Hugo Thimonier, José Lucas De Melo Costa, Fabrice Popineau et al.
COAP: Memory-Efficient Training with Correlation-Aware Gradient Projection
Jinqi Xiao, Shen Sang, Tiancheng Zhi et al.
LVFace: Progressive Cluster Optimization for Large Vision Models in Face Recognition
Jinghan You, Shanglin Li, Yuanrui Sun et al.
AerialVG: A Challenging Benchmark for Aerial Visual Grounding by Exploring Positional Relations
Junli Liu, Qizhi Chen, Zhigang Wang et al.
Feature Responsiveness Scores: Model-Agnostic Explanations for Recourse
Seung Hyun Cheon, Anneke Wernerfelt, Sorelle Friedler et al.
6D Object Pose Tracking in Internet Videos for Robotic Manipulation
Georgy Ponimatkin, Martin Cífka, Tomas Soucek et al.
CODA: Repurposing Continuous VAEs for Discrete Tokenization
Zeyu Liu, Zanlin Ni, Yeguo Hua et al.
Aligning Language Models Using Follow-up Likelihood as Reward Signal
Chen Zhang, Dading Chong, Feng Jiang et al.
Runtime Analysis for Multi-Objective Evolutionary Algorithms in Unbounded Integer Spaces
Benjamin Doerr, Martin S. Krejca, Günter Rudolph
R-LiViT: A LiDAR-Visual-Thermal Dataset Enabling Vulnerable Road User Focused Roadside Perception
Jonas Mirlach, Lei Wan, Andreas Wiedholz et al.
ReDit: Reward Dithering for Improved LLM Policy Optimization
Chenxing Wei, Jiarui Yu, Ying He et al.
Stable Hadamard Memory: Revitalizing Memory-Augmented Agents for Reinforcement Learning
Hung Le, Dung Nguyen, Kien Do et al.
DeRIS: Decoupling Perception and Cognition for Enhanced Referring Image Segmentation through Loopback Synergy
Ming Dai, Wenxuan Cheng, Jiang-Jiang Liu et al.
CATSplat: Context-Aware Transformer with Spatial Guidance for Generalizable 3D Gaussian Splatting from A Single-View Image
Wonseok Roh, Hwanhee Jung, JongWook Kim et al.
ZeroStereo: Zero-shot Stereo Matching from Single Images
Xianqi Wang, Hao Yang, Gangwei Xu et al.
Train Small, Infer Large: Memory-Efficient LoRA Training for Large Language Models
Jun Zhang, Jue Wang, Huan Li et al.
Cross-modal Ship Re-Identification via Optical and SAR Imagery: A Novel Dataset and Method
Han Wang, Shengyang Li, Jian Yang et al.
How Do I Do That? Synthesizing 3D Hand Motion and Contacts for Everyday Interactions
Aditya Prakash, Benjamin E Lundell, Dmitry Andreychuk et al.
GroupMamba: Efficient Group-Based Visual State Space Model
Abdelrahman Shaker, Syed Talal Wasim, Salman Khan et al.
CLIPDrag: Combining Text-based and Drag-based Instructions for Image Editing
Ziqi Jiang, Zhen Wang, Long Chen
UV-Attack: Physical-World Adversarial Attacks on Person Detection via Dynamic-NeRF-based UV Mapping
Yanjie Li, Kaisheng Liang, Bin Xiao
RadarSplat: Radar Gaussian Splatting for High-Fidelity Data Synthesis and 3D Reconstruction of Autonomous Driving Scenes
Pou-Chun Kung, Skanda Harisha, Ram Vasudevan et al.
Enhancing Reward Models for High-quality Image Generation: Beyond Text-Image Alignment
ying ba, Tianyu Zhang, Yalong Bai et al.
3D Occupancy Prediction with Low-Resolution Queries via Prototype-aware View Transformation
Gyeongrok Oh, Sung June Kim, Heeju Ko et al.
Enhancing Text-to-Image Diffusion Transformer via Split-Text Conditioning
Yu Zhang, Jialei Zhou, Xinchen Li et al.
Unleashing the Potential of Vision-Language Pre-Training for 3D Zero-Shot Lesion Segmentation via Mask-Attribute Alignment
Yankai Jiang, Wenhui Lei, Xiaofan Zhang et al.
Learning with Calibration: Exploring Test-Time Computing of Spatio-Temporal Forecasting
Wei Chen, Yuxuan Liang
SUMI-IFL: An Information-Theoretic Framework for Image Forgery Localization with Sufficiency and Minimality Constraints
Ziqi Sheng, Wei Lu, Xiangyang Luo et al.
MobileIE: An Extremely Lightweight and Effective ConvNet for Real-Time Image Enhancement on Mobile Devices
HAILONG YAN, Ao Li, Xiangtao Zhang et al.
From Easy to Hard: Progressive Active Learning Framework for Infrared Small Target Detection with Single Point Supervision
Chuang Yu, Jinmiao Zhao, Yunpeng Liu et al.
SpatialSplat: Efficient Semantic 3D from Sparse Unposed Images
Yu Sheng, Jiajun Deng, Xinran Zhang et al.
TurboReg: TurboClique for Robust and Efficient Point Cloud Registration
Shaocheng Yan, Pengcheng Shi, Zhenjun Zhao et al.
Boosting Short Text Classification with Multi-Source Information Exploration and Dual-Level Contrastive Learning
Yonghao Liu, Mengyu Li, Wei Pang et al.
DefectFill: Realistic Defect Generation with Inpainting Diffusion Model for Visual Inspection
Jaewoo Song, Daemin Park, Kanghyun Baek et al.
``Principal Components" Enable A New Language of Images
Xin Wen, Bingchen Zhao, Ismail Elezi et al.
A Comprehensive Study of Decoder-Only LLMs for Text-to-Image Generation
Andrew Z Wang, Songwei Ge, Tero Karras et al.
Bidirectional Decoding: Improving Action Chunking via Guided Test-Time Sampling
Yuejiang Liu, Jubayer Hamid, Annie Xie et al.
SVIP: Semantically Contextualized Visual Patches for Zero-Shot Learning
Zhi Chen, Zecheng Zhao, Jingcai Guo et al.
Task-level Distributionally Robust Optimization for Large Language Model-based Dense Retrieval
Guangyuan Ma, Yongliang Ma, Xing Wu et al.
Robust Machine Unlearning for Quantized Neural Networks via Adaptive Gradient Reweighting with Similar Labels
Yujia Tong, Yuze Wang, Jingling Yuan et al.
BF-STVSR: B-Splines and Fourier---Best Friends for High Fidelity Spatial-Temporal Video Super-Resolution
Eunjin Kim, HYEONJIN KIM, Kyong Hwan Jin et al.
ReAL-AD: Towards Human-Like Reasoning in End-to-End Autonomous Driving
Yuhang Lu, Jiadong Tu, Yuexin Ma et al.
Guiding Human-Object Interactions with Rich Geometry and Relations
Mengqing Xue, Yifei Liu, Ling Guo et al.
Textured 3D Regenerative Morphing with 3D Diffusion Prior
Songlin Yang, Yushi LAN, Honghua Chen et al.
Snakes and Ladders: Two Steps Up for VideoMamba
Hui Lu, Albert Ali Salah, Ronald Poppe
1000+ FPS 4D Gaussian Splatting for Dynamic Scene Rendering
Yuheng Yuan, Qiuhong Shen, Xingyi Yang et al.
Dual-Process Image Generation
Grace Luo, Jonathan Granskog, Aleksander Holynski et al.
IDInit: A Universal and Stable Initialization Method for Neural Network Training
Yu Pan, Chaozheng Wang, Zekai Wu et al.
Test3R: Learning to Reconstruct 3D at Test Time
Yuheng Yuan, Qiuhong Shen, Shizun Wang et al.
Investigating the Pre-Training Dynamics of In-Context Learning: Task Recognition vs. Task Learning
Xiaolei Wang, Xinyu Tang, Junyi Li et al.
Unbiased Region-Language Alignment for Open-Vocabulary Dense Prediction
Yunheng Li, Yuxuan Li, Quan-Sheng Zeng et al.
Zero-resource Hallucination Detection for Text Generation via Graph-based Contextual Knowledge Triples Modeling
Xinyue Fang, Zhen Huang, Zhiliang Tian et al.
Emulating Self-attention with Convolution for Efficient Image Super-Resolution
Dongheon Lee, Seokju Yun, Youngmin Ro
CAMEx: Curvature-aware Merging of Experts
Dung Viet Nguyen, Minh Nguyen, Luc Nguyen et al.
REDUCIO! Generating 1K Video within 16 Seconds using Extremely Compressed Motion Latents
Rui Tian, Qi Dai, Jianmin Bao et al.
Advantage Alignment Algorithms
Juan Duque, Milad Aghajohari, Timotheus Cooijmans et al.
Tight Clusters Make Specialized Experts
Stefan Nielsen, Rachel Teo, Laziz Abdullaev et al.
CoVoMix2: Advancing Zero-Shot Dialogue Generation with Fully Non-Autoregressive Flow Matching
Leying Zhang, Yao Qian, Xiaofei Wang et al.
Beyond Content Relevance: Evaluating Instruction Following in Retrieval Models
Jianqun Zhou, Yuanlei Zheng, Wei Chen et al.
Open-Vocabulary Octree-Graph for 3D Scene Understanding
Zhigang Wang, Yifei Su, Chenhui Li et al.
SimXRD-4M: Big Simulated X-ray Diffraction Data and Crystal Symmetry Classification Benchmark
Bin Cao, Yang Liu, Zinan Zheng et al.
CompCap: Improving Multimodal Large Language Models with Composite Captions
Xiaohui Chen, Satya Narayan Shukla, Mahmoud Azab et al.
Bringing RNNs Back to Efficient Open-Ended Video Understanding
Weili Xu, Enxin Song, Wenhao Chai et al.
OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding
Jingli Lin, Chenming Zhu, Runsen Xu et al.
Enhancing Uncertainty Modeling with Semantic Graph for Hallucination Detection
Kedi Chen, Qin Chen, Jie Zhou et al.
AlphaPre: Amplitude-Phase Disentanglement Model for Precipitation Nowcasting
Kenghong Lin, Baoquan Zhang, Demin Yu et al.
Fused View-Time Attention and Feedforward Reconstruction for 4D Scene Generation
Chaoyang Wang, Ashkan Mirzaei, Vidit Goel et al.
PartRM: Modeling Part-Level Dynamics with Large Cross-State Reconstruction Model
Mingju Gao, Yike Pan, Huan-ang Gao et al.
AURELIA: Test-time Reasoning Distillation in Audio-Visual LLMs
Sanjoy Chowdhury, Hanan Gani, Nishit Anand et al.
Harnessing Uncertainty-aware Bounding Boxes for Unsupervised 3D Object Detection
Ruiyang Zhang, Hu Zhang, Zhedong Zheng
Bridging Training and Execution via Dynamic Directed Graph-Based Communication in Cooperative Multi-Agent Systems
Zhuohui Zhang, Bin He, Bin Cheng et al.
Satellite Observations Guided Diffusion Model for Accurate Meteorological States at Arbitrary Resolution
Siwei Tu, Ben Fei, Weidong Yang et al.
Federated Learning with Domain Shift Eraser
Zheng Wang, Zihui Wang, Zheng Wang et al.
Probabilistic Stability Guarantees for Feature Attributions
Helen Jin, Anton Xue, Weiqiu You et al.
REVECA: Adaptive Planning and Trajectory-Based Validation in Cooperative Language Agents Using Information Relevance and Relative Proximity
SeungWon Seo, SeongRae Noh, Junhyeok Lee et al.
Stealthy Shield Defense: A Conditional Mutual Information-Based Approach against Black-Box Model Inversion Attacks
Tianqu Zhuang, Hongyao Yu, Yixiang Qiu et al.
MIEB: Massive Image Embedding Benchmark
Chenghao Xiao, Isaac Chung, Imene Kerboua et al.
BadRobot: Jailbreaking Embodied LLM Agents in the Physical World
Hangtao Zhang, Chenyu Zhu, Xianlong Wang et al.
Multi-Agent Motion Planning for Differential Drive Robots Through Stationary State Search
Jingtian Yan, Jiaoyang Li
D^2iT: Dynamic Diffusion Transformer for Accurate Image Generation
Weinan Jia, Mengqi Huang, Nan Chen et al.
On the Expressive Power of Sparse Geometric MPNNs
Yonatan Sverdlov, Nadav Dym
Student-Informed Teacher Training
Nico Messikommer, Jiaxu Xing, Elie Aljalbout et al.
Grounding 3D Object Affordance with Language Instructions, Visual Observations and Interactions
He Zhu, Quyu Kong, Kechun Xu et al.
Enabling Realtime Reinforcement Learning at Scale with Staggered Asynchronous Inference
Matt Riemer, Gopeshh Raaj Subbaraj, Glen Berseth et al.
Circuit Representation Learning with Masked Gate Modeling and Verilog-AIG Alignment
Haoyuan Wu, Haisheng Zheng, Yuan Pu et al.
Not All LLM-Generated Data Are Equal: Rethinking Data Weighting in Text Classification
Hsun-Yu Kuo, Yin-Hsiang Liao, Yu-Chieh Chao et al.
Event-based Tiny Object Detection: A Benchmark Dataset and Baselines
Nuo Chen, Chao Xiao, Yimian Dai et al.
GazeGaussian: High-Fidelity Gaze Redirection with 3D Gaussian Splatting
Xiaobao Wei, Peng Chen, Guangyu Li et al.
Zeroth-Order Fine-Tuning of LLMs in Random Subspaces
Ziming Yu, Pan Zhou, Sike Wang et al.
Understanding the Limits of Deep Tabular Methods with Temporal Shift
Haorun Cai, Han-Jia Ye
LICORICE: Label-Efficient Concept-Based Interpretable Reinforcement Learning
Zhuorui Ye, Stephanie Milani, Geoff Gordon et al.
Learning quadratic neural networks in high dimensions: SGD dynamics and scaling laws
Gerard Ben Arous, Murat Erdogdu, Nuri Mert Vural et al.
MultiverSeg: Scalable Interactive Segmentation of Biomedical Imaging Datasets with In-Context Guidance
Hallee Wong, Jose Javier Gonzalez Ortiz, John Guttag et al.
L-MTP: Leap Multi-Token Prediction Beyond Adjacent Context for Large Language Models
Xiaohao Liu, Xiaobo Xia, Weixiang Zhao et al.
Learning from Neighbors: Category Extrapolation for Long-Tail Learning
Shizhen Zhao, Xin Wen, Jiahui Liu et al.
On Generalization Across Environments In Multi-Objective Reinforcement Learning
Jayden Teoh, Pradeep Varakantham, Peter Vamplew
Feature Clipping for Uncertainty Calibration
Linwei Tao, Minjing Dong, Chang Xu
Bringing CLIP to the Clinic: Dynamic Soft Labels and Negation-Aware Learning for Medical Analysis
Hanbin Ko, Chang Min Park
nvBench 2.0: Resolving Ambiguity in Text-to-Visualization through Stepwise Reasoning
Tianqi Luo, Chuhan Huang, Leixian Shen et al.
Audio-Visual Semantic Graph Network for Audio-Visual Event Localization
Liang Liu, Shuaiyong Li, Yongqiang Zhu
E(3)-equivariant models cannot learn chirality: Field-based molecular generation
Alexandru Dumitrescu, Dani Korpela, Markus Heinonen et al.
Many-Objective Multi-Solution Transport
Ziyue Li, Tian Li, Virginia Smith et al.
Detect Any Mirrors: Boosting Learning Reliability on Large-Scale Unlabeled Data with an Iterative Data Engine
Zhaohu Xing, Lihao Liu, Yijun Yang et al.
BitStack: Any-Size Compression of Large Language Models in Variable Memory Environments
Xinghao Wang, Pengyu Wang, Bo Wang et al.
AniMo: Species-Aware Model for Text-Driven Animal Motion Generation
Xuan Wang, Kai Ruan, Xing Zhang et al.
Noise-Resilient Symbolic Regression with Dynamic Gating Reinforcement Learning
Chenglu Sun, Shuo Shen, Wenzhi Tao et al.
Q-Adapter: Customizing Pre-trained LLMs to New Preferences with Forgetting Mitigation
Yi-Chen Li, Fuxiang Zhang, Wenjie Qiu et al.
SimVS: Simulating World Inconsistencies for Robust View Synthesis
Alex Trevithick, Roni Paiss, Philipp Henzler et al.
Distilling Parallel Gradients for Fast ODE Solvers of Diffusion Models
Beier Zhu, Ruoyu Wang, Tong Zhao et al.
Efficient Quadratic Corrections for Frank-Wolfe Algorithms
Jannis Halbey, Seta Rakotomandimby, Mathieu Besançon et al.
From Probability to Counterfactuals: the Increasing Complexity of Satisfiability in Pearl's Causal Hierarchy
Julian Dörfler, Benito van der Zander, Markus Bläser et al.
ResearchCodeBench: Benchmarking LLMs on Implementing Novel Machine Learning Research Code
Tianyu Hua, Harper Hua, Violet Xiang et al.
Auditing Meta-Cognitive Hallucinations in Reasoning Large Language Models
Haolang Lu, Yilian Liu, Jingxin Xu et al.
DLF: Extreme Image Compression with Dual-generative Latent Fusion
Naifu Xue, Zhaoyang Jia, Jiahao Li et al.
Difficulty-aware Balancing Margin Loss for Long-tailed Recognition
Minseok Son, Inyong Koo, Jinyoung Park et al.
Point-to-Region Loss for Semi-Supervised Point-Based Crowd Counting
Wei Lin, Chenyang ZHAO, Antoni B. Chan
A Simple Approach to Unifying Diffusion-based Conditional Generation
Xirui Li, Charles Herrmann, Kelvin Chan et al.
Multimodal Variational Autoencoder: A Barycentric View
Peijie Qiu, Wenhui Zhu, Sayantan Kumar et al.
Learning-Guided Rolling Horizon Optimization for Long-Horizon Flexible Job-Shop Scheduling
Sirui Li, Wenbin Ouyang, Yining Ma et al.
Rethinking Fair Representation Learning for Performance-Sensitive Tasks
Charles Jones, Fabio De Sousa Ribeiro, Mélanie Roschewitz et al.
Learning to Communicate Through Implicit Communication Channels
Han Wang, Binbin Chen, zhang et al.
Dense2MoE: Restructuring Diffusion Transformer to MoE for Efficient Text-to-Image Generation
Youwei Zheng, Yuxi Ren, Xin Xia et al.
Harmonizing Visual and Textual Embeddings for Zero-Shot Text-to-Image Customization
Yeji Song, Jimyeong Kim, Wonhark Park et al.
Specifying What You Know or Not for Multi-Label Class-Incremental Learning
Aoting Zhang, Dongbao Yang, Chang Liu et al.
Accelerating Training with Neuron Interaction and Nowcasting Networks
Boris Knyazev, Abhinav Moudgil, Guillaume Lajoie et al.
Benchmarking and Learning Multi-Dimensional Quality Evaluator for Text-to-3D Generation
Yujie Zhang, Bingyang Cui, Qi Yang et al.
Direct Numerical Layout Generation for 3D Indoor Scene Synthesis via Spatial Reasoning
Xingjian Ran, Yixuan Li, Linning Xu et al.
FedSPU: Personalized Federated Learning for Resource-Constrained Devices with Stochastic Parameter Update
Ziru Niu, Hai Dong, A. K. Qin
Distillation Robustifies Unlearning
Bruce W, Lee, Addie Foote, Alex Infanger et al.
FreeMorph: Tuning-Free Generalized Image Morphing with Diffusion Model
Yukang Cao, Chenyang Si, Jinghao Wang et al.
Contrastive Test-Time Composition of Multiple LoRA Models for Image Generation
Tuna Meral, Enis Simsar, Federico Tombari et al.
SAM-CP: Marrying SAM with Composable Prompts for Versatile Segmentation
Pengfei Chen, Lingxi Xie, xinyue huo et al.
GarmentPile: Point-Level Visual Affordance Guided Retrieval and Adaptation for Cluttered Garments Manipulation
Ruihai Wu, Ziyu Zhu, Yuran Wang et al.
Scaling Physical Reasoning with the PHYSICS Dataset
Shenghe Zheng, Qianjia Cheng, Junchi Yao et al.
Lay-Your-Scene: Natural Scene Layout Generation with Diffusion Transformers
Divyansh Srivastava, Xiang Zhang, He Wen et al.
Shape My Moves: Text-Driven Shape-Aware Synthesis of Human Motions
Ting-Hsuan Liao, Yi Zhou, Yu Shen et al.
What Matters in Data for DPO?
Yu Pan, Zhongze Cai, Huaiyang Zhong et al.
We Should Chart an Atlas of All the World's Models
Eliahu Horwitz, Nitzan Kurer, Jonathan Kahana et al.
Zero-shot 3D Question Answering via Voxel-based Dynamic Token Compression
Hsiang-Wei Huang, Fu-Chen Chen, Wenhao Chai et al.