Most Cited 2025 "multi-view image generation" Papers
22,274 papers found • Page 53 of 112
Conference
Feed-Forward SceneDINO for Unsupervised Semantic Scene Completion
Aleksandar Jevtić, Christoph Reich, Felix Wimbauer et al.
ARMesh: Autoregressive Mesh Generation via Next-Level-of-Detail Prediction
Jiabao Lei, Kewei Shi, Zhihao Liang et al.
Performative Validity of Recourse Explanations
Gunnar König, Hidde Fokkema, Timo Freiesleben et al.
Dimension-Reduction Attack! Video Generative Models are Experts on Controllable Image Synthesis
Hengyuan Cao, Yutong Feng, Biao Gong et al.
LightFair: Towards an Efficient Alternative for Fair T2I Diffusion via Debiasing Pre-trained Text Encoders
Boyu Han, Qianqian Xu, Shilong Bao et al.
VFRTok: Variable Frame Rates Video Tokenizer with Duration-Proportional Information Assumption
Tianxiong Zhong, Xingye Tian, Boyuan Jiang et al.
TITAN: A Trajectory-Informed Technique for Adaptive Parameter Freezing in Large-Scale VQE
Yifeng Peng, Xinyi Li, Samuel Yen-Chi Chen et al.
A Hidden Stumbling Block in Generalized Category Discovery: Distracted Attention
Qiyu Xu, Zhanxuan Hu, Yu Duan et al.
GenM3: Generative Pretrained Multi-path Motion Model for Text Conditional Human Motion Generation
Junyu Shi, Lijiang LIU, Yong Sun et al.
PRVQL: Progressive Knowledge-guided Refinement for Robust Egocentric Visual Query Localization
Bing Fan, Yunhe Feng, Yapeng Tian et al.
Learning long range dependencies through time reversal symmetry breaking
Guillaume Pourcel, Maxence Ernoult
Enhancing Graph Of Thought: Enhancing Prompts with LLM Rationales and Dynamic Temperature Control
Sunguk Shin, Youngjoon Kim
Aligning Transformers with Continuous Feedback via Energy Rank Alignment
Shriram Chennakesavalu, Frank Hu, Sebastian Ibarraran et al.
IllumiCraft: Unified Geometry and Illumination Diffusion for Controllable Video Generation
Yuanze Lin, Yi-Wen Chen, Yi-Hsuan Tsai et al.
When Lighting Deceives: Exposing Vision-Language Models' Illumination Vulnerability Through Illumination Transformation Attack
Hanqing Liu, Shouwei Ruan, Yao Huang et al.
CLOVER: Cross-Layer Orthogonal Vectors Pruning and Fine-Tuning
Fanxu Meng, Muhan Zhang
Scale Efficient Training for Large Datasets
Qing Zhou, Junyu Gao, Qi Wang
STEP: A Unified Spiking Transformer Evaluation Platform for Fair and Reproducible Benchmarking
Sicheng Shen, Dongcheng Zhao, Linghao Feng et al.
Measuring Scientific Capabilities of Language Models with a Systems Biology Dry Lab
Haonan Duan, Stephen Lu, Caitlin F Harrigan et al.
Transformed Low-rank Adaptation via Tensor Decomposition and Its Applications to Text-to-image Models
Zerui Tao, Yuhta Takida, Naoki Murata et al.
DiffDoctor: Diagnosing Image Diffusion Models Before Treating
Yiyang Wang, Xi Chen, Xiaogang Xu et al.
Constructing Ophthalmic MLLM for Positioning-diagnosis Collaboration Through Clinical Cognitive Chain Reasoning
Xinyao Liu, Diping Song
Hallucinatory Image Tokens: A Training-free EAZY Approach to Detecting and Mitigating Object Hallucinations in LVLMs
Liwei Che, Qingze T Liu, Jing Jia et al.
ToVE: Efficient Vision-Language Learning via Knowledge Transfer from Vision Experts
Yuanchen Wu, Junlong Du, Ke Yan et al.
A Differentiable Wave Optics Model for End-to-End Computational Imaging System Optimization
Chi-Jui Ho, Yash Belhe, Steve Rotenberg et al.
Bring Your Rear Cameras for Egocentric 3D Human Pose Estimation
HIroyasu Akada, Jian Wang, Vladislav Golyanik et al.
FGBench: A Dataset and Benchmark for Molecular Property Reasoning at Functional Group-Level in Large Language Models
Xuan Liu, Siru Ouyang, Xianrui Zhong et al.
SC-Captioner: Improving Image Captioning with Self-Correction by Reinforcement Learning
Lin Zhang, Xianfang Zeng, Kangcong Li et al.
Free-Lunch Color-Texture Disentanglement for Stylized Image Generation
Jiang Qin, Alexandra Gomez-Villa, Senmao Li et al.
Hallucination at a Glance: Controlled Visual Edits and Fine-Grained Multimodal Learning
Tianyi Bai, Yuxuan Fan, Qiu Jiantao et al.
BoltzNCE: Learning likelihoods for Boltzmann Generation with Stochastic Interpolants and Noise Contrastive Estimation
Rishal Aggarwal, Jacky Chen, Nicholas Boffi et al.
CSD-VAR: Content-Style Decomposition in Visual Autoregressive Models
Quang-Binh Nguyen, Minh Luu, Quang Nguyen et al.
Self-Reinforcing Prototype Evolution with Dual-Knowledge Cooperation for Semi-Supervised Lifelong Person Re-Identification
Kunlun Xu, Fan Zhuo, Jiangmeng Li et al.
On the Coexistence and Ensembling of Watermarks
Aleksandar Petrov, Shruti Agarwal, Philip Torr et al.
Integrating Visual Interpretation and Linguistic Reasoning for Geometric Problem Solving
Zixian Guo, Ming Liu, Qilong Wang et al.
SMMILE: An expert-driven benchmark for multimodal medical in-context learning
Melanie Rieff, Maya Varma, Ossian Rabow et al.
Latent Mixture of Symmetries for Sample-Efficient Dynamic Learning
Haoran Li, CHENHAN XIAO, Muhao Guo et al.
LLM-Driven Treatment Effect Estimation Under Inference Time Text Confounding
Yuchen Ma, Dennis Frauen, Jonas Schweisthal et al.
Graph Neural Networks for Edge Signals: Orientation Equivariance and Invariance
Dominik Fuchsgruber, Tim Postuvan, Stephan Günnemann et al.
Dist Loss: Enhancing Regression in Few-Shot Region through Distribution Distance Constraint
Guangkun Nie, Gongzheng Tang, Shenda Hong
Multi-focal Conditioned Latent Diffusion for Person Image Synthesis
Jiaqi Liu, Jichao Zhang, Paolo Rota et al.
Multi-modal contrastive learning adapts to intrinsic dimensions of shared latent variables
Yu Gui, Cong Ma, Zongming Ma
R-KV: Redundancy-aware KV Cache Compression for Reasoning Models
Zefan Cai, Wen Xiao, Hanshi Sun et al.
Invisible Backdoor Attack against Self-supervised Learning
Hanrong Zhang, Zhenting Wang, Boheng Li et al.
Feel-Good Thompson Sampling for Contextual Bandits: a Markov Chain Monte Carlo Showdown
Emile Anand, Sarah Liaw
Joint Diffusion Models in Continual Learning
Paweł Skierś, Kamil Deja
GT-Loc: Unifying When and Where in Images through a Joint Embedding Space
David G. Shatwell, Ishan Rajendrakumar Dave, Swetha Sirnam et al.
Credal Prediction based on Relative Likelihood
Timo Löhr, Paul Hofman, Felix Mohr et al.
ScenePainter: Semantically Consistent Perpetual 3D Scene Generation with Concept Relation Alignment
Chong Xia, Shengjun Zhang, Fangfu Liu et al.
Is `Right' Right? Enhancing Object Orientation Understanding in Multimodal Large Language Models through Egocentric Instruction Tuning
JiHyeok Jung, EunTae Kim, SeoYeon Kim et al.
ReCAP: Recursive Context-Aware Reasoning and Planning for Large Language Model Agents
Zhenyu Zhang, Tianyi Chen, Weiran Xu et al.
Trans-EnV: A Framework for Evaluating the Linguistic Robustness of LLMs Against English Varieties
Jiyoung Lee, Seungho Kim, Jieun Han et al.
Demystifying Spectral Feature Learning for Instrumental Variable Regression
Dimitri Meunier, Antoine Moulin, Jakub Wornbard et al.
ProJudge: A Multi-Modal Multi-Discipline Benchmark and Instruction-Tuning Dataset for MLLM-based Process Judges
Jiaxin Ai, Pengfei Zhou, xu Pan et al.
Efficient Preference-Based Reinforcement Learning: Randomized Exploration meets Experimental Design
Andreas Schlaginhaufen, Reda Ouhamma, Maryam Kamgarpour
T1: A Tool-Oriented Conversational Dataset for Multi-Turn Agentic Planning
Latte: Collaborative Test-Time Adaptation of Vision-Language Models in Federated Learning
Wenxuan Bao, Ruxi Deng, Ruizhong Qiu et al.
AutoSSVH: Exploring Automated Frame Sampling for Efficient Self-Supervised Video Hashing
Niu Lian, Jun Li, Jinpeng Wang et al.
Efficient Speech Language Modeling via Energy Distance in Continuous Latent Space
Zhengrui Ma, Yang Feng, Chenze Shao et al.
SGFormer: Satellite-Ground Fusion for 3D Semantic Scene Completion
Xiyue Guo, Jiarui Hu, Junjie Hu et al.
Bootstrapping Grounded Chain-of-Thought in Multimodal LLMs for Data-Efficient Model Adaptation
Jiaer Xia, Bingkui Tong, Yuhang Zang et al.
ICPC-Eval: Probing the Frontiers of LLM Reasoning with Competitive Programming Contests
Shiyi Xu, Hu Yiwen, Yingqian Min et al.
Offline Goal-conditioned Reinforcement Learning with Quasimetric Representations
Vivek Myers, Bill Zheng, Benjamin Eysenbach et al.
Video Individual Counting for Moving Drones
Yaowu Fan, Jia Wan, Tao Han et al.
Eyes Wide Open: Ego Proactive Video-LLM for Streaming Video
Xueyang Yu, Cheng Shi, Yang Wang et al.
TalkCuts: A Large-Scale Dataset for Multi-Shot Human Speech Video Generation
Jiaben Chen, Zixin Wang, AILING ZENG et al.
SAMA: Towards Multi-Turn Referential Grounded Video Chat with Large Language Models
Ye Sun, Hao Zhang, Henghui Ding et al.
LIFT: Latent Implicit Functions for Task- and Data-Agnostic Encoding
Amirhossein Kazerouni, Soroush Mehraban, Michael Brudno et al.
CAPability: A Comprehensive Visual Caption Benchmark for Evaluating Both Correctness and Thoroughness
Zhihang Liu, Chen-Wei Xie, Bin Wen et al.
MARVEL-40M+: Multi-Level Visual Elaboration for High-Fidelity Text-to-3D Content Creation
Sankalp Sinha, Mohammad Sadil Khan, Muhammad Usama et al.
Moderating the Generalization of Score-based Generative Model
Wan Jiang, He Wang, Xin Zhang et al.
Charm: The Missing Piece in ViT Fine-Tuning for Image Aesthetic Assessment
Fatemeh Behrad, Tinne Tuytelaars, Johan Wagemans
A Lesson in Splats: Teacher-Guided Diffusion for 3D Gaussian Splats Generation with 2D Supervision
Chensheng Peng, Ido Sobol, Masayoshi Tomizuka et al.
GraSS: Scalable Data Attribution with Gradient Sparsification and Sparse Projection
Pingbang Hu, Joseph Melkonian, Weijing Tang et al.
Rethinking Multimodal Learning from the Perspective of Mitigating Classification Ability Disproportion
Qing-Yuan Jiang, Longfei Huang, Yang Yang
PAC Bench: Do Foundation Models Understand Prerequisites for Executing Manipulation Policies?
Atharva Gundawar, Som Sagar, Ransalu Senanayake
Open-set Cross Modal Generalization via Multimodal Unified Representation
Hai Huang, Yan Xia, Shulei Wang et al.
KL-Regularized RLHF with Multiple Reference Models: Exact Solutions and Sample Complexity
Gholamali Aminian, Amir R. Asadi, Idan Shenfeld et al.
Normalization in Attention Dynamics
Nikita Karagodin, Shu Ge, Yury Polyanskiy et al.
Who You Are Matters: Bridging Interests and Social Roles via LLM-Enhanced Logic Recommendation
Qing Yu, Xiaobei Wang, Shuchang Liu et al.
Adaptive Frontier Exploration on Graphs with Applications to Network-Based Disease Testing
XianJun, Davin Choo, Yuqi Pan, Tonghan Wang et al.
On the creation of narrow AI: hierarchy and nonlocality of neural network skills
Eric Michaud, Asher Parker-Sartori, Max Tegmark
Towards Straggler-Resilient Split Federated Learning: An Unbalanced Update Approach
Dandan Liang, Jianing Zhang, Evan Chen et al.
PETRA: Parallel End-to-end Training with Reversible Architectures
Stéphane Rivaud, Louis Fournier, Thomas Pumir et al.
Who is a Better Talker: Subjective and Objective Quality Assessment for AI-Generated Talking Heads
Yingjie Zhou, Jiezhang Cao, Zicheng Zhang et al.
Flatness is Necessary, Neural Collapse is Not: Rethinking Generalization via Grokking
Ting Han, Linara Adilova, Henning Petzka et al.
Seeing the Abstract: Translating the Abstract Language for Vision Language Models
Davide Talon, Federico Girella, Ziyue Liu et al.
A Statistical Theory of Contrastive Learning via Approximate Sufficient Statistics
Licong Lin, Song Mei
VolumetricSMPL: A Neural Volumetric Body Model for Efficient Interactions, Contacts, and Collisions
Marko Mihajlovic, Siwei Zhang, Gen Li et al.
Hierarchical Retrieval: The Geometry and a Pretrain-Finetune Recipe
Chong You, Rajesh Jayaram, Ananda Theertha Suresh et al.
Balanced Conic Rectified Flow
Kim Shin seong, Mingi Kwon, Jaeseok Jeong et al.
Causal Head Gating: A Framework for Interpreting Roles of Attention Heads in Transformers
Andrew Nam, Henry Conklin, Yukang Yang et al.
EA-KD: Entropy-based Adaptive Knowledge Distillation
Chi-Ping Su, Ching-Hsun Tseng, Bin Pu et al.
Efficient Parametric SVD of Koopman Operator for Stochastic Dynamical Systems
Minchan Jeong, Jongha (Jon) Ryu, Se-Young Yun et al.
AutoDiscovery: Open-ended Scientific Discovery via Bayesian Surprise
Dhruv Agarwal, Bodhisattwa Prasad Majumder, Reece Adamson et al.
CXReasonBench: A Benchmark for Evaluating Structured Diagnostic Reasoning in Chest X-rays
Hyungyung Lee, Geon Choi, Jung-Oh Lee et al.
MuHBoost: Multi-Label Boosting For Practical Longitudinal Human Behavior Modeling
Nguyen Thach, Patrick Habecker, Anika Eisenbraun et al.
Learning Sparse Approximate Inverse Preconditioners for Conjugate Gradient Solvers on GPUs
Zhehao Li, Zhehao Li, Kangbo Lyu et al.
Reinforced Context Order Recovery for Adaptive Reasoning and Planning
Long Ma, Fangwei Zhong, Yizhou Wang
I Am Big, You Are Little; I Am Right, You Are Wrong
David A Kelly, Akchunya Chanchal, Nathan Blake
HCRMP: An LLM-Hinted Contextual Reinforcement Learning Framework for Autonomous Driving
Zhiwen Chen, Hanming Deng, Zhuoren Li et al.
SparseDiT: Token Sparsification for Efficient Diffusion Transformer
Shuning Chang, Pichao WANG, Jiasheng Tang et al.
GeoComplete: Geometry-Aware Diffusion for Reference-Driven Image Completion
Beibei Lin, Tingting Chen, Robby Tan
Non-stationary Bandit Convex Optimization: A Comprehensive Study
Xiaoqi Liu, Dorian Baudry, Julian Zimmert et al.
Martian World Model: Controllable Video Synthesis with Physically Accurate 3D Reconstructions
Longfei Li, Zhiwen Fan, Wenyan Cong et al.
From Sequence to Structure: Uncovering Substructure Reasoning in Transformers
Xinnan Dai, Kai Yang, Jay Revolinsky et al.
DiTaiListener: Controllable High Fidelity Listener Video Generation with Diffusion
Maksim Siniukov, Di Chang, Minh Tran et al.
EA-Vit: Efficient Adaptation for Elastic Vision Transformer
Chen Zhu, Wangbo Zhao, Huiwen Zhang et al.
Generate, Refine, and Encode: Leveraging Synthesized Novel Samples for On-the-Fly Fine-Grained Category Discovery
Xiao Liu, Nan Pu, Haiyang Zheng et al.
Tiled Diffusion
Or Madar, Ohad Fried
Memory-Efficient 4-bit Preconditioned Stochastic Optimization
Jingyang Li, Kuangyu Ding, Kim-chuan Toh et al.
Taming generative video models for zero-shot optical flow extraction
Seungwoo Kim, Khai Loong Aw, Klemen Kotar et al.
Associative Transformer
Yuwei Sun, Hideya Ochiai, Zhirong Wu et al.
On the Surprising Effectiveness of Large Learning Rates under Standard Width Scaling
Moritz Haas, Sebastian Bordt, Ulrike Luxburg et al.
Kernel Density Steering: Inference-Time Scaling via Mode Seeking for Image Restoration
Yuyang Hu, Kangfu Mei, Mojtaba Ardakani et al.
AdaDrive: Self-Adaptive Slow-Fast System for Language-Grounded Autonomous Driving
Ruifei Zhang, Junlin Xie, Wei Zhang et al.
PRISM: Privacy-Preserving Improved Stochastic Masking for Federated Generative Models
Kyeongkook Seo, Dong-Jun Han, Jaejun Yoo
CodeCrash: Exposing LLM Fragility to Misleading Natural Language in Code Reasoning
Man Ho Lam, Chaozheng Wang, Jen-Tse Huang et al.
Extending Foundational Monocular Depth Estimators to Fisheye Cameras with Calibration Tokens
Suchisrit Gangopadhyay, Jung Hee Kim, Xien Chen et al.
HybridMQA: Exploring Geometry-Texture Interactions for Colored Mesh Quality Assessment
Armin Shafiee Sarvestani, Sheyang Tang, Zhou Wang
Improved Regret Bounds for Gaussian Process Upper Confidence Bound in Bayesian Optimization
Shogo Iwazaki
AnyMoLe: Any Character Motion In-betweening Leveraging Video Diffusion Models
Kwan Yun, Seokhyeon Hong, Chaelin Kim et al.
Graph-based Document Structure Analysis
Yufan Chen, Ruiping Liu, Junwei Zheng et al.
LocDiff: Identifying Locations on Earth by Diffusing in the Hilbert Space
Zhangyu Wang, Zeping Liu, Jielu Zhang et al.
Towards Fully FP8 GEMM LLM Training at Scale
Alejandro Hernández Cano, Dhia Garbaya, Imanol Schlag et al.
On the Robustness of Transformers against Context Hijacking for Linear Classification
Tianle Li, Chenyang Zhang, Xingwu Chen et al.
RoPECraft: Training-Free Motion Transfer with Trajectory-Guided RoPE Optimization on Diffusion Transformers
Ahmet Berke Gökmen, Yiğit Ekin, Bahri Batuhan Bilecen et al.
Synthesize Privacy-Preserving High-Resolution Images via Private Textual Intermediaries
Haoxiang Wang, Zinan Lin, Da Yu et al.
Generalization Guarantees for Representation Learning via Data-Dependent Gaussian Mixture Priors
Milad Sefidgaran, Abdellatif Zaidi, Piotr Krasnowski
Improving the Euclidean Diffusion Generation of Manifold Data by Mitigating Score Function Singularity
Zichen Liu, Wei Zhang, Tiejun Li
Towards General Modality Translation with Contrastive and Predictive Latent Diffusion Bridge
Nimrod Berman, Omkar Joglekar, Eitan Kosman et al.
World-aware Planning Narratives Enhance Large Vision-Language Model Planner
Junhao Shi, Zhaoye Fei, Siyin Wang et al.
Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights
Zhiyuan Liang, Dongwen Tang, Yuhao Zhou et al.
ODHSR: Online Dense 3D Reconstruction of Humans and Scenes from Monocular Videos
Zetong Zhang, Manuel Kaufmann, Lixin Xue et al.
Probably Approximately Precision and Recall Learning
Lee Cohen, Yishay Mansour, Shay Moran et al.
SCOUT: Teaching Pre-trained Language Models to Enhance Reasoning via Flow Chain-of-Thought
Guanghao Li, Wenhao Jiang, Mingfeng Chen et al.
ExCap3D: Expressive 3D Scene Understanding via Object Captioning with Varying Detail
Chandan Yeshwanth, David Rozenberszki, Angela Dai
LLM-Explorer: A Plug-in Reinforcement Learning Policy Exploration Enhancement Driven by Large Language Models
Qianyue Hao, Yiwen Song, Qingmin Liao et al.
ConformalSAM: Unlocking the Potential of Foundational Segmentation Models in Semi-Supervised Semantic Segmentation with Conformal Prediction
Danhui Chen, Ziquan Liu, Chuxi Yang et al.
Scaling Language-centric Omnimodal Representation Learning
Chenghao Xiao, Hou Pong (Ken) Chan, Hao Zhang et al.
FG-OrIU: Towards Better Forgetting via Feature-Gradient Orthogonality for Incremental Unlearning
qian feng, Jiahang Tu, Mintong Kang et al.
Human-assisted Robotic Policy Refinement via Action Preference Optimization
Wenke Xia, Yichu Yang, Hongtao Wu et al.
Underwater Visual SLAM with Depth Uncertainty and Medium Modeling
Rui Liu, Sheng Fan, Wenguan Wang et al.
Low Rank Gradients and Where to Find Them
Rishi Sonthalia, Michael Murray, Guido Montufar
EVOREFUSE: Evolutionary Prompt Optimization for Evaluation and Mitigation of LLM Over-Refusal to Pseudo-Malicious Instructions
Xiaorui Wu, Fei Li, Xiaofeng Mao et al.
Efficient Policy Optimization in Robust Constrained MDPs with Iteration Complexity Guarantees
Sourav Ganguly, Kishan Panaganti, Arnob Ghosh et al.
ProxyTransformation: Preshaping Point Cloud Manifold With Proxy Attention For 3D Visual Grounding
Qihang Peng, Henry Zheng, Gao Huang
Conditional Balance: Improving Multi-Conditioning Trade-Offs in Image Generation
Nadav Z. Cohen, Oron Nir, Ariel Shamir
AdaDetectGPT: Adaptive Detection of LLM-Generated Text with Statistical Guarantees
Hongyi Zhou, Jin Zhu, Pingfan Su et al.
Silencer: From Discovery to Mitigation of Self-Bias in LLM-as-Benchmark-Generator
Peiwen Yuan, Yiwei Li, Shaoxiong Feng et al.
A Unified Framework for the Transportability of Population-Level Causal Measures
Ahmed Boughdiri, Clément Berenfeld, Julie Josse et al.
Visual Modality Prompt for Adapting Vision-Language Object Detectors
Heitor Rapela Medeiros, Atif Belal, Srikanth Muralidharan et al.
Efficient Unsupervised Shortcut Learning Detection and Mitigation in Transformers
Lukas Kuhn, sari sadiya, Jörg Schlötterer et al.
Generalization Bounds for Canonicalization: A Comparative Study with Group Averaging
Behrooz Tahmasebi, Stefanie Jegelka
Scaling Laws for Robust Comparison of Open Foundation Language-Vision Models and Datasets
Marianna Nezhurina, Tomer Porian, Giovanni Puccetti et al.
Quantifying Cross-Modality Memorization in Vision-Language Models
Yuxin Wen, Yangsibo Huang, Tom Goldstein et al.
Learning from positive and unlabeled examples -Finite size sample bounds
Farnam Mansouri, Shai Ben-David
Mitigating Ambiguities in 3D Classification with Gaussian Splatting
Ruiqi Zhang, Hao Zhu, Jingyi Zhao et al.
I2-NeRF: Learning Neural Radiance Fields Under Physically-Grounded Media Interactions
Shuhong Liu, Lin Gu, Ziteng Cui et al.
AVCD: Mitigating Hallucinations in Audio-Visual Large Language Models through Contrastive Decoding
Chaeyoung Jung, Youngjoon Jang, Joon Son Chung
Attention! Your Vision Language Model Could Be Maliciously Manipulated
Xiaosen Wang, Shaokang Wang, Zhijin Ge et al.
Statistical Tractability of Off-policy Evaluation of History-dependent Policies in POMDPs
Yuheng Zhang, Nan Jiang
ReDi: Rectified Discrete Flow
Jaehoon Yoo, Wonjung Kim, Seunghoon Hong
Constrained Entropic Unlearning: A Primal-Dual Framework for Large Language Models
Taha Entesari, Arman Hatami, Rinat Khaziev et al.
BTL-UI: Blink-Think-Link Reasoning Model for GUI Agent
Shaojie Zhang, Ruoceng Zhang, Pei Fu et al.
ElasticMM: Efficient Multimodal LLMs Serving with Elastic Multimodal Parallelism
Zedong Liu, Shenggan Cheng, Guangming Tan et al.
Diffusion Models and the Manifold Hypothesis: Log-Domain Smoothing is Geometry Adaptive
Tyler Farghly, Peter Potaptchik, Samuel Howard et al.
Bisimulation Metric for Model Predictive Control
Yutaka Shimizu, Masayoshi Tomizuka
SynBrain: Enhancing Visual-to-fMRI Synthesis via Probabilistic Representation Learning
Weijian Mai, Jiamin Wu, Yu Zhu et al.
Generalization Error Analysis for Selective State-Space Models Through the Lens of Attention
Arya Honarpisheh, Mustafa Bozdag, Octavia Camps et al.
Who Reasons in the Large Language Models?
Jie Shao, Jianxin Wu
RoboTron-Nav: A Unified Framework for Embodied Navigation Integrating Perception, Planning, and Prediction
Yufeng Zhong, Chengjian Feng, Feng yan et al.
BMMR: A Large-Scale Bilingual Multimodal Multi-Discipline Reasoning Dataset
Zhiheng Xi, Guanyu Li, Yutao Fan et al.
SkyLadder: Better and Faster Pretraining via Context Window Scheduling
Tongyao Zhu, Qian Liu, Haonan Wang et al.
VoteSplat: Hough Voting Gaussian Splatting for 3D Scene Understanding
Minchao Jiang, Shunyu Jia, Jiaming Gu et al.
ATLAS: Decoupling Skeletal and Shape Parameters for Expressive Parametric Human Modeling
Jinhyung Park, Javier Romero, Shunsuke Saito et al.
VQToken: Neural Discrete Token Representation Learning for Extreme Token Reduction in Video Large Language Models
Haichao Zhang, Yun Fu
On the Generalization of Representation Uncertainty in Earth Observation
Spyros Kondylatos, Nikolaos Ioannis Bountos, Dimitrios Michail et al.
FRESA: Feedforward Reconstruction of Personalized Skinned Avatars from Few Images
Rong Wang, Fabian Prada, Ziyan Wang et al.
E-BATS: Efficient Backpropagation-Free Test-Time Adaptation for Speech Foundation Models
Jiaheng Dong, Hong Jia, Soumyajit Chatterjee et al.
Predict-Optimize-Distill: A Self-Improving Cycle for 4D Object Understanding
Mingxuan Wu, Huang Huang, Justin Kerr et al.
Distribution-Free Data Uncertainty for Neural Network Regression
Domokos M. Kelen, Ádám Jung, Péter Kersch et al.
Learning (Approximately) Equivariant Networks via Constrained Optimization
Andrei Manolache, Luiz Chamon, Mathias Niepert
Jigsaw++: Imagining Complete Shape Priors for Object Reassembly
Jiaxin Lu, Gang Hua, Qixing Huang
Language Modeling by Language Models
Junyan Cheng, Peter Clark, Kyle Richardson
Second-Order Convergence in Private Stochastic Non-Convex Optimization
Youming Tao, Zuyuan Zhang, Dongxiao Yu et al.
Disentangled Clothed Avatar Generation with Layered Representation
Weitian Zhang, Yichao Yan, Sijing Wu et al.
Mixture-of-Experts Meets In-Context Reinforcement Learning
Wenhao Wu, Fuhong Liu, Haoru Li et al.
HoliGS: Holistic Gaussian Splatting for Embodied View Synthesis
Xiaoyuan Wang, Yizhou Zhao, Botao Ye et al.
ResQ: A Novel Framework to Implement Residual Neural Networks on Analog Rydberg Atom Quantum Computers
Nicholas DiBrita, Jason Han, Tirthak Patel
Contrastive Representations for Temporal Reasoning
Alicja Ziarko, Michał Bortkiewicz, Michał Zawalski et al.
Emergence of Linear Truth Encodings in Language Models
Shauli Ravfogel, Gilad Yehudai, Tal Linzen et al.
Monitoring Risks in Test-Time Adaptation
Mona Schirmer, Metod Jazbec, Christian Andersson Naesseth et al.
Color Matching Using Hypernetwork-Based Kolmogorov-Arnold Networks
Artem Nikonorov, Georgy Perevozchikov, Andrei Korepanov et al.
Learning to price with resource constraints: from full information to machine-learned prices
Ruicheng Ao, Jiashuo Jiang, David Simchi-Levi
Avoiding exp(R) scaling in RLHF through Preference-based Exploration
Mingyu Chen, Yiding Chen, Wen Sun et al.
Hierarchical-aware Orthogonal Disentanglement Framework for Fine-grained Skeleton-based Action Recognition
Haochen Chang, Pengfei Ren, Haoyang Zhang et al.
GyroSwin: 5D Surrogates for Gyrokinetic Plasma Turbulence Simulations
Fabian Paischer, Gianluca Galletti, William Hornsby et al.
An Analytical Theory of Spectral Bias in the Learning Dynamics of Diffusion Models
Binxu Wang, Cengiz Pehlevan
More of the Same: Persistent Representational Harms Under Increased Representation
Jennifer Mickel, Maria De-Arteaga, Liu Leqi et al.