Most Cited 2025 "private estimators" Papers
22,274 papers found • Page 20 of 112
Conference
Unified Multimodal Understanding via Byte-Pair Visual Encoding
Wanpeng Zhang, Yicheng Feng, Hao Luo et al.
SSL-STMFormer Self-Supervised Learning Spatio-Temporal Entanglement Transformer for Traffic Flow Prediction
Zetao Li, Zheng Hu, Peng Han et al.
Enhancing SAM with Efficient Prompting and Preference Optimization for Semi-supervised Medical Image Segmentation
Aishik Konwer, Zhijian Yang, Erhan Bas et al.
Distributional Associations vs In-Context Reasoning: A Study of Feed-forward and Attention Layers
Lei Chen, Joan Bruna, Alberto Bietti
Vision-Language Models Create Cross-Modal Task Representations
Grace Luo, Trevor Darrell, Amir Bar
Distilling Reinforcement Learning Algorithms for In-Context Model-Based Planning
Jaehyeon Son, Soochan Lee, Gunhee Kim
Agentic Plan Caching: Test-Time Memory for Fast and Cost-Efficient LLM Agents
Qizheng Zhang, Michael Wornow, Kunle Olukotun
GDiffRetro: Retrosynthesis Prediction with Dual Graph Enhanced Molecular Representation and Diffusion Generation
Shengyin Sun, Wenhao Yu, Yuxiang Ren et al.
ShotBench: Expert-Level Cinematic Understanding in Vision-Language Models
Hongbo Liu, Jingwen He, Yi Jin et al.
From Attention to Activation: Unraveling the Enigmas of Large Language Models
Prannay Kaul, Chengcheng Ma, Ismail Elezi et al.
Towards hyperparameter-free optimization with differential privacy
Ruixuan Liu, Zhiqi Bu
Activation-Informed Merging of Large Language Models
Amin Heyrani Nobari, Kaveh Alimohammadi, Ali ArjomandBigdeli et al.
TAMER: Tree-Aware Transformer for Handwritten Mathematical Expression Recognition
Jianhua Zhu, Wenqi Zhao, Yu Li et al.
ShEPhERD: Diffusing shape, electrostatics, and pharmacophores for bioisosteric drug design
Keir Adams, Kento Abeywardane, Jenna Fromer et al.
Speculative Prefill: Turbocharging TTFT with Lightweight and Training-Free Token Importance Estimation
Jingyu Liu, Beidi Chen, Ce Zhang
DoF: A Diffusion Factorization Framework for Offline Multi-Agent Reinforcement Learning
Chao Li, Ziwei Deng, Chenxing Lin et al.
HAMoBE: Hierarchical and Adaptive Mixture of Biometric Experts for Video-based Person ReID
Yiyang Su, Yunping Shi, Feng Liu et al.
Cross-modal Causal Relation Alignment for Video Question Grounding
weixing chen, Yang Liu, Binglin Chen et al.
Hyperbolic Category Discovery
Yuanpei Liu, Zhenqi He, Kai Han
Locality in Image Diffusion Models Emerges from Data Statistics
Artem Lukoianov, Chenyang Yuan, Justin Solomon et al.
Enhancing Rating-Based Reinforcement Learning to Effectively Leverage Feedback from Large Vision-Language Models
Minh-Tung Luu, Younghwan Lee, Donghoon Lee et al.
HUMOTO: A 4D Dataset of Mocap Human Object Interactions
Jiaxin Lu, Chun-Hao Huang, Uttaran Bhattacharya et al.
ContextualStory: Consistent Visual Storytelling with Spatially-Enhanced and Storyline Context
Sixiao Zheng, Yanwei Fu
ALE-Bench: A Benchmark for Long-Horizon Objective-Driven Algorithm Engineering
Yuki Imajuku, Kohki Horie, Yoichi Iwata et al.
Selective Prompt Anchoring for Code Generation
Yuan Tian, Tianyi Zhang
Dynamic Updates for Language Adaptation in Visual-Language Tracking
Xiaohai Li, Bineng Zhong, Qihua Liang et al.
GPS: A Probabilistic Distributional Similarity with Gumbel Priors for Set-to-Set Matching
Ziming Zhang, Fangzhou Lin, Haotian Liu et al.
LookCloser: Frequency-aware Radiance Field for Tiny-Detail Scene
Xiaoyu Zhang, Weihong Pan, Chong Bao et al.
Physics-Informed Deep Inverse Operator Networks for Solving PDE Inverse Problems
Sung Woong Cho, Hwijae Son
Towards Stable and Storage-efficient Dataset Distillation: Matching Convexified Trajectory
Wenliang Zhong, Haoyu Tang, Qinghai Zheng et al.
Panorama Generation From NFoV Image Done Right
Dian Zheng, Cheng Zhang, Xiao-Ming Wu et al.
Enhancing 3D Gaze Estimation in the Wild using Weak Supervision with Gaze Following Labels
Pierre Vuillecard, Jean-marc Odobez
DiffGrasp: Whole-Body Grasping Synthesis Guided by Object Motion Using a Diffusion Model
Yonghao Zhang, Qiang He, Yanguang Wan et al.
SynFER: Towards Boosting Facial Expression Recognition with Synthetic Data
Xilin He, Cheng Luo, Xiaole Xian et al.
MMSearch: Unveiling the Potential of Large Models as Multi-modal Search Engines
Dongzhi Jiang, Renrui Zhang, Ziyu Guo et al.
Rethinking Pseudo-Label Guided Learning for Weakly Supervised Temporal Action Localization from the Perspective of Noise Correction
Quan Zhang, Yuxin Qi, Xi Tang et al.
Stochastic Process Learning via Operator Flow Matching
Yaozhong Shi, Zachary Ross, Domniki Asimaki et al.
Hand1000: Generating Realistic Hands from Text with Only 1,000 Images
Haozhuo Zhang, Bin Zhu, Yu Cao et al.
FlashMD: long-stride, universal prediction of molecular dynamics
Filippo Bigi, Sanggyu Chong, Agustinus Kristiadi et al.
UltraHR-100K: Enhancing UHR Image Synthesis with A Large-Scale High-Quality Dataset
Chen Zhao, En Ci, Yunzhe Xu et al.
What Has Been Overlooked in Contrastive Source-Free Domain Adaptation: Leveraging Source-Informed Latent Augmentation within Neighborhood Context
JING WANG, Wonho Bae, Jiahong Chen et al.
Instruction-guided Multi-Granularity Segmentation and Captioning with Large Multimodal Model
Xu Yuan, Li Zhou, Zenghui Sun et al.
Predicting the Original Appearance of Damaged Historical Documents
Zhenhua Yang, Dezhi Peng, Yongxin Shi et al.
Detail-Preserving Latent Diffusion for Stable Shadow Removal
Jiamin Xu, Yuxin Zheng, Zelong Li et al.
Towards Robustness and Explainability of Automatic Algorithm Selection
Xingyu Wu, Jibin Wu, Yu Zhou et al.
Geometry Aware Operator Transformer as an efficient and accurate neural surrogate for PDEs on arbitrary domains
Shizheng Wen, Arsh Kumbhat, Levi Lingsch et al.
JAFAR: Jack up Any Feature at Any Resolution
Paul Couairon, Loïck Chambon, Louis Serrano et al.
VORTA: Efficient Video Diffusion via Routing Sparse Attention
Wenhao Sun, Rong-Cheng Tu, Yifu Ding et al.
Rethinking Verification for LLM Code Generation: From Generation to Testing
Zihan Ma, Taolin Zhang, Maosongcao et al.
Towards Generalizable Scene Change Detection
Jae-Woo KIM, Ue-Hwan Kim
Towards Understanding Why Label Smoothing Degrades Selective Classification and How to Fix It
Guoxuan Xia, Olivier Laurent, Gianni Franchi et al.
Valid Conformal Prediction for Dynamic GNNs
Ed Davis, Ian Gallagher, Daniel Lawson et al.
DiffLocks: Generating 3D Hair from a Single Image using Diffusion Models
Radu Alexandru Rosu, Keyu Wu, Yao Feng et al.
HarmonySet: A Comprehensive Dataset for Understanding Video-Music Semantic Alignment and Temporal Synchronization
Zitang Zhou, Ke Mei, Yu Lu et al.
Details Enhancement in Unsigned Distance Field Learning for High-fidelity 3D Surface Reconstruction
Cheng Xu, Fei Hou, Wencheng Wang et al.
AgroBench: Vision-Language Model Benchmark in Agriculture
Risa Shinoda, Nakamasa Inoue, Hirokatsu Kataoka et al.
Progress-Aware Video Frame Captioning
Zihui Xue, Joungbin An, Xitong Yang et al.
Language Driven Occupancy Prediction
Zhu Yu, Bowen Pang, Lizhe Liu et al.
Finite-Sample Analysis of Policy Evaluation for Robust Average Reward Reinforcement Learning
Yang Xu, Washim Mondal, Vaneet Aggarwal
Video In-context Learning: Autoregressive Transformers are Zero-Shot Video Imitators
Wentao Zhang, Junliang Guo, Tianyu He et al.
DanceFix: An Exploration in Group Dance Neatness Assessment Through Fixing Abnormal Challenges of Human Pose
Huangbiao Xu, Xiao Ke, Huanqi Wu et al.
GaussianProperty: Integrating Physical Properties to 3D Gaussians with LMMs
Xinli Xu, Wenhang Ge, Dicong Qiu et al.
Doubly Robust Conformalized Survival Analysis with Right-Censored Data
Matteo Sesia, vladimir svetnik
Privacy Attacks on Image AutoRegressive Models
Antoni Kowalczuk, Jan Dubiński, Franziska Boenisch et al.
GSRF: Complex-Valued 3D Gaussian Splatting for Efficient Radio-Frequency Data Synthesis
Kang Yang, Gaofeng Dong, Sijie Ji et al.
DiET-GS: Diffusion Prior and Event Stream-Assisted Motion Deblurring 3D Gaussian Splatting
Seungjun Lee, Gim Hee Lee
InteractAnything: Zero-shot Human Object Interaction Synthesis via LLM Feedback and Object Affordance Parsing
Jinlu Zhang, Yixin Chen, Zan Wang et al.
SMT: Fine-Tuning Large Language Models with Sparse Matrices
Haoze He, Juncheng Li, Xuan Jiang et al.
Gradient-Guided Annealing for Domain Generalization
Aristotelis Ballas, Christos Diou
CVLUE: A New Benchmark Dataset for Chinese Vision-Language Understanding Evaluation
Yuxuan Wang, Yijun Liu, Fei Yu et al.
FirePlace: Geometric Refinements of LLM Common Sense Reasoning for 3D Object Placement
Ian Huang, Yanan Bao, Karen Truong et al.
MAGE: Model-Level Graph Neural Networks Explanations via Motif-based Graph Generation
Zhaoning Yu, Hongyang Gao
Corvid: Improving Multimodal Large Language Models Towards Chain-of-Thought Reasoning
Jingjing Jiang, Chao Ma, Xurui Song et al.
EchoShot: Multi-Shot Portrait Video Generation
Jiahao Wang, Hualian Sheng, Sijia Cai et al.
Effective and Efficient Time-Varying Counterfactual Prediction with State-Space Models
Haotian Wang, Haoxuan Li, Hao Zou et al.
Towards Real Unsupervised Anomaly Detection Via Confident Meta-Learning
Muhammad Aqeel, Shakiba Sharifi, Marco Cristani et al.
SymDPO: Boosting In-Context Learning of Large Multimodal Models with Symbol Demonstration Direct Preference Optimization
Hongrui Jia, Chaoya Jiang, Haiyang Xu et al.
UniCoTT: A Unified Framework for Structural Chain-of-Thought Distillation
Xianwei Zhuang, Zhihong Zhu, Zhichang Wang et al.
Learning Safety Constraints for Large Language Models
Xin Chen, Yarden As, Andreas Krause
BRAID: Input-driven Nonlinear Dynamical Modeling of Neural-Behavioral Data
Parsa Vahidi, Omid G. Sani, Maryam Shanechi
Glauber Generative Model: Discrete Diffusion Models via Binary Classification
Harshit Varma, Dheeraj Nagaraj, Karthikeyan Shanmugam
Alignment-Free RGB-T Salient Object Detection: A Large-Scale Dataset and Progressive Correlation Network
Kunpeng Wang, Keke Chen, Chenglong Li et al.
Understanding Fairness Surrogate Functions in Algorithmic Fairness
Yong Liu, (Andrew) Zhanke Zhou, Zhicong Li et al.
MimeQA: Towards Socially-Intelligent Nonverbal Foundation Models
Hengzhi Li, Megan Tjandrasuwita, Yi R. (May) Fung et al.
CoreGuard: Safeguarding Foundational Capabilities of LLMs Against Model Stealing in Edge Deployment
Qinfeng Li, Tianyue Luo, Xuhong Zhang et al.
Janus-Pro-R1: Advancing Collaborative Visual Comprehension and Generation via Reinforcement Learning
Kaihang Pan, Yang Wu, Wendong Bu et al.
DGQ: Distribution-Aware Group Quantization for Text-to-Image Diffusion Models
Hyogon Ryu, NaHyeon Park, Hyunjung Shim
SMITE: Segment Me In TimE
Amirhossein Alimohammadi, Sauradip Nag, Saeid Asgari et al.
SemGeoMo: Dynamic Contextual Human Motion Generation with Semantic and Geometric Guidance
Peishan Cong, Ziyi Wang, Yuexin Ma et al.
FAM Diffusion: Frequency and Attention Modulation for High-Resolution Image Generation with Stable Diffusion
Haosen Yang, Adrian Bulat, Isma Hadji et al.
Anywhere: A Multi-Agent Framework for User-Guided, Reliable, and Diverse Foreground-Conditioned Image Generation
Xie Tianyidan, Rui Ma, Qian Wang et al.
Federated Continual Instruction Tuning
Haiyang Guo, Fanhu Zeng, Fei Zhu et al.
Beyond Human Data: Aligning Multimodal Large Language Models by Iterative Self-Evolution
Wentao Tan, Qiong Cao, Yibing Zhan et al.
Impossible Videos
Zechen Bai, Hai Ci, Mike Zheng Shou
Compression via Pre-trained Transformers: A Study on Byte-Level Multimodal Data
David Heurtel-Depeiges, Anian Ruoss, Joel Veness et al.
Cross-Domain Graph Data Scaling: A Showcase with Diffusion Models
Wenzhuo Tang, Haitao Mao, Danial Dervovic et al.
Slot-Guided Adaptation of Pre-trained Diffusion Models for Object-Centric Learning and Compositional Generation
adil kaan akan, Yucel Yemez
What Do Latent Action Models Actually Learn?
Chuheng Zhang, Tim Pearce, Pushi Zhang et al.
TIMotion: Temporal and Interactive Framework for Efficient Human-Human Motion Generation
Yabiao Wang, Shuo Wang, Jiangning Zhang et al.
InterAct: Advancing Large-Scale Versatile 3D Human-Object Interaction Generation
Sirui Xu, Dongting Li, Yucheng Zhang et al.
Sculpting Features from Noise: Reward-Guided Hierarchical Diffusion for Task-Optimal Feature Transformation
Nanxu Gong, Zijun Li, Sixun Dong et al.
Privacy amplification by random allocation
Moshe Shenfeld, Vitaly Feldman
MMReason: An Open-Ended Multi-Modal Multi-Step Reasoning Benchmark for MLLMs Toward AGI
Huanjin Yao, Jiaxing Huang, Yawen Qiu et al.
Medical Multimodal Model Stealing Attacks via Adversarial Domain Alignment
Yaling Shen, Zhixiong Zhuang, Kun Yuan et al.
STPro: Spatial and Temporal Progressive Learning for Weakly Supervised Spatio-Temporal Grounding
Aaryan Garg, Akash Kumar, Yogesh S. Rawat
Arbitrary Reading Order Scene Text Spotter with Local Semantics Guidance
Jiahao Lyu, Wei Wang, Dongbao Yang et al.
Training Language Models on Synthetic Edit Sequences Improves Code Synthesis
Ulyana Piterbarg, Lerrel Pinto, Rob Fergus
Scaling Laws for Task-Optimized Models of the Primate Visual Ventral Stream
Abdulkadir Gokce, Martin Schrimpf
GenesisTex2: Stable, Consistent and High-Quality Text-to-Texture Generation
Jiawei Lu, YingPeng Zhang, Zengjun Zhao et al.
GraphLand: Evaluating Graph Machine Learning Models on Diverse Industrial Data
Gleb Bazhenov, Oleg Platonov, Liudmila Prokhorenkova
SwiftTry: Fast and Consistent Video Virtual Try-On with Diffusion Models
Hung Nguyen, Quang Qui-Vinh Nguyen, Khoi Nguyen et al.
Overcoming the Curse of Dimensionality in Reinforcement Learning Through Approximate Factorization
Chenbei Lu, Laixi Shi, Zaiwei Chen et al.
Perception in Reflection
Yana Wei, Liang Zhao, Kangheng Lin et al.
Motion-aware Contrastive Learning for Temporal Panoptic Scene Graph Generation
Thong Thanh Nguyen, Xiaobao Wu, Yi Bin et al.
PBR-NeRF: Inverse Rendering with Physics-Based Neural Fields
Sean Wu, Shamik Basu, Tim Broedermann et al.
Neighborhood Self-Dissimilarity Attention for Medical Image Segmentation
Junren Chen, Rui Chen, Wei Wang et al.
Robust and Conjugate Spatio-Temporal Gaussian Processes
William Laplante, Matias Altamirano, Andrew Duncan et al.
Loss Functions and Operators Generated by f-Divergences
Vincent Roulet, Tianlin Liu, Nino Vieillard et al.
ConMo: Controllable Motion Disentanglement and Recomposition for Zero-Shot Motion Transfer
Jiayi Gao, Zijin Yin, Changcheng Hua et al.
FreSh: Frequency Shifting for Accelerated Neural Representation Learning
Adam Kania, Marko Mihajlovic, Sergey Prokudin et al.
Validating LLM-as-a-Judge Systems under Rating Indeterminacy
Luke Guerdan, Solon Barocas, Kenneth Holstein et al.
Ultra-Resolution Adaptation with Ease
Ruonan Yu, Songhua Liu, Zhenxiong Tan et al.
Segment Any 3D Object with Language
Seungjun Lee, Yuyang Zhao, Gim H Lee
Robust Multimodal Survival Prediction with Conditional Latent Differentiation Variational AutoEncoder
Junjie Zhou, Jiao Tang, Yingli Zuo et al.
Symbolic regression via MDLformer-guided search: from minimizing prediction error to minimizing description length
Zihan Yu, Jingtao Ding, Yong Li et al.
Object-Shot Enhanced Grounding Network for Egocentric Video
Yisen Feng, Haoyu Zhang, Meng Liu et al.
Spatial Understanding from Videos: Structured Prompts Meet Simulation Data
Haoyu Zhang, Meng Liu, Zaijing Li et al.
System-1.5 Reasoning: Traversal in Language and Latent Spaces with Dynamic Shortcuts
Xiaoqiang Wang, Suyuchen Wang, Yun Zhu et al.
Uncertainty Modeling in Graph Neural Networks via Stochastic Differential Equations
Richard Bergna, Sergio Calvo Ordoñez, Felix Opolka et al.
SEMU: Singular Value Decomposition for Efficient Machine Unlearning
Marcin Sendera, Łukasz Struski, Kamil Książek et al.
Multimodal LLMs as Customized Reward Models for Text-to-Image Generation
Shijie Zhou, Ruiyi Zhang, Huaisheng Zhu et al.
Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation
Bolin Lai, Felix Juefei-Xu, Miao Liu et al.
COSMOS: Cross-Modality Self-Distillation for Vision Language Pre-training
Sanghwan Kim, Rui Xiao, Iuliana Georgescu et al.
COLUMBUS: Evaluating COgnitive Lateral Understanding Through Multiple-Choice reBUSes
Koen Kraaijveld, Yifan Jiang, Kaixin Ma et al.
PIG: Physics-Informed Gaussians as Adaptive Parametric Mesh Representations
Namgyu Kang, Jaemin Oh, Youngjoon Hong et al.
MotionAgent: Fine-grained Controllable Video Generation via Motion Field Agent
Xinyao Liao, Xianfang Zeng, Liao Wang et al.
Synergizing Motion and Appearance: Multi-Scale Compensatory Codebooks for Talking Head Video Generation
Shuling Zhao, Fa-Ting Hong, Xiaoshui Huang et al.
StreamGS: Online Generalizable Gaussian Splatting Reconstruction for Unposed Image Streams
Yang LI, Jinglu Wang, Lei Chu et al.
ManiVideo: Generating Hand-Object Manipulation Video with Dexterous and Generalizable Grasping
Youxin Pang, Ruizhi Shao, Jiajun Zhang et al.
Leveraging Skills from Unlabeled Prior Data for Efficient Online Exploration
Max Wilcoxson, Qiyang Li, Kevin Frans et al.
AutoElicit: Using Large Language Models for Expert Prior Elicitation in Predictive Modelling
Alexander Capstick, Rahul G. Krishnan, Payam Barnaghi
ESE: Espresso Sentence Embeddings
Xianming Li, Zongxi Li, Jing Li et al.
Second Order Bounds for Contextual Bandits with Function Approximation
Aldo Pacchiano
Doubly Contrastive Learning for Source-Free Domain Adaptive Person Search
Yizhen Jia, Rong Quan, Yue Feng et al.
A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image Generation
Liang Chen, Sinan Tan, Zefan Cai et al.
TruthPrInt: Mitigating Large Vision-Language Models Object Hallucination Via Latent Truthful-Guided Pre-Intervention
Jinhao Duan, Fei Kong, Hao Cheng et al.
Geometric Knowledge-Guided Localized Global Distribution Alignment for Federated Learning
Yanbiao Ma, Wei Dai, Wenke Huang et al.
HQGS: High-Quality Novel View Synthesis with Gaussian Splatting in Degraded Scenes
Xin Lin, Shi Luo, Xiaojun Shan et al.
M3amba: Memory Mamba is All You Need for Whole Slide Image Classification
Tingting Zheng, Kui Jiang, Yi Xiao et al.
A multiscale analysis of mean-field transformers in the moderate interaction regime
Giuseppe Bruno, Federico Pasqualotto, Andrea Agazzi
KinMo: Kinematic-aware Human Motion Understanding and Generation
Pengfei Zhang, Pinxin Liu, Pablo Garrido et al.
SliderSpace: Decomposing the Visual Capabilities of Diffusion Models
Rohit Gandikota, Zongze Wu, Richard Zhang et al.
Building a Mind Palace: Structuring Environment-Grounded Semantic Graphs for Effective Long Video Analysis with LLMs
Zeyi Huang, Yuyang Ji, Xiaofang Wang et al.
Position: We Need An Algorithmic Understanding of Generative AI
Oliver Eberle, Thomas McGee, Hamza Giaffar et al.
DuMo: Dual Encoder Modulation Network for Precise Concept Erasure
Feng Han, Kai Chen, Chao Gong et al.
Pursuing Feature Separation based on Neural Collapse for Out-of-Distribution Detection
Yingwen Wu, Ruiji Yu, Xinwen Cheng et al.
MI-DETR: An Object Detection Model with Multi-time Inquiries Mechanism
Zhixiong Nan, Xianghong Li, Tao Xiang et al.
A General Adaptive Dual-level Weighting Mechanism for Remote Sensing Pansharpening
Jie Huang, Haorui Chen, Jiaxuan Ren et al.
Emergence and Evolution of Interpretable Concepts in Diffusion Models
Berk Tinaz, Zalan Fabian, Mahdi Soltanolkotabi
Data-Juicer Sandbox: A Feedback-Driven Suite for Multimodal Data-Model Co-development
Daoyuan Chen, Haibin Wang, Yilun Huang et al.
Conformal Linguistic Calibration: Trading-off between Factuality and Specificity
Zhengping Jiang, Anqi Liu, Ben Van Durme
PQMass: Probabilistic Assessment of the Quality of Generative Models using Probability Mass Estimation
Pablo Lemos, Sammy Sharief, Nikolay Malkin et al.
Generating Freeform Endoskeletal Robots
Muhan Li, Lingji Kong, Sam Kriegman
Triples as the Key: Structuring Makes Decomposition and Verification Easier in LLM-based TableQA
Zhen Yang, Ziwei Du, Minghan Zhang et al.
AVF-MAE++: Scaling Affective Video Facial Masked Autoencoders via Efficient Audio-Visual Self-Supervised Learning
Xuecheng Wu, Heli Sun, Yifan Wang et al.
Out of Length Text Recognition with Sub-String Matching
Yongkun Du, Zhineng Chen, Caiyan Jia et al.
GS-LIVM: Real-Time Photo-Realistic LiDAR-Inertial-Visual Mapping with Gaussian Splatting
Yusen XIE, Zhenmin Huang, Jin Wu et al.
Evaluating Neuron Explanations: A Unified Framework with Sanity Checks
Tuomas Oikarinen, Ge Yan, Lily Weng
RobustMerge: Parameter-Efficient Model Merging for MLLMs with Direction Robustness
Fanhu Zeng, Haiyang Guo, Fei Zhu et al.
SVGBuilder: Component-Based Colored SVG Generation with Text-Guided Autoregressive Transformers
Zehao Chen, Rong Pan
CausalRivers - Scaling up benchmarking of causal discovery for real-world time-series
Gideon Stein, Maha Shadaydeh, Jan Blunk et al.
Training-Free Constrained Generation With Stable Diffusion Models
Stefano Zampini, Jacob K Christopher, Luca Oneto et al.
Keeping an Eye on LLM Unlearning: The Hidden Risk and Remedy
Jie Ren, Zhenwei Dai, Xianfeng Tang et al.
Robustness Auditing for Linear Regression: To Singularity and Beyond
Ittai Rubinstein, Samuel Hopkins
PADRe: A Unifying Polynomial Attention Drop-in Replacement for Efficient Vision Transformer
Pierre-David Letourneau, Manish Singh, Hsin-Pai Cheng et al.
Multi-Perspective Data Augmentation for Few-shot Object Detection
Anh-Khoa Nguyen Vu, Quoc Truong Truong, Vinh-Tiep Nguyen et al.
Parameter-efficient Fine-tuning in Hyperspherical Space for Open-vocabulary Semantic Segmentation
Zelin Peng, Zhengqin Xu, Zhilin Zeng et al.
Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation
Yuying Ge, Yizhuo Li, Yixiao Ge et al.
Enhancing Privacy-Utility Trade-offs to Mitigate Memorization in Diffusion Models
Chen Chen, Daochang Liu, Mubarak Shah et al.
Keyframe-oriented Vision Token Pruning: Enhancing Efficiency of Large Vision Language Models on Long-Form Video Processing
Yudong Liu, Jingwei Sun, Yueqian Lin et al.
Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding
Jinlong Li, Cristiano Saltori, Fabio Poiesi et al.
Adaptive Calibration: A Unified Conversion Framework of Spiking Neural Networks
Ziqing Wang, Yuetong Fang, Jiahang Cao et al.
ARIG: Autoregressive Interactive Head Generation for Real-time Conversations
Ying Guo, Xi Liu, Cheng Zhen et al.
DISCO: learning to DISCover an evolution Operator for multi-physics-agnostic prediction
Rudy Morel, Jiequn Han, Edouard Oyallon
CAM: A Constructivist View of Agentic Memory for LLM-Based Reading Comprehension
Rui Li, Zeyu Zhang, Xiaohe Bo et al.
Generalizable Sensor-Based Activity Recognition via Categorical Concept Invariant Learning
Di Xiong, Shuoyuan Wang, Lei Zhang et al.
Noisy Label Calibration for Multi-View Classification
Shilin Xu, Yuan Sun, Xingfeng Li et al.
NOVA: A Benchmark for Rare Anomaly Localization and Clinical Reasoning in Brain MRI
Cosmin Bercea, Jun Li, Philipp Raffler et al.
Ringmaster ASGD: The First Asynchronous SGD with Optimal Time Complexity
Artavazd Maranjyan, Alexander Tyurin, Peter Richtarik
ActionReasoningBench: Reasoning about Actions with and without Ramification Constraints
Divij Handa, Pavel Dolin, Shrinidhi Kumbhar et al.
Scene Map-based Prompt Tuning for Navigation Instruction Generation
Sheng Fan, Rui Liu, Wenguan Wang et al.
NeSyC: A Neuro-symbolic Continual Learner For Complex Embodied Tasks in Open Domains
Wonje Choi, Jinwoo Park, Sanghyun Ahn et al.
Value-Guided Search for Efficient Chain-of-Thought Reasoning
Kaiwen Wang, Jin Zhou, Jonathan Chang et al.
HAIF-GS: Hierarchical and Induced Flow-Guided Gaussian Splatting for Dynamic Scene
Jianing Chen, Zehao Li, Yujun Cai et al.
Bigger, Regularized, Categorical: High-Capacity Value Functions are Efficient Multi-Task Learners
Michal Nauman, Marek Cygan, Carmelo Sferrazza et al.
Learning Complex Heterogeneous Multimodal Fake News via Social Latent Network Inference
Mingxin Li, Yuchen Zhang, Haowei Xu et al.
Latent Policy Barrier: Learning Robust Visuomotor Policies by Staying In-Distribution
Zhanyi Sun, Shuran Song
DAMO: Decoding by Accumulating Activations Momentum for Mitigating Hallucinations in Vision-Language Models
Kaishen Wang, Hengrui Gu, Meijun Gao et al.
Learning Fine-Grained Representations through Textual Token Disentanglement in Composed Video Retrieval
Yue Wu, Zhaobo Qi, Yiling Wu et al.
Solving Robust Markov Decision Processes: Generic, Reliable, Efficient
Tobias Meggendorfer, Maximilian Weininger, Patrick Wienhöft