Most Cited 2025 "hardware robotic control" Papers
22,274 papers found • Page 62 of 112
Conference
Learning Visual Hierarchies in Hyperbolic Space for Image Retrieval
Ziwei Wang, Sameera Ramasinghe, Chenchen Xu et al.
Doctor Approved: Generating Medically Accurate Skin Disease Images through AI-Expert Feedback
Janet Wang, Yunbei Zhang, Zhengming Ding et al.
Estimating Interventional Distributions with Uncertain Causal Graphs through Meta-Learning
Anish Dhir, Cristiana Diaconu, Valentinian Lungu et al.
CellCLIP - Learning Perturbation Effects in Cell Painting via Text-Guided Contrastive Learning
MingYu Lu, Ethan Weinberger, Chanwoo Kim et al.
Exploring the Translation Mechanism of Large Language Models
Hongbin Zhang, Kehai Chen, Xuefeng Bai et al.
ViCTr: Vital Consistency Transfer for Pathology Aware Image Synthesis
Onkar Susladkar, Gayatri Deshmukh, Yalcin Tur et al.
Seeing the Trees for the Forest: Rethinking Weakly-Supervised Medical Visual Grounding
Huy Ta, Duy Anh Huynh, Yutong Xie et al.
Understanding Co-speech Gestures in-the-wild
Sindhu Hegde, K R Prajwal, Taein Kwon et al.
Data Mixture Optimization: A Multi-fidelity Multi-scale Bayesian Framework
Thomson Yen, Andrew Siah, Haozhe Chen et al.
Template-Guided 3D Molecular Pose Generation via Flow Matching and Differentiable Optimization
Noémie Bergues, Arthur Carré, Paul Join-Lambert et al.
UMotion: Uncertainty-driven Human Motion Estimation from Inertial and Ultra-wideband Units
Huakun Liu, Hiroki Ota, Xin Wei et al.
D2ST-Adapter: Disentangled-and-Deformable Spatio-Temporal Adapter for Few-shot Action Recognition
Wenjie Pei, Qizhong Tan, Guangming Lu et al.
Everything is a Video: Unifying Modalities through Next-Frame Prediction
G Thomas Hudson, Dean Slack, Thomas Winterbottom et al.
The Bias-Variance Tradeoff in Data-Driven Optimization: A Local Misspecification Perspective
Haixiang Lan, Luofeng Liao, Adam N. Elmachtoub et al.
Provable Ordering and Continuity in Vision-Language Pretraining for Generalizable Embodied Agents
Zhizhen Zhang, Lei Zhu, Zhen Fang et al.
MESS+: Dynamically Learned Inference-Time LLM Routing in Model Zoos with Service Level Guarantees
Herbert Woisetschläger, Ryan Zhang, Shiqiang Wang et al.
On Transferring Transferability: Towards a Theory for Size Generalization
Eitan Levin, Yuxin Ma, Mateo Diaz et al.
LINR-PCGC: Lossless Implicit Neural Representations for Point Cloud Geometry Compression
Wenjie Huang, Qi Yang, Shuting Xia et al.
LoRAverse: A Submodular Framework to Retrieve Diverse Adapters for Diffusion Models
Mert Sonmezer, Matthew Zheng, Pinar Yanardag
The Effect of Optimal Self-Distillation in Noisy Gaussian Mixture Model
Kaito Takanami, Takashi Takahashi, Ayaka Sakata
Improving SAM for Camouflaged Object Detection via Dual Stream Adapters
Jiaming Liu, Linghe Kong, Guihai Chen
Improved Regret Bounds for Linear Bandits with Heavy-Tailed Rewards
Artin Tajdini, Jonathan Scarlett, Kevin Jamieson
Quantifying Distributional Invariance in Causal Subgraph for IRM-Free Graph Generalization
Yang Qiu, Yixiong Zou, Jun Wang et al.
Learning Interactive World Model for Object-Centric Reinforcement Learning
Fan Feng, Phillip Lippe, Sara Magliacane
SAS: Segment Any 3D Scene with Integrated 2D Priors
Zhuoyuan Li, Jiahao Lu, Jiacheng Deng et al.
MosaicDiff: Training-free Structural Pruning for Diffusion Model Acceleration Reflecting Pretraining Dynamics
Bowei Guo, Shengkun Tang, Cong Zeng et al.
Simultaneous Motion And Noise Estimation with Event Cameras
Shintaro Shiba, Yoshimitsu Aoki, Guillermo Gallego
MMCR: Benchmarking Cross-Source Reasoning in Scientific Papers
Yang Tian, Zheng Lu, Mingqi Gao et al.
On the Mechanisms of Weak-to-Strong Generalization: A Theoretical Perspective
Behrad Moniri, Hamed Hassani
Structure Matters: Revisiting Boundary Refinement in Video Object Segmentation
Guanyi Qin, Ziyue Wang, Daiyun Shen et al.
MMIG-Bench: Towards Comprehensive and Explainable Evaluation of Multi-Modal Image Generation Models
Hang Hua, Ziyun Zeng, Yizhi Song et al.
Web Artifact Attacks Disrupt Vision Language Models
Maan Qraitem, Piotr Teterwak, Kate Saenko et al.
A General-Purpose Theorem for High-Probability Bounds of Stochastic Approximation with Polyak Averaging
Sajad Khodadadian, Martin Zubeldia
Bayes optimal learning of attention-indexed models
Fabrizio Boncoraglio, Emanuele Troiani, Vittorio Erba et al.
Far from the Shallow: Brain-Predictive Reasoning Embedding through Residual Disentanglement
Linyang He, Tianjun Zhong, Richard Antonello et al.
Autoregressive Denoising Score Matching is a Good Video Anomaly Detector
hanwen Zhang, Congqi Cao, Qinyi Lv et al.
On-Device Self-Supervised Learning of Low-Latency Monocular Depth from Only Events
Jesse Hagenaars, Yilun Wu, Federico Paredes Valles et al.
Fast Monte Carlo Tree Diffusion: 100× Speedup via Parallel and Sparse Planning
Jaesik Yoon, Hyeonseo Cho, Yoshua Bengio et al.
Transformers for Mixed-type Event Sequences
Felix Draxler, Yang Meng, Kai Nelson et al.
Neural Emulator Superiority: When Machine Learning for PDEs Surpasses its Training Data
Felix Koehler, Nils Thuerey
Learn2Synth: Learning Optimal Data Synthesis Using Hypergradients for Brain Image Segmentation
Xiaoling Hu, Xiangrui Zeng, Oula Puonti et al.
GraLoRA: Granular Low-Rank Adaptation for Parameter-Efficient Fine-Tuning
Yeonjoon Jung, Daehyun Ahn, Hyungjun Kim et al.
Track Any Anomalous Object:A Granular Video Anomaly Detection Pipeline
Yuzhi Huang, Chenxin Li, Haitao Zhang et al.
DADM: Dual Alignment of Domain and Modality for Face Anti-spoofing
Yang JingYi, Xun Lin, Zitong YU et al.
Quantifying and Narrowing the Unknown: Interactive Text-to-Video Retrieval via Uncertainty Minimization
Bingqing Zhang, Zhuo Cao, Heming Du et al.
LiON-LoRA: Rethinking LoRA Fusion to Unify Controllable Spatial and Temporal Generation for Video Diffusion
Yisu Zhang, Chenjie Cao, Chaohui Yu et al.
InstaScene: Towards Complete 3D Instance Decomposition and Reconstruction from Cluttered Scenes
Zesong Yang, Bangbang Yang, Wenqi Dong et al.
Ferret: An Efficient Online Continual Learning Framework under Varying Memory Constraints
Yuhao Zhou, Yuxin Tian, Jindi Lv et al.
Where's the Liability in the Generative Era? Recovery-based Black-Box Detection of AI-Generated Content
Haoyue Bai, Yiyou Sun, Wei Cheng et al.
One Trajectory, One Token: Grounded Video Tokenization via Panoptic Sub-object Trajectory
Chenhao Zheng, Jieyu Zhang, Mohammadreza Salehi et al.
Incomplete Multi-modal Brain Tumor Segmentation via Learnable Sorting State Space Model
Zheyu Zhang, Yayuan Lu, Feipeng Ma et al.
UniFoil: A Universal Dataset of Airfoils in Transitional and Turbulent Regimes for Subsonic and Transonic Flows
Rohit Kanchi, Benjamin Melanson, Nithin Somasekharan et al.
Deep Change Monitoring: A Hyperbolic Representative Learning Framework and a Dataset for Long-term Fine-grained Tree Change Detection
Yante Li, Hanwen Qi, Haoyu Chen et al.
Solving Continuous Mean Field Games: Deep Reinforcement Learning for Non-Stationary Dynamics
Lorenzo Magnino, Kai Shao, Zida Wu et al.
CoMoGaussian: Continuous Motion-Aware Gaussian Splatting from Motion-Blurred Images
Jungho Lee, DongHyeong Kim, Dogyoon Lee et al.
A Provable Approach for End-to-End Safe Reinforcement Learning
Akifumi Wachi, Kohei Miyaguchi, Takumi Tanabe et al.
Flow Density Control: Generative Optimization Beyond Entropy-Regularized Fine-Tuning
Riccardo De Santi, Marin Vlastelica, Ya-Ping Hsieh et al.
Timestep-Aware Diffusion Model for Extreme Image Rescaling
Ce Wang, Zhenyu Hu, Wanjie Sun et al.
Representation Consistency for Accurate and Coherent LLM Answer Aggregation
Junqi Jiang, Tom Bewley, Salim I. Amoukou et al.
SAEMark: Steering Personalized Multilingual LLM Watermarks with Sparse Autoencoders
Zhuohao Yu, Xingru Jiang, Weizheng Gu et al.
metaTextGrad: Automatically optimizing language model optimizers
Guowei Xu, Mert Yuksekgonul, Carlos Guestrin et al.
MoST: Efficient Monarch Sparse Tuning for 3D Representation Learning
Xu Han, Yuan Tang, Jinfeng Xu et al.
Beyond Components: Singular Vector-Based Interpretability of Transformer Circuits
Areeb Ahmad, Abhinav Joshi, Ashutosh Modi
Adversarial Attention Perturbations for Large Object Detection Transformers
Zachary Yahn, Selim Tekin, Fatih Ilhan et al.
FSBench: A Figure Skating Benchmark for Advancing Artistic Sports Understanding
Rong Gao, Xin Liu, Zhuozhao Hu et al.
Towards Large-Scale In-Context Reinforcement Learning by Meta-Training in Randomized Worlds
Fan Wang, Pengtao Shao, Yiming Zhang et al.
Learning Visual Composition through Improved Semantic Guidance
Austin Stone, Hagen Soltau, Robert Geirhos et al.
Fixed-Point RNNs: Interpolating from Diagonal to Dense
Sajad Movahedi, Felix Sarnthein, Nicola Muca Cirone et al.
Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models
Yonggan Fu, Xin Dong, Shizhe Diao et al.
Clean First, Align Later: Benchmarking Preference Data Cleaning for Reliable LLM Alignment
Samuel (Min-Hsuan) Yeh, Sharon Li
JailbreakDiffBench: A Comprehensive Benchmark for Jailbreaking Diffusion Models
Xiaolong Jin, Zixuan Weng, Hanxi Guo et al.
REASONING COMPILER: LLM-Guided Optimizations for Efficient Model Serving
Annabelle Sujun Tang, Christopher Priebe, Rohan Mahapatra et al.
DiN: Diffusion Model for Robust Medical VQA with Semantic Noisy Labels
Erjian Guo, Zhen Zhao, Zicheng Wang et al.
V2V3D: View-to-View Denoised 3D Reconstruction for Light Field Microscopy
Jiayin Zhao, Zhenqi Fu, Tao Yu et al.
SKALD: Learning-Based Shot Assembly for Coherent Multi-Shot Video Creation
Chen Yi Lu, Mehrab Tanjim, Ishita Dasgupta et al.
SING: SDE Inference via Natural Gradients
Amber Hu, Henry Smith, Scott Linderman
Disentanglement Beyond Static vs. Dynamic: A Benchmark and Evaluation Framework for Multi-Factor Sequential Representations
Tal Barami, Nimrod Berman, Ilan Naiman et al.
AI Testing Should Account for Sophisticated Strategic Behaviour
Vojta Kovarik, Eric Chen, Sami Petersen et al.
Towards Improved Text-Aligned Codebook Learning: Multi-Hierarchical Codebook-Text Alignment with Long Text
Guotao liang, Baoquan Zhang, Zhiyuan Wen et al.
GST-UNet: A Neural Framework for Spatiotemporal Causal Inference with Time-Varying Confounding
Miruna Oprescu, David Park, Xihaier Luo et al.
Bigram Subnetworks: Mapping to Next Tokens in Transformer Language Models
Tyler Chang, Benjamin Bergen
MUSTAFAR: Promoting Unstructured Sparsity for KV Cache Pruning in LLM Inference
Donghyeon Joo, Helya Hosseini, Ramyad Hadidi et al.
VA-GS: Enhancing the Geometric Representation of Gaussian Splatting via View Alignment
Qing Li, Huifang Feng, Xun Gong et al.
DPoser-X: Diffusion Model as Robust 3D Whole-body Human Pose Prior
Junzhe Lu, Jing Lin, Hongkun Dou et al.
Handling Spatial-Temporal Data Heterogeneity for Federated Continual Learning via Tail Anchor
Hao Yu, Xin Yang, Le Zhang et al.
Cross-Modal Representational Knowledge Distillation for Enhanced Spike-informed LFP Modeling
Eray Erturk, Saba Hashemi, Maryam Shanechi
Infighting in the Dark: Multi-Label Backdoor Attack in Federated Learning
Ye Li, Yanchao Zhao, chengcheng zhu et al.
HiGarment: Cross-modal Harmony Based Diffusion Model for Flat Sketch to Realistic Garment Image
Junyi Guo, Jingxuan Zhang, Fangyu Wu et al.
Robo2VLM: Improving Visual Question Answering using Large-Scale Robot Manipulation Data
Kaiyuan Eric Chen, Shuangyu Xie, Zehan Ma et al.
MoE-Gyro: Self-Supervised Over-Range Reconstruction and Denoising for MEMS Gyroscopes
Feiyang Pan, Shenghe Zheng, Chunyan Yin et al.
Generalized Linear Mode Connectivity for Transformers
Alexander Theus, Alessandro Cabodi, Sotiris Anagnostidis et al.
SynTab-LLaVA: Enhancing Multimodal Table Understanding with Decoupled Synthesis
Bangbang Zhou, Zuan Gao, Zixiao Wang et al.
What Happens During the Loss Plateau? Understanding Abrupt Learning in Transformers
Pulkit Gopalani, Wei Hu
Affine-Invariant Global Non-Asymptotic Convergence Analysis of BFGS under Self-Concordance
Qiujiang Jin, Aryan Mokhtari
Fast Globally Optimal and Geometrically Consistent 3D Shape Matching
Paul Roetzer, Florian Bernard
Beyond Human Perception: Understanding Multi-Object World from Monocular View
Keyu Guo, Yongle Huang, Shijie Sun et al.
Breaking the Performance Ceiling in Reinforcement Learning requires Inference Strategies
Felix Chalumeau, Daniel Rajaonarivonivelomanantsoa, Ruan John de Kock et al.
CosmoBench: A Multiscale, Multiview, Multitask Cosmology Benchmark for Geometric Deep Learning
Teresa Huang, Richard Stiskalek, Jun-Young Lee et al.
Detoxifying Large Language Models via Autoregressive Reward Guided Representation Editing
Yisong Xiao, Aishan Liu, Siyuan Liang et al.
CALM-PDE: Continuous and Adaptive Convolutions for Latent Space Modeling of Time-dependent PDEs
Jan Hagnberger, Daniel Musekamp, Mathias Niepert
Unsupervised Learning for Optimal Transport plan prediction between unbalanced graphs
Sonia Mazelet, Rémi Flamary, Bertrand Thirion
Connecting Neural Models Latent Geometries with Relative Geodesic Representations
Hanlin Yu, Berfin Inal, Georgios Arvanitidis et al.
Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy and Research
A. Feder Cooper, Christopher A. Choquette-Choo, Miranda Bogen et al.
BOE-ViT: Boosting Orientation Estimation with Equivariance in Self-Supervised 3D Subtomogram Alignment
Runmin Jiang, Jackson Daggett, Shriya Pingulkar et al.
Optimal Transport-Guided Source-Free Adaptation for Face Anti-Spoofing
Zhuowei Li, Tianchen Zhao, Xiang Xu et al.
Stable Matching with Ties: Approximation Ratios and Learning
Shiyun Lin, Simon Mauras, Nadav Merlis et al.
Incomplete Multi-view Clustering via Hierarchical Semantic Alignment and Cooperative Completion
Xiaojian Ding, Lin Zhao, Xian Li et al.
PreFM: Online Audio-Visual Event Parsing via Predictive Future Modeling
Xiao Yu, Yan Fang, Yao Zhao et al.
Efficient Multimodal Dataset Distillation via Generative Models
Zhenghao Zhao, Haoxuan Wang, Junyi Wu et al.
LoRASuite: Efficient LoRA Adaptation Across Large Language Model Upgrades
Yanan Li, Fanxu Meng, Muhan Zhang et al.
Dynamic Regret Reduces to Kernelized Static Regret
Andrew Jacobsen, Alessandro Rudi, Francesco Orabona et al.
GBC-Splat: Generalizable Gaussian-Based Clothed Human Digitalization under Sparse RGB Cameras
Hanzhang Tu, Zhanfeng Liao, Boyao Zhou et al.
MDP: Multidimensional Vision Model Pruning with Latency Constraint
Xinglong Sun, Barath Lakshmanan, Maying Shen et al.
DH-Set: Improving Vision-Language Alignment with Diverse and Hybrid Set-Embeddings Learning
Kun Zhang, Jingyu Li, Zhe Li et al.
DMQ: Dissecting Outliers of Diffusion Models for Post-Training Quantization
Dongyeun Lee, jiwan hur, Hyounguk Shon et al.
Rigor in AI: Doing Rigorous AI Work Requires a Broader, Responsible AI-Informed Conception of Rigor
Alexandra Olteanu, Su Lin Blodgett, Agathe Balayn et al.
Where the Devil Hides: Deepfake Detectors Can No Longer Be Trusted
Shuaiwei Yuan, Junyu Dong, Yuezun Li
Dyn-O: Building Structured World Models with Object-Centric Representations
Zizhao Wang, Kaixin Wang, Li Zhao et al.
Hyperbolic Uncertainty-Aware Few-Shot Incremental Point Cloud Segmentation
Tanuj Sur, Samrat Mukherjee, Kaizer Rahaman et al.
Sparse Polyak: an adaptive step size rule for high-dimensional M-estimation
Tianqi Qiao, Marie Maros
Tackling Feature-Classifier Mismatch in Federated Learning via Prompt-Driven Feature Transformation
Xinghao Wu, Xuefeng Liu, Jianwei Niu et al.
Convolution Goes Higher-Order: A Biologically Inspired Mechanism Empowers Image Classification
Simone Azeglio, Olivier Marre, Peter Neri et al.
JiSAM: Alleviate Labeling Burden and Corner Case Problems in Autonomous Driving via Minimal Real-World Data
Runjian Chen, Wenqi Shao, Bo Zhang et al.
S'MoRE: Structural Mixture of Residual Experts for Parameter-Efficient LLM Fine-tuning
Hanqing Zeng, Yinglong Xia, Zhuokai Zhao et al.
MVU-Eval: Towards Multi-Video Understanding Evaluation for Multimodal LLMs
Tianhao Peng, Haochen Wang, Yuanxing Zhang et al.
SEGA: A Stepwise Evolution Paradigm for Content-Aware Layout Generation with Design Prior
Bo Zhao, Haoran Wang, Jinghui Wang et al.
MultiHuman-Testbench: Benchmarking Image Generation for Multiple Humans
Shubhankar Borse, Seokeon Choi, Sunghyun Park et al.
Selective Learning for Deep Time Series Forecasting
Yisong Fu, Zezhi Shao, Chengqing Yu et al.
Conformal Risk Training: End-to-End Optimization of Conformal Risk Control
Christopher Yeh, Nicolas Christianson, Adam Wierman et al.
Zero-Shot Head Swapping in Real-World Scenarios
Sohyun Jeong, Taewoong Kang, Hyojin Jang et al.
Perceptual Video Compression with Neural Wrapping
Muhammad Umar Karim Khan, Aaron Chadha, Mohammad Ashraful Anam et al.
RobustSplat: Decoupling Densification and Dynamics for Transient-Free 3DGS
Chuanyu Fu, Yuqi Zhang, Kunbin Yao et al.
DALIP: Distribution Alignment-based Language-Image Pre-Training for Domain-Specific Data
Junjie Wu, Jiangtao Xie, Zhaolin Zhang et al.
MagShield: Towards Better Robustness in Sparse Inertial Motion Capture Under Magnetic Disturbances
Yunzhe Shao, Xinyu Yi, Lu Yin et al.
Noise Matters: Optimizing Matching Noise for Diffusion Classifiers
Yanghao Wang, Long Chen
Distributionally Robust Performative Optimization
Zhuangzhuang Jia, Yijie Wang, Roy Dong et al.
Improving Diffusion-based Inverse Algorithms under Few-Step Constraint via Linear Extrapolation
Jiawei Zhang, Ziyuan Liu, Leon Yan et al.
Towards Cross-modal Backward-compatible Representation Learning for Vision-Language Models
Young Kyun Jang, Ser-Nam Lim
A Diffusion Model for Regular Time Series Generation from Irregular Data with Completion and Masking
Gal Fadlon, Idan Arbiv, Nimrod Berman et al.
Open-World Drone Active Tracking with Goal-Centered Rewards
Haowei Sun, Jinwu Hu, Zhirui Zhang et al.
Stronger, Steadier & Superior: Geometric Consistency in Depth VFM Forges Domain Generalized Semantic Segmentation
Siyu Chen, Ting Han, Changshe Zhang et al.
Variational Supervised Contrastive Learning
Ziwen Wang, Jiajun Fan, Thao Nguyen et al.
Beyond Single Images: Retrieval Self-Augmented Unsupervised Camouflaged Object Detection
Ji Du, Xin WANG, Fangwei Hao et al.
ADPretrain: Advancing Industrial Anomaly Detection via Anomaly Representation Pretraining
Xincheng Yao, Yan Luo, Zefeng Qian et al.
HAODiff: Human-Aware One-Step Diffusion via Dual-Prompt Guidance
JUE GONG, Tingyu Yang, Jingkai Wang et al.
On the Optimal Construction of Unbiased Gradient Estimators for Zeroth-Order Optimization
Shaocong Ma, Heng Huang
Learning Individual Behavior in Agent-Based Models with Graph Diffusion Networks
Francesco Cozzi, Marco Pangallo, Alan Perotti et al.
MixAT: Combining Continuous and Discrete Adversarial Training for LLMs
Csaba Dékány, Stefan Balauca, Dimitar I. Dimitrov et al.
MetaShadow: Object-Centered Shadow Detection, Removal, and Synthesis
Tianyu Wang, Jianming Zhang, Haitian Zheng et al.
Exploiting Domain Properties in Language-Driven Domain Generalization for Semantic Segmentation
Seogkyu Jeon, Kibeom Hong, Hyeran Byun
Inv-Entropy: A Fully Probabilistic Framework for Uncertainty Quantification in Language Models
Haoyi Song, Ruihan Ji, Naichen Shi et al.
Three Cars Approaching within 100m! Enhancing Distant Geometry by Tri-Axis Voxel Scanning for Camera-based Semantic Scene Completion
Jongseong Bae, Junwoo Ha, Ha Young Kim
Distributional Autoencoders Know the Score
Andrej Leban
Infinite-Width Limit of a Single Attention Layer: Analysis via Tensor Programs
Mana Sakai, Ryo Karakida, Masaaki Imaizumi
Deep Fair Multi-View Clustering with Attention KAN
HaiMing Xu, Qianqian Wang, Boyue Wang et al.
PBR-SR: Mesh PBR Texture Super Resolution from 2D Image Priors
Yujin Chen, Yinyu Nie, Benjamin Ummenhofer et al.
Aligning Constraint Generation with Design Intent in Parametric CAD
Evan Casey, Tianyu Zhang, Shu Ishida et al.
Distinguish Then Exploit: Source-free Open Set Domain Adaptation via Weight Barcode Estimation and Sparse Label Assignment
Weiming Liu, Jun Dan, Fan Wang et al.
StruMamba3D: Exploring Structural Mamba for Self-supervised Point Cloud Representation Learning
Chuxin Wang, Yixin Zha, Wenfei Yang et al.
LinEAS: End-to-end Learning of Activation Steering with a Distributional Loss
Pau Rodriguez, Michal Klein, Eleonora Gualdoni et al.
Decouple to Reconstruct: High Quality UHD Restoration via Active Feature Disentanglement and Reversible Fusion
Yidi Liu, Dong Li, Yuxin Ma et al.
CCL-LGS: Contrastive Codebook Learning for 3D Language Gaussian Splatting
Lei Tian, Xiaomin Li, Liqian Ma et al.
Diffusion Generative Modeling on Lie Group Representations
Marco Bertolini, Tuan Le, Djork-Arné Clevert
Group-Level Data Selection for Efficient Pretraining
Zichun Yu, Fei Peng, Jie Lei et al.
Chirality in Action: Time-Aware Video Representation Learning by Latent Straightening
Piyush Nitin Bagad, Andrew Zisserman
The Generative Leap: Tight Sample Complexity for Efficiently Learning Gaussian Multi-Index Models
Alex Damian, Jason Lee, Joan Bruna
Elucidated Rolling Diffusion Models for Probabilistic Forecasting of Complex Dynamics
Salva Rühling Cachay, Miika Aittala, Karsten Kreis et al.
Structure-aware Semantic Discrepancy and Consistency for 3D Medical Image Self-supervised Learning
Tan Pan, Zhaorui Tan, Kaiyu Guo et al.
FairImagen: Post-Processing for Bias Mitigation in Text-to-Image Models
Zihao Fu, Ryan Brown, Shun Shao et al.
Synthetic Visual Genome
Jae Sung Park, Zixian Ma, Linjie Li et al.
Foresight: Adaptive Layer Reuse for Accelerated and High-Quality Text-to-Video Generation
Muhammad Adnan, Nithesh Kurella, Akhil Arunkumar et al.
Towards Open-World Generation of Stereo Images and Unsupervised Matching
Feng Qiao, Zhexiao Xiong, Eric Xing et al.
Influence Guided Context Selection for Effective Retrieval-Augmented Generation
Jiale Deng, Yanyan Shen, Ziyuan Pei et al.
Autoregressive Distillation of Diffusion Transformers
Yeongmin Kim, Sotiris Anagnostidis, Yuming Du et al.
ShiQ: Bringing back Bellman to LLMs
Pierre Clavier, Nathan Grinsztajn, Raphaël Avalos et al.
SimSort: A Data-Driven Framework for Spike Sorting by Large-Scale Electrophysiology Simulation
Yimu Zhang, Dongqi Han, Yansen Wang et al.
Do ImageNet-trained Models Learn Shortcuts? The Impact of Frequency Shortcuts on Generalization
Shunxin Wang, Raymond Veldhuis, Nicola Strisciuglio
One-Step Event-Driven High-Speed Autofocus
Yuhan Bao, Shaohua Gao, Wenyong Li et al.
From Prototypes to General Distributions: An Efficient Curriculum for Masked Image Modeling
Jinhong Lin, Cheng-En Wu, Huanran Li et al.
Overcoming Shortcut Problem in VLM for Robust Out-of-Distribution Detection
Zhuo Xu, Xiang Xiang, Yifan Liang
Blurry-Edges: Photon-Limited Depth Estimation from Defocused Boundaries
Wei Xu, Charlie Wagner, Junjie Luo et al.
Advancing Visual Large Language Model for Multi-granular Versatile Perception
Wentao Xiang, Haoxian Tan, Cong Wei et al.
MEMFOF: High-Resolution Training for Memory-Efficient Multi-Frame Optical Flow Estimation
Vladislav Bargatin, Egor Chistov, Alexander Yakovenko et al.
Galaxy Walker: Geometry-aware VLMs For Galaxy-scale Understanding
Tianyu Chen, Xingcheng Fu, Yisen Gao et al.
Tripartite Weight-Space Ensemble for Few-Shot Class-Incremental Learning
Juntae Lee, Munawar Hayat, Sungrack Yun
ByTheWay: Boost Your Text-to-Video Generation Model to Higher Quality in a Training-free Way
Jiazi Bu, Pengyang Ling, Pan Zhang et al.
Guiding Diffusion-Based Articulated Object Generation by Partial Point Cloud Alignment and Physical Plausibility Constraints
Jens U. Kreber, Joerg Stueckler
Feedback-Aware MCTS for Goal-Oriented Information Seeking
Harshita Chopra, Chirag Shah
ROGR: Relightable 3D Objects using Generative Relighting
Jiapeng Tang, Matthew Levine, Dor Verbin et al.
Scale-invariant attention
Ben Anson, Xi Wang, Laurence Aitchison
Color Alignment in Diffusion
Ka Chun SHUM, Binh-Son Hua, Thanh Nguyen et al.
Self-Supervised Monocular 4D Scene Reconstruction for Egocentric Videos
Chengbo Yuan, Geng Chen, Li Yi et al.
Attack via Overfitting: 10-shot Benign Fine-tuning to Jailbreak LLMs
Zhixin Xie, Xurui Song, Jun Luo
Consistency Trajectory Matching for One-Step Generative Super-Resolution
Weiyi You, Mingyang Zhang, Leheng Zhang et al.
Scaling Image Geo-Localization to Continent Level
Philipp Lindenberger, Paul-Edouard Sarlin, Jan Hosang et al.
With Limited Data for Multimodal Alignment, Let the STRUCTURE Guide You
Fabian Gröger, Shuo Wen, Huyen Le et al.
RADAR: Benchmarking Language Models on Imperfect Tabular Data
Ken Gu, Zhihan Zhang, Kate Lin et al.
On the Convergence of Single-Timescale Actor-Critic
Navdeep Kumar, Priyank Agrawal, Giorgia Ramponi et al.
LV-MAE: Learning Long Video Representations through Masked-Embedding Autoencoders
Ilan Naiman, Emanuel Baruch Baruch, Oron Anschel et al.
Seek Common Ground While Reserving Differences: Semi-Supervised Image-Text Sentiment Recognition
Wuyou Xia, Guoli Jia, Sicheng Zhao et al.