Most Cited 2025 Poster Papers
22,274 papers found • Page 28 of 112
Conference
Universal Scene Graph Generation
Shengqiong Wu, Hao Fei, Tat-seng Chua
SAM-I2V: Upgrading SAM to Support Promptable Video Segmentation with Less than 0.2% Training Cost
Haiyang Mei, Pengyu Zhang, Mike Zheng Shou
FALCON: An ML Framework for Fully Automated Layout-Constrained Analog Circuit Design
Asal Mehradfar, Xuzhe Zhao, Yilun Huang et al.
ForestLPR: LiDAR Place Recognition in Forests Attentioning Multiple BEV Density Images
Yanqing Shen, Turcan Tuna, Marco Hutter et al.
InterDyn: Controllable Interactive Dynamics with Video Diffusion Models
Rick Akkerman, Haiwen Feng, Michael J. Black et al.
OptiScene: LLM-driven Indoor Scene Layout Generation via Scaled Human-aligned Data Synthesis and Multi-Stage Preference Optimization
Yixuan Yang, Zhen Luo, Tongsheng Ding et al.
Elevating Visual Perception in Multimodal LLMs with Visual Embedding Distillation
Jitesh Jain, Zhengyuan Yang, Humphrey Shi et al.
SViM3D: Stable Video Material Diffusion for Single Image 3D Generation
Andreas Engelhardt, Mark Boss, Vikram Voleti et al.
DistillDrive: End-to-End Multi-Mode Autonomous Driving Distillation by Isomorphic Hetero-Source Planning Model
Rui Yu, Xianghang Zhang, Runkai Zhao et al.
A Token-level Text Image Foundation Model for Document Understanding
Tongkun Guan, Zining Wang, Pei Fu et al.
GS-ID: Illumination Decomposition on Gaussian Splatting via Adaptive Light Aggregation and Diffusion-Guided Material Priors
Kang DU, Zhihao Liang, Yulin Shen et al.
Thought Communication in Multiagent Collaboration
Yujia Zheng, Zhuokai Zhao, Zijian Li et al.
GS-DiT: Advancing Video Generation with Dynamic 3D Gaussian Fields through Efficient Dense 3D Point Tracking
Weikang Bian, Zhaoyang Huang, Xiaoyu Shi et al.
DEAL: Data-Efficient Adversarial Learning for High-Quality Infrared Imaging
Zhu Liu, Zijun Wang, Jinyuan Liu et al.
Are Images Indistinguishable to Humans Also Indistinguishable to Classifiers?
Zebin You, Xinyu Zhang, Hanzhong Guo et al.
PolyVivid: Vivid Multi-Subject Video Generation with Cross-Modal Interaction and Enhancement
Teng Hu, Zhentao Yu, Zhengguang Zhou et al.
DH-FaceVid-1K: A Large-Scale High-Quality Dataset for Face Video Generation
Donglin Di, He Feng, Wenzhang SUN et al.
Learning 3D Object Spatial Relationships from Pre-trained 2D Diffusion Models
Sangwon Baik, Hyeonwoo Kim, Hanbyul Joo
GuardSplat: Efficient and Robust Watermarking for 3D Gaussian Splatting
Zixuan Chen, Guangcong Wang, Jiahao Zhu et al.
MonoMVSNet: Monocular Priors Guided Multi-View Stereo Network
Jianfei Jiang, Qiankun Liu, Haochen Yu et al.
SMoLoRA: Exploring and Defying Dual Catastrophic Forgetting in Continual Visual Instruction Tuning
Ziqi Wang, Chang Che, Qi Wang et al.
Capturing Polysemanticity with PRISM: A Multi-Concept Feature Description Framework
Laura Kopf, Nils Feldhus, Kirill Bykov et al.
EAMamba: Efficient All-Around Vision State Space Model for Image Restoration
Yu-Cheng Lin, Yu-Syuan Xu, Hao-Wei Chen et al.
7DGS: Unified Spatial-Temporal-Angular Gaussian Splatting
Zhongpai Gao, Benjamin Planche, Meng Zheng et al.
EquiTabPFN: A Target-Permutation Equivariant Prior Fitted Network
Michael Arbel, David Salinas, Frank Hutter
HyperLoRA: Parameter-Efficient Adaptive Generation for Portrait Synthesis
Mengtian Li, Jinshu Chen, Wanquan Feng et al.
SPRINT: Enabling Interleaved Planning and Parallelized Execution in Reasoning Models
Emil Biju, Shayan Talaei, Zhemin Huang et al.
A Data-Centric Revisit of Pre-Trained Vision Models for Robot Learning
Xin Wen, Bingchen Zhao, Yilun Chen et al.
CAVIS: Context-Aware Video Instance Segmentation
Seunghun Lee, Jiwan Seo, Kiljoon Han et al.
Advanced Sign Language Video Generation with Compressed and Quantized Multi-Condition Tokenization
Cong Wang, Zexuan Deng, Zhiwei Jiang et al.
ToolVQA: A Dataset for Multi-step Reasoning VQA with External Tools
Shaofeng Yin, Ting Lei, Yang Liu
On Inductive Biases That Enable Generalization in Diffusion Transformers
Jie An, De Wang, Pengsheng Guo et al.
StochasticSplats: Stochastic Rasterization for Sorting-Free 3D Gaussian Splatting
Shakiba Kheradmand, Delio Vicini, George Kopanas et al.
TurboTrain: Towards Efficient and Balanced Multi-Task Learning for Multi-Agent Perception and Prediction
Zewei Zhou, Zhihao Zhao, Tianhui Cai et al.
Rectified Point Flow: Generic Point Cloud Pose Estimation
Tao Sun, Liyuan Zhu, Shengyu Huang et al.
BIGS: Bimanual Category-agnostic Interaction Reconstruction from Monocular Videos via 3D Gaussian Splatting
Jeongwan On, Kyeonghwan Gwak, Gunyoung Kang et al.
SP2T: Sparse Proxy Attention for Dual-stream Point Transformer
Jiaxu Wan, Hong Zhang, Ziqi He et al.
Simplification Is All You Need against Out-of-Distribution Overconfidence
Keke Tang, Chao Hou, Weilong Peng et al.
On Denoising Walking Videos for Gait Recognition
Dongyang Jin, Chao Fan, Jingzhe Ma et al.
Exploring Diffusion Transformer Designs via Grafting
Keshigeyan Chandrasegaran, Michael Poli, Dan Fu et al.
DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles
Rui Zhao, Weijia Mao, Mike Zheng Shou
GIVEPose: Gradual Intra-class Variation Elimination for RGB-based Category-Level Object Pose Estimation
Ziqin Huang, Gu Wang, Chenyangguang Zhang et al.
SVG-IR: Spatially-Varying Gaussian Splatting for Inverse Rendering
Hanxiao Sun, Yupeng Gao, Jin Xie et al.
CASAGPT: Cuboid Arrangement and Scene Assembly for Interior Design
Weitao Feng, Hang Zhou, Jing Liao et al.
Robust Multi-View Learning via Representation Fusion of Sample-Level Attention and Alignment of Simulated Perturbation
Jie Xu, Na Zhao, Gang Niu et al.
Learning to Unlearn while Retaining: Combating Gradient Conflicts in Machine Unlearning
Gaurav Patel, Qiang Qiu
The Overthinker's DIET: Cutting Token Calories with DIfficulty-AwarE Training
Weize Chen, Jiarui yuan, Jin Tailin et al.
Multi-Label Prototype Visual Spatial Search for Weakly Supervised Semantic Segmentation
Songsong Duan, Xi Yang, Nannan Wang
Flexible Language Modeling in Continuous Space with Transformer-based Autoregressive Flows
Ruixiang Zhang, Shuangfei Zhai, Jiatao Gu et al.
Alternating Gradient Flows: A Theory of Feature Learning in Two-layer Neural Networks
Daniel Kunin, Giovanni Luca Marchetti, Feng Chen et al.
NeurIPT: Foundation Model for Neural Interfaces
Zitao Fang, Chenxuan Li, Hongting Zhou et al.
Computational Algebra with Attention: Transformer Oracles for Border Basis Algorithms
Hiroshi Kera, Nico Pelleriti, Yuki Ishihara et al.
Unleashing Diffusion Transformers for Visual Correspondence by Modulating Massive Activations
Chaofan Gan, Yuanpeng Tu, Xi Chen et al.
Learning to Generalize without Bias for Open-Vocabulary Action Recognition
Yating Yu, Congqi Cao, Yifan Zhang et al.
USP-Gaussian: Unifying Spike-based Image Reconstruction, Pose Correction and Gaussian Splatting
Kang Chen, Jiyuan Zhang, Zecheng Hao et al.
DaCapo: Score Distillation as Stacked Bridge for Fast and High-quality 3D Editing
Yufei Huang, Bangyan Liao, Yuqi Hu et al.
Enhancing Online Continual Learning with Plug-and-Play State Space Model and Class-Conditional Mixture of Discretization
Sihao Liu, Yibo Yang, Xiaojie Li et al.
Symmetry Strikes Back: From Single-Image Symmetry Detection to 3D Generation
Xiang Li, Zixuan Huang, Anh Thai et al.
Incentivizing Reasoning for Advanced Instruction-Following of Large Language Models
Yulei Qin, Gang Li, Zongyi Li et al.
Learning Linear Attention in Polynomial Time
Morris Yau, Ekin Akyürek, Jiayuan Mao et al.
HOP: Heterogeneous Topology-based Multimodal Entanglement for Co-Speech Gesture Generation
Hongye Cheng, Tianyu Wang, guangsi shi et al.
Revisiting Multi-Agent World Modeling from a Diffusion-Inspired Perspective
Yang Zhang, Xinran Li, Jianing Ye et al.
Gatekeeper: Improving Model Cascades Through Confidence Tuning
Stephan Rabanser, Nathalie Rauschmayr, Achin Kulshrestha et al.
MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering
Rushi Qiang, Yuchen Zhuang, Yinghao Li et al.
UniAP: Unifying Inter- and Intra-Layer Automatic Parallelism by Mixed Integer Quadratic Programming
Hao Lin, Ke Wu, Jie Li et al.
Anchored Diffusion Language Model
Litu Rout, Constantine Caramanis, Sanjay Shakkottai
Scaling Up Liquid-Resistance Liquid-Capacitance Networks for Efficient Sequence Modeling
Mónika Farsang, Radu Grosu
INS-MMBench: A Comprehensive Benchmark for Evaluating LVLMs' Performance in Insurance
Chenwei Lin, Hanjia Lyu, Xian Xu et al.
The Devil is in Low-Level Features for Cross-Domain Few-Shot Segmentation
Yuhan Liu, Yixiong Zou, Yuhua Li et al.
Let Humanoids Hike! Integrative Skill Development on Complex Trails
Kwan-Yee Lin, Stella X. Yu
NuiScene: Exploring Efficient Generation of Unbounded Outdoor Scenes
Han-Hung Lee, Qinghong Han, Angel Chang
Buffer Anytime: Zero-Shot Video Depth and Normal from Image Priors
Zhengfei Kuang, Tianyuan Zhang, Kai Zhang et al.
QuickSplat: Fast 3D Surface Reconstruction via Learned Gaussian Initialization
Yueh-Cheng Liu, Lukas Höllein, Matthias Nießner et al.
ETA: Efficiency through Thinking Ahead, A Dual Approach to Self-Driving with Large Models
Shadi Hamdan, Chonghao Sima, Zetong Yang et al.
Modeling Microenvironment Trajectories on Spatial Transcriptomics with NicheFlow
Kristiyan Sakalyan, Alessandro Palma, Filippo Guerranti et al.
Enhancing Partially Relevant Video Retrieval with Hyperbolic Learning
Jun Li, Jinpeng Wang, Chaolei Tan et al.
VideoLLaMB: Long Streaming Video Understanding with Recurrent Memory Bridges
Yuxuan Wang, Yiqi Song, Cihang Xie et al.
Better Language Model Inversion by Compactly Representing Next-Token Distributions
Murtaza Nazir, Matthew Finlayson, John Morris et al.
Learning single index models via harmonic decomposition
Nirmit Joshi, Hugo Koubbi, Theodor Misiakiewicz et al.
Which Data Attributes Stimulate Math and Code Reasoning? An Investigation via Influence Functions
Siqi Kou, Qingyuan Tian, Hanwen Xu et al.
Measure gradients, not activations! Enhancing neuronal activity in deep reinforcement learning
Jiashun Liu, Zihao Wu, Johan Obando Ceron et al.
Vision Function Layer in Multimodal LLMs
Cheng Shi, Yizhou Yu, Sibei Yang
LISAt: Language-Instructed Segmentation Assistant for Satellite Imagery
Jerome Quenum, Wen-Han Hsieh, Tsung-Han (Patrick) Wu et al.
Apply Hierarchical-Chain-of-Generation to Complex Attributes Text-to-3D Generation
Yiming Qin, Zhu Xu, Yang Liu
Interpretable Global Minima of Deep ReLU Neural Networks on Sequentially Separable Data
Thomas Chen, Patricia Muñoz Ewald
Chain-of-Zoom: Extreme Super-Resolution via Scale Autoregression and Preference Alignment
Bryan Sangwoo Kim, Jeongsol Kim, Jong Chul Ye
FreeScene: Mixed Graph Diffusion for 3D Scene Synthesis from Free Prompts
Tongyuan Bai, Wangyuanfan Bai, Dong Chen et al.
Taste More, Taste Better: Diverse Data and Strong Model Boost Semi-Supervised Crowd Counting
Maochen Yang, Zekun Li, Jian Zhang et al.
Segment Any-Quality Images with Generative Latent Space Enhancement
Guangqian Guo, Yong Guo, Xuehui Yu et al.
Neural Thermodynamics: Entropic Forces in Deep and Universal Representation Learning
Liu Ziyin, Yizhou Xu, Isaac Chuang
Protecting Your Video Content: Disrupting Automated Video-based LLM Annotations
Haitong Liu, Kuofeng Gao, Yang Bai et al.
Deterministic Image-to-Image Translation via Denoising Brownian Bridge Models with Dual Approximators
Bohan Xiao, PEIYONG WANG, Qisheng He et al.
Balanced Direction from Multifarious Choices: Arithmetic Meta-Learning for Domain Generalization
Xiran Wang, Jian Zhang, Lei Qi et al.
Optimized Minimal 3D Gaussian Splatting
Joo Chan Lee, Jong Hwan Ko, Eunbyung Park
SphereUFormer: A U-Shaped Transformer for Spherical 360 Perception
Yaniv Benny, Lior Wolf
Permissioned LLMs: Enforcing Access Control in Large Language Models
Bargav Jayaraman, Virendra Marathe, Hamid Mozaffari et al.
Single Domain Generalization for Few-Shot Counting via Universal Representation Matching
Xianing Chen, Si Huo, Borui Jiang et al.
ZeroSep: Separate Anything in Audio with Zero Training
Chao Huang, Yuesheng Ma, Junxuan Huang et al.
Gene Regulatory Network Inference in the Presence of Selection Bias and Latent Confounders
Gongxu Luo, Haoyue Dai, Longkang Li et al.
End-to-End Implicit Neural Representations for Classification
Alexander Gielisse, Jan van Gemert
PTDiffusion: Free Lunch for Generating Optical Illusion Hidden Pictures with Phase-Transferred Diffusion Model
Xiang Gao, Shuai Yang, Jiaying Liu
System Prompt Optimization with Meta-Learning
Yumin Choi, Jinheon Baek, Sung Ju Hwang
StelLA: Subspace Learning in Low-rank Adaptation using Stiefel Manifold
Zhizhong Li, Sina Sajadmanesh, Jingtao Li et al.
Generative Zoo
Tomasz Niewiadomski, Anastasios Yiannakidis, Hanz Cuevas Velasquez et al.
AdmTree: Compressing Lengthy Context with Adaptive Semantic Trees
Yangning Li, Shaoshen Chen, Yinghui Li et al.
Steering Generative Models with Experimental Data for Protein Fitness Optimization
Jason Yang, Wenda Chu, Daniel Khalil et al.
Temporal Unlearnable Examples: Preventing Personal Video Data from Unauthorized Exploitation by Object Tracking
Qiangqiang Wu, Yi Yu, Chenqi Kong et al.
Understanding Multi-Task Activities from Single-Task Videos
Yuhan Shen, Ehsan Elhamifar
Brain-Informed Fine-Tuning for Improved Multilingual Understanding in Language Models
Anuja Negi, SUBBAREDDY OOTA, Anwar Nunez-Elizalde et al.
Characterizing the Expressivity of Fixed-Precision Transformer Language Models
Jiaoda Li, Ryan Cotterell
Spectral State Space Model for Rotation-Invariant Visual Representation Learning
Sahar Dastani, Ali Bahri, Moslem Yazdanpanah et al.
LightSwitch: Multi-view Relighting with Material-guided Diffusion
Yehonathan Litman, Fernando De la Torre, Shubham Tulsiani
URWKV: Unified RWKV Model with Multi-state Perspective for Low-light Image Restoration
Rui Xu, Yuzhen Niu, Yuezhou Li et al.
Mono3DVLT: Monocular-Video-Based 3D Visual Language Tracking
Hongkai Wei, YANG YANG, Shijie Sun et al.
$\texttt{G1}$: Teaching LLMs to Reason on Graphs with Reinforcement Learning
Xiaojun Guo, Ang Li, Yifei Wang et al.
Conditional Panoramic Image Generation via Masked Autoregressive Modeling
Chaoyang Wang, Xiangtai Li, Lu Qi et al.
ObjectGS: Object-aware Scene Reconstruction and Scene Understanding via Gaussian Splatting
Ruijie Zhu, Mulin Yu, Linning Xu et al.
CellVerse: Do Large Language Models Really Understand Cell Biology?
Fan Zhang, Tianyu Liu, Zhihong Zhu et al.
From Style to Facts: Mapping the Boundaries of Knowledge Injection with Finetuning
Eric Zhao, Pranjal Awasthi, Nika Haghtalab
DuoLoRA : Cycle-consistent and Rank-disentangled Content-Style Personalization
Aniket Roy, Shubhankar Borse, Shreya Kadambi et al.
Order-One Rolling Shutter Cameras
Marvin Anas Hahn, Kathlén Kohn, Orlando Marigliano et al.
Top-H Decoding: Adapting the Creativity and Coherence with Bounded Entropy in Text Generation
Erfan Baghaei Potraghloo, Seyedarmin Azizi, Souvik Kundu et al.
PatchVSR: Breaking Video Diffusion Resolution Limits with Patch-wise Video Super-Resolution
Shian Du, Menghan Xia, Chang Liu et al.
Watermarking Autoregressive Image Generation
Nikola Jovanović, Ismail Labiad, Tomas Soucek et al.
Synchronized Video-to-Audio Generation via Mel Quantization-Continuum Decomposition
Juncheng Wang, Chao Xu, Cheng Yu et al.
Floxels: Fast Unsupervised Voxel Based Scene Flow Estimation
David T. Hoffmann, Syed Haseeb Raza, Hanqiu Jiang et al.
ResGS: Residual Densification of 3D Gaussian for Efficient Detail Recovery
Yanzhe Lyu, Kai Cheng, Kang Xin et al.
Test-Time Adaptation of Vision-Language Models for Open-Vocabulary Semantic Segmentation
Mehrdad Noori, David OSOWIECHI, Gustavo Vargas Hakim et al.
Morpheus: Text-Driven 3D Gaussian Splat Shape and Color Stylization
Jamie Wynn, Zawar Qureshi, Jakub Powierza et al.
Auto-Regressively Generating Multi-View Consistent Images
JiaKui Hu, Yuxiao Yang, Jialun Liu et al.
EgoThinker: Unveiling Egocentric Reasoning with Spatio-Temporal CoT
Baoqi Pei, Yifei Huang, Jilan Xu et al.
Dynamic Multimodal Prototype Learning in Vision-Language Models
Xingyu Zhu, Shuo Wang, Beier Zhu et al.
SSHNet: Unsupervised Cross-modal Homography Estimation via Problem Reformulation and Split Optimization
Junchen Yu, Siyuan Cao, Runmin Zhang et al.
PriorMotion: Generative Class-Agnostic Motion Prediction with Raster-Vector Motion Field Priors
Kangan Qian, Jinyu Miao, Xinyu Jiao et al.
EgoM2P: Egocentric Multimodal Multitask Pretraining
Gen Li, Yutong Chen, Yiqian Wu et al.
Bisecle: Binding and Separation in Continual Learning for Video Language Understanding
Yue Tan, Xiaoqian Hu, Hao Xue et al.
DistinctAD: Distinctive Audio Description Generation in Contexts
Bo Fang, Wenhao Wu, Qiangqiang Wu et al.
The Atlas of In-Context Learning: How Attention Heads Shape In-Context Retrieval Augmentation
Patrick Kahardipraja, Reduan Achtibat, Thomas Wiegand et al.
BiLoRA: Almost-Orthogonal Parameter Spaces for Continual Learning
Hao Zhu, Yifei Zhang, Junhao Dong et al.
Towards Generalizable Trajectory Prediction using Dual-Level Representation Learning and Adaptive Prompting
Kaouther Messaoud, Matthieu Cord, Alex Alahi
SnowMaster: Comprehensive Real-world Image Desnowing via MLLM with Multi-Model Feedback Optimization
Jianyu LAI, Sixiang Chen, yunlong lin et al.
Understanding Contrastive Learning via Gaussian Mixture Models
Parikshit Bansal, Ali Kavis, Sujay Sanghavi
AgentBreeder: Mitigating the AI Safety Risks of Multi-Agent Scaffolds via Self-Improvement
J Rosser, Jakob Foerster
TrustMark: Robust Watermarking and Watermark Removal for Arbitrary Resolution Images
Tu Bui, Shruti Agarwal, John Collomosse
Optimal Spectral Transitions in High-Dimensional Multi-Index Models
Leonardo Defilippis, Yatin Dandi, Pierre Mergny et al.
DiscoVLA: Discrepancy Reduction in Vision, Language, and Alignment for Parameter-Efficient Video-Text Retrieval
Leqi Shen, Guoqiang Gong, Tianxiang Hao et al.
QuCOOP: A Versatile Framework for Solving Composite and Binary-Parametrised Problems on Quantum Annealers
Natacha Kuete Meli, Vladislav Golyanik, Marcel Seelbach Benkner et al.
Repurposing 2D Diffusion Models with Gaussian Atlas for 3D Generation
Tiange Xiang, Kai Li, Chengjiang Long et al.
Doubly Robust Alignment for Large Language Models
Erhan Xu, Kai Ye, Hongyi Zhou et al.
LatentHOI: On the Generalizable Hand Object Motion Generation with Latent Hand Diffusion.
Muchen Li, Sammy Christen, Chengde Wan et al.
Towards foundational LiDAR world models with efficient latent flow matching
Tianran Liu, Shengwen Zhao, Nicholas Rhinehart
ZeroVO: Visual Odometry with Minimal Assumptions
Lei Lai, Zekai Yin, Eshed Ohn-Bar
Do different prompting methods yield a common task representation in language models?
Guy Davidson, Todd Gureckis, Brenden Lake et al.
CoDA: Coordinated Diffusion Noise Optimization for Whole-Body Manipulation of Articulated Objects
Huaijin Pi, Zhi Cen, Zhiyang Dou et al.
Dual-Agent Optimization framework for Cross-Domain Few-Shot Segmentation
Zhaoyang Li, Yuan Wang, Wangkai Li et al.
Self-Calibrated Variance-Stabilizing Transformations for Real-World Image Denoising
Sébastien Herbreteau, Michael Unser
GenFlowRL: Shaping Rewards with Generative Object-Centric Flow in Visual Reinforcement Learning
Kelin Yu, Sheng Zhang, Harshit Soora et al.
Enhanced then Progressive Fusion with View Graph for Multi-View Clustering
Zhibin Dong, Meng Liu, Siwei Wang et al.
Unity in Diversity: Video Editing via Gradient-Latent Purification
Junyu Gao, Kunlin Yang, Xuan Yao et al.
Lie Detector: Unified Backdoor Detection via Cross-Examination Framework
Xuan Wang, Siyuan Liang, Dongping Liao et al.
Deliberation on Priors: Trustworthy Reasoning of Large Language Models on Knowledge Graphs
Jie Ma, NING QU, Zhitao Gao et al.
Capturing Individual Human Preferences with Reward Features
Andre Barreto, Vincent Dumoulin, Yiran Mao et al.
BG-Triangle: Bézier Gaussian Triangle for 3D Vectorization and Rendering
Minye Wu, Haizhao Dai, Kaixin Yao et al.
Feedback Guidance of Diffusion Models
Felix Koulischer, Florian Handke, Johannes Deleu et al.
Robust-MVTON: Learning Cross-Pose Feature Alignment and Fusion for Robust Multi-View Virtual Try-On
Nannan Zhang, Yijiang Li, Dong Du et al.
PSBD: Prediction Shift Uncertainty Unlocks Backdoor Detection
Wei Li, Pin-Yu Chen, Sijia Liu et al.
TopoPoint: Enhance Topology Reasoning via Endpoint Detection in Autonomous Driving
Yanping Fu, Xinyuan Liu, Tianyu Li et al.
Decomposing Interventional Causality into Synergistic, Redundant, and Unique Components
Abel Jansma
UniEgoMotion: A Unified Model for Egocentric Motion Reconstruction, Forecasting, and Generation
Chaitanya Patel, Hiroki Nakamura, Yuta Kyuragi et al.
CanonSwap: High-Fidelity and Consistent Video Face Swapping via Canonical Space Modulation
Xiangyang Luo, Ye Zhu, Yunfei Liu et al.
Win Fast or Lose Slow: Balancing Speed and Accuracy in Latency-Sensitive Decisions of LLMs
Hao Kang, Qingru Zhang, Han Cai et al.
Knowledge Distillation with Refined Logits
Wujie Sun, Defang Chen, Siwei Lyu et al.
GECKO: Gigapixel Vision-Concept Contrastive Pretraining in Histopathology
Saarthak Kapse, Pushpak Pati, Srikar Yellapragada et al.
$\texttt{STRCMP}$: Integrating Graph Structural Priors with Language Models for Combinatorial Optimization
Xijun Li, Jiexiang Yang, Jinghao Wang et al.
PanoLlama: Generating Endless and Coherent Panoramas with Next-Token-Prediction LLMs
Teng Zhou, Xiaoyu Zhang, Yongchuan Tang
MIRA: Medical Time Series Foundation Model for Real-World Health Data
Hao Li, Bowen Deng, Chang Xu et al.
Joint Relational Database Generation via Graph-Conditional Diffusion Models
Mohamed Amine Ketata, David Lüdke, Leo Schwinn et al.
3D Dental Model Segmentation with Geometrical Boundary Preserving
Shufan Xi, Zexian Liu, Junlin Chang et al.
Statistical inference for Linear Stochastic Approximation with Markovian Noise
Sergey Samsonov, Marina Sheshukova, Eric Moulines et al.
Beyond Single-Task: Robust Multi-Task Length Generalization for LLMs
Yi Hu, Shijia Kang, Haotong Yang et al.
CutS3D: Cutting Semantics in 3D for 2D Unsupervised Instance Segmentation
Leon Sick, Dominik Engel, Sebastian Hartwig et al.
Small Singular Values Matter: A Random Matrix Analysis of Transformer Models
Max Staats, Matthias Thamm, Bernd Rosenow
Neural Hierarchical Decomposition for Single Image Plant Modeling
Zhihao Liu, Zhanglin Cheng, Naoto Yokoya
BiM-VFI: Bidirectional Motion Field-Guided Frame Interpolation for Video with Non-uniform Motions
Wonyong Seo, Jihyong Oh, Munchurl Kim
GRAVER: Generative Graph Vocabularies for Robust Graph Foundation Models Fine-tuning
Haonan Yuan, Qingyun Sun, Junhua Shi et al.
ANNEXE: Unified Analyzing, Answering, and Pixel Grounding for Egocentric Interaction
YUEJIAO SU, Yi Wang, Qiongyang Hu et al.
Tight Lower Bounds and Improved Convergence in Performative Prediction
Pedram Khorsandi, Rushil Gupta, Mehrnaz Mofakhami et al.
MARBLE: Material Recomposition and Blending in CLIP-Space
Ta-Ying Cheng, Prafull Sharma, Mark Boss et al.
LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale
Joya Chen, Yiqi Lin, Ziyun Zeng et al.
DSV-LFS: Unifying LLM-Driven Semantic Cues with Visual Features for Robust Few-Shot Segmentation
Amin Karimi, Charalambos Poullis
Memories of Forgotten Concepts
Matan Rusanovsky, Shimon Malnick, Amir Jevnisek et al.
Doppelgangers++: Improved Visual Disambiguation with Geometric 3D Features
Yuanbo Xiangli, Ruojin Cai, Hanyu Chen et al.
MixerMDM: Learnable Composition of Human Motion Diffusion Models
Pablo Ruiz-Ponce, German Barquero, Cristina Palmero et al.
PolarFree: Polarization-based Reflection-Free Imaging
Mingde Yao, Menglu Wang, King Man Tam et al.
OmniStereo: Real-time Omnidireactional Depth Estimation with Multiview Fisheye Cameras
Jiaxi Deng, Yushen Wang, Haitao Meng et al.
H-MoRe: Learning Human-centric Motion Representation for Action Analysis
Zhanbo Huang, Xiaoming Liu, Yu Kong
Learning 4D Panoptic Scene Graph Generation from Rich 2D Visual Scene
Shengqiong Wu, Hao Fei, Jingkang Yang et al.
Scalable and Cost-Efficient de Novo Template-Based Molecular Generation
Piotr Gaiński, Oussama Boussif, Andrei Rekesh et al.
SmartCLIP: Modular Vision-language Alignment with Identification Guarantees
Shaoan Xie, Lingjing Kong, Yujia Zheng et al.
When Thinking Drifts: Evidential Grounding for Robust Video Reasoning
Romy Luo, Zihui (Sherry) Xue, Alex Dimakis et al.