Most Cited 2025 Poster Papers
22,274 papers found • Page 34 of 112
Conference
A Regularization-Guided Equivariant Approach for Image Restoration
Yulu Bai, Jiahong Fu, Qi Xie et al.
NeedleInATable: Exploring Long-Context Capability of Large Language Models towards Long-Structured Tables
Lanrui Wang, Mingyu Zheng, Hongyin Tang et al.
Constrained Entropic Unlearning: A Primal-Dual Framework for Large Language Models
Taha Entesari, Arman Hatami, Rinat Khaziev et al.
MMMG: A Massive, Multidisciplinary, Multi-Tier Generation Benchmark for Text-to-Image Reasoning
Yuxuan Luo, Ryan Yuan, Junwen Chen et al.
GeoRanker: Distance-Aware Ranking for Worldwide Image Geolocalization
Pengyue Jia, Seongheon Park, Song Gao et al.
SFDM: Robust Decomposition of Geometry and Reflectance for Realistic Face Rendering from Sparse-view Images
Daisheng Jin, Jiangbei Hu, Baixin Xu et al.
Finsler Multi-Dimensional Scaling: Manifold Learning for Asymmetric Dimensionality Reduction and Embedding
Thomas Dagès, Simon Weber, Ya-Wei Eileen Lin et al.
Dense Associative Memory with Epanechnikov Energy
Benjamin Hoover, Zhaoyang Shi, Krishnakumar Balasubramanian et al.
FADE: Frequency-Aware Diffusion Model Factorization for Video Editing
Yixuan Zhu, Haolin Wang, Shilin Ma et al.
A3: Few-shot Prompt Learning of Unlearnable Examples with Cross-Modal Adversarial Feature Alignment
Xuan Wang, Xitong Gao, Dongping Liao et al.
SynBrain: Enhancing Visual-to-fMRI Synthesis via Probabilistic Representation Learning
Weijian Mai, Jiamin Wu, Yu Zhu et al.
Offline RL by Reward-Weighted Fine-Tuning for Conversation Optimization
Subhojyoti Mukherjee, Viet Lai, Raghavendra Addanki et al.
Generating Computational Cognitive models using Large Language Models
Milena Rmus, Akshay Kumar Jagadish, Marvin Mathony et al.
Gen3DEval: Using vLLMs for Automatic Evaluation of Generated 3D Objects
Shalini Maiti, Lourdes Agapito, Filippos Kokkinos
MaRI: Material Retrieval Integration across Domains
Jianhui Wang, Zhifei Yang, Yangfan He et al.
Uncertainty Weighted Gradients for Model Calibration
Jinxu Lin, Linwei Tao, Minjing Dong et al.
ETVA: Evaluation of Text-to-Video Alignment via Fine-grained Question Generation and Answering
Kaisi Guan, Zhengfeng Lai, Yuchong Sun et al.
Optimistic Query Routing in Clustering-based Approximate Maximum Inner Product Search
Sebastian Bruch, Aditya Krishnan, Franco Maria Nardini
CHOICE: Benchmarking the Remote Sensing Capabilities of Large Vision-Language Models
Xiao An, Jiaxing Sun, Zihan Gui et al.
Efficient Video Face Enhancement with Enhanced Spatial-Temporal Consistency
Yutong Wang, Jiajie Teng, Jiajiong Cao et al.
ODG: Occupancy Prediction Using Dual Gaussians
Yunxiao Shi, Yinhao Zhu, Herbert Cai et al.
A Lightweight UDF Learning Framework for 3D Reconstruction Based on Local Shape Functions
Jiangbei Hu, Yanggeng Li, Fei Hou et al.
Generalization Error Analysis for Selective State-Space Models Through the Lens of Attention
Arya Honarpisheh, Mustafa Bozdag, Octavia Camps et al.
EigenGS Representation: From Eigenspace to Gaussian Image Space
LO-WEI TAI, Ching-En Ching En, Li et al.
IM-Zero: Instance-level Motion Controllable Video Generation in a Zero-shot Manner
Yuyang Huang, Yabo Chen, Li Ding et al.
Watermarking One for All: A Robust Watermarking Scheme Against Partial Image Theft
Gaozhi Liu, Silu Cao, Zhenxing Qian et al.
Regression-adjusted Monte Carlo Estimators for Shapley Values and Probabilistic Values
R. Teal Witter, Yurong Liu, Christopher Musco
TRAP: Targeted Redirecting of Agentic Preferences
Hangoo Kang, Jehyeok Yeon, Gagandeep Singh
Metropolis Adjusted Microcanonical Hamiltonian Monte Carlo
Jakob Robnik, Reuben Cohn-Gordon, Uros Seljak
Towards High-fidelity 3D Talking Avatar with Personalized Dynamic Texture
Xuanchen Li, Jianyu Wang, Yuhao Cheng et al.
Decoder Gradient Shield: Provable and High-Fidelity Prevention of Gradient-Based Box-Free Watermark Removal
Haonan An, Guang Hua, Zhengru Fang et al.
Learning from Synchronization: Self-Supervised Uncalibrated Multi-View Person Association in Challenging Scenes
Keqi Chen, vinkle srivastav, Didier MUTTER et al.
Decoupling Training-Free Guided Diffusion by ADMM
Youyuan Zhang, Zehua Liu, Zenan Li et al.
CMMLoc: Advancing Text-to-PointCloud Localization with Cauchy-Mixture-Model Based Framework
Yanlong Xu, Haoxuan Qu, Jun Liu et al.
Fine-Grained Image-Text Correspondence with Cost Aggregation for Open-Vocabulary Part Segmentation
Jiho Choi, Seonho Lee, Minhyun Lee et al.
Go With the Flow: Fast Diffusion for Gaussian Mixture Models
George Rapakoulias, Ali Reza Pedram, Fengjiao Liu et al.
Search and Detect: Training-Free Long Tail Object Detection via Web-Image Retrieval
Mankeerat Sidhu, Hetarth Chopra, Ansel Blume et al.
Learning Phase Distortion with Selective State Space Models for Video Turbulence Mitigation
Xingguang Zhang, Nicholas M Chimitt, Xijun Wang et al.
PiKE: Adaptive Data Mixing for Large-Scale Multi-Task Learning Under Low Gradient Conflicts
Zeman Li, Yuan Deng, Peilin Zhong et al.
VQToken: Neural Discrete Token Representation Learning for Extreme Token Reduction in Video Large Language Models
Haichao Zhang, Yun Fu
Classifier-to-Bias: Toward Unsupervised Automatic Bias Detection for Visual Classifiers
Quentin Guimard, Moreno D'Incà, Massimiliano Mancini et al.
Rethinking Decoder Design: Improving Biomarker Segmentation Using Depth-to-Space Restoration and Residual Linear Attention
Saad Wazir, Daeyoung Kim
Towards Efficient Foundation Model for Zero-shot Amodal Segmentation
Zhaochen Liu, Limeng Qiao, Xiangxiang Chu et al.
Diffusion-based Realistic Listening Head Generation via Hybrid Motion Modeling
Yinuo Wang, Yanbo Fan, Xuan Wang et al.
E-BATS: Efficient Backpropagation-Free Test-Time Adaptation for Speech Foundation Models
Jiaheng Dong, Hong Jia, Soumyajit Chatterjee et al.
GGTalker: Talking Head Systhesis with Generalizable Gaussian Priors and Identity-Specific Adaptation
Wentao Hu, Shunkai Li, Ziqiao Peng et al.
From Panels to Prose: Generating Literary Narratives from Comics
Ragav Sachdeva, Andrew Zisserman
SMTPD: A New Benchmark for Temporal Prediction of Social Media Popularity
Yijie Xu, Bolun Zheng, Wei Zhu et al.
Spherical Manifold Guided Diffusion Model for Panoramic Image Generation
Xiancheng Sun, Mai Xu, Shengxi Li et al.
Zero-Shot Blind-spot Image Denoising via Implicit Neural Sampling
Yuhui Quan, Tianxiang Zheng, Zhiyuan Ma et al.
Accurate and Efficient Low-Rank Model Merging in Core Space
Aniello Panariello, Daniel Marczak, Simone Magistri et al.
Learning (Approximately) Equivariant Networks via Constrained Optimization
Andrei Manolache, Luiz Chamon, Mathias Niepert
Acquire and then Adapt: Squeezing out Text-to-Image Model for Image Restoration
Junyuan Deng, Xinyi Wu, Yongxing Yang et al.
Positive2Negative: Breaking the Information-Lossy Barrier in Self-Supervised Single Image Denoising
Tong Li, Lizhi Wang, Zhiyuan Xu et al.
CoSpace: Benchmarking Continuous Space Perception Ability for Vision-Language Models
Yiqi Zhu, Ziyue Wang, Can Zhang et al.
RivuletMLP: An MLP-based Architecture for Efficient Compressed Video Quality Enhancement
Gang He, Weiran Wang, Guancheng Quan et al.
Grammars of Formal Uncertainty: When to Trust LLMs in Automated Reasoning Tasks
Debargha Ganguly, Vikash Singh, Sreehari Sankar et al.
KLASS: KL-Guided Fast Inference in Masked Diffusion Models
Seo Hyun Kim, Sunwoo Hong, Hojung Jung et al.
DOF-GS: Adjustable Depth-of-Field 3D Gaussian Splatting for Post-Capture Refocusing, Defocus Rendering and Blur Removal
Yujie Wang, Praneeth Chakravarthula, Baoquan Chen
3D-AVS: LiDAR-based 3D Auto-Vocabulary Segmentation
Weijie Wei, Osman Ülger, Fatemeh Karimi Nejadasl et al.
Plug-and-Play Context Feature Reuse for Efficient Masked Generation
Xuejie Liu, Anji Liu, Guy Van den Broeck et al.
PVChat: Personalized Video Chat with One-Shot Learning
YUFEI SHI, Weilong Yan, Gang Xu et al.
Pioneering 4-Bit FP Quantization for Diffusion Models: Mixup-Sign Quantization and Timestep-Aware Fine-Tuning
Maosen Zhao, Pengtao Chen, Chong Yu et al.
Cheb-GR: Rethinking K-nearest Neighbor Search in Re-ranking for Person Re-identification
Jinxi Yang, He Li, Bo Du et al.
Empowering Large Language Models with 3D Situation Awareness
Zhihao Yuan, Yibo Peng, Jinke Ren et al.
Predictability Enables Parallelization of Nonlinear State Space Models
Xavier Gonzalez, Leo Kozachkov, David Zoltowski et al.
ConformalSAM: Unlocking the Potential of Foundational Segmentation Models in Semi-Supervised Semantic Segmentation with Conformal Prediction
Danhui Chen, Ziquan Liu, Chuxi Yang et al.
Learnable Infinite Taylor Gaussian for Dynamic View Rendering
Bingbing Hu, Yanyan Li, rui xie et al.
NSD-Imagery: A Benchmark Dataset for Extending fMRI Vision Decoding Methods to Mental Imagery
Reese Kneeland, Paul Scotti, Ghislain St-Yves et al.
Better Training Data Attribution via Better Inverse Hessian-Vector Products
Andrew Wang, Elisa Nguyen, Runshi Yang et al.
DocSAM: Unified Document Image Segmentation via Query Decomposition and Heterogeneous Mixed Learning
Xiao-Hui Li, Fei Yin, Cheng-Lin Liu
GRAPHGPT-O: Synergistic Multimodal Comprehension and Generation on Graphs
Yi Fang, Bowen Jin, Jiacheng Shen et al.
Making Old Film Great Again: Degradation-aware State Space Model for Old Film Restoration
Yudong Mao, Hao Luo, Zhiwei Zhong et al.
HoliGS: Holistic Gaussian Splatting for Embodied View Synthesis
Xiaoyuan Wang, Yizhou Zhao, Botao Ye et al.
Contrastive Representations for Temporal Reasoning
Alicja Ziarko, Michał Bortkiewicz, Michał Zawalski et al.
ControlFace: Harnessing Facial Parametric Control for Face Rigging
Wooseok Jang, Youngjun Hong, Geonho Cha et al.
InvFusion: Bridging Supervised and Zero-shot Diffusion for Inverse Problems
Noam Elata, Hyungjin Chung, Jong Chul Ye et al.
Universal Sequence Preconditioning
Annie Marsden, Elad Hazan
Which Viewpoint Shows it Best? Language for Weakly Supervising View Selection in Multi-view Instructional Videos
Sagnik Majumder, Tushar Nagarajan, Ziad Al-Halah et al.
Emergence of Linear Truth Encodings in Language Models
Shauli Ravfogel, Gilad Yehudai, Tal Linzen et al.
TopNet: Transformer-Efficient Occupancy Prediction Network for Octree-Structured Point Cloud Geometry Compression
Xinjie Wang, Yifan Zhang, Ting Liu et al.
A Visual Leap in CLIP Compositionality Reasoning through Generation of Counterfactual Sets
Zexi Jia, Chuanwei Huang, Yeshuang Zhu et al.
Fourier Analysis Network
Yihong Dong, Ge Li, Yongding Tao et al.
SATURN: SAT-based Reinforcement Learning to Unleash LLMs Reasoning
Huanyu Liu, Jia Li, Hao Zhu et al.
FilmComposer: LLM-Driven Music Production for Silent Film Clips
Zhifeng Xie, Qile He, Youjia Zhu et al.
MuTri: Multi-view Tri-alignment for OCT to OCTA 3D Image Translation
zhuangzhuang chen, hualiang wang, Chubin Ou et al.
Diffusion Classifiers Understand Compositionality, but Conditions Apply
Yujin Jeong, Arnas Uselis, Seong Joon Oh et al.
Pixel-aligned RGB-NIR Stereo Imaging and Dataset for Robot Vision
Jinneyong Kim, Seung-Hwan Baek
Adaptive Dropout: Unleashing Dropout across Layers for Generalizable Image Super-Resolution
Hang Xu, Jie Huang, Wei Yu et al.
DexFlyWheel: A Scalable and Self-improving Data Generation Framework for Dexterous Manipulation
Kefei Zhu, Fengshuo Bai, YuanHao Xiang et al.
AeSPa : Attention-guided Self-supervised Parallel Imaging for MRI Reconstruction
Jinho Joo, Hyeseong Kim, Hyeyeon Won et al.
Position: AI Should Sense Better, Not Just Scale Bigger: Adaptive Sensing as a Paradigm Shift
Eunsu Baek, Keondo Park, Jeonggil Ko et al.
Learning to price with resource constraints: from full information to machine-learned prices
Ruicheng Ao, Jiashuo Jiang, David Simchi-Levi
Avoiding exp(R) scaling in RLHF through Preference-based Exploration
Mingyu Chen, Yiding Chen, Wen Sun et al.
Zero-1-to-A: Zero-Shot One Image to Animatable Head Avatars Using Video Diffusion
Zhenglin Zhou, Fan Ma, Hehe Fan et al.
Enhanced Visual-Semantic Interaction with Tailored Prompts for Pedestrian Attribute Recognition
Junyi Wu, Yan Huang, Min Gao et al.
ShapeEmbed: a self-supervised learning framework for 2D contour quantification
Anna Foix-Romero, Craig Russell, Alexander Krull et al.
I2VGuard: Safeguarding Images against Misuse in Diffusion-based Image-to-Video Models
Dongnan Gui, Xun Guo, Wengang Zhou et al.
SceneCrafter: Controllable Multi-View Driving Scene Editing
Zehao Zhu, Yuliang Zou, Chiyu “Max” Jiang et al.
Adaptive Markup Language Generation for Contextually-Grounded Visual Document Understanding
Han Xiao, yina xie, Guanxin tan et al.
Underwater Visual SLAM with Depth Uncertainty and Medium Modeling
Rui Liu, Sheng Fan, Wenguan Wang et al.
Differentiable Inverse Rendering with Interpretable Basis BRDFs
Hoon-Gyu Chung, Seokjun Choi, Seung-Hwan Baek
See Further When Clear: Curriculum Consistency Model
Yunpeng Liu, Boxiao Liu, Yi Zhang et al.
WeakMCN: Multi-task Collaborative Network for Weakly Supervised Referring Expression Comprehension and Segmentation
Silin Cheng, Yang Liu, Xinwei He et al.
VAFlow: Video-to-Audio Generation with Cross-Modality Flow Matching
Xihua Wang, Xin Cheng, Yuyue Wang et al.
DoF-Gaussian: Controllable Depth-of-Field for 3D Gaussian Splatting
Liao Shen, Tianqi Liu, Huiqiang Sun et al.
GyroSwin: 5D Surrogates for Gyrokinetic Plasma Turbulence Simulations
Fabian Paischer, Gianluca Galletti, William Hornsby et al.
UrbanCAD: Towards Highly Controllable and Photorealistic 3D Vehicles for Urban Scene Simulation
Yichong Lu, Yichi Cai, Shangzhan Zhang et al.
Supervising Sound Localization by In-the-wild Egomotion
Anna Min, Ziyang Chen, Hang Zhao et al.
Unifying Re-Identification, Attribute Inference, and Data Reconstruction Risks in Differential Privacy
Bogdan Kulynych, Juan Gomez, Georgios Kaissis et al.
One-shot 3D Object Canonicalization based on Geometric and Semantic Consistency
Li Jin, Yujie Wang, Wenzheng Chen et al.
SLVR: Super-Light Visual Reconstruction via Blueprint Controllable Convolutions and Exploring Feature Diversity Representation
Ning Ni, Libao Zhang
Vision-Language Embodiment for Monocular Depth Estimation
Jinchang Zhang, Guoyu Lu
Towards Consistent Multi-Task Learning: Unlocking the Potential of Task-Specific Parameters
Xiaohan Qin, Xiaoxing Wang, Junchi Yan
A Tale of Two Classes: Adapting Supervised Contrastive Learning to Binary Imbalanced Datasets
David Mildenberger, Paul Hager, Daniel Rueckert et al.
A Hubness Perspective on Representation Learning for Graph-Based Multi-View Clustering
Zheming Xu, He Liu, Congyan Lang et al.
Recognition-Synergistic Scene Text Editing
Zhengyao Fang, Pengyuan Lyu, Jingjing Wu et al.
MPG-SAM 2: Adapting SAM 2 with Mask Priors and Global Context for Referring Video Object Segmentation
Fu Rong, Meng Lan, Qian Zhang et al.
RL-RC-DoT: A Block-level RL agent for Task-Aware Video Compression
Uri Gadot, Shie Mannor, Assaf Shocher et al.
TAMT: Temporal-Aware Model Tuning for Cross-Domain Few-Shot Action Recognition
yilong wang, Zilin Gao, Qilong Wang et al.
Proxy Target: Bridging the Gap Between Discrete Spiking Neural Networks and Continuous Control
Zijie Xu, Tong Bu, Zecheng Hao et al.
More of the Same: Persistent Representational Harms Under Increased Representation
Jennifer Mickel, Maria De-Arteaga, Liu Leqi et al.
Brain-Like Processing Pathways Form in Models With Heterogeneous Experts
Jack Cook, Danyal Akarca, Rui Costa et al.
AudCast: Audio-Driven Human Video Generation by Cascaded Diffusion Transformers
Jiazhi Guan, Kaisiyuan Wang, Zhiliang Xu et al.
Notes-guided MLLM Reasoning: Enhancing MLLM with Knowledge and Visual Notes for Visual Question Answering
Wenlong Fang, Qiaofeng Wu, Jing Chen et al.
Fair Generation without Unfair Distortions: Debiasing Text-to-Image Generation with Entanglement-Free Attention
Jeonghoon Park, Juyoung Lee, Chaeyeon Chung et al.
BACON: Improving Clarity of Image Captions via Bag-of-Concept Graphs
Zhantao Yang, Ruili Feng, Keyu Yan et al.
MPMAvatar: Learning 3D Gaussian Avatars with Accurate and Robust Physics-Based Dynamics
Changmin Lee, Jihyun Lee, Tae-Kyun Kim
Tracing Back the Malicious Clients in Poisoning Attacks to Federated Learning
Yuqi Jia, Minghong Fang, Hongbin Liu et al.
Identity-Clothing Similarity Modeling for Unsupervised Clothing Change Person Re-Identification
Zhiqi Pang, Junjie Wang, Lingling Zhao et al.
Born a Transformer -- Always a Transformer? On the Effect of Pretraining on Architectural Abilities
Mayank Jobanputra, Yana Veitsman, Yash Sarrof et al.
Improving Generalization of Neural Combinatorial Optimization for Vehicle Routing Problems via Test-Time Projection Learning
Yuanyao Chen, Rongsheng Chen, Fu Luo et al.
STRAP: Spatio-Temporal Pattern Retrieval for Out-of-Distribution Generalization
Haoyu Zhang, WentaoZhang, Hao Miao et al.
Sensitivity-Aware Efficient Fine-Tuning via Compact Dynamic-Rank Adaptation
Tianran Chen, Jiarui Chen, Baoquan Zhang et al.
Six-CD: Benchmarking Concept Removals for Text-to-image Diffusion Models
Jie Ren, Kangrui Chen, Yingqian Cui et al.
Less Attention is More: Prompt Transformer for Generalized Category Discovery
Wei Zhang, Baopeng Zhang, Zhu Teng et al.
Not Only Text: Exploring Compositionality of Visual Representations in Vision-Language Models
Davide Berasi, Matteo Farina, Massimiliano Mancini et al.
Deep RL Needs Deep Behavior Analysis: Exploring Implicit Planning by Model-Free Agents in Open-Ended Environments
Riley Simmons-Edler, Ryan Badman, Felix Berg et al.
Codifying Character Logic in Role-Playing
Letian Peng, Jingbo Shang
Angular Steering: Behavior Control via Rotation in Activation Space
Minh Hieu Vu, Tan Nguyen
BrepGiff: Lightweight Generation of Complex B-rep with 3D GAT Diffusion
Hao Guo, Xiaoshui Huang, Hao jiacheng et al.
Tightening Robustness Verification of MaxPool-based Neural Networks via Minimizing the Over-Approximation Zone
Yuan Xiao, Yuchen Chen, Shiqing Ma et al.
DriveScape: High-Resolution Driving Video Generation by Multi-View Feature Fusion
Wei Wu, Xi Guo, Weixuan TANG et al.
SD-VLM: Spatial Measuring and Understanding with Depth-Encoded Vision-Language Models
Pingyi Chen, Yujing Lou, Shen Cao et al.
Silence is Golden: Leveraging Adversarial Examples to Nullify Audio Control in LDM-based Talking-Head Generation
Yuan Gan, Jiaxu Miao, Yunze Wang et al.
Leveraging SD Map to Augment HD Map-based Trajectory Prediction
Zhiwei Dong, Ran Ding, Wei Li et al.
VODiff: Controlling Object Visibility Order in Text-to-Image Generation
Dong Liang, Jinyuan Jia, Yuhao Liu et al.
Efficient Federated Learning against Byzantine Attacks and Data Heterogeneity via Aggregating Normalized Gradients
Shiyuan Zuo, Xingrun Yan, Rongfei Fan et al.
Exploiting Diffusion Prior for Task-driven Image Restoration
Jaeha Kim, Junghun Oh, Kyoung Mu Lee
Optimizing for the Shortest Path in Denoising Diffusion Model
Ping Chen, Xingpeng Zhang, Zhaoxiang Liu et al.
ICP: Immediate Compensation Pruning for Mid-to-high Sparsity
Xin Luo, Fu Xueming, Zihang Jiang et al.
Wavelet and Prototype Augmented Query-based Transformer for Pixel-level Surface Defect Detection
Feng Yan, Xiaoheng Jiang, Yang Lu et al.
REOBench: Benchmarking Robustness of Earth Observation Foundation Models
Xiang Li, Yong Tao, Siyuan Zhang et al.
Enhancing Few-Shot Vision-Language Classification with Large Multimodal Model Features
Chancharik Mitra, Brandon Huang, Tianning Chai et al.
CAML: Collaborative Auxiliary Modality Learning for Multi-Agent Systems
Rui Liu, Yu Shen, Peng Gao et al.
Information-Theoretic Reward Decomposition for Generalizable RLHF
Liyuan Mao, Haoran Xu, Amy Zhang et al.
GPAvatar: High-fidelity Head Avatars by Learning Efficient Gaussian Projections
Weiqi Feng, Dong Han, Zekang Zhou et al.
Brain network science modelling of sparse neural networks enables Transformers and LLMs to perform as fully connected
Yingtao Zhang, Diego Cerretti, Jialin Zhao et al.
PlugMark: A Plug-in Zero-Watermarking Framework for Diffusion Models
Pengzhen Chen, Yanwei Liu, Xiaoyan Gu et al.
PI-HMR: Towards Robust In-bed Temporal Human Shape Reconstruction with Contact Pressure Sensing
Ziyu Wu, Yufan Xiong, Mengting Niu et al.
Learning to Instruct for Visual Instruction Tuning
Zhihan Zhou, Feng Hong, JIAAN LUO et al.
GeneMAN: Generalizable Single-Image 3D Human Reconstruction from Multi-Source Human Data
Wentao Wang, Hang Ye, Fangzhou Hong et al.
Flow Equivariant Recurrent Neural Networks
Andy Keller
Colors See Colors Ignore: Clothes Changing ReID with Color Disentanglement
Priyank Pathak, Yogesh Rawat
A Implies B: Circuit Analysis in LLMs for Propositional Logical Reasoning
Guan Zhe Hong, Nishanth Dikkala, Enming Luo et al.
Towards Better & Faster Autoregressive Image Generation: From the Perspective of Entropy
Xiaoxiao Ma, Feng Zhao, Pengyang Ling et al.
Topology-Aware Conformal Prediction for Stream Networks
Jifan Zhang, Fangxin Wang, Zihe Song et al.
Optimal Neural Compressors for the Rate-Distortion-Perception Tradeoff
Eric Lei, Hamed Hassani, Shirin Saeedi Bidokhti
Shallow Diffuse: Robust and Invisible Watermarking through Low-Dim Subspaces in Diffusion Models
Wenda Li, Huijie Zhang, Qing Qu
CAP: Evaluation of Persuasive and Creative Image Generation
Aysan Aghazadeh, Adriana Kovashka
A Closer Look at Model Collapse: From a Generalization-to-Memorization Perspective
Lianghe Shi, Meng Wu, Huijie Zhang et al.
PolyGuard: Massive Multi-Domain Safety Policy-Grounded Guardrail Dataset
Mintong Kang, Zhaorun Chen, Chejian Xu et al.
OmniTalker: One-shot Real-time Text-Driven Talking Audio-Video Generation With Multimodal Style Mimicking
Zhongjian Wang, Peng Zhang, Jinwei Qi et al.
Local-Global Associative Frames for Symmetry-Preserving Crystal Structure Modeling
haowei hua, Wanyu Lin
FedCALM: Conflict-aware Layer-wise Mitigation for Selective Aggregation in Deeper Personalized Federated Learning
Hao Zheng, Zhigang Hu, Boyu Wang et al.
LibraGrad: Balancing Gradient Flow for Universally Better Vision Transformer Attributions
Faridoun Mehri, Mahdieh Baghshah, Mohammad Taher Pilehvar
Register and [CLS] tokens induce a decoupling of local and global features in large ViTs
Alexander Lappe, Martin Giese
What Makes a Good Dataset for Knowledge Distillation?
Logan Frank, Jim Davis
HollowFlow: Efficient Sample Likelihood Evaluation using Hollow Message Passing
Johann Flemming Gloy, Simon Olsson
Vision-and-Language Training Helps Deploy Taxonomic Knowledge but Does Not Fundamentally Alter It
Yulu Qin, Dheeraj Varghese, Adam Dahlgren Lindström et al.
RePO: Understanding Preference Learning Through ReLU-Based Optimization
Junkang Wu, Kexin Huang, xue wang et al.
CoP: Agentic Red-teaming for Large Language Models using Composition of Principles
Chen Xiong, Pin-Yu Chen, Tsung-Yi Ho
O-MaMa: Learning Object Mask Matching between Egocentric and Exocentric Views
Lorenzo Mur-Labadia, Maria Santos-Villafranca, Jesus Bermudez-cameo et al.
BlockScan: Detecting Anomalies in Blockchain Transactions
Jiahao Yu, Xian Wu, Hao Liu et al.
Annotation Ambiguity Aware Semi-Supervised Medical Image Segmentation
Suruchi Kumari, Pravendra Singh
BWFormer: Building Wireframe Reconstruction from Airborne LiDAR Point Cloud with Transformer
Yuzhou Liu, Lingjie Zhu, Hanqiao Ye et al.
Are Spatial-Temporal Graph Convolution Networks for Human Action Recognition Over-Parameterized?
Jianyang Xie, Yitian Zhao, Yanda Meng et al.
Masking meets Supervision: A Strong Learning Alliance
Byeongho Heo, Taekyung Kim, Sangdoo Yun et al.
Optimal Control for Transformer Architectures: Enhancing Generalization, Robustness and Efficiency
Kelvin Kan, Xingjian Li, Benjamin Zhang et al.
Reproducible Vision-Language Models Meet Concepts Out of Pre-Training
Ziliang Chen, Xin Huang, Xiaoxuan Fan et al.
LayerD: Decomposing Raster Graphic Designs into Layers
Tomoyuki Suzuki, Kang-Jun Liu, Naoto Inoue et al.
Distribution-Aligned Decoding for Efficient LLM Task Adaptation
Senkang Hu, Xudong Han, Jinqi Jiang et al.
ODE: Open-Set Evaluation of Hallucinations in Multimodal Large Language Models
Yahan Tu, Rui Hu, Jitao Sang
OW-OVD: Unified Open World and Open Vocabulary Object Detection
Xing Xi, Yangyang Huang, Ronghua Luo et al.
Effortless, Simulation-Efficient Bayesian Inference using Tabular Foundation Models
Julius Vetter, Manuel Gloeckler, Daniel Gedon et al.
Shading Meets Motion: Self-supervised Indoor 3D Reconstruction Via Simultaneous Shape-from-Shading and Structure-from-Motion
Guoyu Lu
Towards Understanding How Knowledge Evolves in Large Vision-Language Models
Sudong Wang, Yunjian Zhang, Yao Zhu et al.
PlayerOne: Egocentric World Simulator
Yuanpeng Tu, Hao Luo, Xi Chen et al.
PoLAR: Polar-Decomposed Low-Rank Adapter Representation
Kai Lion, Liang Zhang, Bingcong Li et al.
Anatomical Consistency and Adaptive Prior-informed Transformation for Multi-contrast MR Image Synthesis via Diffusion Model
Yejee Shin, Yeeun Lee, Hanbyol Jang et al.