Most Cited CVPR "interpretable neural networks" Papers
5,589 papers found • Page 13 of 28
Conference
Balanced Direction from Multifarious Choices: Arithmetic Meta-Learning for Domain Generalization
Xiran Wang, Jian Zhang, Lei Qi et al.
ATA: Adaptive Transformation Agent for Text-Guided Subject-Position Variable Background Inpainting
Yizhe Tang, Zhimin Sun, Yuzhen Du et al.
SGFormer: Satellite-Ground Fusion for 3D Semantic Scene Completion
Xiyue Guo, Jiarui Hu, Junjie Hu et al.
SUM Parts: Benchmarking Part-Level Semantic Segmentation of Urban Meshes
Weixiao Gao, Liangliang Nan, Hugo Ledoux
A Flag Decomposition for Hierarchical Datasets
Nathan Mankovich, Ignacio Santamaria, Gustau Camps-Valls et al.
Finding Local Diffusion Schrödinger Bridge using Kolmogorov-Arnold Network
Xingyu Qiu, Mengying Yang, Xinghua Ma et al.
On the Out-Of-Distribution Generalization of Large Multimodal Models
Xingxuan Zhang, Jiansheng Li, Wenjing Chu et al.
Decoupling Training-Free Guided Diffusion by ADMM
Youyuan Zhang, Zehua Liu, Zenan Li et al.
Towards Unbiased and Robust Spatio-Temporal Scene Graph Generation and Anticipation
Rohith Peddi, Saurabh ., Ayush Abhay Shrivastava et al.
Fine-Grained Image-Text Correspondence with Cost Aggregation for Open-Vocabulary Part Segmentation
Jiho Choi, Seonho Lee, Minhyun Lee et al.
GO-N3RDet: Geometry Optimized NeRF-enhanced 3D Object Detector
Zechuan Li, Hongshan Yu, Yihao Ding et al.
CMMLoc: Advancing Text-to-PointCloud Localization with Cauchy-Mixture-Model Based Framework
Yanlong Xu, Haoxuan Qu, Jun Liu et al.
Learning Phase Distortion with Selective State Space Models for Video Turbulence Mitigation
Xingguang Zhang, Nicholas M Chimitt, Xijun Wang et al.
UniPTS: A Unified Framework for Proficient Post-Training Sparsity
JingJing Xie, Yuxin Zhang, Mingbao Lin et al.
Is `Right' Right? Enhancing Object Orientation Understanding in Multimodal Large Language Models through Egocentric Instruction Tuning
JiHyeok Jung, EunTae Kim, SeoYeon Kim et al.
TailedCore: Few-Shot Sampling for Unsupervised Long-Tail Noisy Anomaly Detection
Yoon Gyo Jung, Jaewoo Park, Jaeho Yoon et al.
Towards Efficient Foundation Model for Zero-shot Amodal Segmentation
Zhaochen Liu, Limeng Qiao, Xiangxiang Chu et al.
UrbanCAD: Towards Highly Controllable and Photorealistic 3D Vehicles for Urban Scene Simulation
Yichong Lu, Yichi Cai, Shangzhan Zhang et al.
Classifier-to-Bias: Toward Unsupervised Automatic Bias Detection for Visual Classifiers
Quentin Guimard, Moreno D'Incà, Massimiliano Mancini et al.
Charm: The Missing Piece in ViT Fine-Tuning for Image Aesthetic Assessment
Fatemeh Behrad, Tinne Tuytelaars, Johan Wagemans
Diffusion-based Realistic Listening Head Generation via Hybrid Motion Modeling
Yinuo Wang, Yanbo Fan, Xuan Wang et al.
ArcPro: Architectural Programs for Structured 3D Abstraction of Sparse Points
Qirui Huang, Runze Zhang, Kangjun Liu et al.
RL-RC-DoT: A Block-level RL agent for Task-Aware Video Compression
Uri Gadot, Shie Mannor, Assaf Shocher et al.
HumanMM: Global Human Motion Recovery from Multi-shot Videos
Yuhong Zhang, Guanlin Wu, Ling-Hao Chen et al.
TAMT: Temporal-Aware Model Tuning for Cross-Domain Few-Shot Action Recognition
yilong wang, Zilin Gao, Qilong Wang et al.
ControlFace: Harnessing Facial Parametric Control for Face Rigging
Wooseok Jang, Youngjun Hong, Geonho Cha et al.
Optimizing for the Shortest Path in Denoising Diffusion Model
Ping Chen, Xingpeng Zhang, Zhaoxiang Liu et al.
PGC: Physics-Based Gaussian Cloth from a Single Pose
Michelle Guo, Matt Jen-Yuan Chiang, Igor Santesteban et al.
Not Only Text: Exploring Compositionality of Visual Representations in Vision-Language Models
Davide Berasi, Matteo Farina, Massimiliano Mancini et al.
Scale Efficient Training for Large Datasets
Qing Zhou, Junyu Gao, Qi Wang
GPAvatar: High-fidelity Head Avatars by Learning Efficient Gaussian Projections
Weiqi Feng, Dong Han, Zekang Zhou et al.
Empowering Large Language Models with 3D Situation Awareness
Zhihao Yuan, Yibo Peng, Jinke Ren et al.
PS-Diffusion: Photorealistic Subject-Driven Image Editing with Disentangled Control and Attention
Weicheng Wang, Guoli Jia, Zhongqi Zhang et al.
Learnable Infinite Taylor Gaussian for Dynamic View Rendering
Bingbing Hu, Yanyan Li, rui xie et al.
PolarMatte: Fully Computational Ground-Truth-Quality Alpha Matte Extraction for Images and Video using Polarized Screen Matting
Kenji Enomoto, TJ Rhodes, Brian Price et al.
All Rivers Run to the Sea: Private Learning with Asymmetric Flows
Yue Niu, Ramy E. Ali, Saurav Prakash et al.
JDEC: JPEG Decoding via Enhanced Continuous Cosine Coefficients
Woo Kyoung Han, Sunghoon Im, Jaedeok Kim et al.
Comprehensive Relighting: Generalizable and Consistent Monocular Human Relighting and Harmonization
Junying Wang, Jingyuan Liu, Xin Sun et al.
One-shot 3D Object Canonicalization based on Geometric and Semantic Consistency
Li Jin, Yujie Wang, Wenzheng Chen et al.
Making Old Film Great Again: Degradation-aware State Space Model for Old Film Restoration
Yudong Mao, Hao Luo, Zhiwei Zhong et al.
Reproducible Vision-Language Models Meet Concepts Out of Pre-Training
Ziliang Chen, Xin Huang, Xiaoxuan Fan et al.
FilmComposer: LLM-Driven Music Production for Silent Film Clips
Zhifeng Xie, Qile He, Youjia Zhu et al.
MuTri: Multi-view Tri-alignment for OCT to OCTA 3D Image Translation
zhuangzhuang chen, hualiang wang, Chubin Ou et al.
Reconstruction-free Cascaded Adaptive Compressive Sensing
Chenxi Qiu, Tao Yue, Xuemei Hu
Tightening Robustness Verification of MaxPool-based Neural Networks via Minimizing the Over-Approximation Zone
Yuan Xiao, Yuchen Chen, Shiqing Ma et al.
Self-Supervised Dual Contouring
Ramana Sundararaman, Roman Klokov, Maks Ovsjanikov
USP-Gaussian: Unifying Spike-based Image Reconstruction, Pose Correction and Gaussian Splatting
Kang Chen, Jiyuan Zhang, Zecheng Hao et al.
SceneCrafter: Controllable Multi-View Driving Scene Editing
Zehao Zhu, Yuliang Zou, Chiyu “Max” Jiang et al.
SURGEON: Memory-Adaptive Fully Test-Time Adaptation via Dynamic Activation Sparsity
Ke Ma, Jiaqi Tang, Bin Guo et al.
ODE: Open-Set Evaluation of Hallucinations in Multimodal Large Language Models
Yahan Tu, Rui Hu, Jitao Sang
ATP: Adaptive Threshold Pruning for Efficient Data Encoding in Quantum Neural Networks
Mohamed Afane, Gabrielle Ebbrecht, Ying Wang et al.
TimeTracker: Event-based Continuous Point Tracking for Video Frame Interpolation with Non-linear Motion
Haoyue Liu, Jinghan Xu, Yi Chang et al.
A Hubness Perspective on Representation Learning for Graph-Based Multi-View Clustering
Zheming Xu, He Liu, Congyan Lang et al.
Uncertainty Visualization via Low-Dimensional Posterior Projections
Omer Yair, Tomer Michaeli, Elias Nehme
A Tale of Two Classes: Adapting Supervised Contrastive Learning to Binary Imbalanced Datasets
David Mildenberger, Paul Hager, Daniel Rueckert et al.
Believing is Seeing: Unobserved Object Detection using Generative Models
Subhransu S. Bhattacharjee, Dylan Campbell, Rahul Shome
Vision-Language Embodiment for Monocular Depth Estimation
Jinchang Zhang, Guoyu Lu
DoF-Gaussian: Controllable Depth-of-Field for 3D Gaussian Splatting
Liao Shen, Tianqi Liu, Huiqiang Sun et al.
WeakMCN: Multi-task Collaborative Network for Weakly Supervised Referring Expression Comprehension and Segmentation
Silin Cheng, Yang Liu, Xinwei He et al.
Leak and Learn: An Attacker's Cookbook to Train Using Leaked Data from Federated Learning
Joshua C. Zhao, Ahaan Dabholkar, Atul Sharma et al.
I2VGuard: Safeguarding Images against Misuse in Diffusion-based Image-to-Video Models
Dongnan Gui, Xun Guo, Wengang Zhou et al.
Adaptive Markup Language Generation for Contextually-Grounded Visual Document Understanding
Han Xiao, yina xie, Guanxin tan et al.
Project-Probe-Aggregate: Efficient Fine-Tuning for Group Robustness
Beier Zhu, Jiequan Cui, Hanwang Zhang et al.
Rethinking Lanes and Points in Complex Scenarios for Monocular 3D Lane Detection
Yifan Chang, Junjie Huang, Xiaofeng Wang et al.
Improving Sound Source Localization with Joint Slot Attention on Image and Audio
Inho Kim, YOUNGKIL SONG, Jicheol Park et al.
EntityErasure: Erasing Entity Cleanly via Amodal Entity Segmentation and Completion
Yixing Zhu, Qing Zhang, Yitong Wang et al.
Decision SpikeFormer: Spike-Driven Transformer for Decision Making
Wei Huang, Qinying Gu, Nanyang Ye
Infighting in the Dark: Multi-Label Backdoor Attack in Federated Learning
Ye Li, Yanchao Zhao, chengcheng zhu et al.
ToNNO: Tomographic Reconstruction of a Neural Network’s Output for Weakly Supervised Segmentation of 3D Medical Images
Marius Schmidt-Mengin, Alexis Benichoux, Shibeshih Belachew et al.
DocSAM: Unified Document Image Segmentation via Query Decomposition and Heterogeneous Mixed Learning
Xiao-Hui Li, Fei Yin, Cheng-Lin Liu
BOE-ViT: Boosting Orientation Estimation with Equivariance in Self-Supervised 3D Subtomogram Alignment
Runmin Jiang, Jackson Daggett, Shriya Pingulkar et al.
HORP: Human-Object Relation Priors Guided HOI Detection
Pei Geng, Jian Yang, Shanshan Zhang
RaSS: Improving Denoising Diffusion Samplers with Reinforced Active Sampling Scheduler
Xin Ding, Lei Yu, Xin Li et al.
Perceptual Video Compression with Neural Wrapping
Muhammad Umar Karim Khan, Aaron Chadha, Mohammad Ashraful Anam et al.
Devil is in the Detail: Towards Injecting Fine Details of Image Prompt in Image Generation via Conflict-free Guidance and Stratified Attention
Kyungmin Jo, Jooyeol Yun, Jaegul Choo
Do ImageNet-trained Models Learn Shortcuts? The Impact of Frequency Shortcuts on Generalization
Shunxin Wang, Raymond Veldhuis, Nicola Strisciuglio
Pre-training Vision Models with Mandelbulb Variations
Benjamin N. Chiche, Yuto Horikawa, Ryo Fujita
VISTREAM: Improving Computation Efficiency of Visual Streaming Perception via Law-of-Charge-Conservation Inspired Spiking Neural Network
Kang You, Ziling Wei, Jing Yan et al.
One-Step Event-Driven High-Speed Autofocus
Yuhan Bao, Shaohua Gao, Wenyong Li et al.
DTOS: Dynamic Time Object Sensing with Large Multimodal Model
Jirui Tian, Jinrong Zhang, Shenglan Liu et al.
TexGarment: Consistent Garment UV Texture Generation via Efficient 3D Structure-Guided Diffusion Transformer
Jialun Liu, Jinbo Wu, Xiaobo Gao et al.
Enhancing Dance-to-Music Generation via Negative Conditioning Latent Diffusion Model
Changchang Sun, Gaowen Liu, Charles Fleming et al.
Auto-Encoded Supervision for Perceptual Image Super-Resolution
MinKyu Lee, Sangeek Hyun, Woojin Jun et al.
Harnessing Frozen Unimodal Encoders for Flexible Multimodal Alignment
Mayug Maniparambil, Raiymbek Akshulakov, YASSER ABDELAZIZ DAHOU DJILALI et al.
Early-Bird Diffusion: Investigating and Leveraging Timestep-Aware Early-Bird Tickets in Diffusion Models for Efficient Training
Lexington Whalen, Zhenbang Du, Haoran You et al.
Detecting Adversarial Data Using Perturbation Forgery
Qian Wang, Chen Li, Yuchen Luo et al.
Three Cars Approaching within 100m! Enhancing Distant Geometry by Tri-Axis Voxel Scanning for Camera-based Semantic Scene Completion
Jongseong Bae, Junwoo Ha, Ha Young Kim
Incomplete Multi-modal Brain Tumor Segmentation via Learnable Sorting State Space Model
Zheyu Zhang, Yayuan Lu, Feipeng Ma et al.
JiSAM: Alleviate Labeling Burden and Corner Case Problems in Autonomous Driving via Minimal Real-World Data
Runjian Chen, Wenqi Shao, Bo Zhang et al.
DynaMoDe-NeRF: Motion-aware Deblurring Neural Radiance Field for Dynamic Scenes
Ashish Kumar, A. N. Rajagopalan
T-CIL: Temperature Scaling using Adversarial Perturbation for Calibration in Class-Incremental Learning
Seong-Hyeon Hwang, Minsu Kim, Steven Euijong Whang
CoE: Chain-of-Explanation via Automatic Visual Concept Circuit Description and Polysemanticity Quantification
wenlong yu, Qilong Wang, Chuang Liu et al.
A2XP: Towards Private Domain Generalization
Geunhyeok Yu, Hyoseok Hwang
Rethinking Correspondence-based Category-Level Object Pose Estimation
Huan Ren, Wenfei Yang, Shifeng Zhang et al.
Progressive Correspondence Regenerator for Robust 3D Registration
Guiyu Zhao, Sheng Ao, Ye Zhang et al.
DV-Matcher: Deformation-based Non-rigid Point Cloud Matching Guided by Pre-trained Visual Features
Zhangquan Chen, Puhua Jiang, Ruqi Huang
GenVDM: Generating Vector Displacement Maps From a Single Image
Yuezhi Yang, Qimin Chen, Vladimir G. Kim et al.
Maintaining Consistent Inter-Class Topology in Continual Test-Time Adaptation
Chenggong Ni, Fan Lyu, Jiayao Tan et al.
Fancy123: One Image to High-Quality 3D Mesh Generation via Plug-and-Play Deformation
Qiao Yu, Xianzhi Li, Yuan Tang et al.
VideoGuide: Improving Video Diffusion Models without Training Through a Teacher's Guide
Dohun Lee, Bryan Sangwoo Kim, Geon Yeong Park et al.
Multi-Modal Synergistic Implicit Image Enhancement for Efficient Optical Flow Estimation
Weichen Dai, wu hexing, xiaoyang weng et al.
CAV-MAE Sync: Improving Contrastive Audio-Visual Mask Autoencoders via Fine-Grained Alignment
Edson Araujo, Andrew Rouditchenko, Yuan Gong et al.
SACB-Net: Spatial-awareness Convolutions for Medical Image Registration
Xinxing Cheng, Tianyang Zhang, Wenqi Lu et al.
Dense Match Summarization for Faster Two-view Estimation
Jonathan Astermark, Anders Heyden, Viktor Larsson
Do Your Best and Get Enough Rest for Continual Learning
Hankyul Kang, Gregor Seifer, Donghyun Lee et al.
Encapsulated Composition of Text-to-Image and Text-to-Video Models for High-Quality Video Synthesis
Tongtong Su, Chengyu Wang, Bingyan Liu et al.
High-Fidelity Lightweight Mesh Reconstruction from Point Clouds
Chen Zhang, Wentao Wang, Ximeng Li et al.
OmniSplat: Taming Feed-Forward 3D Gaussian Splatting for Omnidirectional Images with Editable Capabilities
Suyoung Lee, JAEYOUNG CHUNG, Kihoon Kim et al.
UCOD-DPL: Unsupervised Camouflaged Object Detection via Dynamic Pseudo-label Learning
Weiqi Yan, Lvhai Chen, Huaijia Kou et al.
SEC-Prompt:SEmantic Complementary Prompting for Few-Shot Class-Incremental Learning
Ye Liu, Meng Yang
Solving Instance Detection from an Open-World Perspective
Qianqian Shen, Yunhan Zhao, Nahyun Kwon et al.
MICAS: Multi-grained In-Context Adaptive Sampling for 3D Point Cloud Processing
Feifei Shao, Ping Liu, Zhao Wang et al.
CASP: Compression of Large Multimodal Models Based on Attention Sparsity
Mohsen Gholami, Mohammad Akbari, Kevin Cannons et al.
SP3D: Boosting Sparsely-Supervised 3D Object Detection via Accurate Cross-Modal Semantic Prompts
Shijia Zhao, Qiming Xia, Xusheng Guo et al.
Certified Human Trajectory Prediction
Mohammadhossein Bahari, Saeed Saadatnejad, Amirhossein Askari Farsangi et al.
Track Any Anomalous Object:A Granular Video Anomaly Detection Pipeline
Yuzhi Huang, Chenxin Li, Haitao Zhang et al.
FlexGS: Train Once, Deploy Everywhere with Many-in-One Flexible 3D Gaussian Splatting
Hengyu Liu, Yuehao Wang, Chenxin Li et al.
The PanAf-FGBG Dataset: Understanding the Impact of Backgrounds in Wildlife Behaviour Recognition
Otto Brookes, Maksim Kukushkin, Majid Mirmehdi et al.
Recovering Dynamic 3D Sketches from Videos
Jaeah Lee, Changwoon Choi, Young Min Kim et al.
Autoregressive Distillation of Diffusion Transformers
Yeongmin Kim, Sotiris Anagnostidis, Yuming Du et al.
ConText-CIR: Learning from Concepts in Text for Composed Image Retrieval
Eric Xing, Pranavi Kolouju, Robert Pless et al.
Style Quantization for Data-Efficient GAN Training
Jian Wang, Xin Lan, Ji-Zhe Zhou et al.
Point Cloud Upsampling Using Conditional Diffusion Module with Adaptive Noise Suppression
Boqian Zhang, shen yang, Hao Chen et al.
Self-Supervised Spatial Correspondence Across Modalities
Ayush Shrivastava, Andrew Owens
Balanced Rate-Distortion Optimization in Learned Image Compression
Yichi Zhang, Zhihao Duan, Yuning Huang et al.
UPME: An Unsupervised Peer Review Framework for Multimodal Large Language Model Evaluation
Qihui Zhang, Munan Ning, Zheyuan Liu et al.
Discontinuity-preserving Normal Integration with Auxiliary Edges
Hyomin Kim, Yucheol Jung, Seungyong Lee
Rethinking Token Reduction with Parameter-Efficient Fine-Tuning in ViT for Pixel-Level Tasks
Cheng Lei, Ao Li, Hu Yao et al.
Enhancing Adversarial Transferability with Checkpoints of a Single Model’s Training
Shixin Li, Chaoxiang He, Xiaojing Ma et al.
Silence is Golden: Leveraging Adversarial Examples to Nullify Audio Control in LDM-based Talking-Head Generation
Yuan Gan, Jiaxu Miao, Yunze Wang et al.
Unsupervised Continual Domain Shift Learning with Multi-Prototype Modeling
Haopeng Sun, Yingwei Zhang, Lumin Xu et al.
See Further When Clear: Curriculum Consistency Model
Yunpeng Liu, Boxiao Liu, Yi Zhang et al.
Towards Understanding How Knowledge Evolves in Large Vision-Language Models
Sudong Wang, Yunjian Zhang, Yao Zhu et al.
GBC-Splat: Generalizable Gaussian-Based Clothed Human Digitalization under Sparse RGB Cameras
Hanzhang Tu, Zhanfeng Liao, Boyao Zhou et al.
EBS-EKF: Accurate and High Frequency Event-based Star Tracking
Albert Reed, Connor Hashemi, Dennis Melamed et al.
STING-BEE: Towards Vision-Language Model for Real-World X-ray Baggage Security Inspection
Divya Velayudhan, Abdelfatah Ahmed, Mohamad Alansari et al.
Where the Devil Hides: Deepfake Detectors Can No Longer Be Trusted
Shuaiwei Yuan, Junyu Dong, Yuezun Li
FSBench: A Figure Skating Benchmark for Advancing Artistic Sports Understanding
Rong Gao, Xin Liu, Zhuozhao Hu et al.
BiM-VFI: Bidirectional Motion Field-Guided Frame Interpolation for Video with Non-uniform Motions
Wonyong Seo, Jihyong Oh, Munchurl Kim
Forensic Self-Descriptions Are All You Need for Zero-Shot Detection, Open-Set Source Attribution, and Clustering of AI-generated Images
Tai Nguyen, Aref Azizpour, Matthew Stamm
SemAlign3D: Semantic Correspondence between RGB-Images through Aligning 3D Object-Class Representations
Krispin Wandel, Hesheng Wang
Transfer Your Perspective: Controllable 3D Generation from Any Viewpoint in a Driving Scene
Tai-Yu Daniel Pan, Sooyoung Jeon, Mengdi Fan et al.
Preconditioners for the Stochastic Training of Neural Fields
Shin-Fang Chng, Hemanth Saratchandran, Simon Lucey
Handling Spatial-Temporal Data Heterogeneity for Federated Continual Learning via Tail Anchor
Hao Yu, Xin Yang, Le Zhang et al.
PhyS-EdiT: Physics-aware Semantic Image Editing with Text Description
Ziqi Cai, Shuchen Weng, Yifei Xia et al.
Hand-held Object Reconstruction from RGB Video with Dynamic Interaction
Shijian Jiang, Qi Ye, Rengan Xie et al.
Evaluating Model Perception of Color Illusions in Photorealistic Scenes
Lingjun Mao, Zineng Tang, Alane Suhr
SynTab-LLaVA: Enhancing Multimodal Table Understanding with Decoupled Synthesis
Bangbang Zhou, Zuan Gao, Zixiao Wang et al.
Towards Source-Free Machine Unlearning
Sk Miraj Ahmed, Umit Basaran, Dripta S. Raychaudhuri et al.
Towards Million-Scale Adversarial Robustness Evaluation With Stronger Individual Attacks
Yong Xie, Weijie Zheng, Hanxun Huang et al.
SAM-REF: Introducing Image-Prompt Synergy during Interaction for Detail Enhancement in the Segment Anything Model
Chongkai Yu, Ting Liu, Li Anqi et al.
Sound Bridge: Associating Egocentric and Exocentric Videos via Audio Cues
Sihong Huang, Jiaxin Wu, Xiaoyong Wei et al.
DiffCAM: Data-Driven Saliency Maps by Capturing Feature Differences
Xingjian Li, Qiming Zhao, Neelesh Bisht et al.
Hierarchical Flow Diffusion for Efficient Frame Interpolation
Yang Hai, Guo Wang, Tan Su et al.
Observation-Guided Diffusion Probabilistic Models
Junoh Kang, Jinyoung Choi, Sungik Choi et al.
Towards Human-Understandable Multi-Dimensional Concept Discovery
Arne Grobrügge, Niklas Kühl, Gerhard Satzger et al.
PoseBH: Prototypical Multi-Dataset Training Beyond Human Pose Estimation
Uyoung Jeong, Jonathan Freer, Seungryul Baek et al.
ESR-NeRF: Emissive Source Reconstruction Using LDR Multi-view Images
Jinseo Jeong, Junseo Koo, Qimeng Zhang et al.
Progressive Rendering Distillation: Adapting Stable Diffusion for Instant Text-to-Mesh Generation without 3D Data
Zhiyuan Ma, Xinyue Liang, Rongyuan Wu et al.
Parameter Efficient Mamba Tuning via Projector-targeted Diagonal-centric Linear Transformation
Seokil Ham, Hee-Seon Kim, Sangmin Woo et al.
Tripartite Weight-Space Ensemble for Few-Shot Class-Incremental Learning
Juntae Lee, Munawar Hayat, Sungrack Yun
Unleashing the Potential of Consistency Learning for Detecting and Grounding Multi-Modal Media Manipulation
Yiheng Li, Yang Yang, Zichang Tan et al.
Compass Control: Multi Object Orientation Control for Text-to-Image Generation
Rishubh Parihar, Vaibhav Agrawal, Sachidanand VS et al.
Watermarking One for All: A Robust Watermarking Scheme Against Partial Image Theft
Gaozhi Liu, Silu Cao, Zhenxing Qian et al.
Exploring Semantic Feature Discrimination for Perceptual Image Super-Resolution and Opinion-Unaware No-Reference Image Quality Assessment
Guanglu Dong, Xiangyu Liao, Mingyang Li et al.
Diffusion-based Event Generation for High-Quality Image Deblurring
Xinan Xie, Qing Zhang, Wei-Shi Zheng
LOCORE: Image Re-ranking with Long-Context Sequence Modeling
Zilin Xiao, Pavel Suma, Ayush Sachdeva et al.
SPC-GS: Gaussian Splatting with Semantic-Prompt Consistency for Indoor Open-World Free-view Synthesis from Sparse Inputs
Guibiao Liao, Qing Li, Zhenyu Bao et al.
Reasoning Mamba: Hypergraph-Guided Region Relation Calculating for Weakly Supervised Affordance Grounding
Yuxuan Wang, Aming Wu, Muli Yang et al.
Deep Fair Multi-View Clustering with Attention KAN
HaiMing Xu, Qianqian Wang, Boyue Wang et al.
Coherence As Texture – Passive Textureless 3D Reconstruction by Self-interference
Wei-Yu Chen, Aswin C. Sankaranarayanan, Anat Levin et al.
Distinguish Then Exploit: Source-free Open Set Domain Adaptation via Weight Barcode Estimation and Sparse Label Assignment
Weiming Liu, Jun Dan, Fan Wang et al.
Escaping Plato's Cave: Towards the Alignment of 3D and Text Latent Spaces
Souhail Hadgi, Luca Moschella, Andrea Santilli et al.
Boosting the Dual-Stream Architecture in Ultra-High Resolution Segmentation with Resolution-Biased Uncertainty Estimation
Rong Qin, Xingyu Liu, Jinglei Shi et al.
LC-Mamba: Local and Continuous Mamba with Shifted Windows for Frame Interpolation
Min Wu Jeong, Chae Eun Rhee
Learning from Synchronization: Self-Supervised Uncalibrated Multi-View Person Association in Challenging Scenes
Keqi Chen, vinkle srivastav, Didier MUTTER et al.
Overcoming Shortcut Problem in VLM for Robust Out-of-Distribution Detection
Zhuo Xu, Xiang Xiang, Yifan Liang
Blurry-Edges: Photon-Limited Depth Estimation from Defocused Boundaries
Wei Xu, Charlie Wagner, Junjie Luo et al.
VITED: Video Temporal Evidence Distillation
Yujie Lu, Yale Song, Lorenzo Torresani et al.
LightLoc: Learning Outdoor LiDAR Localization at Light Speed
Wen Li, Chen Liu, Shangshu Yu et al.
KMD: Koopman Multi-modality Decomposition for Generalized Brain Tumor Segmentation under Incomplete Modalities
Tianyi Liu, Haochuan Jiang, Kaizhu Huang
3D-AVS: LiDAR-based 3D Auto-Vocabulary Segmentation
Weijie Wei, Osman Ülger, Fatemeh Karimi Nejadasl et al.
CustAny: Customizing Anything from A Single Example
Lingjie Kong, Kai WU, Chengming Xu et al.
Test-Time Fine-Tuning of Image Compression Models for Multi-Task Adaptability
Unki Park, Seongmoon Jeong, Jang Youngchan et al.
Robust Multi-Object 4D Generation for In-the-wild Videos
Wen-Hsuan Chu, Lei Ke, Jianmeng Liu et al.
ViUniT: Visual Unit Tests for More Robust Visual Programming
Artemis Panagopoulou, Honglu Zhou, silvio savarese et al.
DH-Set: Improving Vision-Language Alignment with Diverse and Hybrid Set-Embeddings Learning
Kun Zhang, Jingyu Li, Zhe Li et al.
EAP-GS: Efficient Augmentation of Pointcloud for 3D Gaussian Splatting in Few-shot Scene Reconstruction
Dongrui Dai, Yuxiang Xing
Using Powerful Prior Knowledge of Diffusion Model in Deep Unfolding Networks for Image Compressive Sensing
Chen Liao, Yan Shen, Dan Li et al.
Derivative-Free Diffusion Manifold-Constrained Gradient for Unified XAI
Won Jun Kim, Hyungjin Chung, Jaemin Kim et al.
FreeUV: Ground-Truth-Free Realistic Facial UV Texture Recovery via Cross-Assembly Inference Strategy
Xingchao Yang, Takafumi Taketomi, Yuki Endo et al.
Sample- and Parameter-Efficient Auto-Regressive Image Models
Elad Amrani, Leonid Karlinsky, Alex M. Bronstein
Color Alignment in Diffusion
Ka Chun SHUM, Binh-Son Hua, Thanh Nguyen et al.
On the Generalization of Handwritten Text Recognition Models
Carlos Garrido-Munoz, Jorge Calvo-Zaragoza
FRESA: Feedforward Reconstruction of Personalized Skinned Avatars from Few Images
Rong Wang, Fabian Prada, Ziyan Wang et al.
Galaxy Walker: Geometry-aware VLMs For Galaxy-scale Understanding
Tianyu Chen, Xingcheng Fu, Yisen Gao et al.
Adapting to the Unknown: Training-Free Audio-Visual Event Perception with Dynamic Thresholds
Eitan Shaar, Ariel Shaulov, Gal Chechik et al.
Finsler Multi-Dimensional Scaling: Manifold Learning for Asymmetric Dimensionality Reduction and Embedding
Thomas Dagès, Simon Weber, Ya-Wei Eileen Lin et al.
UA-Pose: Uncertainty-Aware 6D Object Pose Estimation and Online Object Completion with Partial References
Ming-Feng Li, Xin Yang, Fu-En Wang et al.
Mitigating Ambiguities in 3D Classification with Gaussian Splatting
Ruiqi Zhang, Hao Zhu, Jingyi Zhao et al.