Most Cited ECCV "multimodal recommendation" Papers
2,387 papers found • Page 12 of 12
Conference
Learning Representations from Foundation Models for Domain Generalized Stereo Matching
Yongjian Zhang, Longguang Wang, Kunhong Li et al.
Pick-a-back: Selective Device-to-Device Knowledge Transfer in Federated Continual Learning
JinYi Yoon, HyungJune Lee
Improving Unsupervised Domain Adaptation: A Pseudo-Candidate Set Approach
Aveen Dayal, Rishabh Lalla, Linga Reddy Cenkeramaddi et al.
Enhancing Optimization Robustness in 1-bit Neural Networks through Stochastic Sign Descent
NianHui Guo, Hong Guo, Christoph Meinel et al.
Holodepth: Programmable Depth-Varying Projection via Computer-Generated Holography
Dorian Chan, Matthew O'Toole, Sizhuo Ma et al.
Linearly Controllable GAN: Unsupervised Feature Categorization and Decomposition for Image Generation and Manipulation
Sehyung Lee, Mijung Kim, Yeongnam Chae et al.
High-Fidelity Modeling of Generalizable Wrinkle Deformation
Jingfan Guo, Jae Shin Yoon, Shunsuke Saito et al.
Two-Stage Video Shadow Detection via Temporal-Spatial Adaption
Xin Duan, Yu Cao, Lei Zhu et al.
CLIP-DINOiser: Teaching CLIP a few DINO tricks for open-vocabulary semantic segmentation
Monika Wysoczanska, Oriane Siméoni, Michaël Ramamonjisoa et al.
IGNORE: Information Gap-based False Negative Loss Rejection for Single Positive Multi-Label Learning
Gyeong Ryeol Song, Noo-ri Kim, Jin-Seop Lee et al.
FMBoost: Boosting Latent Diffusion with Flow Matching
Johannes Schusterbauer-Fischer, Ming Gui, Pingchuan Ma et al.
Merlin: Empowering Multimodal LLMs with Foresight Minds
En Yu, liang zhao, YANA WEI et al.
MemBN: Robust Test-Time Adaptation via Batch Norm with Statistics Memory
Juwon Kang, Nayeong Kim, Jungseul Ok et al.
High-Resolution and Few-shot View Synthesis from Asymmetric Dual-lens Inputs
Ruikang Xu, Mingde Yao, Yue Li et al.
SparseSSP: 3D Subcellular Structure Prediction from Sparse-View Transmitted Light Images
Jintu Zheng, Yi Ding, Qizhe Liu et al.
SEDiff: Structure Extraction for Domain Adaptive Depth Estimation via Denoising Diffusion Models
Dongseok Shim, Hyoun Jin Kim
M^2Depth: Self-supervised Two-Frame Multi-camera Metric Depth Estimation
Yingshuang Zou, Yikang Ding, Xi Qiu et al.
Pseudo-Labelling Should Be Aware of Disguising Channel Activations
Changrui Chen, Kurt Debattista, Jungong Han
3DSA:Multi-View 3D Human Pose Estimation With 3D Space Attention Mechanisms
Po Han Chen, Chia-Chi Tsai
Gaze Target Detection Based on Head-Local-Global Coordination
Yaokun Yang, Feng Lu
Visual Relationship Transformation
Xiaoyu Xu, Jiayan Qiu, Baosheng Yu et al.
SPHINX: A Mixer of Weights, Visual Embeddings and Image Scales for Multi-modal Large Language Models
Ziyi Lin, Dongyang Liu, Renrui Zhang et al.
Computing the Lipschitz constant needed for fast scene recovery from CASSI measurements
Niels Chr. Overgaard, Anders Holst
Unsupervised Exposure Correction
Ruodai Cui, Li Niu, Guosheng Hu
Cut out the Middleman: Revisiting Pose-based Gait Recognition
YANG FU, Saihui Hou, Shibei Meng et al.
Elysium: Exploring Object-level Perception in Videos through Semantic Integration Using MLLMs
Han Wang, Yanjie Wang, Ye Yongjie et al.
CPT-VR: Improving Surface Rendering via Closest Point Transform with View-Reflection Appearance
Zhipeng Hu, Yongqiang Zhang, Chen Liu et al.
Rethinking Fast Adversarial Training: A Splitting Technique To Overcome Catastrophic Overfitting
Masoumeh Zareapoor, Pourya Shamsolmoali
FedHARM: Harmonizing Model Architectural Diversity in Federated Learning
Anestis Kastellos, Athanasios Psaltis, Charalampos Z Patrikakis et al.
3D Hand Sequence Recovery from Real Blurry Images and Event Stream
Joonkyu Park, Gyeongsik Moon, Weipeng Xu et al.
Rethinking Data Bias: Dataset Copyright Protection via Embedding Class-wise Hidden Bias
Jinhyeok Jang, ByungOk Han, Jaehong Kim et al.
Federated Learning with Local Openset Noisy Labels
Zonglin Di, Zhaowei Zhu, Xiaoxiao Li et al.
TCC-Det: Temporarily consistent cues for weakly-supervised 3D detection
Jan Skvrna, Lukas Neumann
3D Congealing: 3D-Aware Image Alignment in the Wild
Yunzhi Zhang, Zizhang Li, Amit Raj et al.
Learning Exhaustive Correlation for Spectral Super-Resolution: Where Spatial-Spectral Attention Meets Linear Dependence
Hongyuan Wang, Lizhi Wang, Jiang Xu et al.
RCS-Prompt: Learning Prompt to Rearrange Class Space for Prompt-based Continual Learning
Longrong Yang, Hanbin Zhao, Yunlong Yu et al.
High-Fidelity 3D Textured Shapes Generation by Sparse Encoding and Adversarial Decoding
Qi Zuo, Xiaodong Gu, Yuan Dong et al.
SemTrack: A Large-scale Dataset for Semantic Tracking in the Wild
Pengfei Wang, Xiaofei Hui, Jing Wu et al.
IVTP: Instruction-guided Visual Token Pruning for Large Vision-Language Models
Kai Huang, Hao Zou, Ye Xi et al.
3DFG-PIFu: 3D Feature Grids for Human Digitization from Sparse Views
Kennard Yanting Chan, Fayao Liu, Guosheng Lin et al.
All You Need is Your Voice: Emotional Face Representation with Audio Perspective for Emotional Talking Face Generation
Seongho Kim, Byung Cheol Song
TrafficNight : An Aerial Multimodal Benchmark For Nighttime Vehicle Surveillance
Guoxing Zhang, Yiming Liu, xiaoyu yang et al.
PreLAR: World Model Pre-training with Learnable Action Representation
Lixuan Zhang, Meina Kan, Shiguang Shan et al.
G2fR: Frequency Regularization in Grid-based Feature Encoding Neural Radiance Fields
Shuxiang Xie, Shuyi Zhou, Ken Sakurada et al.
Let the Avatar Talk using Texts without Paired Training Data
Xiuzhe Wu, Yang-Tian Sun, Handi Chen et al.
Loc3Diff: Local Diffusion for 3D Human Head Synthesis and Editing
Yushi Lan, Feitong Tan, Qiangeng Xu et al.
SceneTeller: Language-to-3D Scene Generation
Basak Melis Ocal, Maxim Tatarchenko, Sezer Karaoglu et al.
PoseSOR: Human Pose Can Guide Our Attention
Huankang Guan, Rynson W.H. Lau
MarineInst: A Foundation Model for Marine Image Analysis with Instance Visual Description
Ziqiang Zheng, Yiwei Chen, Huimin Zeng et al.
SpeedUpNet: A Plug-and-Play Adapter Network for Accelerating Text-to-Image Diffusion Models
Weilong Chai, Dandan Zheng, Jiajiong Cao et al.
FuseTeacher: Modality-fused Encoders are Strong Vision Supervisors
Chen-Wei Xie, Siyang Sun, Liming Zhao et al.
WAVE: Warping DDIM Inversion Features for Zero-shot Text-to-Video Editing
Yutang Feng, Sicheng Gao, Yuxiang Bao et al.
Cocktail Universal Adversarial Attack on Deep Neural Networks
Shaoxin Li, Xiaofeng Liao, Xin Che et al.
Learning to Distinguish Samples for Generalized Category Discovery
Fengxiang Yang, Pu Nan, Wenjing Li et al.
COM Kitchens: An Unedited Overhead-view Procedural Videos Dataset a Vision-Language Benchmark
Atsushi Hashimoto, Koki Maeda, Tosho Hirasawa et al.
Optimal Transport of Diverse Unsupervised Tasks for Robust Learning from Noisy Few-Shot Data
Xiaofan Que, Qi Yu
WBP: Training-time Backdoor Attacks through Hardware-based Weight Bit Poisoning
Kunbei Cai, Zhenkai Zhang, Qian Lou et al.
Towards Dual Transparent Liquid Level Estimation in Biomedical Lab: Dataset, Methods and Practice
Xiayu Wang, Ke Ma, Ruiyun Zhong et al.
Free-Viewpoint Video of Outdoor Sports Using a Drone
Zhengdong Hong
Blind image deblurring with noise-robust kernel estimation
Chanseok Lee, Jeongsol Kim, Seungmin Lee et al.
4Diff: 3D-Aware Diffusion Model for Third-to-First Viewpoint Translation
Feng Cheng, Mi Luo, Huiyu Wang et al.
LetsMap: Unsupervised Representation Learning for Label-Efficient Semantic BEV Mapping
Nikhil Gosala, Kürsat Petek, B Ravi Kiran et al.
Get Your Embedding Space in Order: Domain-Adaptive Regression for Forest Monitoring
Sizhuo Li, Dimitri Gominski, Martin Brandt et al.
Unsupervised Dense Prediction using Differentiable Normalized Cuts
Yanbin Liu, Stephen Gould
L-DiffER: Single Image Reflection Removal with Language-based Diffusion Model
Yuchen Hong, Haofeng Zhong, Shuchen Weng et al.
uCAP: An Unsupervised Prompting Method for Vision-Language Models
A. Tuan Nguyen, Kai Sheng Tai, Bor-Chun Chen et al.
How Video Meetings Change Your Expression
Sumit Sarin, Utkarsh Mall, Purva Tendulkar et al.
Learning Chain of Counterfactual Thought for Bias-Robust Vision-Language Reasoning
Yifeng Zhang, Ming Jiang, Qi Zhao
Delving into Adversarial Robustness on Document Tampering Localization
Huiru Shao, Zhuang Qian, Kaizhu Huang et al.
Dual-Decoupling Learning and Metric-Adaptive Thresholding for Semi-Supervised Multi-Label Learning
Jiahao Xiao, Ming-Kun Xie, Heng-Bo Fan et al.
Common Sense Reasoning for Deep Fake Detection
Yue Zhang, Ben Colman, Xiao Guo et al.
Efficient Frequency-Domain Image Deraining with Contrastive Regularization
Ning Gao, xingyu jiang, Xiuhui Zhang et al.
Rethinking Deep Unrolled Model for Accelerated MRI Reconstruction
Bingyu Xin, Meng Ye, Leon Axel et al.
Pose Guided Fine-Grained Sign Language Video Generation
Tongkai Shi, Lianyu Hu, Fanhua Shang et al.
Retrieval Robust to Object Motion Blur
Rong Zou, Marc Pollefeys, Denys Rozumnyi
SeA: Semantic Adversarial Augmentation for Last Layer Features from Unsupervised Representation Learning
Qi Qian, Yuanhong Xu, JUHUA HU
HVCLIP: High-dimensional Vector in CLIP for Unsupervised Domain Adaptation
Noranart Vesdapunt, Kah Kuen Fu, Yue Wu et al.
Improving 3D Semi-supervised Learning by Effectively Utilizing All Unlabelled Data
Sneha Paul, Zachary Patterson, Nizar Bouguila
Norma: A Noise Robust Memory-Augmented Framework for Whole Slide Image Classification
Yu Bai, Bo Zhang, Zheng Zhang et al.
To Supervise or Not to Supervise: Understanding and Addressing the Key Challenges of Point Cloud Transfer Learning
Souhail Hadgi, Lei Li, Maks Ovsjanikov
Context-Aware Action Recognition: Introducing a Comprehensive Dataset for Behavior Contrast
Tatsuya Sasaki, Yoshiki Ito, Satoshi Kondo
Uni3DL: A Unified Model for 3D Vision-Language Understanding
Xiang Li, Jian Ding, Zhaoyang Chen et al.
Learning to Robustly Reconstruct Dynamic Scenes from Low-light Spike Streams
Liwen Hu, gang ding, Mianzhi Liu et al.
A Riemannian Approach for Spatiotemporal Analysis and Generation of 4D Tree-shaped Structures
Tahmina Khanam, Mohammed Bennamoun, Guan Wang et al.
Rethinking Image Super Resolution from Training Data Perspectives
Go Ohtani, Ryu Tadokoro, Ryosuke Yamada et al.
Robustness Preserving Fine-tuning using Neuron Importance
Guangrui Li, Rahul Duggal, Aaditya Singh et al.
Optimization-based Uncertainty Attribution Via Learning Informative Perturbations
Hanjing Wang, Bashirul Azam Biswas, Qiang Ji
Ex2Eg-MAE: A Framework for Adaptation of Exocentric Video Masked Autoencoders for Egocentric Social Role Understanding
Minh Tran, Yelin Kim, Che-Chun Su et al.
Dual-Rain: Video Rain Removal using Assertive and Gentle Teachers
Tingting Chen, Beibei Lin, Yeying Jin et al.
ProtoComp: Diverse Point Cloud Completion with Controllable Prototype
Xumin Yu, Yanbo Wang, Jie Zhou et al.
3D Gaussian Parametric Head Model
Yuelang Xu, Lizhen Wang, Zerong Zheng et al.
Forbes: Face Obfuscation Rendering via Backpropagation Refinement Scheme
Jintae Kim, Seungwon Yang, Seong-Gyun Jeong et al.
Information Bottleneck Based Data Correction in Continual Learning
Shuai Chen, mingyi zhang, Junge Zhang et al.
Soft Shadow Diffusion (SSD): Physics-inspired Learning for 3D Computational Periscopy
Fadlullah Raji, John Murray-Bruce
FisherRF: Active View Selection and Mapping with Radiance Fields using Fisher Information
Wen Jiang, BOSHU LEI, Kostas Daniilidis
Learning Representation for Multitask Learning through Self-Supervised Auxiliary Learning
Seokwon Shin, Hyungrok Do, Youngdoo Son
Generalizing to Unseen Domains via Text-guided Augmentation
Daiqing Qi, Handong Zhao, Aidong Zhang et al.
Adaptive Parametric Activation
Konstantinos P Alexandridis, Jiankang Deng, Anh Nguyen et al.
GAURA: Generalizable Approach for Unified Restoration and Rendering of Arbitrary Views
Vinayak Gupta, Rongali Simhachala Venkata Girish, Mukund Varma T et al.
Generalizable Symbolic Optimizer Learning
Xiaotian Song, Peng Zeng, Yanan Sun et al.
AEDNet: Adaptive Embedding and Multiview-Aware Disentanglement for Point Cloud Completion
Zhiheng Fu, Longguang Wang, Lian Xu et al.
Contextual Correspondence Matters: Bidirectional Graph Matching for Video Summarization
yunzuo zhang, Yameng Liu
Lost in Translation: Latent Concept Misalignment in Text-to-Image Diffusion Models
Juntu Zhao, Junyu Deng, Yixin Ye et al.
Think before Placement: Common Sense Enhanced Transformer for Object Placement
Yaxuan Qin, Jiayu Xu, Ruiping Wang et al.
Reinforcement Learning via Auxillary Task Distillation
Abhinav Narayan Harish, Larry Heck, Josiah P Hanna et al.
On the Viability of Monocular Depth Pre-training for Semantic Segmentation
DONG LAO, Fengyu Yang, Daniel Wang et al.
Causal Subgraphs and Information Bottlenecks: Redefining OOD Robustness in Graph Neural Networks
Weizhi An, Wenliang Zhong, Feng Jiang et al.
Revisit Self-supervision with Local Structure-from-Motion
Shengjie Zhu, Xiaoming Liu
Easing 3D Pattern Reasoning with Side-view Features for Semantic Scene Completion
Linxi Huan, Mingyue Dong, Linwei Yue et al.
Unsupervised Representation Learning by Balanced Self Attention Matching
Daniel Shalam, Simon Korman
Occluded Gait Recognition with Mixture of Experts: An Action Detection Perspective
Panjian Huang, Yunjie Peng, Saihui Hou et al.
Early Anticipation of Driving Maneuvers
Abdul Wasi Lone, Shankar Gangisetty, Shyam Nandan et al.
Revisiting Domain-Adaptive Object Detection in Adverse Weather by the Generation and Composition of High-Quality Pseudo-Labels
Rui Zhao, Huibin Yan, Shuoyao Wang
DreamReward: Aligning Human Preference in Text-to-3D Generation
junliang ye, Fangfu Liu, Qixiu Li et al.
Plug and Play: A Representation Enhanced Domain Adapter for Collaborative Perception
TIANYOU LUO, Quan Yuan, Yuchen Xia et al.
FedHide: Federated Learning by Hiding in the Neighbors
Hyunsin Park, Sungrack Yun
HoloADMM: High-Quality Holographic Complex Field Recovery
Mazen Mel, Paul Springer, Pietro Zanuttigh et al.
GGRt: Towards Generalizable 3D Gaussians without Pose Priors in Real-Time
Hao Li, Yuanyuan Gao, Dingwen Zhang et al.
Learning Equilibrium Transformation for Gamut Expansion and Color Restoration
JUN XIAO, Changjian Shui, Zhi-Song Liu et al.
Physics-informed Knowledge Transfer for Underwater Monocular Depth Estimation
Jinghe Yang, Mingming Gong, Ye Pu
Motion Keyframe Interpolation for Any Human Skeleton using Point Cloud-based Human Motion Data Homogenisation
Clinton Mo, Kun Hu, Chengjiang Long et al.
DCDM: Diffusion-Conditioned-Diffusion Model for Scene Text Image Super-Resolution
Shrey Singh, Prateek Keserwani, Masakazu Iwamura et al.
Shapefusion: 3D localized human diffusion models
Rolandos Alexandros Potamias, Michael Tarasiou, Stylianos Ploumpis et al.
Robust Nearest Neighbors for Source-Free Domain Adaptation under Class Distribution Shift
Antonio Tejero-de-Pablos, Riku Togashi, Mayu Otani et al.
Linking in Style: Understanding learned features in deep learning models
Maren Wehrheim, Pamela Osuna Vargas, Matthias Kaschube
Chains of Diffusion Models
Yanheng Wei, Lianghua Huang, Zhi-Fan Wu et al.
Integrating Markov Blanket Discovery into Causal Representation Learning for Domain Generalization
Naiyu Yin, Hanjing Wang, Yue Yu et al.
Discovering Unwritten Visual Classifiers with Large Language Models
Mia Chiquier, Utkarsh Mall, Carl Vondrick
FLAT: Flux-aware Imperceptible Adversarial Attacks on 3D Point Clouds
Keke Tang, Lujie Huang, Weilong Peng et al.
Learn to Optimize Denoising Scores: A Unified and Improved Diffusion Prior for 3D Generation
Xiaofeng Yang, Yiwen Chen, Cheng Chen et al.
Learning Quantized Adaptive Conditions for Diffusion Models
Yuchen Liang, Yuchuan Tian, Lei Yu et al.
Attention Decomposition for Cross-Domain Semantic Segmentation
Liqiang He, Sinisa Todorovic
MANIKIN: Biomechanically Accurate Neural Inverse Kinematics for Human Motion Estimation
Jiaxi Jiang, Paul Streli, Xuejing Luo et al.
A Unified Image Compression Method for Human Perception and Multiple Vision Tasks
Sha Guo, Sui Lin, Chen-Lin Zhang et al.
IAM-VFI : Interpolate Any Motion for Video Frame Interpolation with motion complexity map
Kihwan Yoon, Yong Han Kim, Sungjei Kim et al.
On the Approximation Risk of Few-Shot Class-Incremental Learning
Xuan Wang, Zhong Ji, Xiyao Liu et al.
Learning Neural Deformation Representation for 4D Dynamic Shape Generation
Gyojin Han, Jiwan Hur, Jaehyun Choi et al.
Echoes of the Past: Boosting Long-tail Recognition via Reflective Learning
Qihao Zhao, YALUN DAI, Shen Lin et al.
Diffusion-Guided Weakly Supervised Semantic Segmentation
Sung-Hoon Yoon, Hoyong Kwon, Jaeseok Jeong et al.
Real-data-driven 2000 FPS Color Video from Mosaicked Chromatic Spikes
Siqi Yang, Zhaojun Huang, Yakun Chang et al.
Panel-Specific Degradation Representation for Raw Under-Display Camera Image Restoration
Youngjin Oh, Keuntek Lee, Jooyoung Lee et al.
SignGen: End-to-End Sign Language Video Generation with Latent Diffusion
Fan Qi, Yu Duan, Changsheng Xu et al.
Idling Neurons, Appropriately Lenient Workload During Fine-tuning Leads to Better Generalization
Hongjing Niu, Hanting Li, Bin Li et al.
Hierarchically Structured Neural Bones for Reconstructing Animatable Objects from Casual Videos
Subin Jeon, In Cho, Minsu Kim et al.
APL: Anchor-based Prompt Learning for One-stage Weakly Supervised Referring Expression Comprehension
Yaxin Luo, Jiayi Ji, Xiaofu Chen et al.
Depth-Aware Blind Image Decomposition for Real-World Adverse Weather Recovery
Chao Wang, Zhedong Zheng, Ruijie Quan et al.
Convex Relaxations for Manifold-Valued Markov Random Fields with Approximation Guarantees
Robin Kenis, Emanuel Laude, Panagiotis Patrinos
Label-free Neural Semantic Image Synthesis
Jiayi Wang, Kevin Alexander Laube, Yumeng Li et al.
Revisiting Feature Disentanglement Strategy in Diffusion Training and Breaking Conditional Independence Assumption in Sampling
Wonwoong Cho, Hareesh Ravi, Midhun Harikumar et al.
HiEI: A Universal Framework for Generating High-quality Emerging Images from Natural Images
Jingmeng Li, Lukang Fu, Surun Yang et al.
Co-Student: Collaborating Strong and Weak Students for Sparsely Annotated Object Detection
Lianjun Wu, Jiangxiao Han, Zengqiang Zheng et al.
Oulu Remote-photoplethysmography Physical Domain Attacks Database (ORPDAD)
Marko Savic, Guoying Zhao
V-Trans4Style: Visual Transition Recommendation for Video Production Style Adaptation
Pooja Guhan, Tsung-Wei Huang, Guan-Ming Su et al.
Privacy-Preserving Adaptive Re-Identification without Image Transfer
Hamza Rami, Jhony H. Giraldo, Nicolas Winckler et al.
Structured-NeRF: Hierarchical Scene Graph with Neural Representation
Zhide Zhong, Jiakai Cao, songen gu et al.
Motion Aware Event Representation-driven Image Deblurring
Zhijing Sun, Xueyang Fu, Longzhuo Huang et al.
Uncertainty-Driven Spectral Compressive Imaging with Spatial-Frequency Transformer
Lintao Peng, Siyu Xie, Liheng Bian
Omni-Recon: Harnessing Image-based Rendering for General-Purpose Neural Radiance Fields
Yonggan Fu, Huaizhi Qu, Zhifan Ye et al.
Weakly-Supervised Spatio-Temporal Video Grounding with Variational Cross-Modal Alignment
Yang Jin, Yadong Mu
The Sky's the Limit: Relightable Outdoor Scenes via a Sky-pixel Constrained Illumination Prior and Outside-In Visibility
James Gardner, Evgenii Kashin, Bernhard Egger et al.
Hierarchical Unsupervised Relation Distillation for Source Free Domain Adaptation
Bowei Xing, Xianghua Ying, Ruibin Wang et al.
Category Adaptation Meets Projected Distillation in Generalized Continual Category Discovery
Grzegorz Rypesc, Daniel Marczak, Sebastian Cygert et al.
TAG: Text Prompt Augmentation for Zero-Shot Out-of-Distribution Detection
Xixi Liu, Christopher Zach
Deep Companion Learning: Enhancing Generalization Through Historical Consistency
Ruizhao Zhu, Venkatesh Saligrama
Domain Reduction Strategy for Non-Line-of-Sight Imaging
Hyunbo Shim, In Cho, Daekyu Kwon et al.
Textual Grounding for Open-vocabulary Visual Information Extraction in Layout-diversified Documents
MENGJUN CHENG, Chengquan Zhang, Chang Liu et al.
GMT: Enhancing Generalizable Neural Rendering via Geometry-Driven Multi-Reference Texture Transfer
Youngho Yoon, Hyun-Kurl Jang, Kuk-Jin Yoon
Scalar Function Topology Divergence: Comparing Topology of 3D Objects
Ilya Trofimov, Daria Voronkova, Eduard Tulchinskii et al.
BlazeBVD: Make Scale-Time Equalization Great Again for Blind Video Deflickering
Xinmin Qiu, Congying Han, Zicheng Zhang et al.
SAFARI: Adaptive Sequence Transformer for Weakly Supervised Referring Expression Segmentation
Sayan Nag, Koustava Goswami, Srikrishna Karanam
Interaction-centric Spatio-Temporal Context Reasoning for Multi-Person Video HOI Recognition
Yisong Wang, Nan Xi, Jingjing Meng et al.
MAD-DR: Map Compression for Visual Localization with Matchness Aware Descriptor Dimension Reduction
Qiang Wang
Continual Learning and Unknown Object Discovery in 3D Scenes via Self-Distillation
Mohamed El Amine Boudjoghra, Jean Lahoud, Salman Khan et al.
LLM as Copilot for Coarse-grained Vision-and-Language Navigation
Yanyuan Qiao, Qianyi Liu, Jiajun Liu et al.
ExMatch: Self-guided Exploitation for Semi-Supervised Learning with Scarce Labeled Samples
Noo-ri Kim, Jin-Seop Lee, Jee-Hyong LEE
Physically Plausible Color Correction for Neural Radiance Fields
Qi Zhang, Ying Feng, HONGDONG LI
Domesticating SAM for Breast Ultrasound Image Segmentation via Spatial-frequency Fusion and Uncertainty Correction
Wanting Zhang, Huisi Wu, Jing Qin
Analysis-by-Synthesis Transformer for Single-View 3D Reconstruction
Dian Jia, Xiaoqian Ruan, Kun Xia et al.
Open-Vocabulary RGB-Thermal Semantic Segmentation
Guoqiang Zhao, JunJie Huang, Xiaoyun Yan et al.
DMiT: Deformable Mipmapped Tri-Plane Representation for Dynamic Scenes
Jing-Wen Yang, Jia-Mu Sun, Yong-Liang Yang et al.
Textual-Visual Logic Challenge: Understanding and Reasoning in Text-to-Image Generation
Peixi Xiong, Michael A Kozuch, Nilesh Jain
Spline-based Transformers
Prashanth Chandran, Agon Serifi, Markus Gross et al.
KeypointDETR: An End-to-End 3D Keypoint Detector
Hairong Jin, Yuefan Shen, Jianwen Lou et al.
Lost and Found: Overcoming Detector Failures in Online Multi-Object Tracking
Lorenzo Vaquero, Yihong XU, Xavier Alameda-Pineda et al.
Unsupervised Moving Object Segmentation with Atmospheric Turbulence
Dehao Qin, Ripon Saha, Woojeh Chung et al.
Modeling Label Correlations with Latent Context for Multi-Label Recognition
Zhao-Min Chen, Quan Cui, Ruoxi Deng et al.
Bayesian Self-Training for Semi-Supervised 3D Segmentation
Ozan Unal, Christos Sakaridis, Luc Van Gool