Most Cited ECCV "capsule networks" Papers
2,387 papers found • Page 8 of 12
Conference
VISAGE: Video Instance Segmentation with Appearance-Guided Enhancement
Hanjung Kim, Jaehyun Kang, Miran Heo et al.
MeshVPR: Citywide Visual Place Recognition Using 3D Meshes
Gabriele Berton, Lorenz Junglas, Riccardo Zaccone et al.
AvatarPose: Avatar-guided 3D Pose Estimation of Close Human Interaction from Sparse Multi-view Videos
Feichi Lu, Zijian Dong, Jie Song et al.
Temporal Residual Jacobians for Rig-free Motion Transfer
Sanjeev Muralikrishnan, Niladri Shekhar Dutt, Siddhartha Chaudhuri et al.
CONDA: Condensed Deep Association Learning for Co-Salient Object Detection.
Long Li, Nian Liu, Dingwen Zhang et al.
Self-Supervised Audio-Visual Soundscape Stylization
Tingle Li, Renhao Wang, Po-Yao Huang et al.
Affective Visual Dialog: A Large-Scale Benchmark for Emotional Reasoning Based on Visually Grounded Conversations
KILICHBEK HAYDAROV, Xiaoqian Shen, Avinash Madasu et al.
Domain Generalization of 3D Object Detection by Density-Resampling
Shuangzhi Li, Lei Ma, Xingyu Li
From Fake to Real: Pretraining on Balanced Synthetic Images to Prevent Spurious Correlations in Image Recognition
Maan Qraitem, Kate Saenko, Bryan Plummer
SemReg: Semantics Constrained Point Cloud Registration
Sheldon Fung, Xuequan Lu, Dasith de Silva Edirimuni et al.
OneTrack: Demystifying the Conflict Between Detection and Tracking in End-to-End 3D Trackers
Qitai Wang, Jiawei He, Yuntao Chen et al.
MonoWAD: Weather-Adaptive Diffusion Model for Robust Monocular 3D Object Detection
Youngmin Oh, Hyung-Il Kim, Seong Tae Kim et al.
Agglomerative Token Clustering
Joakim Bruslund Haurum, Sergio Escalera, Graham W. Taylor et al.
Fast Encoding and Decoding for Implicit Video Representation
Hao Chen, Saining Xie, Ser-Nam Lim et al.
Learn to Memorize and to Forget: A Continual Learning Perspective of Dynamic SLAM
Baicheng Li, Zike Yan, Dong Wu et al.
The Devil is in the Statistics: Mitigating and Exploiting Statistics Difference for Generalizable Semi-supervised Medical Image Segmentation
Muyang Qiu, Jian Zhang, Lei Qi et al.
Generalized Coverage for More Robust Low-Budget Active Learning
Wonho Bae, Junhyug Noh, Danica J. Sutherland
Enhancing Tampered Text Detection through Frequency Feature Fusion and Decomposition
Zhongxi Chen, Shen Chen, Taiping Yao et al.
Any Target Can be Offense: Adversarial Example Generation via Generalized Latent Infection
Youheng Sun, Shengming Yuan, Xuanhan Wang et al.
Appearance-based Refinement for Object-Centric Motion Segmentation
Junyu Xie, Weidi Xie, Andrew ZISSERMAN
Vision-Language Dual-Pattern Matching for Out-of-Distribution Detection
Zihan Zhang, Zhuo Xu, Xiang Xiang
Learning to Localize Actions in Instructional Videos with LLM-Based Multi-Pathway Text-Video Alignment
Yuxiao Chen, Kai Li, Wentao Bao et al.
Modelling Competitive Behaviors in Autonomous Driving Under Generative World Model
Guanren Qiao, Guiliang Liu, Guorui Quan et al.
DySeT: a Dynamic Masked Self-distillation Approach for Robust Trajectory Prediction
MOZHGAN POURKESHAVARZ, Arielle Zhang, Amir Rasouli
Semantically Guided Representation Learning For Action Anticipation
Anxhelo Diko, Danilo Avola, Bardh Prenkaj et al.
Rawformer: Unpaired Raw-to-Raw Translation for Learnable Camera ISPs
Georgy Perevozchikov, Nancy Mehta, Mahmoud Afifi et al.
Understanding Physical Dynamics with Counterfactual World Modeling
Rahul Mysore Venkatesh, Honglin Chen, Kevin Feigelis et al.
Spherical World-Locking for Audio-Visual Localization in Egocentric Videos
Heeseung Yun, Ruohan Gao, Ishwarya Ananthabhotla et al.
Improving Domain Generalization in Self-Supervised Monocular Depth Estimation via Stabilized Adversarial Training
Yuanqi Yao, Gang Wu, Kui Jiang et al.
PoseEmbroider: Towards a 3D, Visual, Semantic-aware Human Pose Representation
Ginger Delmas, Philippe Weinzaepfel, Francesc Moreno et al.
Shedding More Light on Robust Classifiers under the lens of Energy-based Models
Mujtaba Hussain Mirza, Maria Rosaria Briglia, Senad Beadini et al.
Generating 3D House Wireframes with Semantics
Xueqi Ma, Yilin Liu, Wenjun Zhou et al.
Token Compensator: Altering Inference Cost of Vision Transformer without Re-Tuning
Shibo Jie, Yehui Tang, Jianyuan Guo et al.
Synergy of Sight and Semantics: Visual Intention Understanding with CLIP
Qu Yang, Mang Ye, Dacheng Tao
SCPNet: Unsupervised Cross-modal Homography Estimation via Intra-modal Self-supervised Learning
Runmin Zhang, Jun Ma, Lun Luo et al.
Synchronous Diffusion for Unsupervised Smooth Non-Rigid 3D Shape Matching
Dongliang Cao, Zorah Laehner, Florian Bernard
De-confounded Gaze Estimation
Ziyang Liang, Yiwei Bao, Feng Lu
GenQ: Quantization in Low Data Regimes with Generative Synthetic Data
YUHANG LI, Youngeun Kim, Donghyun Lee et al.
Parameter-Efficient and Memory-Efficient Tuning for Vision Transformer: A Disentangled Approach
Taolin Zhang, Jiawang Bai, Zhihe Lu et al.
MAP-ADAPT: Real-Time Quality-Adaptive Semantic 3D Maps
Jianhao Zheng, Daniel Barath, Marc Pollefeys et al.
MedRAT: Unpaired Medical Report Generation via Auxiliary Tasks
Elad Hirsch, Gefen Dawidowicz, Ayellet Tal
DreamView: Injecting View-specific Text Guidance into Text-to-3D Generation
Junkai Yan, Yipeng Gao, Qize Yang et al.
Scene-aware Human Motion Forecasting via Mutual Distance Prediction
Chaoyue Xing, Wei Mao, Miaomiao LIU
PDiscoFormer: Relaxing Part Discovery Constraints with Vision Transformers
Ananthu Aniraj, Cassio F. Dantas, Dino Ienco et al.
CMD: A Cross Mechanism Domain Adaptation Dataset for 3D Object Detection
Jinhao Deng, Wei Ye, Hai Wu et al.
Bidirectional Uncertainty-Based Active Learning for Open-Set Annotation
ChenChen Zong, Ye-Wen Wang, Kun-Peng Ning et al.
EAFormer: Scene Text Segmentation with Edge-Aware Transformers
Haiyang Yu, Teng Fu, Bin Li et al.
ProDepth: Boosting Self-Supervised Multi-Frame Monocular Depth with Probabilistic Fusion
Sungmin Woo, Wonjoon Lee, Woo Jin Kim et al.
X-InstructBLIP: A Framework for Aligning Image, 3D, Audio, Video to LLMs and its Emergent Cross-modal Reasoning
Artemis Panagopoulou, Le Xue, Ning Yu et al.
ADMap: Anti-disturbance Framework for Vectorized HD Map Construction
Haotian Hu, Fanyi Wang, Yaonong Wang et al.
This Probably Looks Exactly Like That: An Invertible Prototypical Network
Zachariah Carmichael, Timothy Redgrave, Daniel Gonzalez Cedre et al.
3D-GOI: 3D GAN Omni-Inversion for Multifaceted and Multi-object Editing
Haoran Li, Long Ma, Haolin Shi et al.
Edge-Guided Fusion and Motion Augmentation for Event-Image Stereo
Fengan Zhao, Qianang Zhou, Junlin Xiong
Pose-Aware Self-Supervised Learning with Viewpoint Trajectory Regularization
Jiayun Wang, Yubei Chen, Stella Yu
ML-SemReg: Boosting Point Cloud Registration with Multi-level Semantic Consistency
Shaocheng Yan, Pengcheng Shi, Jiayuan Li
Operational Open-Set Recognition and PostMax Refinement
Steve Cruz, Ryan Rabinowitz, Manuel Günther et al.
Rethinking LiDAR Domain Generalization: Single Source as Multiple Density Domains
Jaeyeul Kim, Jungwan Woo, Jeonghoon Kim et al.
DECOLLAGE: 3D Detailization by Controllable, Localized, and Learned Geometry Enhancement
Qimin Chen, Zhiqin Chen, Vladimir Kim et al.
DA-BEV: Unsupervised Domain Adaptation for Bird's Eye View Perception
Kai Jiang, Jiaxing Huang, Weiying Xie et al.
Investigating Style Similarity in Diffusion Models
Gowthami Somepalli, Anubhav Anubhav, Kamal Gupta et al.
Occlusion-Aware Seamless Segmentation
Yihong Cao, Jiaming Zhang, Hao Shi et al.
Augmented Neural Fine-tuning for Efficient Backdoor Purification
Md Nazmul Karim, Abdullah Al Arafat, Umar Khalid et al.
Scalable Group Choreography via Variational Phase Manifold Learning
Nhat Le, Khoa Do, Xuan Bui et al.
Bi-directional Contextual Attention for 3D Dense Captioning
Minjung Kim, Hyung Suk Lim, Soonyoung Lee et al.
Topo4D: Topology-Preserving Gaussian Splatting for High-Fidelity 4D Head Capture
Xuanchen Li, Yuhao Cheng, Xingyu Ren et al.
Towards High-Quality 3D Motion Transfer with Realistic Apparel Animation
Rong Wang, Wei Mao, Changsheng Lu et al.
DEVIAS: Learning Disentangled Video Representations of Action and Scene
Kyungho Bae, Youngrae Kim, Geo Ahn et al.
Noise-assisted Prompt Learning for Image Forgery Detection and Localization
Dong Li, Jiaying Zhu, Xueyang Fu et al.
Finding NeMo: Negative-mined Mosaic Augmentation for Referring Image Segmentation
Seongsu Ha, Chaeyun Kim, Donghwa Kim et al.
LMT-GP: Combined Latent Mean-Teacher and Gaussian Process for Semi-supervised Low-light Image Enhancement
Ye Yu, Fengxin Chen, Jun Yu et al.
Viewpoint textual inversion: discovering scene representations and 3D view control in 2D diffusion models
James Burgess, Kuan-Chieh Wang, Serena Yeung-Levy
SPIN: Hierarchical Segmentation with Subpart Granularity in Natural Images
josh myers-dean, Jarek T Reynolds, Brian Price et al.
Text-to-Sticker: Style Tailoring Latent Diffusion Models for Human Expression
Animesh Sinha, Bo Sun, Anmol Kalia et al.
High-Quality Mesh Blendshape Generation from Face Videos via Neural Inverse Rendering
Xin Ming, Jiawei Li, Jingwang Ling et al.
Delving Deep into Engagement Prediction of Short Videos
dasong Li, Wenjie Li, Baili Lu et al.
Quanta Video Restoration
PRATEEK CHENNURI, Yiheng Chi, Enze Jiang et al.
Adapt without Forgetting: Distill Proximity from Dual Teachers in Vision-Language Models
MENGYU ZHENG, Yehui Tang, Zhiwei Hao et al.
Towards Open-World Object-based Anomaly Detection via Self-Supervised Outlier Synthesis
Brian Isaac Medina, Yona Falinie Abdul Gaus, Neelanjan Bhowmik et al.
Efficient Bias Mitigation Without Privileged Information
Mateo Espinosa Zarlenga, Sankaranarayanan, Jerone Andrews et al.
WaSt-3D: Wasserstein-2 Distance for Scene-to-Scene Stylization on 3D Gaussians
Dmytro Kotovenko, Olga Grebenkova, Nikolaos Sarafianos et al.
Watching it in Dark: A Target-aware Representation Learning Framework for High-Level Vision Tasks in Low Illumination
Yunan LI, Yihao Zhang, Shoude Li et al.
Risk-Aware Self-Consistent Imitation Learning for Trajectory Planning in Autonomous Driving
Yixuan Fan, Ya-Li Li, Shengjin Wang
Co-synthesis of Histopathology Nuclei Image-Label Pairs using a Context-Conditioned Joint Diffusion Model
Seonghui Min, Hyun-Jic Oh, Won-Ki Jeong
Two-Stage Active Learning for Efficient Temporal Action Segmentation
Yuhao Su, Ehsan Elhamifar
Adaptive Multi-task Learning for Few-shot Object Detection
Yan Ren, Yanling Li, Wai-Kin Adams Kong
Temporal Residual Guided Diffusion Framework for Event-Driven Video Reconstruction
Lin Zhu, Yunlong Zheng, Yijun Zhang et al.
Improving image synthesis with diffusion-negative sampling
Alakh Desai, Nuno Vasconcelos
LLMCO4MR: LLMs-aided Neural Combinatorial Optimization for Ancient Manuscript Restoration from Fragments with Case Studies on Dunhuang
Yuqing Zhang, Hangqi Li, Shengyu Zhang et al.
LiDAR-Event Stereo Fusion with Hallucinations
Luca Bartolomei, Matteo Poggi, Andrea Conti et al.
HARIVO: Harnessing Text-to-Image Models for Video Generation
Mingi Kwon, Seoung Wug Oh, Yang Zhou et al.
AlignZeg: Mitigating Objective Misalignment for Zero-shot Semantic Segmentation
Jiannan Ge, Lingxi Xie, Hongtao Xie et al.
Weak-to-Strong Compositional Learning from Generative Models for Language-based Object Detection
Kwanyong Park, Kuniaki Saito, Donghyun Kim
Event Trojan: Asynchronous Event-based Backdoor Attacks
Ruofei Wang, Qing Guo, Haoliang Li et al.
VisFocus: Prompt-Guided Vision Encoders for OCR-Free Dense Document Understanding
Ofir Abramovich, Niv Nayman, Sharon Fogel et al.
STSP: Spatial-Temporal Subspace Projection for Video Class-incremental Learning
Hao CHENG, SIYUAN YANG, Chong Wang et al.
SPVLoc: Semantic Panoramic Viewport Matching for 6D Camera Localization in Unseen Environments
Niklas Gard, Anna Hilsmann, Peter Eisert
Efficient Diffusion-Driven Corruption Editor for Test-Time Adaptation
Yeongtak Oh, Jonghyun Lee, Jooyoung Choi et al.
PRET: Planning with Directed Fidelity Trajectory for Vision and Language Navigation
Renjie Lu, Jing-Ke Meng, WEISHI ZHENG
Long-range Turbulence Mitigation: A Large-scale Dataset and A Coarse-to-fine Framework
Shengqi Xu, Run Sun, Yi Chang et al.
Versatile Incremental Learning: Towards Class and Domain-Agnostic Incremental Learning
Minyeong Park, Jae-Ho Lee, Gyeong-Moon Park
PointRegGPT: Boosting 3D Point Cloud Registration using Generative Point-Cloud Pairs for Training
SUYI CHEN, Hao Xu, Haipeng Li et al.
FastCAD: Real-Time CAD Retrieval and Alignment from Scans and Videos
Florian Langer, Jihong Ju, Georgi Dikov et al.
MERLiN: Single-Shot Material Estimation and Relighting for Photometric Stereo
Ashish Tiwari, Satoshi Ikehata, Shanmuganathan Raman
E3M: Zero-Shot Spatio-Temporal Video Grounding with Expectation-Maximization Multimodal Modulation
Peijun Bao, Zihao Shao, Wenhan Yang et al.
Learning to Enhance Aperture Phasor Field for Non-Line-of-Sight Imaging
In Cho, Hyunbo Shim, Seon Joo Kim
Merging and Splitting Diffusion Paths for Semantically Coherent Panoramas
Fabio Quattrini, Vittorio Pippi, Silvia Cascianelli et al.
UDA-Bench: Revisiting Common Assumptions in Unsupervised Domain Adaptation Using a Standardized Framework
Tarun Kalluri, Sreyas Ravichandran, Manmohan Chandraker
Hierarchical Conditioning of Diffusion Models Using Tree-of-Life for Studying Species Evolution
Mridul Khurana, Arka Daw, M. Maruf et al.
Constructing Concept-based Models to Mitigate Spurious Correlations with Minimal Human Effort
Jeeyung Kim, Ze Wang, Qiang Qiu
Tight and Efficient Upper Bound on Spectral Norm of Convolutional Layers
Ekaterina Grishina, Mikhail Gorbunov, Maxim Rakhuba
AlignDiff: Aligning Diffusion Models for General Few-Shot Segmentation
Ri-Zhao Qiu, Yu-Xiong Wang, Kris Hauser
Geometry Fidelity for Spherical Images
Anders Christensen, Nooshin Mojab, Khushman Patel et al.
DHR: Dual Features-Driven Hierarchical Rebalancing in Inter- and Intra-Class Regions for Weakly-Supervised Semantic Segmentation
Sanghyun Jo, Fei Pan, In-Jae Yu et al.
VLAD-BuFF: Burst-aware Fast Feature Aggregation for Visual Place Recognition
Ahmad Khaliq, Ming Xu, Stephen Hausler et al.
Better Regression Makes Better Test-time Adaptive 3D Object Detection
Jiakang Yuan, Bo Zhang, Kaixiong Gong et al.
Test-Time Stain Adaptation with Diffusion Models for Histopathology Image Classification
Cheng-Chang Tsai, Yuan-Chih Chen, Chun-Shien Lu
Discovering Novel Actions from Open World Egocentric Videos with Object-Grounded Visual Commonsense Reasoning
Sanjoy Kundu, Shubham Trehan, Sathyanarayanan Aakur
Six-Point Method for Multi-Camera Systems with Reduced Solution Space
Banglei Guan, Ji Zhao, Laurent Kneip
Interleaving One-Class and Weakly-Supervised Models with Adaptive Thresholding for Unsupervised Video Anomaly Detection
Yongwei Nie, Hao Huang, Chengjiang Long et al.
LiDAR-based All-weather 3D Object Detection via Prompting and Distilling 4D Radar
Yujeong Chae, HYEONSEONG KIM, Changgyoon Oh et al.
Pseudo-keypoint RKHS Learning for Self-supervised 6DoF Pose Estimation
Yangzheng Wu, Michael Alan Greenspan
Domain-Adaptive 2D Human Pose Estimation via Dual Teachers in Extremely Low-Light Conditions
Yihao Ai, Yifei Qi, Bo Wang et al.
Exploring Reliable Matching with Phase Enhancement for Night-time Semantic Segmentation
Yuwen Pan, Rui Sun, Naisong Luo et al.
Smoothness, Synthesis, and Sampling: Re-thinking Unsupervised Multi-View Stereo with DIV Loss
Alex Rich, Noah Stier, Pradeep Sen et al.
RS-NeRF: Neural Radiance Fields from Rolling Shutter Images
Muyao Niu, Tong Chen, Yifan Zhan et al.
MasterWeaver: Taming Editability and Face Identity for Personalized Text-to-Image Generation
Yuxiang WEI, Zhilong Ji, Jinfeng Bai et al.
Learning High-resolution Vector Representation from Multi-Camera Images for 3D Object Detection
Zhili Chen, Shuangjie Xu, Maosheng Ye et al.
Fine-Grained Scene Graph Generation via Sample-Level Bias Prediction
Yansheng Li, Tingzhu Wang, Kang Wu et al.
DiffPMAE: Diffusion Masked Autoencoders for Point Cloud Reconstruction
YANLONG LI, Chamara Madarasingha, Kanchana Thilakarathna
SCOMatch: Alleviating Overtrusting in Open-set Semi-supervised Learning
ZERUN WANG, Liuyu Xiang, Lang Huang et al.
Spike-Temporal Latent Representation for Energy-Efficient Event-to-Video Reconstruction
Jianxiong Tang, Jian-Huang Lai, Lingxiao Yang et al.
Personalized Privacy Protection Mask Against Unauthorized Facial Recognition
Ka Ho Chow, Sihao Hu, Tiansheng Huang et al.
Reprojection Errors as Prompts for Efficient Scene Coordinate Regression
Ting-Ru Liu, Hsuan-Kung Yang, Jou-Min Liu et al.
Feature Diversification and Adaptation for Federated Domain Generalization
Seunghan Yang, Seokeon Choi, Hyunsin Park et al.
Fundamental Matrix Estimation Using Relative Depths
Yaqing Ding, Václav Vávra, Snehal Bhayani et al.
Are Synthetic Data Useful for Egocentric Hand-Object Interaction Detection?
Rosario Leonardi, Antonino Furnari, Francesco Ragusa et al.
Self-Cooperation Knowledge Distillation for Novel Class Discovery
Yuzheng Wang, Zhaoyu Chen, Dingkang Yang et al.
A Diffusion Model for Simulation Ready Coronary Anatomy with Morpho-skeletal Control
Karim Kadry, Shreya Gupta, Jonas Sogbadji et al.
Adapt2Reward: Adapting Video-Language Models to Generalizable Robotic Rewards via Failure Prompts
Yanting Yang, Minghao Chen, Qibo Qiu et al.
AddMe: Zero-shot Group-photo Synthesis by Inserting People into Scenes
Dongxu Yue, Maomao Li, Yunfei Liu et al.
Rejection Sampling IMLE: Designing Priors for Better Few-Shot Image Synthesis
Chirag Vashist, Shichong Peng, Ke Li
Online Continuous Generalized Category Discovery
Keon-Hee Park, Hakyung Lee, Kyungwoo Song et al.
Instance-dependent Noisy-label Learning with Graphical Model Based Noise-rate Estimation
Arpit Garg, Cuong Cao Nguyen, RAFAEL FELIX et al.
Learning the Unlearned: Mitigating Feature Suppression in Contrastive Learning
Jihai Zhang, Xiang Lan, Xiaoye Qu et al.
cDP-MIL: Robust Multiple Instance Learning via Cascaded Dirichlet Process
Yihang Chen, TSAI HOR CHAN, Guosheng Yin et al.
RaFE: Generative Radiance Fields Restoration
Zhongkai Wu, Ziyu Wan, Jing Zhang et al.
DECap: Towards Generalized Explicit Caption Editing via Diffusion Mechanism
Zhen Wang, Xinyun Jiang, Jun Xiao et al.
SelfGeo: Self-supervised and Geodesic-consistent Estimation of Keypoints on Deformable Shapes
Mohammad Zohaib, Luca Cosmo, Alessio Del Bue
On the Evaluation Consistency of Attribution-based Explanations
Jiarui Duan, Haoling Li, Haofei Zhang et al.
Open-World Dynamic Prompt and Continual Visual Representation Learning
Youngeun Kim, Jun Fang, Qin Zhang et al.
Text-Anchored Score Composition: Tackling Condition Misalignment in Text-to-Image Diffusion Models
Luozhou Wang, Guibao Shen, Wenhang Ge et al.
ProSub: Probabilistic Open-Set Semi-Supervised Learning with Subspace-Based Out-of-Distribution Detection
Erik Wallin, Lennart Svensson, Fredrik Kahl et al.
Learning Dual-Level Deformable Implicit Representation for Real-World Scale Arbitrary Super-Resolution
Zhiheng Li, Muheng Li, Jixuan Fan et al.
Open Vocabulary Multi-Label Video Classification
Rohit Gupta, Mamshad Nayeem Rizve, Jayakrishnan Unnikrishnan et al.
Pre-trained Visual Dynamics Representations for Efficient Policy Learning
Hao Luo, Bohan Zhou, Zongqing Lu
FTBC: Forward Temporal Bias Correction for Optimizing ANN-SNN Conversion
Xiaofeng Wu, Velibor Bojkovic, Bin Gu et al.
A Fair Ranking and New Model for Panoptic Scene Graph Generation
Julian Lorenz, Alexander Pest, Daniel Kienzle et al.
EgoPoseFormer: A Simple Baseline for Stereo Egocentric 3D Human Pose Estimation
Chenhongyi Yang, Anastasia Tkach, Shreyas Hampali et al.
Boost Your NeRF: A Model-Agnostic Mixture of Experts Framework for High Quality and Efficient Rendering
Francesco Di Sario, Riccardo Renzulli, Marco Grangetto et al.
Semi-Supervised Teacher-Reference-Student Architecture for Action Quality Assessment
Wulian Yun, Mengshi Qi, Fei Peng et al.
Learning from the Web: Language Drives Weakly-Supervised Incremental Learning for Semantic Segmentation
Chang Liu, Giulia Rizzoli, Pietro Zanuttigh et al.
FRI-Net: Floorplan Reconstruction via Room-wise Implicit Representation
Honghao Xu, Juzhan Xu, Zeyu Huang et al.
Adaptive Multi-modal Fusion of Spatially Variant Kernel Refinement with Diffusion Model for Blind Image Super-Resolution
Junxiong Lin, Yan Wang, Zeng Tao et al.
Towards Stable 3D Object Detection
Jiabao Wang, Qiang Meng, Guochao Liu et al.
LiveHPS++: Robust and Coherent Motion Capture in Dynamic Free Environment
Yiming Ren, Xiao Han, Yichen Yao et al.
VSViG: Real-time Video-based Seizure Detection via Skeleton-based Spatiotemporal ViG
Yankun Xu, Junzhe Wang, Yun-Hsuan Chen et al.
GlobalPointer: Large-Scale Plane Adjustment with Bi-Convex Relaxation
Bangyan Liao, Zhenjun Zhao, Lu Chen et al.
Bucketed Ranking-based Losses for Efficient Training of Object Detectors
Feyza Yavuz, Baris Can Cam, Adnan Harun Dogan et al.
HAT: History-Augmented Anchor Transformer for Online Temporal Action Localization
Sakib Reza, Yuexi Zhang, Mohsen Moghaddam et al.
DATENeRF: Depth-Aware Text-based Editing of NeRFs
Sara Rojas Martinez, Julien Philip, Kai Zhang et al.
Revisiting Calibration of Wide-Angle Radially Symmetric Cameras
Andrea Porfiri Dal Cin, Francesco Azzoni, Giacomo Boracchi et al.
Event-based Mosaicing Bundle Adjustment
Shuang Guo, Guillermo Gallego
SuperFedNAS: Cost-Efficient Federated Neural Architecture Search for On-Device Inference
Alind Khare, Animesh Agrawal, Aditya Annavajjala et al.
Local and Global Flatness for Federated Domain Generalization
Hao Yan, Yuhong Guo
TPA3D: Triplane Attention for Fast Text-to-3D Generation
Bin-Shih Wu, HONG-EN CHEN, Sheng-Yu Huang et al.
EgoBody3M: Egocentric Body Tracking on a VR Headset using a Diverse Dataset
Amy Zhao, Chengcheng Tang, Lezi Wang et al.
Correspondence-Free SE(3) Point Cloud Registration in RKHS via Unsupervised Equivariant Learning
Ray Zhang, Zheming Zhou, Min Sun et al.
Unified Medical Image Pre-training in Language-Guided Common Semantic Space
Xiaoxuan He, Yifan Yang, Xinyang Jiang et al.
Understanding Multi-compositional learning in Vision and Language models via Category Theory
Sotirios Panagiotis Takis Chytas, Hyunwoo J. Kim, Vikas Singh
Comprehensive Attribution: Inherently Explainable Vision Model with Feature Detector
Xianren Zhang, Dongwon Lee, Suhang Wang
Spatial-Temporal Multi-level Association for Video Object Segmentation
Deshui Miao, Xin Li, Zhenyu He et al.
GAReT: Cross-view Video Geolocalization with Adapters and Auto-Regressive Transformers
Manu S Pillai, Mamshad Nayeem Rizve, Shah Mubarak
View-Consistent Hierarchical 3D Segmentation Using Ultrametric Feature Fields
Haodi He, Colton Stearns, Adam Harley et al.
Efficient Depth-Guided Urban View Synthesis
sheng miao, Jiaxin Huang, Dongfeng Bai et al.
Efficient Training with Denoised Neural Weights
Yifan Gong, Zheng Zhan, Yanyu Li et al.
OpenKD: Opening Prompt Diversity for Zero- and Few-shot Keypoint Detection
Changsheng Lu, Zheyuan Liu, Piotr Koniusz
SkyMask: Attack-agnostic Robust Federated Learning with Fine-grained Learnable Masks
Peishen Yan, Hao Wang, Tao Song et al.
FREST: Feature RESToration for Semantic Segmentation under Multiple Adverse Conditions
Sohyun Lee, Namyup Kim, Sungyeon Kim et al.
Seeing Faces in Things: A Model and Dataset for Pareidolia
Mark T Hamilton, Simon Stent, Vasha G DuTell et al.
Mahalanobis Distance-based Multi-view Optimal Transport for Multi-view Crowd Localization
Qi Zhang, Kaiyi Zhang, Antoni Chan et al.
Region-Aware Sequence-to-Sequence Learning for Hyperspectral Denoising
JiaHua Xiao, Yang Liu, Xing Wei
Reliable Spatial-Temporal Voxels For Multi-Modal Test-Time Adaptation
Haozhi Cao, Yuecong Xu, Jianfei Yang et al.
Cross-Domain Semantic Segmentation on Inconsistent Taxonomy using VLMs
Jeongkee Lim, Yusung Kim
OvSW: Overcoming Silent Weights for Accurate Binary Neural Networks
JINGYANG XIANG, Zuohui Chen, Siqi Li et al.
Using My Artistic Style? You Must Obtain My Authorization
Xiuli Bi, Haowei Liu, Weisheng Li et al.
RPBG: Towards Robust Neural Point-based Graphics in the Wild
Qingtian Zhu, Zizhuang Wei, Zhongtian Zheng et al.
Surface-Centric Modeling for High-Fidelity Generalizable Neural Surface Reconstruction
Rui Peng, Shihe Shen, Kaiqiang Xiong et al.
SCAPE: A Simple and Strong Category-Agnostic Pose Estimator
Yujia Liang, Zixuan Ye, Wenze Liu et al.
Minimalist Vision with Freeform Pixels
Jeremy Klotz, Shree Nayar
SpecFormer: Guarding Vision Transformer Robustness via Maximum Singular Value Penalization
Xixu Hu, Runkai Zheng, Jindong Wang et al.