Most Cited ECCV "non-centroid clustering" Papers
2,387 papers found • Page 5 of 12
Conference
Free Lunch for Gait Recognition: A Novel Relation Descriptor
Jilong Wang, Saihui Hou, Yan Huang et al.
Unrolled Decomposed Unpaired Learning for Controllable Low-Light Video Enhancement
Lingyu Zhu, Wenhan Yang, Baoliang Chen et al.
VQ-HPS: Human Pose and Shape Estimation in a Vector-Quantized Latent Space
Guénolé Fiche, Simon Leglaive, Xavier Alameda-Pineda et al.
Temporal-Mapping Photography for Event Cameras
Yuhan Bao, Lei Sun, Yuqin Ma et al.
Idempotent Unsupervised Representation Learning for Skeleton-Based Action Recognition
Lilang Lin, Lehong Wu, Jiahang Zhang et al.
PMT: Progressive Mean Teacher via Exploring Temporal Consistency for Semi-Supervised Medical Image Segmentation
Ning Gao, Sanping Zhou, Le Wang et al.
Real-time 3D-aware Portrait Editing from a Single Image
Qingyan Bai, Zifan Shi, Yinghao Xu et al.
Imaging Interiors: An Implicit Solution to Electromagnetic Inverse Scattering Problems
Ziyuan Luo, Boxin Shi, Haoliang Li et al.
Training-free Composite Scene Generation for Layout-to-Image Synthesis
Jiaqi Liu, Tao Huang, Chang Xu
KDProR: A Knowledge-Decoupling Probabilistic Framework for Video-Text Retrieval
Xianwei Zhuang, Hongxiang Li, Xuxin Cheng et al.
Adversarially Robust Distillation by Reducing the Student-Teacher Variance Gap
Junhao Dong, Piotr Koniusz, Junxi Chen et al.
Rasterized Edge Gradients: Handling Discontinuities Differentially
Stanislav Pidhorskyi, Tomas Simon, Gabriel Schwartz et al.
Enhancing Recipe Retrieval with Foundation Models: A Data Augmentation Perspective
Fangzhou Song, Bin Zhu, Yanbin Hao et al.
VQA-Diff: Exploiting VQA and Diffusion for Zero-Shot Image-to-3D Vehicle Asset Generation in Autonomous Driving
Yibo Liu, Zheyuan Yang, Guile Wu et al.
Generating Physically Realistic and Directable Human Motions from Multi-Modal Inputs
Aayam Shrestha, Pan Liu, German Ros et al.
Skews in the Phenomenon Space Hinder Generalization in Text-to-Image Generation
Yingshan Chang, Yasi Zhang, Zhiyuan Fang et al.
DNI: Dilutional Noise Initialization for Diffusion Video Editing
Sunjae Yoon, Gwanhyeong Koo, Ji Woo Hong et al.
CanonicalFusion: Generating Drivable 3D Human Avatars from Multiple Images
Jisu Shin, Junmyeong Lee, Seongmin Lee et al.
Towards More Practical Group Activity Detection: A New Benchmark and Model
Dongkeun Kim, Youngkil Song, Minsu Cho et al.
Volumetric Rendering with Baked Quadrature Fields
Gopal Sharma, Daniel Rebain, Kwang Moo Yi et al.
MAGR: Manifold-Aligned Graph Regularization for Continual Action Quality Assessment
Kanglei Zhou, Liyuan Wang, Xingxing Zhang et al.
Graph Neural Network Causal Explanation via Neural Causal Models
Arman Behnam, Binghui Wang
Open-Set Recognition in the Age of Vision-Language Models
Dimity Miller, Niko Suenderhauf, Alex Kenna et al.
Uncertainty-aware sign language video retrieval with probability distribution modeling
Xuan Wu, Hongxiang Li, yuanjiang luo et al.
Few-shot NeRF by Adaptive Rendering Loss Regularization
Qingshan Xu, Xuanyu Yi, Jianyao Xu et al.
REVISION: Rendering Tools Enable Spatial Fidelity in Vision-Language Models
Agneet Chatterjee, Yiran Luo, Tejas Gokhale et al.
Distributionally Robust Loss for Long-Tailed Multi-Label Image Classification
Dekun Lin, Zhe Cui, Rui Chen et al.
Recursive Visual Programming
Jiaxin Ge, Sanjay Subramanian, Baifeng Shi et al.
Leveraging Thermal Modality to Enhance Reconstruction in Low-Light Conditions
Jiacong Xu, Mingqian Liao, Ram Prabhakar Kathirvel et al.
Leveraging temporal contextualization for video action recognition
Minji Kim, Dongyoon Han, Taekyung Kim et al.
Multi-Person Pose Forecasting with Individual Interaction Perceptron and Prior Learning
Peng Xiao, Yi Xie, Xuemiao Xu et al.
CoSIGN: Few-Step Guidance of ConSIstency Model to Solve General INverse Problems
Jiankun Zhao, Bowen Song, Liyue Shen
RepVF: A Unified Vector Fields Representation for Multi-task 3D Perception
Shen Jianbing, Chunliang Li, Wencheng Han et al.
NOVUM: Neural Object Volumes for Robust Object Classification
Artur Jesslen, Guofeng Zhang, Angtian Wang et al.
Part2Object: Hierarchical Unsupervised 3D Instance Segmentation
cheng Shi, Yulin zhang, Bin Yang et al.
Surf-D: Generating High-Quality Surfaces of Arbitrary Topologies Using Diffusion Models
Zhengming Yu, Zhiyang Dou, Xiaoxiao Long et al.
Learn from the Learnt: Source-Free Active Domain Adaptation via Contrastive Sampling and Visual Persistence
Mengyao Lyu, Tianxiang Hao, Xinhao Xu et al.
EgoPet: Egomotion and Interaction Data from an Animal's Perspective
Amir Bar, Arya Bakhtiar, Danny L Tran et al.
CoLeaF: A Contrastive-Collaborative Learning Framework for Weakly Supervised Audio-Visual Video Parsing
Faegheh Sardari, Armin Mustafa, Philip JB Jackson et al.
Ponymation: Learning Articulated 3D Animal Motions from Unlabeled Online Videos
Keqiang Sun, Dori Litvak, Yunzhi Zhang et al.
X-Former: Unifying Contrastive and Reconstruction Learning for MLLMs
Swetha Sirnam, Jinyu Yang, Tal Neiman et al.
GarmentAligner: Text-to-Garment Generation via Retrieval-augmented Multi-level Corrections
Shiyue Zhang, Zheng Chong, Xujie Zhang et al.
Emerging Property of Masked Token for Effective Pre-training
Hyesong Choi, Hunsang Lee, Seyoung Joung et al.
CloudFixer: Test-Time Adaptation for 3D Point Clouds via Diffusion-Guided Geometric Transformation
Hajin Shim, Changhun Kim, Eunho Yang
Propose, Assess, Search: Harnessing LLMs for Goal-Oriented Planning in Instructional Videos
Mohaiminul Islam, Tushar Nagarajan, Huiyu Wang et al.
Hiding Imperceptible Noise in Curvature-Aware Patches for 3D Point Cloud Attack
Mingyu Yang, Daizong Liu, Keke Tang et al.
ViPer: Visual Personalization of Generative Models via Individual Preference Learning
Sogand Salehi, Mahdi Shafiei, Roman Bachmann et al.
AFF-ttention! Affordances and Attention models for Short-Term Object Interaction Anticipation
Lorenzo Mur Labadia, Ruben Martinez-Cantin, Jose J Guerrero et al.
PiTe: Pixel-Temporal Alignment for Large Video-Language Model
Yang Liu, Pengxiang Ding, Siteng Huang et al.
HSR: Holistic 3D Human-Scene Reconstruction from Monocular Videos
Lixin Xue, Chen Guo, Chengwei Zheng et al.
Perceptual Evaluation of Audio-Visual Synchrony Grounded in Viewers’ Opinion Scores
Lucas Goncalves, Prashant Mathur, Chandrashekhar Lavania et al.
BK-SDM: A Lightweight, Fast, and Cheap Version of Stable Diffusion
Bo-Kyeong Kim, Hyoung-Kyu Song, Thibault Castells et al.
3D Small Object Detection with Dynamic Spatial Pruning
Xiuwei Xu, Zhihao Sun, Ziwei Wang et al.
PCF-Lift: Panoptic Lifting by Probabilistic Contrastive Fusion
Runsong Zhu, Shi Qiu, Qianyi Wu et al.
Event-Aided Time-To-Collision Estimation for Autonomous Driving
Jinghang Li, Bangyan Liao, Xiuyuan LU et al.
MinD-3D: Reconstruct High-quality 3D objects in Human Brain
Jianxiong Gao, Yuqian Fu, Yun Wang et al.
Make a Strong Teacher with Label Assistance: A Novel Knowledge Distillation Approach for Semantic Segmentation
Shoumeng Qiu, Jie Chen, Xinrun Li et al.
T-CorresNet: Template Guided 3D Point Cloud Completion with Correspondence Pooling Query Generation Strategy
Fan Duan, Jiahao Yu, Li Chen
Exploring Phrase-Level Grounding with Text-to-Image Diffusion Model
Danni Yang, Ruohan Dong, Jiayi Ji et al.
RoomTex: Texturing Compositional Indoor Scenes via Iterative Inpainting
Qi Wang, Ruijie Lu, Xudong XU et al.
Layer-Wise Relevance Propagation with Conservation Property for ResNet
Seitaro Otsuki, Tsumugi Iida, Félix Doublet et al.
GS-Pose: Category-Level Object Pose Estimation via Geometric and Semantic Correspondence
Pengyuan Wang, Takuya Ikeda, Robert Lee et al.
GenView: Enhancing View Quality with Pretrained Generative Model for Self-Supervised Learning
Xiaojie Li, Yibo Yang, Xiangtai Li et al.
Brain Netflix: Scaling Data to Reconstruct Videos from Brain Signals
Camilo Fosco, Benjamin Lahner, Bowen Pan et al.
Exploring Vulnerabilities in Spiking Neural Networks: Direct Adversarial Attacks on Raw Event Data
Yanmeng Yao, Xiaohan Zhao, Bin Gu
Enhancing Cross-Subject fMRI-to-Video Decoding with Global-Local Functional Alignment
Chong Li, Xuelin Qian, Yun Wang et al.
Eta Inversion: Designing an Optimal Eta Function for Diffusion-based Real Image Editing
Wonjun Kang, Kevin Galim, Hyung Il Koo
Safeguard Text-to-Image Diffusion Models with Human Feedback Inversion
Sanghyun Kim, Seohyeon Jung, Balhae Kim et al.
Bi-TTA: Bidirectional Test-Time Adapter for Remote Physiological Measurement
Haodong LI, Hao LU, Yingcong Chen
NVS-Adapter: Plug-and-Play Novel View Synthesis from a Single Image
Yoonwoo Jeong, Jinwoo Lee, Chiheon Kim et al.
Statewide Visual Geolocalization in the Wild
Florian Fervers, Sebastian Bullinger, Christoph Bodensteiner et al.
Enhancing Perceptual Quality in Video Super-Resolution through Temporally-Consistent Detail Synthesis using Diffusion Models
Claudio Rota, Marco Buzzelli, Joost Van de Weijer
LG-Gaze: Learning Geometry-aware Continuous Prompts for Language-Guided Gaze Estimation
Pengwei Yin, Jingjing Wang, Guanzhong Zeng et al.
MetaCap: Meta-learning Priors from Multi-View Imagery for Sparse-view Human Performance Capture and Rendering
Guoxing Sun, Rishabh Dabral, Pascal Fua et al.
Learning Semantic Latent Directions for Accurate and Controllable Human Motion Prediction
Guowei Xu, Jiale Tao, Wen Li et al.
Flying with Photons: Rendering Novel Views of Propagating Light
Anagh Malik, Noah Juravsky, Ryan Po et al.
CityGuessr: City-Level Video Geo-Localization on a Global Scale
Parth Parag Kulkarni, Gaurav Kumar Nayak, Shah Mubarak
ZoLA: Zero-Shot Creative Long Animation Generation with Short Video Model
Fu-Yun Wang, Zhaoyang Huang, Qiang Ma et al.
Certifiably Robust Image Watermark
Zhengyuan Jiang, Moyang Guo, Yuepeng Hu et al.
Localization and Expansion: A Decoupled Framework for Point Cloud Few-shot Semantic Segmentation
Zhaoyang Li, Yuan Wang, Wangkai Li et al.
EraseDraw : Learning to Insert Objects by Erasing Them from Images
Alper Canberk, Maksym Bondarenko, Ege Ozguroglu et al.
Take A Step Back: Rethinking the Two Stages in Visual Reasoning
Mingyu Zhang, Jiting Cai, Mingyu Liu et al.
ActionVOS: Actions as Prompts for Video Object Segmentation
LIANGYANG OUYANG, Ruicong Liu, Yifei Huang et al.
RoScenes: A Large-scale Multi-view 3D Dataset for Roadside Perception
Xiaosu Zhu, Hualian Sheng, Sijia Cai et al.
PQ-SAM: Post-training Quantization for Segment Anything Model
Xiaoyu Liu, Xin Ding, Lei Yu et al.
CountFormer: Multi-View Crowd Counting Transformer
Hong Mo, Xiong Zhang, Jianchao Tan et al.
Intrinsic Single-Image HDR Reconstruction
Sebastian Dille, Chris Careaga, Yagiz Aksoy
3D Weakly Supervised Semantic Segmentation with 2D Vision-Language Guidance
Xiaoxu Xu, Yitian Yuan, Jinlong Li et al.
O2V-Mapping: Online Open-Vocabulary Mapping with Neural Implicit Representation
Muer Tie, Julong Wei, Zhengjun Wang et al.
Quantized Prompt for Efficient Generalization of Vision-Language Models
Tianxiang Hao, Xiaohan Ding, Juexiao Feng et al.
Dual-level Adaptive Self-Labeling for Novel Class Discovery in Point Cloud Segmentation
Ruijie Xu, Chuyu Zhang, Hui Ren et al.
Toward Tiny and High-quality Facial Makeup with Data Amplify Learning
Qiaoqiao Jin, Xuanhong Chen, Meiguang Jin et al.
GenRC: Generative 3D Room Completion from Sparse Image Collections
Ming-Feng Li, Yueh-Feng Ku, Hong-Xuan Yen et al.
FontStudio: Shape-Adaptive Diffusion Model for Coherent and Consistent Font Effect Generation
Xinzhi MU, Li Chen, Bohan CHEN et al.
TriNeRFLet: A Wavelet Based Triplane NeRF Representation
Rajaei Khatib, RAJA GIRYES
PredBench: Benchmarking Spatio-Temporal Prediction across Diverse Disciplines
Zidong Wang, Zeyu Lu, Di Huang et al.
Rethinking Video Deblurring with Wavelet-Aware Dynamic Transformer and Diffusion Model
chen rao, Guangyuan Li, Zehua Lan et al.
Quality Assured: Rethinking Annotation Strategies in Imaging AI
Tim Rädsch, Annika Reinke, Vivienn Weru et al.
Taming Lookup Tables for Efficient Image Retouching
Sidi Yang, Binxiao Huang, Mingdeng Cao et al.
Compress3D: a Compressed Latent Space for 3D Generation from a Single Image
Bowen Zhang, Tianyu Yang, Yu Li et al.
AddBiomechanics Dataset: Capturing the Physics of Human Motion at Scale
Keenon Werling, Janelle M Kaneda, Tian Tan et al.
Learning Diffusion Models for Multi-View Anomaly Detection
Chieh Liu, Yu-Min Chu, Ting-I Hsieh et al.
DynoSurf: Neural Deformation-based Temporally Consistent Dynamic Surface Reconstruction
Yuxin Yao, Siyu Ren, Junhui Hou et al.
Self-supervised visual learning from interactions with objects
Arthur Aubret, Céline Teulière, Jochen Triesch
GeometrySticker: Enabling Ownership Claim of Recolorized Neural Radiance Fields
Xiufeng HUANG, Ka Chun Cheung, Simon See et al.
SMILe: Leveraging Submodular Mutual Information For Robust Few-Shot Object Detection
Anay Majee, Ryan X Sharp, Rishabh Iyer
Cs2K: Class-specific and Class-shared Knowledge Guidance for Incremental Semantic Segmentation
Wei Cong, Yang Cong, Yuyang Liu et al.
On Pretraining Data Diversity for Self-Supervised Learning
Hasan Abed El Kader Hammoud, Tuhin Das, Fabio Pizzati et al.
Hyperion – A fast, versatile symbolic Gaussian Belief Propagation framework for Continuous-Time SLAM
David Hug, Ignacio Alzugaray Lopez, Margarita Chli
COMO: Compact Mapping and Odometry
Eric Dexheimer, Andrew Davison
FSD-BEV: Foreground Self-Distillation for Multi-view 3D Object Detection
Zheng Jiang, Jinqing Zhang, Yanan Zhang et al.
Sur^2f: A Hybrid Representation for High-Quality and Efficient Surface Reconstruction from Multi-view Images
Zhangjin Huang, Zhihao Liang, Kui Jia
RecurrentBEV: A Long-term Temporal Fusion Framework for Multi-view 3D Detection
Ming Chang, Xishan Zhang, Rui Zhang et al.
Learning Neural Volumetric Pose Features for Camera Localization
Jingyu Lin, Jiaqi Gu, Bojian Wu et al.
Scene-Graph ViT: End-to-End Open-Vocabulary Visual Relationship Detection
Tim Salzmann, Markus Ryll, Alex Bewley et al.
Enriching Information and Preserving Semantic Consistency in Expanding Curvilinear Object Segmentation Datasets
Qin Lei, Jiang Zhong, Qizhu Dai
ConDense: Consistent 2D-3D Pre-training for Dense and Sparse Features from Multi-View Images
Xiaoshuai Zhang, Zhicheng Wang, Howard Zhou et al.
FARSE-CNN: Fully Asynchronous, Recurrent and Sparse Event-Based CNN
Riccardo Santambrogio, Marco Cannici, Matteo Matteucci
Power Variable Projection for Initialization-Free Large-Scale Bundle Adjustment
Simon Weber, Je Hyeong Hong, Daniel Cremers
Preventing Catastrophic Overfitting in Fast Adversarial Training: A Bi-level Optimization Perspective
Zhaoxin Wang, Handing Wang, Cong Tian et al.
Camera Calibration using a Collimator System
Shunkun Liang, Banglei Guan, Zhenbao Yu et al.
Flash Cache: Reducing Bias in Radiance Cache Based Inverse Rendering
Benjamin Attal, Dor Verbin, Ben Mildenhall et al.
IFTR: An Instance-Level Fusion Transformer for Visual Collaborative Perception
Shaohong Wang, Lu Bin, Xinyu Xiao et al.
Layered Rendering Diffusion Model for Controllable Zero-Shot Image Synthesis
Zipeng Qi, Guoxi Huang, Chenyang Liu et al.
Plan, Posture and Go: Towards Open-vocabulary Text-to-Motion Generation
Jinpeng Liu, Wenxun Dai, Chunyu Wang et al.
Parameterized Quasi-Physical Simulators for Dexterous Manipulations Transfer
Xueyi Liu, Kangbo Lyu, jieqiong zhang et al.
DIAL: Dense Image-text ALignment for Weakly Supervised Semantic Segmentation
Soojin Jang, JungMin Yun, JuneHyoung Kwon et al.
PEA-Diffusion: Parameter-Efficient Adapter with Knowledge Distillation in non-English Text-to-Image Generation
Jian Ma, Chen Chen, Qingsong Xie et al.
Semantic Diversity-aware Prototype-based Learning for Unbiased Scene Graph Generation
Jaehyeong Jeon, Kibum Kim, Kanghoon Yoon et al.
Within the Dynamic Context: Inertia-aware 3D Human Modeling with Pose Sequence
Yutong Chen, Yifan Zhan, Zhihang Zhong et al.
Memory-Efficient Fine-Tuning for Quantized Diffusion Model
Hyogon Ryu, Seohyun Lim, Hyunjung Shim
An Explainable Vision Question Answer Model via Diffusion Chain-of-Thought
Chunhao LU, Qiang Lu, Jake Luo
Markov Knowledge Distillation: Make Nasty Teachers trained by Self-undermining Knowledge Distillation Fully Distillable
En-Hui Yang, Linfeng Ye
Towards Physical World Backdoor Attacks against Skeleton Action Recognition
Qichen Zheng, Yi Yu, SIYUAN YANG et al.
Snuffy: Efficient Whole Slide Image Classifier
Hossein Jafarinia, Alireza Alipanah, Saeed Razavi et al.
PolyOculus: Simultaneous Multi-view Image-based Novel View Synthesis
Jason Yu, Tristan Aumentado-Armstrong, Fereshteh Forghani et al.
Zero-Shot Image Feature Consensus with Deep Functional Maps
Xinle Cheng, Congyue Deng, Adam Harley et al.
Rethinking Image-to-Video Adaptation: An Object-centric Perspective
Rui Qian, Shuangrui Ding, Dahua Lin
Zero-Shot Adaptation for Approximate Posterior Sampling of Diffusion Models in Inverse Problems
Yasar Utku Alcalar, Mehmet Akcakaya
Prompting Future Driven Diffusion Model for Hand Motion Prediction
Bowen Tang, Kaihao Zhang, Wenhan Luo et al.
Auto-DAS: Automated Proxy Discovery for Training-free Distillation-aware Architecture Search
Haosen SUN, Lujun Li, Peijie Dong et al.
SLIM: Spuriousness Mitigation with Minimal Human Annotations
Xiwei Xuan, Ziquan Deng, Hsuan-Tien Lin et al.
D4-VTON: Dynamic Semantics Disentangling for Differential Diffusion based Virtual Try-On
Zhaotong Yang, Zicheng Jiang, Xinzhe Li et al.
High-Precision Self-Supervised Monocular Depth Estimation with Rich-Resource Prior
Shen Jianbing, Wencheng Han
PoseCrafter: One-Shot Personalized Video Synthesis Following Flexible Pose Control
Yong Zhong, Min Zhao, Zebin You et al.
Towards Scene Graph Anticipation
Rohith Peddi, Saksham Singh, Saurabh . et al.
FALIP: Visual Prompt as Foveal Attention Boosts CLIP Zero-Shot Performance
Jiedong Zhuang, Jiaqi Hu, Lianrui Mu et al.
ARoFace: Alignment Robustness to Improve Low-quality Face Recognition
Mohammad Saeed Ebrahimi Saadabadi, Sahar Rahimi Malakshan, Ali Dabouei et al.
ProCreate, Don't Reproduce! Propulsive Energy Diffusion for Creative Generation
Jack Lu, Ryan Teehan, Mengye Ren
Concept Arithmetics for Circumventing Concept Inhibition in Diffusion Models
Vitali Petsiuk, Kate Saenko
Un-EVIMO: Unsupervised Event-based Independent Motion Segmentation
Ziyun Wang, Jinyuan Guo, Kostas Daniilidis
DC-Solver: Improving Predictor-Corrector Diffusion Sampler via Dynamic Compensation
Wenliang Zhao, Haolin Wang, Jie Zhou et al.
FreeMotion: MoCap-Free Human Motion Synthesis with Multimodal Large Language Models
Zhikai Zhang, Yitang Li, Haofeng Huang et al.
Distill Gold from Massive Ores: Bi-level Data Pruning towards Efficient Dataset Distillation
YUE XU, Yong-Lu Li, Kaitong Cui et al.
RePOSE: 3D Human Pose Estimation via Spatio-Temporal Depth Relational Consistency
Ziming Sun, Yuan Liang, Zejun Ma et al.
Trainable Highly-expressive Activation Functions
Irit Chelly, Shahaf Finder, Shira Ifergane et al.
Anytime Continual Learning for Open Vocabulary Classification
Zhen Zhu, Yiming Gong, Derek Hoiem
Bottom-Up Domain Prompt Tuning for Generalized Face Anti-Spoofing
SI-QI LIU, Qirui Wang, Pong Chi Yuen
Bidirectional Stereo Image Compression with Cross-Dimensional Entropy Model
Zhening Liu, XINJIE ZHANG, Jiawei Shao et al.
Relightable Neural Actor with Intrinsic Decomposition and Pose Control
Diogo Carbonera Luvizon, Vladislav Golyanik, Adam Kortylewski et al.
Repaint123: Fast and High-quality One Image to 3D Generation with Progressive Controllable Repainting
Junwu Zhang, Zhenyu Tang, Yatian Pang et al.
UniINR: Event-guided Unified Rolling Shutter Correction, Deblurring, and Interpolation
Yunfan Lu, Guoqiang Liang, Yusheng Wang et al.
AMES: Asymmetric and Memory-Efficient Similarity Estimation for Instance-level Retrieval
Pavel Suma, Giorgos Kordopatis-Zilos, Ahmet Iscen et al.
Shape from Heat Conduction
Sriram Narayanan, Mani Ramanagopal, Mark Sheinin et al.
WorldPose: A World Cup Dataset for Global 3D Human Pose Estimation
Tianjian Jiang, Johsan Billingham, Sebastian Müksch et al.
DiffuX2CT: Diffusion Learning to Reconstruct CT Images from Biplanar X-Rays
Baochang Zhang, Zhi Qiao, Runkun Liu et al.
Bidirectional Progressive Transformer for Interaction Intention Anticipation
Zichen Zhang, Hongchen Luo, Wei Zhai et al.
TTT-MIM: Test-Time Training with Masked Image Modeling for Denoising Distribution Shifts
Youssef Mansour, Xuyang Zhong, Serdar Caglar et al.
Avatar Fingerprinting for Authorized Use of Synthetic Talking-Head Videos
Ekta Prashnani, Koki Nagano, Shalini De Mello et al.
Insect Identification in the Wild: The AMI Dataset
Aditya Jain, Fagner Cunha, Michael J Bunsen et al.
UniCal: Unified Neural Sensor Calibration
Ze Yang, George G Chen, Haowei Zhang et al.
AWOL: Analysis WithOut synthesis using Language
Silvia Zuffi, Michael J. Black
EBDM: Exemplar-guided Image Translation with Brownian-bridge Diffusion Models
Lee Eungbean, Somi Jeong, Kwanghoon Sohn
Accelerating Online Mapping and Behavior Prediction via Direct BEV Feature Attention
Xunjiang Gu, Guanyu Song, Igor Gilitschenski et al.
SLAck: Semantic, Location, and Appearance Aware Open-Vocabulary Tracking
Siyuan Li, Lei Ke, Yung-Hsu Yang et al.
Improving Knowledge Distillation via Regularizing Feature Direction and Norm
Yuzhu Wang, Lechao Cheng, Manni Duan et al.
You Only Learn One Query: Learning Unified Human Query for Single-Stage Multi-Person Multi-Task Human-Centric Perception
Sheng Jin, Shuhuai Li, Tong Li et al.
PartImageNet++ Dataset: Scaling up Part-based Models for Robust Recognition
Xiao Li, Yining Liu, Na Dong et al.
Self-supervised co-salient object detection via feature correspondences at multiple scales
Souradeep Chakraborty, Dimitris Samaras
Made to Order: Discovering monotonic temporal changes via self-supervised video ordering
Charig Yang, Weidi Xie, Andrew ZISSERMAN
RAVE: Residual Vector Embedding for CLIP-Guided Backlit Image Enhancement
Tatiana Gaintseva, Martin Benning, Greg Slabaugh
Self-Supervised Representation Learning for Adversarial Attack Detection
Yi Li, Plamen Angelov, Neeraj Suri
GazeXplain: Learning to Predict Natural Language Explanations of Visual Scanpaths
Xianyu Chen, Ming Jiang, Qi Zhao
DiffCD: A Symmetric Differentiable Chamfer Distance for Neural Implicit Surface Fitting
Linus Härenstam-Nielsen, Lu Sang, Abhishek Saroha et al.
Dissolving Is Amplifying: Towards Fine-Grained Anomaly Detection
Jian Shi, Pengyi Zhang, Ni Zhang et al.
Overcome Modal Bias in Multi-modal Federated Learning via Balanced Modality Selection
Yunfeng Fan, Wenchao Xu, Haozhao Wang et al.
Gradient-Aware for Class-Imbalanced Semi-supervised Medical Image Segmentation
Wenbo Qi, Jiafei Wu, S. C. Chan
Shedding More Light on Robust Classifiers under the lens of Energy-based Models
Mujtaba Hussain Mirza, Maria Rosaria Briglia, Senad Beadini et al.
Removing Distributional Discrepancies in Captions Improves Image-Text Alignment
Mu Cai, Haotian Liu, Yuheng Li et al.
Data Augmentation via Latent Diffusion for Saliency Prediction
Bahar Aydemir, Deblina Bhattacharjee, Tong Zhang et al.
PARIS3D: Reasoning-based 3D Part Segmentation Using Large Multimodal Model
Amrin Kareem, Jean Lahoud, Hisham Cholakkal
Latent-INR: A Flexible Framework for Implicit Representations of Videos with Discriminative Semantics
Shishira R Maiya, Anubhav Anubhav, Matthew Gwilliam et al.
The Devil is in the Statistics: Mitigating and Exploiting Statistics Difference for Generalizable Semi-supervised Medical Image Segmentation
Muyang Qiu, Jian Zhang, Lei Qi et al.
Diffusion Prior-Based Amortized Variational Inference for Noisy Inverse Problems
Sojin Lee, Dogyun Park, Inho Kong et al.
AvatarPose: Avatar-guided 3D Pose Estimation of Close Human Interaction from Sparse Multi-view Videos
Feichi Lu, Zijian Dong, Jie Song et al.
Audio-visual Generalized Zero-shot Learning the Easy Way
Shentong Mo, Pedro Morgado
OneTrack: Demystifying the Conflict Between Detection and Tracking in End-to-End 3D Trackers
Qitai Wang, Jiawei He, Yuntao Chen et al.
Fast Training of Diffusion Transformer with Extreme Masking for 3D Point Clouds Generation
Shentong Mo, Enze Xie, Yue Wu et al.
A high-quality robust diffusion framework for corrupted dataset
Quan Dao, Binh Ta, Tung Pham et al.
Hetecooper: Feature Collaboration Graph for Heterogeneous Collaborative Perception
Congzhang Shao, Guiyang Luo, Quan Yuan et al.