Most Cited ECCV "visual prompt strategy" Papers
2,387 papers found • Page 6 of 12
Conference
Toward Tiny and High-quality Facial Makeup with Data Amplify Learning
Qiaoqiao Jin, Xuanhong Chen, Meiguang Jin et al.
DeCo: Decoupled Human-Centered Diffusion Video Editing with Motion Consistency
Xiaojing Zhong, Xinyi Huang, Xiaofeng Yang et al.
Learning to Generate Conditional Tri-plane for 3D-aware Expression Controllable Portrait Animation
Taekyung Ki, Dongchan Min, Gyeongsu Chae
Diffusion Models are Geometry Critics: Single Image 3D Editing Using Pre-Trained Diffusion Priors
Ruicheng Wang, Jianfeng Xiang, Jiaolong Yang et al.
FreeDiff: Progressive Frequency Truncation for Image Editing with Diffusion Models
Wei WU, Qingnan Fan, Shuai Qin et al.
BlenderAlchemy: Editing 3D Graphics with Vision-Language Models
Ian Huang, Guandao Yang, Leonidas Guibas
Betrayed by Attention: A Simple yet Effective Approach for Self-supervised Video Object Segmentation
Shuangrui Ding, Rui Qian, Haohang Xu et al.
Self-Supervised Any-Point Tracking by Contrastive Random Walks
Ayush Shrivastava, Andrew Owens
ProMerge: Prompt and Merge for Unsupervised Instance Segmentation
Dylan Li, Gyungin Shin
Open-Vocabulary 3D Semantic Segmentation with Text-to-Image Diffusion Models
Xiaoyu Zhu, Hao Zhou, Pengfei Xing et al.
Meta-optimized Angular Margin Contrastive Framework for Video-Language Representation Learning
Thanh Thong Nguyen, Yi Bin, Xiaobao Wu et al.
EDformer: Transformer-Based Event Denoising Across Varied Noise Levels
Bin Jiang, Bo Xiong, Bohan Qu et al.
CPM: Class-conditional Prompting Machine for Audio-visual Segmentation
Yuanhong Chen, Chong Wang, Yuyuan Liu et al.
MTA-CLIP: Language-Guided Semantic Segmentation with Mask-Text Alignment
Anurag Das, Xinting Hu, Li Jiang et al.
Few-shot NeRF by Adaptive Rendering Loss Regularization
Qingshan Xu, Xuanyu Yi, Jianyao Xu et al.
GarmentAligner: Text-to-Garment Generation via Retrieval-augmented Multi-level Corrections
Shiyue Zhang, Zheng Chong, Xujie Zhang et al.
Omni6D: Large-Vocabulary 3D Object Dataset for Category-Level 6D Object Pose Estimation
Mengchen Zhang, Tong Wu, Tai Wang et al.
Nonverbal Interaction Detection
Jianan Wei, Tianfei Zhou, Yi Yang et al.
Efficient Few-Shot Action Recognition via Multi-Level Post-Reasoning
Cong Wu, Xiao-Jun Wu, Linze Li et al.
Deblur e-NeRF: NeRF from Motion-Blurred Events under High-speed or Low-light Conditions
Weng Fei Low, Gim Hee Lee
Neural Spectral Decomposition for Dataset Distillation
Yang Shaolei, Shen Cheng, Mingbo Hong et al.
CaesarNeRF: Calibrated Semantic Representation for Few-Shot Generalizable Neural Rendering
Haidong Zhu, Tianyu Ding, Tianyi Chen et al.
OneVOS: Unifying Video Object Segmentation with All-in-One Transformer Framework
Wanyun Li, Pinxue Guo, Xinyu Zhou et al.
SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning
Haiwen Diao, Bo Wan, XU JIA et al.
Arbitrary-Scale Video Super-Resolution with Structural and Textural Priors
Wei Shang, Dongwei Ren, Wanying Zhang et al.
RoadPainter: Points Are Ideal Navigators for Topology transformER
Zhongxing Ma, Liang Shuang, Yongkun Wen et al.
DiscoMatch: Fast Discrete Optimisation for Geometrically Consistent 3D Shape Matching
Paul Roetzer, Ahmed Abbas, Dongliang Cao et al.
MacDiff: Unified Skeleton Modeling with Masked Conditional Diffusion
Lehong Wu, Lilang Lin, Jiahang Zhang et al.
Multiscale Sliced Wasserstein Distances as Perceptual Color Difference Measures
Jiaqi He, Zhihua Wang, Leon Wang et al.
DEAL: Disentangle and Localize Concept-level Explanations for VLMs
Tang Li, Mengmeng Ma, Xi Peng
DiffusionPen: Towards Controlling the Style of Handwritten Text Generation
KONSTANTINA NIKOLAIDOU, George Retsinas, Giorgos Sfikas et al.
Weakly-supervised Camera Localization by Ground-to-satellite Image Registration
Yujiao Shi, HONGDONG LI, Akhil Perincherry et al.
NL2Contact: Natural Language Guided 3D Hand-Object Contact Modeling with Diffusion Model
Zhongqun Zhang, Hengfei Wang, Ziwei Yu et al.
ProTIP: Probabilistic Robustness Verification on Text-to-Image Diffusion Models against Stochastic Perturbation
Yi Zhang, Yun Tang, Wenjie Ruan et al.
SwapAnything: Enabling Arbitrary Object Swapping in Personalized Image Editing
Jing Gu, Nanxuan Zhao, Wei Xiong et al.
Unlocking Attributes' Contribution to Successful Camouflage: A Combined Textual and Visual Analysis Strategy
Hong Zhang, Yixuan Lyu, Qian Yu et al.
IDOL: Unified Dual-Modal Latent Diffusion for Human-Centric Joint Video-Depth Generation
Yuanhao Zhai, Kevin Lin, Linjie Li et al.
Class-Agnostic Object Counting with Text-to-Image Diffusion Model
Xiaofei Hui, Qian Wu, Hossein Rahmani et al.
CLAP: Isolating Content from Style through Contrastive Learning with Augmented Prompts
Yichao Cai, Yuhang Liu, Zhen Zhang et al.
Dense Multimodal Alignment for Open-Vocabulary 3D Scene Understanding
Ruihuang Li, Zhengqiang ZHANG, Chenhang He et al.
Inf-DiT: Upsampling any-resolution image with memory-efficient diffusion transformer.
Zhuoyi Yang, Heyang Jiang, Wenyi Hong et al.
TimeCraft: Navigate Weakly-Supervised Temporal Grounded Video Question Answering via Bi-directional Reasoning
Huabin Liu, Xiao Ma, Cheng Zhong et al.
DreamMesh: Jointly Manipulating and Texturing Triangle Meshes for Text-to-3D Generation
Haibo Yang, Yang Chen, Yingwei Pan et al.
Few-Shot Image Generation by Conditional Relaxing Diffusion Inversion
Yu Cao, Shaogang Gong
Placing Objects in Context via Inpainting for Out-of-distribution Segmentation
Pau de Jorge Aranda, Riccardo Volpi, Puneet Dokania et al.
Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models
Jinrui Zhang, Teng Wang, Haigang Zhang et al.
Diff-Reg: Diffusion Model in Doubly Stochastic Matrix Space for Registration Problem
Qianliang Wu, Haobo Jiang, Lei Luo et al.
CoLeaF: A Contrastive-Collaborative Learning Framework for Weakly Supervised Audio-Visual Video Parsing
Faegheh Sardari, Armin Mustafa, Philip JB Jackson et al.
SAVE: Protagonist Diversification with Structure Agnostic Video Editing
Yeji Song, Wonsik Shin, Junsoo Lee et al.
DreamStruct: Understanding Slides and User Interfaces via Synthetic Data Generation
Yi-Hao Peng, Faria Huq, Yue Jiang et al.
Timestep-Aware Correction for Quantized Diffusion Models
Yuzhe YAO, Feng Tian, Jun Chen et al.
INTRA: Interaction Relationship-aware Weakly Supervised Affordance Grounding
jiha jang, Hoigi Seo, Se Young Chun
OmniSSR: Zero-shot Omnidirectional Image Super-Resolution using Stable Diffusion Model
Runyi Li, Xuhan SHENG, Weiqi Li et al.
TCAN: Animating Human Images with Temporally Consistent Pose Guidance using Diffusion Models
Jeongho Kim, Min-Jung Kim, Junsoo Lee et al.
Hybrid Video Diffusion Models with 2D Triplane and 3D Wavelet Representation
Kihong Kim, Haneol Lee, Jihye Park et al.
Salience-Based Adaptive Masking: Revisiting Token Dynamics for Enhanced Pre-training
Hyesong Choi, Hyejin Park, Kwang Moo Yi et al.
Subspace Prototype Guidance for Mitigating Class Imbalance in Point Cloud Semantic Segmentation
Jiawei Han, Kaiqi Liu, Wei Li et al.
TF-FAS: Twofold-Element Fine-Grained Semantic Guidance for Generalizable Face Anti-Spoofing
Xudong Wang, Ke-Yue Zhang, Taiping Yao et al.
EgoPet: Egomotion and Interaction Data from an Animal's Perspective
Amir Bar, Arya Bakhtiar, Danny L Tran et al.
Real-time 3D-aware Portrait Editing from a Single Image
Qingyan Bai, Zifan Shi, Yinghao Xu et al.
Q&A Prompts: Discovering Rich Visual Clues through Mining Question-Answer Prompts for VQA requiring Diverse World Knowledge
Haibo Wang, Weifeng Ge
MAGR: Manifold-Aligned Graph Regularization for Continual Action Quality Assessment
Kanglei Zhou, Liyuan Wang, Xingxing Zhang et al.
Real-time Holistic Robot Pose Estimation with Unknown States
Shikun Ban, Juling Fan, Xiaoxuan Ma et al.
RT-Pose: A 4D Radar-Tensor based 3D Human Pose Estimation and Localization Benchmark
Yuan-Hao Ho, Jen-Hao Cheng, Sheng Yao Kuan et al.
X-Former: Unifying Contrastive and Reconstruction Learning for MLLMs
Swetha Sirnam, Jinyu Yang, Tal Neiman et al.
Learning to Make Keypoints Sub-Pixel Accurate
Shinjeong Kim, Marc Pollefeys, Daniel Barath
DNI: Dilutional Noise Initialization for Diffusion Video Editing
Sunjae Yoon, Gwanhyeong Koo, Ji Woo Hong et al.
JointDreamer: Ensuring Geometry Consistency and Text Congruence in Text-to-3D Generation via Joint Score Distillation
ChenHan Jiang, Yihan Zeng, Tianyang Hu et al.
Uncertainty-aware sign language video retrieval with probability distribution modeling
Xuan Wu, Hongxiang Li, yuanjiang luo et al.
CoSIGN: Few-Step Guidance of ConSIstency Model to Solve General INverse Problems
Jiankun Zhao, Bowen Song, Liyue Shen
Enhancing Recipe Retrieval with Foundation Models: A Data Augmentation Perspective
Fangzhou Song, Bin Zhu, Yanbin Hao et al.
Unrolled Decomposed Unpaired Learning for Controllable Low-Light Video Enhancement
Lingyu Zhu, Wenhan Yang, Baoliang Chen et al.
Recursive Visual Programming
Jiaxin Ge, Sanjay Subramanian, Baifeng Shi et al.
Rasterized Edge Gradients: Handling Discontinuities Differentially
Stanislav Pidhorskyi, Tomas Simon, Gabriel Schwartz et al.
Leveraging Thermal Modality to Enhance Reconstruction in Low-Light Conditions
Jiacong Xu, Mingqian Liao, Ram Prabhakar Kathirvel et al.
Distributionally Robust Loss for Long-Tailed Multi-Label Image Classification
Dekun Lin, Zhe Cui, Rui Chen et al.
VQ-HPS: Human Pose and Shape Estimation in a Vector-Quantized Latent Space
Guénolé Fiche, Simon Leglaive, Xavier Alameda-Pineda et al.
Length-Aware Motion Synthesis via Latent Diffusion
Alessio Sampieri, Alessio Palma, Indro Spinelli et al.
Free Lunch for Gait Recognition: A Novel Relation Descriptor
Jilong Wang, Saihui Hou, Yan Huang et al.
The Role of Masking for Efficient Supervised Knowledge Distillation of Vision Transformers
Seungwoo Son, Jegwang Ryu, Namhoon Lee et al.
Safe-CLIP: Removing NSFW Concepts from Vision-and-Language Models
Samuele Poppi, Tobia Poppi, Federico Cocchi et al.
Strike a Balance in Continual Panoptic Segmentation
Jinpeng Chen, Runmin Cong, Yuxuan Luo et al.
Event-Aided Time-To-Collision Estimation for Autonomous Driving
Jinghang Li, Bangyan Liao, Xiuyuan LU et al.
Powerful and Flexible: Personalized Text-to-Image Generation via Reinforcement Learning
Fanyue Wei, Wei Zeng, Zhenyang Li et al.
BEAF: Observing BEfore-AFter Changes to Evaluate Hallucination in Vision-language Models
Ye-Bin Moon, Nam Hyeon-Woo, Wonseok Choi et al.
Surf-D: Generating High-Quality Surfaces of Arbitrary Topologies Using Diffusion Models
Zhengming Yu, Zhiyang Dou, Xiaoxiao Long et al.
ByteEdit: Boost, Comply and Accelerate Generative Image Editing
YUXI REN, Jie Wu, Yanzuo Lu et al.
Shape2Scene: 3D Scene Representation Learning Through Pre-training on Shape Data
Tuo FENG, Wenguan Wang, Ruijie Quan et al.
AdaDistill: Adaptive Knowledge Distillation for Deep Face Recognition
Fadi Boutros, Vitomir Struc, Naser Damer
Learn from the Learnt: Source-Free Active Domain Adaptation via Contrastive Sampling and Visual Persistence
Mengyao Lyu, Tianxiang Hao, Xinhao Xu et al.
Idempotent Unsupervised Representation Learning for Skeleton-Based Action Recognition
Lilang Lin, Lehong Wu, Jiahang Zhang et al.
Global-Local Collaborative Inference with LLM for Lidar-Based Open-Vocabulary Detection
Xingyu Peng, Yan Bai, Chen Gao et al.
Motion and Structure from Event-based Normal Flow
Zhongyang Ren, Bangyan Liao, Delei Kong et al.
Exploiting Semantic Reconstruction to Mitigate Hallucinations in Vision-Language Models
Minchan Kim, Minyeong Kim, Junik Bae et al.
Adversarially Robust Distillation by Reducing the Student-Teacher Variance Gap
Junhao Dong, Piotr Koniusz, Junxi Chen et al.
Hiding Imperceptible Noise in Curvature-Aware Patches for 3D Point Cloud Attack
Mingyu Yang, Daizong Liu, Keke Tang et al.
Part2Object: Hierarchical Unsupervised 3D Instance Segmentation
cheng Shi, Yulin zhang, Bin Yang et al.
Self-Adapting Large Visual-Language Models to Edge Devices across Visual Modalities
Kaiwen Cai, ZheKai Duan, Gaowen Liu et al.
RepVF: A Unified Vector Fields Representation for Multi-task 3D Perception
Shen Jianbing, Chunliang Li, Wencheng Han et al.
Siamese Vision Transformers are Scalable Audio-visual Learners
Yan-Bo Lin, Gedas Bertasius
Layer-Wise Relevance Propagation with Conservation Property for ResNet
Seitaro Otsuki, Tsumugi Iida, Félix Doublet et al.
NOVUM: Neural Object Volumes for Robust Object Classification
Artur Jesslen, Guofeng Zhang, Angtian Wang et al.
AFF-ttention! Affordances and Attention models for Short-Term Object Interaction Anticipation
Lorenzo Mur Labadia, Ruben Martinez-Cantin, Jose J Guerrero et al.
Temporal-Mapping Photography for Event Cameras
Yuhan Bao, Lei Sun, Yuqin Ma et al.
Graph Neural Network Causal Explanation via Neural Causal Models
Arman Behnam, Binghui Wang
Improving Adversarial Transferability via Model Alignment
Avery Ma, Amir-massoud Farahmand, Yangchen Pan et al.
Leveraging temporal contextualization for video action recognition
Minji Kim, Dongyoon Han, Taekyung Kim et al.
Emerging Property of Masked Token for Effective Pre-training
Hyesong Choi, Hunsang Lee, Seyoung Joung et al.
Enhancing Perceptual Quality in Video Super-Resolution through Temporally-Consistent Detail Synthesis using Diffusion Models
Claudio Rota, Marco Buzzelli, Joost Van de Weijer
Generating Physically Realistic and Directable Human Motions from Multi-Modal Inputs
Aayam Shrestha, Pan Liu, German Ros et al.
NVS-Adapter: Plug-and-Play Novel View Synthesis from a Single Image
Yoonwoo Jeong, Jinwoo Lee, Chiheon Kim et al.
VQA-Diff: Exploiting VQA and Diffusion for Zero-Shot Image-to-3D Vehicle Asset Generation in Autonomous Driving
Yibo Liu, Zheyuan Yang, Guile Wu et al.
Safeguard Text-to-Image Diffusion Models with Human Feedback Inversion
Sanghyun Kim, Seohyeon Jung, Balhae Kim et al.
Training-free Composite Scene Generation for Layout-to-Image Synthesis
Jiaqi Liu, Tao Huang, Chang Xu
Volumetric Rendering with Baked Quadrature Fields
Gopal Sharma, Daniel Rebain, Kwang Moo Yi et al.
T-MAE: Temporal Masked Autoencoders for Point Cloud Representation Learning
Weijie Wei, Fatemeh Karimi Nejadasl, Theo Gevers et al.
PMT: Progressive Mean Teacher via Exploring Temporal Consistency for Semi-Supervised Medical Image Segmentation
Ning Gao, Sanping Zhou, Le Wang et al.
RoomTex: Texturing Compositional Indoor Scenes via Iterative Inpainting
Qi Wang, Ruijie Lu, Xudong XU et al.
DIAL: Dense Image-text ALignment for Weakly Supervised Semantic Segmentation
Soojin Jang, JungMin Yun, JuneHyoung Kwon et al.
Propose, Assess, Search: Harnessing LLMs for Goal-Oriented Planning in Instructional Videos
Mohaiminul Islam, Tushar Nagarajan, Huiyu Wang et al.
Adaptive Multi-head Contrastive Learning
Lei Wang, Piotr Koniusz, Tom Gedeon et al.
Open Vocabulary 3D Scene Understanding via Geometry Guided Self-Distillation
Pengfei Wang, Yuxi Wang, Shuai Li et al.
Instruction Tuning-free Visual Token Complement for Multimodal LLMs
Dongsheng Wang, Jiequan Cui, Miaoge Li et al.
REVISION: Rendering Tools Enable Spatial Fidelity in Vision-Language Models
Agneet Chatterjee, Yiran Luo, Tejas Gokhale et al.
Domain Shifting: A Generalized Solution for Heterogeneous Cross-Modality Person Re-Identification
Yan Jiang, Xu Cheng, Hao Yu et al.
PISR: Polarimetric Neural Implicit Surface Reconstruction for Textureless and Specular Objects
Guangcheng Chen, Yicheng He, Li He et al.
Multi-Person Pose Forecasting with Individual Interaction Perceptron and Prior Learning
Peng Xiao, Yi Xie, Xuemiao Xu et al.
Open-Set Recognition in the Age of Vision-Language Models
Dimity Miller, Niko Suenderhauf, Alex Kenna et al.
DoubleTake: Geometry Guided Depth Estimation
Mohamed Sayed, Filippo Aleotti, Jamie Watson et al.
CloudFixer: Test-Time Adaptation for 3D Point Clouds via Diffusion-Guided Geometric Transformation
Hajin Shim, Changhun Kim, Eunho Yang
Self-supervised visual learning from interactions with objects
Arthur Aubret, Céline Teulière, Jochen Triesch
MVDD: Multi-View Depth Diffusion Models
Zhen Wang, Qiangeng Xu, Feitong Tan et al.
Skews in the Phenomenon Space Hinder Generalization in Text-to-Image Generation
Yingshan Chang, Yasi Zhang, Zhiyuan Fang et al.
Exploring the Feature Extraction and Relation Modeling For Light-Weight Transformer Tracking
Jikai Zheng, Mingjiang Liang, Shaoli Huang et al.
Curved Diffusion: A Generative Model With Optical Geometry Control
Andrey Voynov, Amir Hertz, Moab Arar et al.
AnimateMe: 4D Facial Expressions via Diffusion Models
Dimitrios Gerogiannis, Foivos Paraperas Papantoniou, Rolandos Alexandros Potamias et al.
Imaging Interiors: An Implicit Solution to Electromagnetic Inverse Scattering Problems
Ziyuan Luo, Boxin Shi, Haoliang Li et al.
Ponymation: Learning Articulated 3D Animal Motions from Unlabeled Online Videos
Keqiang Sun, Dori Litvak, Yunzhi Zhang et al.
KDProR: A Knowledge-Decoupling Probabilistic Framework for Video-Text Retrieval
Xianwei Zhuang, Hongxiang Li, Xuxin Cheng et al.
Implicit Filtering for Learning Neural Signed Distance Functions from 3D Point Clouds
Shengtao Li, Ge Gao, Yudong Liu et al.
ABC Easy as 123: A Blind Counter for Exemplar-Free Multi-Class Class-agnostic Counting
Michael A Hobley, Victor Adrian Prisacariu
Towards More Practical Group Activity Detection: A New Benchmark and Model
Dongkeun Kim, Youngkil Song, Minsu Cho et al.
CanonicalFusion: Generating Drivable 3D Human Avatars from Multiple Images
Jisu Shin, Junmyeong Lee, Seongmin Lee et al.
Certifiably Robust Image Watermark
Zhengyuan Jiang, Moyang Guo, Yuepeng Hu et al.
Brain Netflix: Scaling Data to Reconstruct Videos from Brain Signals
Camilo Fosco, Benjamin Lahner, Bowen Pan et al.
EraseDraw : Learning to Insert Objects by Erasing Them from Images
Alper Canberk, Maksym Bondarenko, Ege Ozguroglu et al.
Fully Authentic Visual Question Answering Dataset from Online Communities
Chongyan Chen, Mengchen Liu, Noel C Codella et al.
Un-EVIMO: Unsupervised Event-based Independent Motion Segmentation
Ziyun Wang, Jinyuan Guo, Kostas Daniilidis
3D Weakly Supervised Semantic Segmentation with 2D Vision-Language Guidance
Xiaoxu Xu, Yitian Yuan, Jinlong Li et al.
Towards Physical World Backdoor Attacks against Skeleton Action Recognition
Qichen Zheng, Yi Yu, SIYUAN YANG et al.
Diffusion Prior-Based Amortized Variational Inference for Noisy Inverse Problems
Sojin Lee, Dogyun Park, Inho Kong et al.
MetaCap: Meta-learning Priors from Multi-View Imagery for Sparse-view Human Performance Capture and Rendering
Guoxing Sun, Rishabh Dabral, Pascal Fua et al.
Intrinsic Single-Image HDR Reconstruction
Sebastian Dille, Chris Careaga, Yagiz Aksoy
Exploring Phrase-Level Grounding with Text-to-Image Diffusion Model
Danni Yang, Ruohan Dong, Jiayi Ji et al.
ZoLA: Zero-Shot Creative Long Animation Generation with Short Video Model
Fu-Yun Wang, Zhaoyang Huang, Qiang Ma et al.
Self-Supervised Representation Learning for Adversarial Attack Detection
Yi Li, Plamen Angelov, Neeraj Suri
CountFormer: Multi-View Crowd Counting Transformer
Hong Mo, Xiong Zhang, Jianchao Tan et al.
RoScenes: A Large-scale Multi-view 3D Dataset for Roadside Perception
Xiaosu Zhu, Hualian Sheng, Sijia Cai et al.
Towards Reliable Advertising Image Generation Using Human Feedback
Zhenbang Du, Wei Feng, Haohan Wang et al.
MinD-3D: Reconstruct High-quality 3D objects in Human Brain
Jianxiong Gao, Yuqian Fu, Yun Wang et al.
WRIM-Net: Wide-Ranging Information Mining Network for Visible-Infrared Person Re-Identification
Yonggan Wu, Ling-Chao Meng, Yuan Zichao et al.
PCF-Lift: Panoptic Lifting by Probabilistic Contrastive Fusion
Runsong Zhu, Shi Qiu, Qianyi Wu et al.
ActionVOS: Actions as Prompts for Video Object Segmentation
LIANGYANG OUYANG, Ruicong Liu, Yifei Huang et al.
FlexAttention for Efficient High-Resolution Vision-Language Models
Junyan Li, Delin Chen, Tianle Cai et al.
EGIC: Enhanced Low-Bit-Rate Generative Image Compression Guided by Semantic Segmentation
Nikolai Körber, Eduard Kromer, Andreas Siebert et al.
PredBench: Benchmarking Spatio-Temporal Prediction across Diverse Disciplines
Zidong Wang, Zeyu Lu, Di Huang et al.
DynoSurf: Neural Deformation-based Temporally Consistent Dynamic Surface Reconstruction
Yuxin Yao, Siyu Ren, Junhui Hou et al.
Localization and Expansion: A Decoupled Framework for Point Cloud Few-shot Semantic Segmentation
Zhaoyang Li, Yuan Wang, Wangkai Li et al.
3D Small Object Detection with Dynamic Spatial Pruning
Xiuwei Xu, Zhihao Sun, Ziwei Wang et al.
DiffuX2CT: Diffusion Learning to Reconstruct CT Images from Biplanar X-Rays
Baochang Zhang, Zhi Qiao, Runkun Liu et al.
GeometrySticker: Enabling Ownership Claim of Recolorized Neural Radiance Fields
Xiufeng HUANG, Ka Chun Cheung, Simon See et al.
GenRC: Generative 3D Room Completion from Sparse Image Collections
Ming-Feng Li, Yueh-Feng Ku, Hong-Xuan Yen et al.
AdaDiffSR: Adaptive Region-aware Dynamic acceleration Diffusion Model for Real-World Image Super-Resolution
Yuanting Fan, Chengxu Liu, Nengzhong Yin et al.
Dual-level Adaptive Self-Labeling for Novel Class Discovery in Point Cloud Segmentation
Ruijie Xu, Chuyu Zhang, Hui Ren et al.
FontStudio: Shape-Adaptive Diffusion Model for Coherent and Consistent Font Effect Generation
Xinzhi MU, Li Chen, Bohan CHEN et al.
T-CorresNet: Template Guided 3D Point Cloud Completion with Correspondence Pooling Query Generation Strategy
Fan Duan, Jiahao Yu, Li Chen
Rethinking Video Deblurring with Wavelet-Aware Dynamic Transformer and Diffusion Model
chen rao, Guangyuan Li, Zehua Lan et al.
Learning Diffusion Models for Multi-View Anomaly Detection
Chieh Liu, Yu-Min Chu, Ting-I Hsieh et al.
Make a Strong Teacher with Label Assistance: A Novel Knowledge Distillation Approach for Semantic Segmentation
Shoumeng Qiu, Jie Chen, Xinrun Li et al.
Head360: Learning a Parametric 3D Full-Head for Free-View Synthesis in 360°
Yuxiao He, Yiyu Zhuang, Yanwen Wang et al.
PQ-SAM: Post-training Quantization for Segment Anything Model
Xiaoyu Liu, Xin Ding, Lei Yu et al.
CityGuessr: City-Level Video Geo-Localization on a Global Scale
Parth Parag Kulkarni, Gaurav Kumar Nayak, Shah Mubarak
SCP-Diff: Spatial-Categorical Joint Prior for Diffusion Based Semantic Image Synthesis
Huan-ang Gao, Mingju Gao, Jiaju Li et al.
PiTe: Pixel-Temporal Alignment for Large Video-Language Model
Yang Liu, Pengxiang Ding, Siteng Huang et al.
SMILe: Leveraging Submodular Mutual Information For Robust Few-Shot Object Detection
Anay Majee, Ryan X Sharp, Rishabh Iyer
LG-Gaze: Learning Geometry-aware Continuous Prompts for Language-Guided Gaze Estimation
Pengwei Yin, Jingjing Wang, Guanzhong Zeng et al.
Quantized Prompt for Efficient Generalization of Vision-Language Models
Tianxiang Hao, Xiaohan Ding, Juexiao Feng et al.
On Pretraining Data Diversity for Self-Supervised Learning
Hasan Abed El Kader Hammoud, Tuhin Das, Fabio Pizzati et al.
Preventing Catastrophic Overfitting in Fast Adversarial Training: A Bi-level Optimization Perspective
Zhaoxin Wang, Handing Wang, Cong Tian et al.
WorldPose: A World Cup Dataset for Global 3D Human Pose Estimation
Tianjian Jiang, Johsan Billingham, Sebastian Müksch et al.
Statewide Visual Geolocalization in the Wild
Florian Fervers, Sebastian Bullinger, Christoph Bodensteiner et al.
Bi-TTA: Bidirectional Test-Time Adapter for Remote Physiological Measurement
Haodong LI, Hao LU, Yingcong Chen
SLIM: Spuriousness Mitigation with Minimal Human Annotations
Xiwei Xuan, Ziquan Deng, Hsuan-Tien Lin et al.
Animate Your Motion: Turning Still Images into Dynamic Videos
Mingxiao Li, Bo Wan, Marie-Francine Moens et al.
On the Vulnerability of Skip Connections to Model Inversion Attacks
Jun Hao Koh, Sy-Tuyen Ho, Ngoc-Bao Nguyen et al.
Learning Semantic Latent Directions for Accurate and Controllable Human Motion Prediction
Guowei Xu, Jiale Tao, Wen Li et al.
Quality Assured: Rethinking Annotation Strategies in Imaging AI
Tim Rädsch, Annika Reinke, Vivienn Weru et al.
DualDn: Dual-domain Denoising via Differentiable ISP
Ruikang Li, Yujin Wang, Shiqi Chen et al.
Compress3D: a Compressed Latent Space for 3D Generation from a Single Image
Bowen Zhang, Tianyu Yang, Yu Li et al.
Efficient and Versatile Robust Fine-Tuning of Zero-shot Models
Sungyeon Kim, Boseung Jeong, Donghyun Kim et al.