Most Cited ECCV "multi-exposure image fusion" Papers
2,387 papers found • Page 4 of 12
Conference
Is Retain Set All You Need in Machine Unlearning? Restoring Performance of Unlearned Models with Out-Of-Distribution Images
Jacopo Bonato, Marco Cotogni, Luigi Sabetta
WildVidFit: Video Virtual Try-On in the Wild via Image-Based Controlled Diffusion Models
Zijian He, Peixin Chen, Guangrun Wang et al.
Protecting NeRFs' Copyright via Plug-And-Play Watermarking Base Model
Qi Song, Ziyuan Luo, Ka Chun Cheung et al.
When Pedestrian Detection Meets Multi-Modal Learning: Generalist Model and Benchmark Dataset
Yi Zhang, Wang Zeng, Sheng Jin et al.
CadVLM: Bridging Language and Vision in the Generation of Parametric CAD Sketches
Sifan Wu, Amir Hosein Khasahmadi, Mor Katz et al.
SmartControl: Enhancing ControlNet for Handling Rough Visual Conditions
XIAOYU LIU, Yuxiang WEI, Ming LIU et al.
ConGeo: Robust Cross-view Geo-localization across Ground View Variations
Li Mi, Chang Xu, Javiera Castillo Navarro et al.
Improving Video Segmentation via Dynamic Anchor Queries
Yikang Zhou, Tao Zhang, Xiangtai Li et al.
Diffusion for Natural Image Matting
Yihan Hu, Yiheng Lin, Wei Wang et al.
ColorPeel: Color Prompt Learning with Diffusion Models via Color and Shape Disentanglement
Muhammad Atif Butt, Kai Wang, Javier Vazquez-Corral et al.
iHuman: Instant Animatable Digital Humans From Monocular Videos
Pramish Paudel, Anubhav Khanal, Danda Pani Paudel et al.
CVT-Occ: Cost Volume Temporal Fusion for 3D Occupancy Prediction
Zhangchen Ye, Tao Jiang, Chenfeng Xu et al.
Simplifying Source-Free Domain Adaptation for Object Detection: Effective Self-Training Strategies and Performance Insights
Yan Hao, Florent Forest, Olga Fink
SPIRE: Semantic Prompt-Driven Image Restoration
Chenyang Qi, Zhengzhong Tu, Keren Ye et al.
Diff-Tracker: Text-to-Image Diffusion Models are Unsupervised Trackers
Zhengbo Zhang, Li Xu, Duo Peng et al.
TexGen: Text-Guided 3D Texture Generation with Multi-view Sampling and Resampling
Dong Huo, Zixin Guo, Xinxin Zuo et al.
MirrorGaussian: Reflecting 3D Gaussians for Reconstructing Mirror Reflections
Jiayue Liu, Tang Xiao, Freeman Cheng et al.
Towards Open Domain Text-Driven Synthesis of Multi-Person Motions
Shan Mengyi, Lu Dong, Yutao Han et al.
Gaussian Frosting: Editable Complex Radiance Fields with Real-Time Rendering
Antoine Guedon, Vincent Lepetit
You Only Need One Step: Fast Super-Resolution with Stable Diffusion via Scale Distillation
Mehdi Noroozi, Isma Hadji, Brais Martinez et al.
Cross-Platform Video Person ReID: A New Benchmark Dataset and Adaptation Approach
Shizhou Zhang, Wenlong Luo, De Cheng et al.
WordRobe: Text-Guided Generation of Textured 3D Garments
Astitva Srivastava, Pranav Manu, Amit Raj et al.
Revisit Human-Scene Interaction via Space Occupancy
Xinpeng Liu, Haowen Hou, Yanchao Yang et al.
MMVR: Millimeter-wave Multi-View Radar Dataset and Benchmark for Indoor Perception
Mohammad Mahbubur Rahman, Ryoma Yataka, Sorachi Kato et al.
Fast View Synthesis of Casual Videos with Soup-of-Planes
Yao-Chih Lee, Zhoutong Zhang, Kevin Blackburn-Matzen et al.
Beat-It: Beat-Synchronized Multi-Condition 3D Dance Generation
Zikai Huang, Xuemiao Xu, Cheng Xu et al.
A Graph-Based Approach for Category-Agnostic Pose Estimation
Or Hirschorn, Shai Avidan
Improving Zero-shot Generalization of Learned Prompts via Unsupervised Knowledge Distillation
Marco Mistretta, Alberto Baldrati, Marco Bertini et al.
OmniNOCS: A unified NOCS dataset and model for 3D lifting of 2D objects
Akshay Krishnan, Abhijit Kundu, Kevis Maninis et al.
SSL-Cleanse: Trojan Detection and Mitigation in Self-Supervised Learning
Mengxin Zheng, Jiaqi Xue, Zihao Wang et al.
Exemplar-free Continual Representation Learning via Learnable Drift Compensation
Alex Gomez-Villa, Dipam Goswami, Kai Wang et al.
Diffusion-Driven Data Replay: A Novel Approach to Combat Forgetting in Federated Class Continual Learning
Jinglin Liang, Jin Zhong, Hanlin Gu et al.
StructLDM: Structured Latent Diffusion for 3D Human Generation
Tao Hu, Fangzhou Hong, Ziwei Liu
PromptCCD: Learning Gaussian Mixture Prompt Pool for Continual Category Discovery
Fernando Julio Cendra, Bingchen Zhao, Kai Han
StoryImager: A Unified and Efficient Framework for Coherent Story Visualization and Completion
Ming Tao, BINGKUN BAO, Hao Tang et al.
Source Prompt Disentangled Inversion for Boosting Image Editability with Diffusion Models
Ruibin Li, Ruihuang Li, Song Guo et al.
CLAMP-ViT: Contrastive Data-Free Learning for Adaptive Post-Training Quantization of ViTs
Akshat Ramachandran, Souvik Kundu, Tushar Krishna
Good Teachers Explain: Explanation-Enhanced Knowledge Distillation
Amin Parchami, Moritz Böhle, Sukrut Rao et al.
Learning to Unlearn for Robust Machine Unlearning
Mark HUANG, Lin Geng Foo, Jun Liu
Osmosis: RGBD Diffusion Prior for Underwater Image Restoration
Opher Bar Nathan, Deborah Steinberger-Levy, Tali Treibitz et al.
DreamSampler: Unifying Diffusion Sampling and Score Distillation for Image Manipulation
Jeongsol Kim, Geon Yeong Park, Jong Chul Ye
Adaptive Bounding Box Uncertainties via Two-Step Conformal Prediction
Alexander Timans, Christoph-Nikolas Straehle, Kaspar Sakmann et al.
NICP: Neural ICP for 3D Human Registration at Scale
Riccardo Marin, Enric Corona, Gerard Pons-Moll
FreestyleRet: Retrieving Images from Style-Diversified Queries
Hao Li, Yanhao Jia, Peng Jin et al.
OV-Uni3DETR: Towards Unified Open-Vocabulary 3D Object Detection via Cycle-Modality Propagation
Zhenyu Wang, Ya-Li Li, TAICHI LIU et al.
EgoCVR: An Egocentric Benchmark for Fine-Grained Composed Video Retrieval
Thomas Hummel, Shyamgopal Karthik, Mariana-Iuliana Georgescu et al.
LAPT: Label-driven Automated Prompt Tuning for OOD Detection with Vision-Language Models
Yabin Zhang, Wenjie Zhu, Chenhang He et al.
Explain via Any Concept: Concept Bottleneck Model with Open Vocabulary Concepts
Andong Tan, Fengtao Zhou, Hao Chen
SparseCraft: Few-Shot Neural Reconstruction through Stereopsis Guided Geometric Linearization
Mae Younes, Amine Ouasfi, Adnane Boukhayma
Towards Real-world Event-guided Low-light Video Enhancement and Deblurring
Taewoo Kim, Jaeseok Jeong, Hoonhee Cho et al.
Enhancing Semantic Fidelity in Text-to-Image Synthesis: Attention Regulation in Diffusion Models
Yang Zhang, Tze Tzun Teoh, Wei Hern Lim et al.
GarmentCodeData: A Dataset of 3D Made-to-Measure Garments With Sewing Patterns
Maria Korosteleva, Timur Levent Kesdogan, Fabian Kemper et al.
CoMusion: Towards Consistent Stochastic Human Motion Prediction via Motion Diffusion
Jiarui Sun, Girish Chowdhary
SceneGraphLoc: Cross-Modal Coarse Visual Localization on 3D Scene Graphs
Yang Miao, Francis Engelmann, Olga Vysotska et al.
Any2Point: Empowering Any-modality Transformers for Efficient 3D Understanding
YIWEN TANG, Renrui Zhang, Jiaming Liu et al.
Towards Neuro-Symbolic Video Understanding
Minkyu Choi, Harsh Goel, Mohammad Omama et al.
SegVG: Transferring Object Bounding Box to Segmentation for Visual Grounding
Weitai Kang, Gaowen Liu, Shah Mubarak et al.
HENet: Hybrid Encoding for End-to-end Multi-task 3D Perception from Multi-view Cameras
Zhongyu Xia, ZhiWei Lin, Xinhao Wang et al.
LatentEditor: Text Driven Local Editing of 3D Scenes
Umar Khalid, Hasan Iqbal, Muhammad Tayyab et al.
LPViT: Low-Power Semi-structured Pruning for Vision Transformers
KAIXIN Xu, Zhe Wang, Chunyun Chen et al.
UniTalker: Scaling up Audio-Driven 3D Facial Animation through A Unified Model
Xiangyu Fan, Jiaqi Li, Zhiqian Lin et al.
Unified Embedding Alignment for Open-Vocabulary Video Instance Segmentation
Hao Fang, Peng Wu, Yawei Li et al.
Contourlet Residual for Prompt Learning Enhanced Infrared Image Super-Resolution
Xingyuan Li, Jinyuan Liu, ZHIXIN CHEN et al.
Centering the Value of Every Modality: Towards Efficient and Resilient Modality-agnostic Semantic Segmentation
Xu Zheng, Yuanhuiyi Lyu, jiazhou zhou et al.
ZeroI2V: Zero-Cost Adaptation of Pre-Trained Transformers from Image to Video
Xinhao Li, Yuhan Zhu, Limin Wang
InfMAE: A Foundation Model in The Infrared Modality
Fangcen liu, Chenqiang Gao, Yaming Zhang et al.
InterFusion: Text-Driven Generation of 3D Human-Object Interaction
Sisi Dai, Wenhao Li, Haowen Sun et al.
Listen to Look into the Future: Audio-Visual Egocentric Gaze Anticipation
Bolin Lai, Fiona Ryan, Wenqi Jia et al.
HO-Gaussian: Hybrid Optimization of 3D Gaussian Splatting for Urban Scenes
Zhuopeng Li, Yilin Zhang, Chenming Wu et al.
UNIC: Universal Classification Models via Multi-teacher Distillation
Yannis Kalantidis, Larlus Diane, Mert Bulent SARIYILDIZ et al.
Spiking Wavelet Transformer
Yuetong Fang, Ziqing Wang, Lingfeng Zhang et al.
Crowd-SAM:SAM as a smart annotator for object detection in crowded scenes
Zhi Cai, Yingjie Gao, Yaoyan Zheng et al.
MagDiff: Multi-Alignment Diffusion for High-Fidelity Video Generation and Editing
Haoyu Zhao, Tianyi Lu, Jiaxi Gu et al.
Caltech Aerial RGB-Thermal Dataset in the Wild
Connor Lee, Matthew Anderson, Nikhil Ranganathan et al.
Latent Diffusion Prior Enhanced Deep Unfolding for Snapshot Spectral Compressive Imaging
Zongliang Wu, Ruiying Lu, Ying Fu et al.
VividDreamer: Invariant Score Distillation for Hyper-Realistic Text-to-3D Generation
Wenjie Zhuo, Fan Ma, Hehe Fan et al.
Thinking Outside the BBox: Unconstrained Generative Object Compositing
Gemma Canet Tarrés, Zhe Lin, Zhifei Zhang et al.
Diffusion Model for Robust Multi-Sensor Fusion in 3D Object Detection and BEV Segmentation
Duy Tho Le, Hengcan Shi, Jianfei Cai et al.
Connecting Consistency Distillation to Score Distillation for Text-to-3D Generation
Zongrui Li, Minghui Hu, Qian Zheng et al.
Learning to Detect Multi-class Anomalies with Just One Normal Image Prompt
Bin-Bin Gao
AnyControl: Create Your Artwork with Versatile Control on Text-to-Image Generation
Sun Yanan, Yanchen Liu, Yinhao Tang et al.
Dataset Enhancement with Instance-Level Augmentations
Orest Kupyn, Christian Rupprecht
RAW-Adapter: Adapting Pretrained Visual Model to Camera RAW Images
Ziteng Cui, Tatsuya Harada
Raindrop Clarity: A Dual-Focused Dataset for Day and Night Raindrop Removal
Yeying Jin, Xin Li, Jiadong Wang et al.
Multi-branch Collaborative Learning Network for 3D Visual Grounding
Zhipeng Qian, Yiwei Ma, Zhekai Lin et al.
CC-SAM: Enhancing SAM with Cross-feature Attention and Context for Ultrasound Image Segmentation
Shreyank Narayana Gowda, David A Clifton
Synchronization is All You Need: Exocentric-to-Egocentric Transfer for Temporal Action Segmentation with Unlabeled Synchronized Video Pairs
Camillo Quattrocchi, Antonino Furnari, Daniele Di Mauro et al.
Implicit Concept Removal of Diffusion Models
Zhili LIU, Kai Chen, Yifan Zhang et al.
GSD: View-Guided Gaussian Splatting Diffusion for 3D Reconstruction
Yuxuan Mu, Xinxin Zuo, Chuan Guo et al.
Object-Conditioned Energy-Based Attention Map Alignment in Text-to-Image Diffusion Models
Yasi Zhang, Peiyu Yu, Ying Nian Wu
GRA: Detecting Oriented Objects through Group-wise Rotating and Attention
Jiangshan Wang, Yifan Pu, Yizeng Han et al.
Improving Virtual Try-On with Garment-focused Diffusion Models
Siqi Wan, Yehao Li, Jingwen Chen et al.
Unlocking the Potential of Federated Learning: The Symphony of Dataset Distillation via Deep Generative Latents
Yuqi Jia, Saeed Vahidian, Jingwei Sun et al.
Pathology-knowledge Enhanced Multi-instance Prompt Learning for Few-shot Whole Slide Image Classification
Linhao Qu, Dingkang Yang, Dan Huang et al.
PILoRA: Prototype Guided Incremental LoRA for Federated Class-Incremental Learning
Haiyang Guo, Fei Zhu, Wenzhuo Liu et al.
Beta-Tuned Timestep Diffusion Model
Tianyi Zheng, Peng-Tao Jiang, Ben Wan et al.
SFPNet: Sparse Focal Point Network for Semantic Segmentation on General LiDAR Point Clouds
Yanbo Wang, Wentao Zhao, Cao Chuan et al.
Keypoint Promptable Re-Identification
Vladimir Somers, Alexandre ALahi, Christophe De Vleeschouwer
Robust Incremental Structure-from-Motion with Hybrid Features
Shaohui Liu, Yidan Gao, Tianyi Zhang et al.
CoDA: Instructive Chain-of-Domain Adaptation with Severity-Aware Visual Prompt Tuning
Ziyang Gong, FuHao Li, Yupeng Deng et al.
Masked Video and Body-worn IMU Autoencoder for Egocentric Action Recognition
Mingfang Zhang, Yifei Huang, Ruicong Liu et al.
CoReS: Orchestrating the Dance of Reasoning and Segmentation
Xiaoyi Bao, Siyang Sun, Shuailei Ma et al.
PetFace: A Large-Scale Dataset and Benchmark for Animal Identification
Risa Shinoda, Kaede Shiohara
InsMapper: Exploring Inner-instance Information for Vectorized HD Mapping
Zhenhua Xu, Kwan-Yee K. Wong, Hengshuang ZHAO
BeNeRF:Neural Radiance Fields from a Single Blurry Image and Event Stream
Wenpu Li, Pian Wan, Peng Wang et al.
PartSTAD: 2D-to-3D Part Segmentation Task Adaptation
Hyunjin Kim, Minhyuk Sung
Diffusion Model is a Good Pose Estimator from 3D RF-Vision
Junqiao Fan, Jianfei Yang, Yuecong Xu et al.
Optimizing Diffusion Models for Joint Trajectory Prediction and Controllable Generation
Yixiao Wang, Chen Tang, Lingfeng Sun et al.
SuperGaussian: Repurposing Video Models for 3D Super Resolution
Yuan Shen, Duygu Ceylan, Paul Guerrero et al.
PromptIQA: Boosting the Performance and Generalization for No-Reference Image Quality Assessment via Prompts
Zewen Chen, Haina Qin, Juan Wang et al.
Prompt-Based Test-Time Real Image Dehazing: A Novel Pipeline
Zixuan Chen, Zewei He, Ziqian Lu et al.
Tackling Structural Hallucination in Image Translation with Local Diffusion
Seunghoi Kim, Chen Jin, Tom Diethe et al.
Text to Layer-wise 3D Clothed Human Generation
Junting Dong, Qi Fang, Zehuan Huang et al.
Rethinking Few-shot Class-incremental Learning: Learning from Yourself
Yu-Ming Tang, Yi-Xing Peng, Jing-Ke Meng et al.
Lazy Diffusion Transformer for Interactive Image Editing
Yotam Nitzan, Zongze Wu, Richard Zhang et al.
Visual Alignment Pre-training for Sign Language Translation
Peiqi Jiao, Yuecong Min, Xilin CHEN
CLOSER: Towards Better Representation Learning for Few-Shot Class-Incremental Learning
Junghun Oh, Sungyong Baik, Kyoung Mu Lee
Efficient Image Pre-Training with Siamese Cropped Masked Autoencoders
Alexandre Eymaël, Renaud Vandeghen, Anthony Cioppa et al.
Multimodal Cross-Domain Few-Shot Learning for Egocentric Action Recognition
Masashi Hatano, Ryo Hachiuma, Ryo Fujii et al.
An Efficient and Effective Transformer Decoder-Based Framework for Multi-Task Visual Grounding
Wei Chen, Long Chen, Yu Wu
Deep Diffusion Image Prior for Efficient OOD Adaptation in 3D Inverse Problems
Hyungjin Chung, Jong Chul Ye
Self-Supervised Video Desmoking for Laparoscopic Surgery
Renlong Wu, Zhilu Zhang, Shuohao Zhang et al.
A Multimodal Benchmark Dataset and Model for Crop Disease Diagnosis
Xiang Liu, Zhaoxiang Liu, Huan Hu et al.
Exploring Pre-trained Text-to-Video Diffusion Models for Referring Video Object Segmentation
Xuelu Feng, Dongdong Chen, Junsong Yuan et al.
PosFormer: Recognizing Complex Handwritten Mathematical Expression with Position Forest Transformer
Tongkun Guan, Chengyu Lin, Wei Shen et al.
GaussReg: Fast 3D Registration with Gaussian Splatting
Jiahao Chang, Yinglin Xu, Yihao Li et al.
Distractors-Immune Representation Learning with Cross-modal Contrastive Regularization for Change Captioning
Yunbin Tu, Liang Li, Li Su et al.
Knowledge Transfer with Simulated Inter-Image Erasing for Weakly Supervised Semantic Segmentation
Tao Chen, Xiruo Jiang, Gensheng Pei et al.
PartCraft: Crafting Creative Objects by Parts
Kam Woh Ng, Xiatian Zhu, Yi-Zhe Song et al.
One-stage Prompt-based Continual Learning
Youngeun Kim, YUHANG LI, Priyadarshini Panda
LiFT: A Surprisingly Simple Lightweight Feature Transform for Dense ViT Descriptors
Saksham Suri, Matthew Walmer, Kamal Gupta et al.
Emergent Visual-Semantic Hierarchies in Image-Text Representations
Morris Alper, Hadar Averbuch-Elor
Motion-prior Contrast Maximization for Dense Continuous-Time Motion Estimation
Friedhelm Hamann, Ziyun Wang, Ioannis Asmanis et al.
AnatoMask: Enhancing Medical Image Segmentation with Reconstruction-guided Self-masking
Yuheng Li, Tianyu Luan, Yizhou Wu et al.
E.T. the Exceptional Trajectory: Text-to-camera-trajectory generation with character awareness
Robin Courant, Nicolas Dufour, Xi WANG et al.
UniM2AE: Multi-modal Masked Autoencoders with Unified 3D Representation for 3D Perception in Autonomous Driving
Jian Zou, Tianyu Huang, Guanglei Yang et al.
Language-Driven 6-DoF Grasp Detection Using Negative Prompt Guidance
Tien Toan Nguyen, Minh Nhat Nhat Vu, Baoru Huang et al.
M2D2M: Multi-Motion Generation from Text with Discrete Diffusion Models
Seunggeun Chi, Hyung-gun Chi, Hengbo Ma et al.
Context Diffusion: In-Context Aware Image Generation
Ivona Najdenkoska, Animesh Sinha, Abhimanyu Dubey et al.
Mismatch Quest: Visual and Textual Feedback for Image-Text Misalignment
Brian Gordon, Yonatan Bitton, Yonatan Shafir et al.
PARE-Net: Position-Aware Rotation-Equivariant Networks for Robust Point Cloud Registration
Runzhao Yao, Shaoyi Du, Wenting Cui et al.
Towards Multimodal Open-Set Domain Generalization and Adaptation through Self-supervision
Hao Dong, Eleni Chatzi, Olga Fink
MSD: A Benchmark Dataset for Floor Plan Generation of Building Complexes
Casper van Engelenburg, Fatemeh Mostafavi, Emanuel Kuhn et al.
City-on-Web: Real-time Neural Rendering of Large-scale Scenes on the Web
Kaiwen Song, Xiaoyi Zeng, Chenqu Ren et al.
Towards Latent Masked Image Modeling for Self-Supervised Visual Representation Learning
Yibing Wei, Abhinav Gupta, Pedro Morgado
Taming Latent Diffusion Model for Neural Radiance Field Inpainting
Chieh Lin, Changil Kim, Jia-Bin Huang et al.
TCLC-GS: Tightly Coupled LiDAR-Camera Gaussian Splatting for Autonomous Driving
Cheng Zhao, su sun, Ruoyu Wang et al.
Towards Multi-modal Transformers in Federated Learning
Guangyu Sun, Matias Mendieta, Aritra Dutta et al.
Controllable Navigation Instruction Generation with Chain of Thought Prompting
Xianghao Kong, Jinyu Chen, Wenguan Wang et al.
Explicitly Guided Information Interaction Network for Cross-modal Point Cloud Completion
Hang Xu, Chen Long, Wenxiao Zhang et al.
Learning with Counterfactual Explanations for Radiology Report Generation
Mingjie Li, Haokun Lin, Liang Qiu et al.
Thermal3D-GS: Physics-induced 3D Gaussians for Thermal Infrared Novel-view Synthesis
Qian Chen, Shihao Shu, Xiangzhi Bai
Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation
Shihao Zhao, Shaozhe Hao, Bojia Zi et al.
Select and Distill: Selective Dual-Teacher Knowledge Transfer for Continual Learning on Vision-Language Models
Yu-Chu Yu, Chi-Pin Huang, Jr-Jen Chen et al.
The Hard Positive Truth about Vision-Language Compositionality
Amita Kamath, Cheng-Yu Hsieh, Kai-Wei Chang et al.
Close, But Not There: Boosting Geographic Distance Sensitivity in Visual Place Recognition
Sergio Izquierdo, Javier Civera
RAPiD-Seg: Range-Aware Pointwise Distance Distribution Networks for 3D LiDAR Segmentation
Luis Li, Hubert P. H. Shum, Toby P Breckon
PatchRefiner: Leveraging Synthetic Data for Real-Domain High-Resolution Monocular Metric Depth Estimation
Zhenyu Li, Shariq Farooq Bhat, Peter Wonka
EMIE-MAP: Large-Scale Road Surface Reconstruction Based on Explicit Mesh and Implicit Encoding
Wenhua Wu, Qi Wang, Guangming Wang et al.
LookupViT: Compressing visual information to a limited number of tokens
Rajat Koner, Gagan Jain, Sujoy Paul et al.
Image Demoireing in RAW and sRGB Domains
Shuning Xu, Binbin Song, Xiangyu Chen et al.
LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction
Penghui Du, Yu Wang, Yifan Sun et al.
Auto-GAS: Automated Proxy Discovery for Training-free Generative Architecture Search
Lujun Li, Haosen SUN, Shiwen Li et al.
Beyond MOT: Semantic Multi-Object Tracking
Yunhao Li, Qin Li, Hao Wang et al.
Diffusion Bridges for 3D Point Cloud Denoising
Mathias Vogel, Keisuke Tateno, Marc Pollefeys et al.
Tri^{2}-plane: Thinking Head Avatar via Feature Pyramid
Luchuan Song, Pinxin Liu, Lele Chen et al.
A Simple Background Augmentation Method for Object Detection with Diffusion Model
YUHANG LI, Xin Dong, Chen Chen et al.
TP2O: Creative Text Pair-to-Object Generation using Balance Swap-Sampling
Jun Li, Zedong Zhang, Jian Yang
AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation
Zanlin Ni, Yulin Wang, Renping Zhou et al.
MoE-DiffIR: Task-customized Diffusion Priors for Universal Compressed Image Restoration
Yulin Ren, Xin Li, Bingchen Li et al.
TimeLens-XL: Real-time Event-based Video Frame Interpolation with Large Motion
Shi Guo, Yutian Chen, Tianfan Xue et al.
DailyDVS-200: A Comprehensive Benchmark Dataset for Event-Based Action Recognition
Qi Wang, Zhou Xu, Yuming Lin et al.
Teach CLIP to Develop a Number Sense for Ordinal Regression
Yao DU, Qiang Zhai, Weihang Dai et al.
I Can't Believe It's Not Scene Flow!
Ishan Khatri, Kyle Vedder, Neehar Peri et al.
TransCAD: A Hierarchical Transformer for CAD Sequence Inference from Point Clouds
Dupont Elona, Kseniya Cherenkova, Dimitrios Mallis et al.
ColorMNet: A Memory-based Deep Spatial-Temporal Feature Propagation Network for Video Colorization
Yixin Yang, Jiangxin Dong, Jinhui Tang et al.
SegGen: Supercharging Segmentation Models with Text2Mask and Mask2Img Synthesis
Hanrong Ye, Jason Wen Yong Kuen, Qing Liu et al.
Improving Text-guided Object Inpainting with Semantic Pre-inpainting
Yifu Chen, Jingwen Chen, Yingwei Pan et al.
Just a Hint: Point-Supervised Camouflaged Object Detection
Huafeng Chen, Dian SHAO, Guangqian Guo et al.
Improving Geo-diversity of Generated Images with Contextualized Vendi Score Guidance
Reyhane Askari Hemmat, Melissa Hall, Alicia Yi Sun et al.
A Simple Latent Diffusion Approach for Panoptic Segmentation and Mask Inpainting
Wouter Van Gansbeke, Bert De Brabandere
Visual Text Generation in the Wild
Yuanzhi Zhu, Jiawei Liu, Feiyu Gao et al.
CLIP-Guided Generative Networks for Transferable Targeted Adversarial Attacks
Hao Fang, Jiawei Kong, Bin Chen et al.
Large-Scale Multi-Hypotheses Cell Tracking Using Ultrametric Contours Maps
Jordao Bragantini, Merlin Lange, Loïc A Royer
Learning Camouflaged Object Detection from Noisy Pseudo Label
Jin Zhang, Ruiheng Zhang, Yanjiao Shi et al.
FRDiff : Feature Reuse for Universal Training-free Acceleration of Diffusion Models
Junhyuk So, Jungwon Lee, Eunhyeok Park
Reinforcement Learning Friendly Vision-Language Model for Minecraft
Haobin Jiang, Junpeng Yue, Hao Luo et al.
LaPose: Laplacian Mixture Shape Modeling for RGB-Based Category-Level Object Pose Estimation
Ruida Zhang, Ziqin Huang, Gu Wang et al.
HumanRefiner: Benchmarking Abnormal Human Generation and Refining with Coarse-to-fine Pose-Reversible Guidance
Guian Fang, Wenbiao Yan, Yuanfan Guo et al.
SUMix: Mixup with Semantic and Uncertain Information
Huafeng Qin, Xin Jin, Hongyu Zhu et al.
Instant 3D Human Avatar Generation using Image Diffusion Models
Nikos Kolotouros, Thiemo Alldieck, Enric Corona et al.
VEON: Vocabulary-Enhanced Occupancy Prediction
Jilai Zheng, Pin Tang, Zhongdao Wang et al.
DG-PIC: Domain Generalized Point-In-Context Learning for Point Cloud Understanding
Jincen Jiang, Qianyu Zhou, Yuhang Li et al.
UniMD: Towards Unifying Moment Retrieval and Temporal Action Detection
Yingsen Zeng, Yujie Zhong, Chengjian Feng et al.
CardiacNet: Learning to Reconstruct Abnormalities for Cardiac Disease Assessment from Echocardiogram Videos
JIEWEN YANG, Yiqun Lin, Bin Pu et al.
Textual Knowledge Matters: Cross-Modality Co-Teaching for Generalized Visual Class Discovery
Haiyang Zheng, Pu Nan, Wenjing Li et al.
Event Camera Data Dense Pre-training
Yan Yang, Liyuan Pan, Liu liu
PLOT: Text-based Person Search with Part Slot Attention for Corresponding Part Discovery
Jicheol Park, Dongwon Kim, Boseung Jeong et al.
Animal Avatars: Reconstructing Animatable 3D Animals from Casual Videos
Remy Sabathier, David Novotny, Niloy Mitra
SAFNet: Selective Alignment Fusion Network for Efficient HDR Imaging
Lingtong Kong, Bo Li, Yike Xiong et al.