Most Cited ECCV "generative enhancement model" Papers
2,387 papers found • Page 9 of 12
Conference
GRAPE: Generalizable and Robust Multi-view Facial Capture
Jing Li, Di Kang, Zhenyu He
Forecasting Future Videos from Novel Views via Disentangled 3D Scene Representation
Sudhir Kumar Reddy Yarram, Junsong Yuan
Diverse Text-to-3D Synthesis with Augmented Text Embedding
Uy Tran, Minh N. Hoang Luu, Phong Nguyen et al.
Dependency-aware Differentiable Neural Architecture Search
Buang Zhang, Xinle Wu, Hao Miao et al.
ADen: Adaptive Density Representations for Sparse-view Camera Pose Estimation
Hao Tang, Weiyao Wang, Pierre Gleize et al.
Diffusion-Refined VQA Annotations for Semi-Supervised Gaze Following
Qiaomu Miao, Alexandros Graikos, Jingwei Zhang et al.
Group Testing for Accurate and Efficient Range-Based Near Neighbor Search for Plagiarism Detection
Harsh Shah, Kashish Mittal, Ajit Rajwade
Open-set Domain Adaptation via Joint Error based Multi-class Positive and Unlabeled Learning
Dexuan Zhang, Thomas Westfechtel, Tatsuya Harada
Semantic-guided Robustness Tuning for Few-Shot Transfer Across Extreme Domain Shift
kangyu xiao, Zilei Wang, junjie li
REFRAME: Reflective Surface Real-Time Rendering for Mobile Devices
Chaojie Ji, Yufeng Li, Yiyi Liao
Human Motion Forecasting in Dynamic Domain Shifts: A Homeostatic Continual Test-time Adaptation Framework
Qiongjie Cui, Huaijiang Sun, Bin Li et al.
Rethinking Unsupervised Outlier Detection via Multiple Thresholding
Zhonghang Liu, Panzhong Lu, Guoyang Xie et al.
BaSIC: BayesNet Structure Learning for Computational Scalable Neural Image Compression
Yufeng Zhang, Hang Yu, Shizhan Liu et al.
Multimodal Label Relevance Ranking via Reinforcement Learning
Taian Guo, Taolin Zhang, Haoqian Wu et al.
Compositional Substitutivity of Visual Reasoning for Visual Question Answering
Chuanhao Li, Zhen Li, Chenchen Jing et al.
CriSp: Leveraging Tread Depth Maps for Enhanced Crime-Scene Shoeprint Matching
Samia Shafique, Shu Kong, Charless Fowlkes
Learning Anomalies with Normality Prior for Unsupervised Video Anomaly Detection
Haoyue Shi, Le Wang, Sanping Zhou et al.
Möbius Transform for Mitigating Perspective Distortions in Representation Learning
Prakash Chandra Chhipa, Meenakshi Subhash Chippa, Kanjar De et al.
Efficient Pre-training for Localized Instruction Generation of Procedural Videos
Anil Batra, Davide Moltisanti, Laura Sevilla-Lara et al.
AID-AppEAL: Automatic Image Dataset and Algorithm for Content Appeal Enhancement and Assessment Labeling
Sherry Chen, Yaron Vaxman, Elad Ben Baruch et al.
GOEmbed: Gradient Origin Embeddings for Representation Agnostic 3D Feature Learning
Animesh Karnewar, Roman Shapovalov, Tom Monnier et al.
Leveraging Imperfect Restoration for Data Availability Attack
YI HUANG, Jeremy Styborski, Mingzhi Lyu et al.
Aligning Neuronal Coding of Dynamic Visual Scenes with Foundation Vision Models
Rining Wu, Feixiang Zhou, Ziwei Yin et al.
Efficient Snapshot Spectral Imaging: Calibration-Free Parallel Structure with Aperture Diffraction Fusion
Tao Lv, Lihao Hu, Shiqiao Li et al.
Adaptive Selection of Sampling-Reconstruction in Fourier Compressed Sensing
Seongmin Hong, Jaehyeok Bae, Jongho Lee et al.
Commonly Interesting Images
Fitim Abdullahu, Helmut Grabner
Refine, Discriminate and Align: Stealing Encoders via Sample-Wise Prototypes and Multi-Relational Extraction
Shuchi Wu, Chuan Ma, Kang Wei et al.
Mutual Learning for Acoustic Matching and Dereverberation via Visual Scene-driven Diffusion
Jian Ma, Wenguan Wang, Yi Yang et al.
Learning Non-Linear Invariants for Unsupervised Out-of-Distribution Detection
Lars Doorenbos, Raphael Sznitman, Pablo Márquez Neila
Neural Poisson Solver: A Universal and Continuous Framework for Natural Signal Blending
Delong Wu, Hao Zhu, Qi Zhang et al.
FedHARM: Harmonizing Model Architectural Diversity in Federated Learning
Anestis Kastellos, Athanasios Psaltis, Charalampos Z Patrikakis et al.
Cut out the Middleman: Revisiting Pose-based Gait Recognition
YANG FU, Saihui Hou, Shibei Meng et al.
DatasetNeRF: Efficient 3D-aware Data Factory with Generative Radiance Fields
Yu Chi, Fangneng Zhan, Sibo Wu et al.
Linking in Style: Understanding learned features in deep learning models
Maren Wehrheim, Pamela Osuna Vargas, Matthias Kaschube
Computing the Lipschitz constant needed for fast scene recovery from CASSI measurements
Niels Chr. Overgaard, Anders Holst
Factorizing Text-to-Video Generation by Explicit Image Conditioning
Rohit Girdhar, Mannat Singh, Andrew Brown et al.
Accelerating Image Super-Resolution Networks with Pixel-Level Classification
Jinho Jeong, Jinwoo Kim, Younghyun Jo et al.
Embodied Understanding of Driving Scenarios
Yunsong Zhou, Linyan Huang, Qingwen Bu et al.
Instant Uncertainty Calibration of NeRFs Using a Meta-Calibrator
Niki Amini-Naieni, Tomas Jakab, Andrea Vedaldi et al.
SHIC: Shape-Image Correspondences with no Keypoint Supervision
Aleksandar Shtedritski, Christian Rupprecht, Andrea Vedaldi
RealGen: Retrieval Augmented Generation for Controllable Traffic Scenarios
Wenhao Ding, Yulong Cao, DING ZHAO et al.
Improving Adversarial Transferability via Model Alignment
Avery Ma, Amir-massoud Farahmand, Yangchen Pan et al.
SceneTeller: Language-to-3D Scene Generation
Basak Melis Ocal, Maxim Tatarchenko, Sezer Karaoglu et al.
MagMax: Leveraging Model Merging for Seamless Continual Learning
Daniel Marczak, Bartlomiej Twardowski, Tomasz Trzcinski et al.
SPHINX: A Mixer of Weights, Visual Embeddings and Image Scales for Multi-modal Large Language Models
Ziyi Lin, Dongyang Liu, Renrui Zhang et al.
Diffusion Models as Data Mining Tools
Ioannis Siglidis, Aleksander Holynski, Alexei Efros et al.
Debiasing surgeon: fantastic weights and how to find them
Remi Nahon, Ivan Luiz De Moura Matos, Van-Tam Nguyen et al.
FoundPose: Unseen Object Pose Estimation with Foundation Features
Evin Pınar Örnek, Yann Labbé, Bugra Tekin et al.
Spline-based Transformers
Prashanth Chandran, Agon Serifi, Markus Gross et al.
M^2Depth: Self-supervised Two-Frame Multi-camera Metric Depth Estimation
Yingshuang Zou, Yikang Ding, Xi Qiu et al.
On Calibration of Object Detectors: Pitfalls, Evaluation and Baselines
Selim Kuzucu, Kemal Oksuz, Jonathan Sadeghi et al.
Efficient NeRF Optimization - Not All Samples Remain Equally Hard
Juuso Korhonen, Goutham Rangu, Hamed Rezazadegan Tavakoli et al.
FMBoost: Boosting Latent Diffusion with Flow Matching
Johannes Schusterbauer-Fischer, Ming Gui, Pingchuan Ma et al.
CLIP-DINOiser: Teaching CLIP a few DINO tricks for open-vocabulary semantic segmentation
Monika Wysoczanska, Oriane Siméoni, Michaël Ramamonjisoa et al.
Semi-Supervised Video Desnowing Network via Temporal Decoupling Experts and Distribution-Driven Contrastive Regularization
Hongtao Wu, Yijun Yang, Angelica I Aviles-Rivero et al.
Two-Stage Video Shadow Detection via Temporal-Spatial Adaption
Xin Duan, Yu Cao, Lei Zhu et al.
Efficient Diffusion Transformer with Step-wise Dynamic Attention Mediators
Yifan Pu, Xia Zhuofan, Jiayi Guo et al.
A Multimodal Benchmark Dataset and Model for Crop Disease Diagnosis
Xiang Liu, Zhaoxiang Liu, Huan Hu et al.
Missing Modality Prediction for Unpaired Multimodal Learning via Joint Embedding of Unimodal Models
Taesup Kim, Donggeun Kim
IVTP: Instruction-guided Visual Token Pruning for Large Vision-Language Models
Kai Huang, Hao Zou, Ye Xi et al.
Holodepth: Programmable Depth-Varying Projection via Computer-Generated Holography
Dorian Chan, Matthew O'Toole, Sizhuo Ma et al.
Situated Instruction Following
So Yeon Min, Xavier Puig, Devendra Singh Chaplot et al.
Fast Diffusion-Based Counterfactuals for Shortcut Removal and Generation
Nina Weng, Paraskevas Pegios, Eike Petersen et al.
Pick-a-back: Selective Device-to-Device Knowledge Transfer in Federated Continual Learning
JinYi Yoon, HyungJune Lee
Knowledge-enhanced Visual-Language Pretraining for Computational Pathology
Xiao Zhou, Xiaoman Zhang, Chaoyi Wu et al.
UAV First-Person Viewers Are Radiance Field Learners
Liqi Yan, Qifan Wang, Junhan Zhao et al.
Occupancy as Set of Points
Yiang Shi, Tianheng Cheng, Qian Zhang et al.
VideoStudio: Generating Consistent-Content and Multi-Scene Videos
Fuchen Long, Zhaofan Qiu, Ting Yao et al.
Probabilistic Image-Driven Traffic Modeling via Remote Sensing
Scott Workman, Armin Hadzic
GroCo: Ground Constraint for Metric Self-Supervised Monocular Depth
Aurélien Cecille, Stefan Duffner, Franck DAVOINE et al.
EMIE-MAP: Large-Scale Road Surface Reconstruction Based on Explicit Mesh and Implicit Encoding
Wenhua Wu, Qi Wang, Guangming Wang et al.
OccGen: Generative Multi-modal 3D Occupancy Prediction for Autonomous Driving
Guoqing Wang, Zhongdao Wang, Pin Tang et al.
AttentionHand: Text-driven Controllable Hand Image Generation for 3D Hand Reconstruction in the Wild
Junho Park, Kyeongbo Kong, Suk-Ju Kang
HyperSpaceX: Radial and Angular Exploration of HyperSpherical Dimensions
Chiranjeev Chiranjeev, Muskan Dosi, Kartik Thakral et al.
ConceptExpress: Harnessing Diffusion Models for Single-image Unsupervised Concept Extraction
Shaozhe Hao, Kai Han, Zhengyao Lv et al.
MotionDirector: Motion Customization of Text-to-Video Diffusion Models
Rui Zhao, Yuchao Gu, Jay Zhangjie Wu et al.
Clearer Frames, Anytime: Resolving Velocity Ambiguity in Video Frame Interpolation
Zhihang Zhong, Gurunandan Krishnan, Xiao Sun et al.
SCPNet: Unsupervised Cross-modal Homography Estimation via Intra-modal Self-supervised Learning
Runmin Zhang, Jun Ma, Lun Luo et al.
Distributed Active Client Selection With Noisy Clients Using Model Association Scores
Kwang In Kim
SimPB: A Single Model for 2D and 3D Object Detection from Multiple Cameras
Yingqi Tang, Zhaotie Meng, Guoliang Chen et al.
Responsible Visual Editing
Minheng Ni, Yeli Shen, Yabin Zhang et al.
Assessing Sample Quality via the Latent Space of Generative Models
Jingyi Xu, Hieu Le, Dimitris Samaras
Learning Unsigned Distance Functions from Multi-view Images with Volume Rendering Priors
Wen Yuan Zhang, Kanle Shi, Yushen Liu et al.
Free-ATM: Harnessing Free Attention Masks for Representation Learning on Diffusion-Generated Images
Junhao Zhang, Mutian Xu, Jay Zhangjie Wu et al.
ReCON: Training-Free Acceleration for Text-to-Image Synthesis with Retrieval of Concept Prompt Trajectories
Chen-yi Lu, Shubham Agarwal, Mehrab Tanjim et al.
Bridging the Gap Between Human Motion and Action Semantics via Kinematics Phrases
Xinpeng Liu, Yong-Lu Li, AILING ZENG et al.
Common Sense Reasoning for Deep Fake Detection
Yue Zhang, Ben Colman, Xiao Guo et al.
Tight and Efficient Upper Bound on Spectral Norm of Convolutional Layers
Ekaterina Grishina, Mikhail Gorbunov, Maxim Rakhuba
Deciphering the Role of Representation Disentanglement: Investigating Compositional Generalization in CLIP Models
Reza Abbasi, Mohammad Rohban, Mahdieh Soleymani Baghshah
Merlin: Empowering Multimodal LLMs with Foresight Minds
En Yu, liang zhao, YANA WEI et al.
EINet: Point Cloud Completion via Extrapolation and Interpolation
Pingping Cai, Canyu Zhang, LINGJIA SHI et al.
Robustness Tokens: Towards Adversarial Robustness of Transformers
Brian Pulfer, Yury Belousov, Slava Voloshynovskiy
Event-based Head Pose Estimation: Benchmark and Method
jiahui yuan, Hebei Li, Yansong Peng et al.
Rethinking Fast Adversarial Training: A Splitting Technique To Overcome Catastrophic Overfitting
Masoumeh Zareapoor, Pourya Shamsolmoali
Open-Vocabulary SAM: Segment and Recognize Twenty-thousand Classes Interactively
Haobo Yuan, Xiangtai Li, Chong Zhou et al.
AdaIFL: Adaptive Image Forgery Localization via a Dynamic and Importance-aware Transformer Network
Yuxi Li, Fuyuan Cheng, Wangbo Yu et al.
GKGNet: Group K-Nearest Neighbor based Graph Convolutional Network for Multi-Label Image Recognition
Ruijie Yao, Sheng Jin, Lumin Xu et al.
Embedding-Free Transformer with Inference Spatial Reduction for Efficient Semantic Segmentation
Hyunwoo Yu, Yubin Cho, Beoungwoo Kang et al.
Phase Concentration and Shortcut Suppression for Weakly Supervised Semantic Segmentation
Hoyong Kwon, Jaeseok Jeong, Sung-Hoon Yoon et al.
StableDrag: Stable Dragging for Point-based Image Editing
Yutao Cui, Xiaotong Zhao, Guozhen Zhang et al.
Finding Meaning in Points: Weakly Supervised Semantic Segmentation for Event Cameras
Hoonhee Cho, Sung-Hoon Yoon, Hyeokjun Kweon et al.
Syn-to-Real Domain Adaptation for Point Cloud Completion via Part-based Approach
Yunseo Yang, Jihun Kim, Kuk-Jin Yoon
DualDn: Dual-domain Denoising via Differentiable ISP
Ruikang Li, Yujin Wang, Shiqi Chen et al.
Object-Aware NIR-to-Visible Translation
Yunyi Gao, Lin Gu, Qiankun Liu et al.
MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model
Muyao Niu, Xiaodong Cun, Xintao Wang et al.
RING-NeRF : Rethinking Inductive Biases for Versatile and Efficient Neural Fields
Doriand Petit, Steve Bourgeois, Dumitru Pavel et al.
StereoGlue: Joint Feature Matching and Robust Estimation
Daniel Barath, Dmytro Mishkin, Luca Cavalli et al.
To Generate or Not? Safety-Driven Unlearned Diffusion Models Are Still Easy To Generate Unsafe Images ... For Now
Yimeng Zhang, jinghan jia, Xin Chen et al.
Combining Generative and Geometry Priors for Wide-Angle Portrait Correction
Lan Yao, Chaofeng Chen, Xiaoming Li et al.
Regulating Model Reliance on Non-Robust Features by Smoothing Input Marginal Density
Peiyu Yang, Naveed Akhtar, Shah Mubarak et al.
Arbitrary-Scale Video Super-Resolution with Structural and Textural Priors
Wei Shang, Dongwei Ren, Wanying Zhang et al.
Textual Query-Driven Mask Transformer for Domain Generalized Segmentation
Byeonghyun Pak, Byeongju Woo, Sunghwan Kim et al.
Align before Collaborate: Mitigating Feature Misalignment for Robust Multi-Agent Perception
Dingkang Yang, Ke Li, Dongling Xiao et al.
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents
Shilong Liu, Hao Cheng, Haotian Liu et al.
DISCO: Embodied Navigation and Interaction via Differentiable Scene Semantics and Dual-level Control
Xinyu Xu, Shengcheng Luo, Yanchao Yang et al.
ViGoR: Improving Visual Grounding of Large Vision Language Models with Fine-Grained Reward Modeling
Siming Yan, Min Bai, Weifeng Chen et al.
Learning Scalable Model Soup on a Single GPU: An Efficient Subspace Training Strategy
Tao Li, Weisen Jiang, Fanghui Liu et al.
Post-training Quantization with Progressive Calibration and Activation Relaxing for Text-to-Image Diffusion Models
Siao Tang, Xin Wang, Hong Chen et al.
KMTalk: Speech-Driven 3D Facial Animation with Key Motion Embedding
Zhihao Xu, Shengjie Gong, Jiapeng Tang et al.
Head360: Learning a Parametric 3D Full-Head for Free-View Synthesis in 360°
Yuxiao He, Yiyu Zhuang, Yanwen Wang et al.
Deep Companion Learning: Enhancing Generalization Through Historical Consistency
Ruizhao Zhu, Venkatesh Saligrama
Unveiling Privacy Risks in Stochastic Neural Networks Training: Effective Image Reconstruction from Gradients
Yiming Chen, Xiangyu Yang, Nikos Deligiannis
Straightforward Layer-wise Pruning for More Efficient Visual Adaptation
Ruizi Han, Jinglei Tang
ABC Easy as 123: A Blind Counter for Exemplar-Free Multi-Class Class-agnostic Counting
Michael A Hobley, Victor Adrian Prisacariu
CrossScore: A Multi-View Approach to Image Evaluation and Scoring
Zirui Wang, Wenjing Bian, Victor Adrian Prisacariu
CPM: Class-conditional Prompting Machine for Audio-visual Segmentation
Yuanhong Chen, Chong Wang, Yuyuan Liu et al.
DiffClass: Diffusion-Based Class Incremental Learning
Zichong Meng, Jie Zhang, Changdi Yang et al.
Dual-Decoupling Learning and Metric-Adaptive Thresholding for Semi-Supervised Multi-Label Learning
Jiahao Xiao, Ming-Kun Xie, Heng-Bo Fan et al.
AddressCLIP: Empowering Vision-Language Models for City-wide Image Address Localization
Shixiong Xu, Chenghao Zhang, Lubin Fan et al.
SEGIC: Unleashing the Emergent Correspondence for In-Context Segmentation
Lingchen Meng, Shiyi Lan, Hengduo Li et al.
Tiny Models are the Computational Saver for Large Models
Qingyuan Wang, Barry Cardiff, Antoine Frappé et al.
Motion-Oriented Compositional Neural Radiance Fields for Monocular Dynamic Human Modeling
Jaehyeok Kim, Dongyoon Wee, Dan Xu
DGE: Direct Gaussian 3D Editing by Consistent Multi-view Editing
Minghao Chen, Iro Laina, Andrea Vedaldi
Handling The Non-Smooth Challenge in Tensor SVD: A Multi-Objective Tensor Recovery Framework
Jingjing Zheng, Wanglong Lu, Wenzhe Wang et al.
Bayesian Self-Training for Semi-Supervised 3D Segmentation
Ozan Unal, Christos Sakaridis, Luc Van Gool
DMiT: Deformable Mipmapped Tri-Plane Representation for Dynamic Scenes
Jing-Wen Yang, Jia-Mu Sun, Yong-Liang Yang et al.
Analysis-by-Synthesis Transformer for Single-View 3D Reconstruction
Dian Jia, Xiaoqian Ruan, Kun Xia et al.
3D Gaussian Parametric Head Model
Yuelang Xu, Lizhen Wang, Zerong Zheng et al.
Dynamic Neural Radiance Field From Defocused Monocular Video
Xianrui Luo, Huiqiang Sun, Juewen Peng et al.
WPS-SAM: Towards Weakly-Supervised Part Segmentation with Foundation Models
xinjian wu, Ruisong Zhang, Jie Qin et al.
4Diff: 3D-Aware Diffusion Model for Third-to-First Viewpoint Translation
Feng Cheng, Mi Luo, Huiyu Wang et al.
SNeRV: Spectra-preserving Neural Representation for Video
Jina Kim, Jihoo Lee, Jewon Kang
GMT: Enhancing Generalizable Neural Rendering via Geometry-Driven Multi-Reference Texture Transfer
Youngho Yoon, Hyun-Kurl Jang, Kuk-Jin Yoon
Realistic Human Motion Generation with Cross-Diffusion Models
Zeping Ren, Shaoli Huang, Xiu Li
Hierarchical Unsupervised Relation Distillation for Source Free Domain Adaptation
Bowei Xing, Xianghua Ying, Ruibin Wang et al.
Labeled Data Selection for Category Discovery
Bingchen Zhao, Nico Lang, Serge Belongie et al.
UniTalker: Scaling up Audio-Driven 3D Facial Animation through A Unified Model
Xiangyu Fan, Jiaqi Li, Zhiqian Lin et al.
HERGen: Elevating Radiology Report Generation with Longitudinal Data
Fuying Wang, Shenghui Du, Lequan Yu
Mew: Multiplexed Immunofluorescence Image Analysis through an Efficient Multiplex Network
Sukwon Yun, Jie Peng, Alexandro E Trevino et al.
Understanding and Mitigating Human-Labelling Errors in Supervised Contrastive Learning
Zijun Long, Lipeng Zhuang, George W Killick et al.
PartCraft: Crafting Creative Objects by Parts
Kam Woh Ng, Xiatian Zhu, Yi-Zhe Song et al.
Causality-inspired Discriminative Feature Learning in Triple Domains for Gait Recognition
Haijun Xiong, Bin Feng, Xinggang Wang et al.
Neural Spectral Decomposition for Dataset Distillation
Yang Shaolei, Shen Cheng, Mingbo Hong et al.
AnyControl: Create Your Artwork with Versatile Control on Text-to-Image Generation
Sun Yanan, Yanchen Liu, Yinhao Tang et al.
The Sky's the Limit: Relightable Outdoor Scenes via a Sky-pixel Constrained Illumination Prior and Outside-In Visibility
James Gardner, Evgenii Kashin, Bernhard Egger et al.
Nonverbal Interaction Detection
Jianan Wei, Tianfei Zhou, Yi Yang et al.
HiEI: A Universal Framework for Generating High-quality Emerging Images from Natural Images
Jingmeng Li, Lukang Fu, Surun Yang et al.
Powerful and Flexible: Personalized Text-to-Image Generation via Reinforcement Learning
Fanyue Wei, Wei Zeng, Zhenyang Li et al.
Label-free Neural Semantic Image Synthesis
Jiayi Wang, Kevin Alexander Laube, Yumeng Li et al.
Track Everything Everywhere Fast and Robustly
Yunzhou Song, Jiahui Lei, Ziyun Wang et al.
MERLiN: Single-Shot Material Estimation and Relighting for Photometric Stereo
Ashish Tiwari, Satoshi Ikehata, Shanmuganathan Raman
HIMO: A New Benchmark for Full-Body Human Interacting with Multiple Objects
Xintao Lv, Liang Xu, Yichao Yan et al.
GRA: Detecting Oriented Objects through Group-wise Rotating and Attention
Jiangshan Wang, Yifan Pu, Yizeng Han et al.
Data Overfitting for On-Device Super-Resolution with Dynamic Algorithm and Compiler Co-Design
Li, zhihao shu, Jie Ji et al.
BI-MDRG: Bridging Image History in Multimodal Dialogue Response Generation
Hee Suk Yoon, Eunseop Yoon, Joshua Tian Jin Tee et al.
PRET: Planning with Directed Fidelity Trajectory for Vision and Language Navigation
Renjie Lu, Jing-Ke Meng, WEISHI ZHENG
Rethinking Few-shot Class-incremental Learning: Learning from Yourself
Yu-Ming Tang, Yi-Xing Peng, Jing-Ke Meng et al.
Idling Neurons, Appropriately Lenient Workload During Fine-tuning Leads to Better Generalization
Hongjing Niu, Hanting Li, Bin Li et al.
SignGen: End-to-End Sign Language Video Generation with Latent Diffusion
Fan Qi, Yu Duan, Changsheng Xu et al.
Improving image synthesis with diffusion-negative sampling
Alakh Desai, Nuno Vasconcelos
Length-Aware Motion Synthesis via Latent Diffusion
Alessio Sampieri, Alessio Palma, Indro Spinelli et al.
Long-CLIP: Unlocking the Long-Text Capability of CLIP
Beichen Zhang, Pan Zhang, Xiaoyi Dong et al.
RoGUENeRF: A Robust Geometry-Consistent Universal Enhancer for NeRF
Sibi Catley-Chandar, Richard Shaw, Greg Slabaugh et al.
Reliable and Efficient Concept Erasure of Text-to-Image Diffusion Models
Chao Gong, Kai Chen, Zhipeng Wei et al.
FuseTeacher: Modality-fused Encoders are Strong Vision Supervisors
Chen-Wei Xie, Siyang Sun, Liming Zhao et al.
MVDD: Multi-View Depth Diffusion Models
Zhen Wang, Qiangeng Xu, Feitong Tan et al.
WRIM-Net: Wide-Ranging Information Mining Network for Visible-Infrared Person Re-Identification
Yonggan Wu, Ling-Chao Meng, Yuan Zichao et al.
HARIVO: Harnessing Text-to-Image Models for Video Generation
Mingi Kwon, Seoung Wug Oh, Yang Zhou et al.
CatchBackdoor: Backdoor Detection via Critical Trojan Neural Path Fuzzing
Haibo Jin, Ruoxi Chen, Jinyin Chen et al.
Scissorhands: Scrub Data Influence via Connection Sensitivity in Networks
Jing Wu, Mehrtash Harandi
Q&A Prompts: Discovering Rich Visual Clues through Mining Question-Answer Prompts for VQA requiring Diverse World Knowledge
Haibo Wang, Weifeng Ge
Learning with Counterfactual Explanations for Radiology Report Generation
Mingjie Li, Haokun Lin, Liang Qiu et al.
Pseudo-Embedding for Generalized Few-Shot Point Cloud Segmentation
Chih-Jung Tsai, Hwann-Tzong Chen, Tyng-Luh Liu
Wavelet Convolutions for Large Receptive Fields
Shahaf Finder, Roy Amoyal, Eran Treister et al.
AdaLog: Post-Training Quantization for Vision Transformers with Adaptive Logarithm Quantizer
Zhuguanyu Wu, Jiaxin Chen, Hanwen Zhong et al.
Gradient-based Out-of-Distribution Detection
Taha Entesari, Sina Sharifi, Bardia Safaei et al.
Veil Privacy on Visual Data: Concealing Privacy for Humans, Unveiling for DNNs
Shuchao Pang, Ruhao Ma, Bing Li et al.
DriveLM: Driving with Graph Visual Question Answering
Chonghao Sima, Katrin Renz, Kashyap Chitta et al.
Simple Unsupervised Knowledge Distillation With Space Similarity
Aditya Singh, Haohan Wang
DIFFender: Diffusion-Based Adversarial Defense against Patch Attacks
Caixin Kang, Yinpeng Dong, Zhengyi Wang et al.
PointRegGPT: Boosting 3D Point Cloud Registration using Generative Point-Cloud Pairs for Training
SUYI CHEN, Hao Xu, Haipeng Li et al.
Learning Natural Consistency Representation for Face Forgery Video Detection
Daichi Zhang, Zihao Xiao, Shikun Li et al.
MasterWeaver: Taming Editability and Face Identity for Personalized Text-to-Image Generation
Yuxiang WEI, Zhilong Ji, Jinfeng Bai et al.
Region-aware Distribution Contrast: A Novel Approach to Multi-Task Partially Supervised Learning
Meixuan Li, Tianyu Li, Guoqing Wang et al.
HiDiffusion: Unlocking Higher-Resolution Creativity and Efficiency in Pretrained Diffusion Models
Shen Zhang, Zhaowei CHEN, Zhenyu Zhao et al.
Diffusion Models as Optimizers for Efficient Planning in Offline RL
Renming Huang, Yunqiang Pei, Guoqing Wang et al.
Controllable Contextualized Image Captioning: Directing the Visual Narrative through User-Defined Highlights
Shunqi Mao, Chaoyi Zhang, Hang Su et al.
Attention Decomposition for Cross-Domain Semantic Segmentation
Liqiang He, Sinisa Todorovic
FYI: Flip Your Images for Dataset Distillation
Byunggwan Son, Youngmin Oh, Donghyeon Baek et al.
View-Consistent 3D Editing with Gaussian Splatting
Yuxuan Wang, Xuanyu Yi, Zike Wu et al.