Most Cited ECCV "training-data free generation" Papers
2,387 papers found • Page 10 of 12
Conference
HiDiffusion: Unlocking Higher-Resolution Creativity and Efficiency in Pretrained Diffusion Models
Shen Zhang, Zhaowei CHEN, Zhenyu Zhao et al.
Imaging Interiors: An Implicit Solution to Electromagnetic Inverse Scattering Problems
Ziyuan Luo, Boxin Shi, Haoliang Li et al.
EgoBody3M: Egocentric Body Tracking on a VR Headset using a Diverse Dataset
Amy Zhao, Chengcheng Tang, Lezi Wang et al.
StableDrag: Stable Dragging for Point-based Image Editing
Yutao Cui, Xiaotong Zhao, Guozhen Zhang et al.
MasterWeaver: Taming Editability and Face Identity for Personalized Text-to-Image Generation
Yuxiang WEI, Zhilong Ji, Jinfeng Bai et al.
PointRegGPT: Boosting 3D Point Cloud Registration using Generative Point-Cloud Pairs for Training
SUYI CHEN, Hao Xu, Haipeng Li et al.
DIFFender: Diffusion-Based Adversarial Defense against Patch Attacks
Caixin Kang, Yinpeng Dong, Zhengyi Wang et al.
Rethinking Data Bias: Dataset Copyright Protection via Embedding Class-wise Hidden Bias
Jinhyeok Jang, ByungOk Han, Jaehong Kim et al.
Visual Relationship Transformation
Xiaoyu Xu, Jiayan Qiu, Baosheng Yu et al.
Q&A Prompts: Discovering Rich Visual Clues through Mining Question-Answer Prompts for VQA requiring Diverse World Knowledge
Haibo Wang, Weifeng Ge
HARIVO: Harnessing Text-to-Image Models for Video Generation
Mingi Kwon, Seoung Wug Oh, Yang Zhou et al.
Reliable and Efficient Concept Erasure of Text-to-Image Diffusion Models
Chao Gong, Kai Chen, Zhipeng Wei et al.
Length-Aware Motion Synthesis via Latent Diffusion
Alessio Sampieri, Alessio Palma, Indro Spinelli et al.
Clean & Compact: Efficient Data-Free Backdoor Defense with Model Compactness
Huy Phan, Jinqi Xiao, Yang Sui et al.
Evaluating the Adversarial Robustness of Semantic Segmentation: Trying Harder Pays Off
Levente Ferenc Halmosi, Bálint Mohos, Márk Jelasity
SignGen: End-to-End Sign Language Video Generation with Latent Diffusion
Fan Qi, Yu Duan, Changsheng Xu et al.
GRA: Detecting Oriented Objects through Group-wise Rotating and Attention
Jiangshan Wang, Yifan Pu, Yizeng Han et al.
Label-free Neural Semantic Image Synthesis
Jiayi Wang, Kevin Alexander Laube, Yumeng Li et al.
Causal Subgraphs and Information Bottlenecks: Redefining OOD Robustness in Graph Neural Networks
Weizhi An, Wenliang Zhong, Feng Jiang et al.
Image-to-Lidar Relational Distillation for Autonomous Driving Data
Anas Mahmoud, Ali Harakeh, Steven Waslander
Powerful and Flexible: Personalized Text-to-Image Generation via Reinforcement Learning
Fanyue Wei, Wei Zeng, Zhenyang Li et al.
SemanticHuman-HD: High Resolution Semantic disentangled 3D Human Generation
Peng Zheng, Tao Liu, Zili Yi et al.
HiEI: A Universal Framework for Generating High-quality Emerging Images from Natural Images
Jingmeng Li, Lukang Fu, Surun Yang et al.
The Sky's the Limit: Relightable Outdoor Scenes via a Sky-pixel Constrained Illumination Prior and Outside-In Visibility
James Gardner, Evgenii Kashin, Bernhard Egger et al.
Neural Spectral Decomposition for Dataset Distillation
Yang Shaolei, Shen Cheng, Mingbo Hong et al.
COSMU: Complete 3D human shape from monocular unconstrained images
Marco Pesavento, Marco Volino, Adrian Hilton
Phase Concentration and Shortcut Suppression for Weakly Supervised Semantic Segmentation
Hoyong Kwon, Jaeseok Jeong, Sung-Hoon Yoon et al.
Language-Assisted Skeleton Action Understanding for Skeleton-Based Temporal Action Segmentation
Haoyu Ji, Bowen Chen, Xinglong Xu et al.
HERGen: Elevating Radiology Report Generation with Longitudinal Data
Fuying Wang, Shenghui Du, Lequan Yu
Hierarchical Unsupervised Relation Distillation for Source Free Domain Adaptation
Bowei Xing, Xianghua Ying, Ruibin Wang et al.
GMT: Enhancing Generalizable Neural Rendering via Geometry-Driven Multi-Reference Texture Transfer
Youngho Yoon, Hyun-Kurl Jang, Kuk-Jin Yoon
SNeRV: Spectra-preserving Neural Representation for Video
Jina Kim, Jihoo Lee, Jewon Kang
L-DiffER: Single Image Reflection Removal with Language-based Diffusion Model
Yuchen Hong, Haofeng Zhong, Shuchen Weng et al.
WPS-SAM: Towards Weakly-Supervised Part Segmentation with Foundation Models
xinjian wu, Ruisong Zhang, Jie Qin et al.
Analysis-by-Synthesis Transformer for Single-View 3D Reconstruction
Dian Jia, Xiaoqian Ruan, Kun Xia et al.
DMiT: Deformable Mipmapped Tri-Plane Representation for Dynamic Scenes
Jing-Wen Yang, Jia-Mu Sun, Yong-Liang Yang et al.
A Diffusion Model for Simulation Ready Coronary Anatomy with Morpho-skeletal Control
Karim Kadry, Shreya Gupta, Jonas Sogbadji et al.
Motion-Oriented Compositional Neural Radiance Fields for Monocular Dynamic Human Modeling
Jaehyeok Kim, Dongyoon Wee, Dan Xu
KMTalk: Speech-Driven 3D Facial Animation with Key Motion Embedding
Zhihao Xu, Shengjie Gong, Jiapeng Tang et al.
Revisiting Domain-Adaptive Object Detection in Adverse Weather by the Generation and Composition of High-Quality Pseudo-Labels
Rui Zhao, Huibin Yan, Shuoyao Wang
Post-training Quantization with Progressive Calibration and Activation Relaxing for Text-to-Image Diffusion Models
Siao Tang, Xin Wang, Hong Chen et al.
DISCO: Embodied Navigation and Interaction via Differentiable Scene Semantics and Dual-level Control
Xinyu Xu, Shengcheng Luo, Yanchao Yang et al.
Textual Query-Driven Mask Transformer for Domain Generalized Segmentation
Byeonghyun Pak, Byeongju Woo, Sunghwan Kim et al.
Arbitrary-Scale Video Super-Resolution with Structural and Textural Priors
Wei Shang, Dongwei Ren, Wanying Zhang et al.
A Unified Image Compression Method for Human Perception and Multiple Vision Tasks
Sha Guo, Sui Lin, Chen-Lin Zhang et al.
APL: Anchor-based Prompt Learning for One-stage Weakly Supervised Referring Expression Comprehension
Yaxin Luo, Jiayi Ji, Xiaofu Chen et al.
Embedding-Free Transformer with Inference Spatial Reduction for Efficient Semantic Segmentation
Hyunwoo Yu, Yubin Cho, Beoungwoo Kang et al.
Combining Generative and Geometry Priors for Wide-Angle Portrait Correction
Lan Yao, Chaofeng Chen, Xiaoming Li et al.
To Generate or Not? Safety-Driven Unlearned Diffusion Models Are Still Easy To Generate Unsafe Images ... For Now
Yimeng Zhang, jinghan jia, Xin Chen et al.
To Supervise or Not to Supervise: Understanding and Addressing the Key Challenges of Point Cloud Transfer Learning
Souhail Hadgi, Lei Li, Maks Ovsjanikov
3D Reconstruction of Objects in Hands without Real World 3D Supervision
Aditya Prakash, Matthew Chang, Matthew Jin et al.
Forbes: Face Obfuscation Rendering via Backpropagation Refinement Scheme
Jintae Kim, Seungwon Yang, Seong-Gyun Jeong et al.
SeA: Semantic Adversarial Augmentation for Last Layer Features from Unsupervised Representation Learning
Qi Qian, Yuanhong Xu, JUHUA HU
DualDn: Dual-domain Denoising via Differentiable ISP
Ruikang Li, Yujin Wang, Shiqi Chen et al.
VideoStudio: Generating Consistent-Content and Multi-Scene Videos
Fuchen Long, Zhaofan Qiu, Ting Yao et al.
Localization and Expansion: A Decoupled Framework for Point Cloud Few-shot Semantic Segmentation
Zhaoyang Li, Yuan Wang, Wangkai Li et al.
AdaIFL: Adaptive Image Forgery Localization via a Dynamic and Importance-aware Transformer Network
Yuxi Li, Fuyuan Cheng, Wangbo Yu et al.
Event-based Head Pose Estimation: Benchmark and Method
jiahui yuan, Hebei Li, Yansong Peng et al.
CoLA: Conditional Dropout and Language-driven Robust Dual-modal Salient Object Detection
Shuang Hao, Chunlin Zhong, He Tang
Visual Alignment Pre-training for Sign Language Translation
Peiqi Jiao, Yuecong Min, Xilin CHEN
Rethinking Image-to-Video Adaptation: An Object-centric Perspective
Rui Qian, Shuangrui Ding, Dahua Lin
Siamese Vision Transformers are Scalable Audio-visual Learners
Yan-Bo Lin, Gedas Bertasius
Learning Unsigned Distance Functions from Multi-view Images with Volume Rendering Priors
Wen Yuan Zhang, Kanle Shi, Yushen Liu et al.
Assessing Sample Quality via the Latent Space of Generative Models
Jingyi Xu, Hieu Le, Dimitris Samaras
Responsible Visual Editing
Minheng Ni, Yeli Shen, Yabin Zhang et al.
Consistent 3D Line Mapping
Xulong Bai, Hainan Cui, Shuhan Shen
Physical-Based Event Camera Simulator
Haiqian Han, Jiacheng Lyu, Jianing Li et al.
LATTE3D: Large-scale Amortized Text-To-Enhanced3D Synthesis
Kevin Xie, Tianshi Cao, Jonathan P Lorraine et al.
Clearer Frames, Anytime: Resolving Velocity Ambiguity in Video Frame Interpolation
Zhihang Zhong, Gurunandan Krishnan, Xiao Sun et al.
Scene-aware Human Motion Forecasting via Mutual Distance Prediction
Chaoyue Xing, Wei Mao, Miaomiao LIU
MotionDirector: Motion Customization of Text-to-Video Diffusion Models
Rui Zhao, Yuchao Gu, Jay Zhangjie Wu et al.
FunQA: Towards Surprising Video Comprehension
Binzhu Xie, Sicheng Zhang, Zitang Zhou et al.
Photon Inhibition for Energy-Efficient Single-Photon Imaging
Lucas Koerner, Shantanu Gupta, Atul N Ingle et al.
OccGen: Generative Multi-modal 3D Occupancy Prediction for Autonomous Driving
Guoqing Wang, Zhongdao Wang, Pin Tang et al.
Probabilistic Image-Driven Traffic Modeling via Remote Sensing
Scott Workman, Armin Hadzic
Smoothness, Synthesis, and Sampling: Re-thinking Unsupervised Multi-View Stereo with DIV Loss
Alex Rich, Noah Stier, Pradeep Sen et al.
Beyond MOT: Semantic Multi-Object Tracking
Yunhao Li, Qin Li, Hao Wang et al.
UAV First-Person Viewers Are Radiance Field Learners
Liqi Yan, Qifan Wang, Junhan Zhao et al.
Knowledge-enhanced Visual-Language Pretraining for Computational Pathology
Xiao Zhou, Xiaoman Zhang, Chaoyi Wu et al.
Pick-a-back: Selective Device-to-Device Knowledge Transfer in Federated Continual Learning
JinYi Yoon, HyungJune Lee
Situated Instruction Following
So Yeon Min, Xavier Puig, Devendra Singh Chaplot et al.
Curved Diffusion: A Generative Model With Optical Geometry Control
Andrey Voynov, Amir Hertz, Moab Arar et al.
Holodepth: Programmable Depth-Varying Projection via Computer-Generated Holography
Dorian Chan, Matthew O'Toole, Sizhuo Ma et al.
Efficient Diffusion Transformer with Step-wise Dynamic Attention Mediators
Yifan Pu, Xia Zhuofan, Jiayi Guo et al.
Two-Stage Video Shadow Detection via Temporal-Spatial Adaption
Xin Duan, Yu Cao, Lei Zhu et al.
CLIP-DINOiser: Teaching CLIP a few DINO tricks for open-vocabulary semantic segmentation
Monika Wysoczanska, Oriane Siméoni, Michaël Ramamonjisoa et al.
M^2Depth: Self-supervised Two-Frame Multi-camera Metric Depth Estimation
Yingshuang Zou, Yikang Ding, Xi Qiu et al.
3D Gaussian Parametric Head Model
Yuelang Xu, Lizhen Wang, Zerong Zheng et al.
Improving Adversarial Transferability via Model Alignment
Avery Ma, Amir-massoud Farahmand, Yangchen Pan et al.
RealGen: Retrieval Augmented Generation for Controllable Traffic Scenarios
Wenhao Ding, Yulong Cao, DING ZHAO et al.
Information Bottleneck Based Data Correction in Continual Learning
Shuai Chen, mingyi zhang, Junge Zhang et al.
Factorizing Text-to-Video Generation by Explicit Image Conditioning
Rohit Girdhar, Mannat Singh, Andrew Brown et al.
REDIR: Refocus-free Event-based De-occlusion Image Reconstruction
Qi Guo, Hailong Shi, Huan Li et al.
Cut out the Middleman: Revisiting Pose-based Gait Recognition
YANG FU, Saihui Hou, Shibei Meng et al.
Fast Registration of Photorealistic Avatars for VR Facial Animation
Chaitanya Patel, Shaojie Bai, Te-Li Wang et al.
Shapefusion: 3D localized human diffusion models
Rolandos Alexandros Potamias, Michael Tarasiou, Stylianos Ploumpis et al.
Frontier-enhanced Topological Memory with Improved Exploration Awareness for Embodied Visual Navigation
Xinru Cui, Qiming Liu, Zhe Liu et al.
Caltech Aerial RGB-Thermal Dataset in the Wild
Connor Lee, Matthew Anderson, Nikhil Ranganathan et al.
Diagnosing and Re-learning for Balanced Multimodal Learning
Yake Wei, Siwei Li, Ruoxuan Feng et al.
MMEarth: Exploring Multi-Modal Pretext Tasks For Geospatial Representation Learning
Vishal Nedungadi, Ankit Kariryaa, Stefan Oehmcke et al.
Loc3Diff: Local Diffusion for 3D Human Head Synthesis and Editing
Yushi Lan, Feitong Tan, Qiangeng Xu et al.
Learning to Distinguish Samples for Generalized Category Discovery
Fengxiang Yang, Pu Nan, Wenjing Li et al.
WBP: Training-time Backdoor Attacks through Hardware-based Weight Bit Poisoning
Kunbei Cai, Zhenkai Zhang, Qian Lou et al.
UL-VIO: Ultra-lightweight Visual-Inertial Odometry with Noise Robust Test-time Adaptation
Jinho Park, Se Young Chun, Mingoo Seok
Get Your Embedding Space in Order: Domain-Adaptive Regression for Forest Monitoring
Sizhuo Li, Dimitri Gominski, Martin Brandt et al.
CLIP-DPO: Vision-Language Models as a Source of Preference for Fixing Hallucinations in LVLMs
Yassine Ouali, Adrian Bulat, Brais Martinez et al.
HVCLIP: High-dimensional Vector in CLIP for Unsupervised Domain Adaptation
Noranart Vesdapunt, Kah Kuen Fu, Yue Wu et al.
Improving 3D Semi-supervised Learning by Effectively Utilizing All Unlabelled Data
Sneha Paul, Zachary Patterson, Nizar Bouguila
Leveraging Text Localization for Scene Text Removal via Text-aware Masked Image Modeling
Zixiao Wang, Hongtao Xie, YuXin Wang et al.
EAGLES: Efficient Accelerated 3D Gaussians with Lightweight EncodingS
Sharath Girish, Kamal Gupta, Abhinav Shrivastava
Thinking Outside the BBox: Unconstrained Generative Object Compositing
Gemma Canet Tarrés, Zhe Lin, Zhifei Zhang et al.
CLAMP-ViT: Contrastive Data-Free Learning for Adaptive Post-Training Quantization of ViTs
Akshat Ramachandran, Souvik Kundu, Tushar Krishna
A Riemannian Approach for Spatiotemporal Analysis and Generation of 4D Tree-shaped Structures
Tahmina Khanam, Mohammed Bennamoun, Guan Wang et al.
Open-Vocabulary SAM: Segment and Recognize Twenty-thousand Classes Interactively
Haobo Yuan, Xiangtai Li, Chong Zhou et al.
ProxyCLIP: Proxy Attention Improves CLIP for Open-Vocabulary Segmentation
Mengcheng Lan, Chaofeng Chen, Yiping Ke et al.
LayerDiff: Exploring Text-guided Multi-layered Composable Image Synthesis via Layer-Collaborative Diffusion Model
Runhui Huang, Kaixin Cai, Jianhua Han et al.
Unsupervised Variational Translator for Bridging Image Restoration and High-Level Vision Tasks
Jiawei Wu, Zhi Jin
Fast Point Cloud Geometry Compression with Context-based Residual Coding and INR-based Refinement
Hao Xu, Xi Zhang, Xiaolin Wu
Scene-Conditional 3D Object Stylization and Composition
Jinghao Zhou, Tomas Jakab, Philip Torr et al.
RANRAC: Robust Neural Scene Representations via Random Ray Consensus
Benno Buschmann, Andreea Dogaru, Elmar Eisemann et al.
Vision-Language Dual-Pattern Matching for Out-of-Distribution Detection
Zihan Zhang, Zhuo Xu, Xiang Xiang
MarineInst: A Foundation Model for Marine Image Analysis with Instance Visual Description
Ziqiang Zheng, Yiwei Chen, Huimin Zeng et al.
Contextual Correspondence Matters: Bidirectional Graph Matching for Video Summarization
yunzuo zhang, Yameng Liu
Linking in Style: Understanding learned features in deep learning models
Maren Wehrheim, Pamela Osuna Vargas, Matthias Kaschube
COD: Learning Conditional Invariant Representation for Domain Adaptation Regression
Hao-Ran Yang, Chuan-Xian Ren, You-Wei Luo
Easing 3D Pattern Reasoning with Side-view Features for Semantic Scene Completion
Linxi Huan, Mingyue Dong, Linwei Yue et al.
Text-to-Sticker: Style Tailoring Latent Diffusion Models for Human Expression
Animesh Sinha, Bo Sun, Anmol Kalia et al.
High-Quality Mesh Blendshape Generation from Face Videos via Neural Inverse Rendering
Xin Ming, Jiawei Li, Jingwang Ling et al.
InfoNorm: Mutual Information Shaping of Normals for Sparse-View Reconstruction
Xulong Wang, Siyan Dong, Youyi Zheng et al.
DreamReward: Aligning Human Preference in Text-to-3D Generation
junliang ye, Fangfu Liu, Qixiu Li et al.
Towards Image Ambient Lighting Normalization
Florin-Alexandru Vasluianu, Tim Seizinger, Zongwei Wu et al.
FedHide: Federated Learning by Hiding in the Neighbors
Hyunsin Park, Sungrack Yun
Superpixel-informed Implicit Neural Representation for Multi-Dimensional Data
Jiayi Li, Xi-Le Zhao, Jian-Li Wang et al.
FedHARM: Harmonizing Model Architectural Diversity in Federated Learning
Anestis Kastellos, Athanasios Psaltis, Charalampos Z Patrikakis et al.
Towards Open-World Object-based Anomaly Detection via Self-Supervised Outlier Synthesis
Brian Isaac Medina, Yona Falinie Abdul Gaus, Neelanjan Bhowmik et al.
DiffSurf: A Transformer-based Diffusion Model for Generating and Reconstructing 3D Surfaces in Pose
Yoshiyasu Yusuke, Leyuan Sun
LPViT: Low-Power Semi-structured Pruning for Vision Transformers
KAIXIN Xu, Zhe Wang, Chunyun Chen et al.
Weighted Ensemble Models Are Strong Continual Learners
Imad Eddine Marouf, Subhankar Roy, Enzo Tartaglione et al.
GGRt: Towards Generalizable 3D Gaussians without Pose Priors in Real-Time
Hao Li, Yuanyuan Gao, Dingwen Zhang et al.
Chains of Diffusion Models
Yanheng Wei, Lianghua Huang, Zhi-Fan Wu et al.
Robustness Tokens: Towards Adversarial Robustness of Transformers
Brian Pulfer, Yury Belousov, Slava Voloshynovskiy
Omni6DPose: A Benchmark and Model for Universal 6D Object Pose Estimation and Tracking
Jiyao Zhang, Weiyao Huang, Bo Peng et al.
EINet: Point Cloud Completion via Extrapolation and Interpolation
Pingping Cai, Canyu Zhang, LINGJIA SHI et al.
Bridging the Gap Between Human Motion and Action Semantics via Kinematics Phrases
Xinpeng Liu, Yong-Lu Li, AILING ZENG et al.
DEPICT: Diffusion-Enabled Permutation Importance for Image Classification Tasks
Sarah Jabbour, Gregory Kondas, Ella Kazerooni et al.
Self-Rectifying Diffusion Sampling with Perturbed-Attention Guidance
Donghoon Ahn, Hyoungwon Cho, Jaewon Min et al.
MONTRAGE: Monitoring Training for Attribution of Generative Diffusion Models
Jonathan Brokman, Omer Hofman, Roman Vainshtein et al.
Close, But Not There: Boosting Geographic Distance Sensitivity in Visual Place Recognition
Sergio Izquierdo, Javier Civera
SWAG: Splatting in the Wild images with Appearance-conditioned Gaussians
Hiba Dahmani, Moussab Bennehar, Nathan Piasco et al.
TAG: Text Prompt Augmentation for Zero-Shot Out-of-Distribution Detection
Xixi Liu, Christopher Zach
Can Textual Semantics Mitigate Sounding Object Segmentation Preference?
Yaoting Wang, Peiwen Sun, Yuanchao Li et al.
Continual Learning and Unknown Object Discovery in 3D Scenes via Self-Distillation
Mohamed El Amine Boudjoghra, Jean Lahoud, Salman Khan et al.
Lost and Found: Overcoming Detector Failures in Online Multi-Object Tracking
Lorenzo Vaquero, Yihong XU, Xavier Alameda-Pineda et al.
Meerkat: Audio-Visual Large Language Model for Grounding in Space and Time
Sanjoy Chowdhury, Sayan Nag, Subhrajyoti Dasgupta et al.
MLPHand: Real Time Multi-View 3D Hand Reconstruction via MLP Modeling
Jian Yang, Jiakun Li, Guoming Li et al.
How Far Can a 1-Pixel Camera Go? Solving Vision Tasks using Photoreceptors and Computationally Designed Visual Morphology
Andrei Atanov, Rishubh Singh, Jiawei Fu et al.
Beyond Pixels: Semi-Supervised Semantic Segmentation with a Multi-scale Patch-based Multi-Label Classifier
Prantik Howlader, Srijan Das, Hieu Le et al.
Spiking Wavelet Transformer
Yuetong Fang, Ziqing Wang, Lingfeng Zhang et al.
WAVE: Warping DDIM Inversion Features for Zero-shot Text-to-Video Editing
Yutang Feng, Sicheng Gao, Yuxiang Bao et al.
HoloADMM: High-Quality Holographic Complex Field Recovery
Mazen Mel, Paul Springer, Pietro Zanuttigh et al.
Few-shot Defect Image Generation based on Consistency Modeling
Qingfeng Shi, Jing Wei, Fei Shen et al.
BeyondScene: Higher-Resolution Human-Centric Scene Generation With Pretrained Diffusion
Gwanghyun Kim, Hayeon Kim, Hoigi Seo et al.
All You Need is Your Voice: Emotional Face Representation with Audio Perspective for Emotional Talking Face Generation
Seongho Kim, Byung Cheol Song
AnimateMe: 4D Facial Expressions via Diffusion Models
Dimitrios Gerogiannis, Foivos Paraperas Papantoniou, Rolandos Alexandros Potamias et al.
iNeMo: Incremental Neural Mesh Models for Robust Class-Incremental Learning
Tom Fischer, Yaoyao Liu, Artur Jesslen et al.
Pose Guided Fine-Grained Sign Language Video Generation
Tongkai Shi, Lianyu Hu, Fanhua Shang et al.
POET: Prompt Offset Tuning for Continual Human Action Adaptation
Prachi Garg, Joseph K J, Vineeth N Balasubramanian et al.
SAFT: Towards Out-of-Distribution Generalization in Fine-Tuning
Bac Nguyen, Stefan Uhlich, Fabien Cardinaux et al.
SAH-SCI: Self-Supervised Adapter for Efficient Hyperspectral Snapshot Compressive Imaging
Haijin Zeng, Yuxi Liu, Yongyong Chen et al.
Optimization-based Uncertainty Attribution Via Learning Informative Perturbations
Hanjing Wang, Bashirul Azam Biswas, Qiang Ji
ReCON: Training-Free Acceleration for Text-to-Image Synthesis with Retrieval of Concept Prompt Trajectories
Chen-yi Lu, Shubham Agarwal, Mehrab Tanjim et al.
Object-Aware Query Perturbation for Cross-Modal Image-Text Retrieval
Naoya Sogi, Takashi Shibata, Makoto Terao
GRiT: A Generative Region-to-text Transformer for Object Understanding
Jialian Wu, Jianfeng Wang, Zhengyuan Yang et al.
LRSLAM: Low-rank Representation of Signed Distance Fields in Dense Visual SLAM System
Hongbeen Park, Minjeong Park, Giljoo Nam et al.
BAGS: Blur Agnostic Gaussian Splatting through Multi-Scale Kernel Modeling
Cheng Peng, Yutao Tang, Yifan Zhou et al.
DPA-Net: Structured 3D Abstraction from Sparse Views via Differentiable Primitive Assembly
Fenggen Yu, Yiming Qian, Xu Zhang et al.
Stepping Stones: A Progressive Training Strategy for Audio-Visual Semantic Segmentation
Juncheng Ma, Peiwen Sun, Yaoting Wang et al.
Reinforcement Learning via Auxillary Task Distillation
Abhinav Narayan Harish, Larry Heck, Josiah P Hanna et al.
Plug and Play: A Representation Enhanced Domain Adapter for Collaborative Perception
TIANYOU LUO, Quan Yuan, Yuchen Xia et al.
Follow the Rules: Reasoning for Video Anomaly Detection with Large Language Models
Yuchen Yang, Kwonjoon Lee, Behzad Dariush et al.
Computing the Lipschitz constant needed for fast scene recovery from CASSI measurements
Niels Chr. Overgaard, Anders Holst
DreamScene: 3D Gaussian-based Text-to-3D Scene Generation via Formation Pattern Sampling
Haoran Li, Haolin Shi, Wenli Zhang et al.
Improving Hyperbolic Representations via Gromov-Wasserstein Regularization
yifei Yang, Wonjun Lee, Dongmian Zou et al.
IAM-VFI : Interpolate Any Motion for Video Frame Interpolation with motion complexity map
Kihwan Yoon, Yong Han Kim, Sungjei Kim et al.
Depth-Aware Blind Image Decomposition for Real-World Adverse Weather Recovery
Chao Wang, Zhedong Zheng, Ruijie Quan et al.
DreamSampler: Unifying Diffusion Sampling and Score Distillation for Image Manipulation
Jeongsol Kim, Geon Yeong Park, Jong Chul Ye
Training A Small Emotional Vision Language Model for Visual Art Comprehension
Jing Zhang, Liang Zheng, Meng Wang et al.
PathMMU: A Massive Multimodal Expert-Level Benchmark for Understanding and Reasoning in Pathology
YUXUAN SUN, Hao Wu, Chenglu Zhu et al.
Kinetic Typography Diffusion Model
Seonmi Park, Inhwan Bae, Seunghyun Shin et al.
Free-ATM: Harnessing Free Attention Masks for Representation Learning on Diffusion-Generated Images
Junhao Zhang, Mutian Xu, Jay Zhangjie Wu et al.
EgoExo-Fitness: Towards Egocentric and Exocentric Full-Body Action Understanding
Yuanming Li, Wei-Jin Huang, An-Lan Wang et al.
TrafficNight : An Aerial Multimodal Benchmark For Nighttime Vehicle Surveillance
Guoxing Zhang, Yiming Liu, xiaoyu yang et al.
Efficient 3D-Aware Facial Image Editing via Attribute-Specific Prompt Learning
Amandeep Kumar, Muhammad Awais, Sanath Narayan et al.
COM Kitchens: An Unedited Overhead-view Procedural Videos Dataset a Vision-Language Benchmark
Atsushi Hashimoto, Koki Maeda, Tosho Hirasawa et al.
DatasetNeRF: Efficient 3D-aware Data Factory with Generative Radiance Fields
Yu Chi, Fangneng Zhan, Sibo Wu et al.
Unsupervised Representation Learning by Balanced Self Attention Matching
Daniel Shalam, Simon Korman
A Comprehensive Study of Multimodal Large Language Models for Image Quality Assessment
Tianhe Wu, Kede Ma, Jie Liang et al.
Make-Your-3D: Fast and Consistent Subject-Driven 3D Content Generation
Fangfu Liu, Hanyang Wang, Weiliang Chen et al.
Towards Dual Transparent Liquid Level Estimation in Biomedical Lab: Dataset, Methods and Practice
Xiayu Wang, Ke Ma, Ruiyun Zhong et al.
On the Topology Awareness and Generalization Performance of Graph Neural Networks
Junwei Su, Chuan Wu