Most Cited CVPR "log anomaly detection" Papers
5,589 papers found • Page 17 of 28
Conference
SG-PGM: Partial Graph Matching Network with Semantic Geometric Fusion for 3D Scene Graph Alignment and Its Downstream Tasks
Yaxu Xie, Alain Pagani, Didier Stricker
Epistemic Uncertainty Quantification For Pre-Trained Neural Networks
Hanjing Wang, Qiang Ji
Federated Learning with Domain Shift Eraser
Zheng Wang, Zihui Wang, Zheng Wang et al.
Coeff-Tuning: A Graph Filter Subspace View for Tuning Attention-Based Large Models
Zichen Miao, WEI CHEN, Qiang Qiu
Can Generative Video Models Help Pose Estimation?
Ruojin Cai, Jason Y. Zhang, Philipp Henzler et al.
The Devil is in the Prompts: Retrieval-Augmented Prompt Optimization for Text-to-Video Generation
Bingjie Gao, Xinyu Gao, Xiaoxue Wu et al.
DiffAgent: Fast and Accurate Text-to-Image API Selection with Large Language Model
Lirui Zhao, Yue Yang, Kaipeng Zhang et al.
SuperPrimitive: Scene Reconstruction at a Primitive Level
Kirill Mazur, Gwangbin Bae, Andrew J. Davison
Gaussian Splatting for Efficient Satellite Image Photogrammetry
Luca Savant Aira, Gabriele Facciolo, Thibaud Ehret
FLAME: Frozen Large Language Models Enable Data-Efficient Language-Image Pre-training
Anjia Cao, Xing Wei, Zhiheng Ma
RealEdit: Reddit Edits As a Large-scale Empirical Dataset for Image Transformations
Peter Sushko, Ayana Bharadwaj, Zhi Yang Lim et al.
FactCheXcker: Mitigating Measurement Hallucinations in Chest X-ray Report Generation Models
Alice Heiman, Xiaoman Zhang, Emma Chen et al.
Detail-Preserving Latent Diffusion for Stable Shadow Removal
Jiamin Xu, Yuxin Zheng, Zelong Li et al.
Image is All You Need to Empower Large-scale Diffusion Models for In-Domain Generation
Pu Cao, Feng Zhou, Lu Yang et al.
Classifier-Free Guidance Inside the Attraction Basin May Cause Memorization
Anubhav Jain, Yuya Kobayashi, Takashi Shibuya et al.
Makeup Prior Models for 3D Facial Makeup Estimation and Applications
Xingchao Yang, Takafumi Taketomi, Yuki Endo et al.
Video Recognition in Portrait Mode
Mingfei Han, Linjie Yang, Xiaojie Jin et al.
Theoretical Insights in Model Inversion Robustness and Conditional Entropy Maximization for Collaborative Inference Systems
Song Xia, Yi Yu, Wenhan Yang et al.
PICD: Versatile Perceptual Image Compression with Diffusion Rendering
Tongda Xu, Jiahao Li, Bin Li et al.
FAM Diffusion: Frequency and Attention Modulation for High-Resolution Image Generation with Stable Diffusion
Haosen Yang, Adrian Bulat, Isma Hadji et al.
APHQ-ViT: Post-Training Quantization with Average Perturbation Hessian Based Reconstruction for Vision Transformers
Zhuguanyu Wu, Jiayi Zhang, Jiaxin Chen et al.
Grounding 3D Object Affordance with Language Instructions, Visual Observations and Interactions
He Zhu, Quyu Kong, Kechun Xu et al.
Progress-Aware Video Frame Captioning
Zihui Xue, Joungbin An, Xitong Yang et al.
Efficient Stitchable Task Adaptation
Haoyu He, Zizheng Pan, Jing Liu et al.
Polarization Wavefront Lidar: Learning Large Scene Reconstruction from Polarized Wavefronts
Dominik Scheuble, Chenyang Lei, Mario Bijelic et al.
Geometric Knowledge-Guided Localized Global Distribution Alignment for Federated Learning
Yanbiao Ma, Wei Dai, Wenke Huang et al.
Physics-Aware Hand-Object Interaction Denoising
Haowen Luo, Yunze Liu, Li Yi
Temporal Alignment-Free Video Matching for Few-shot Action Recognition
SuBeen Lee, WonJun Moon, Hyun Seok Seong et al.
UNIC-Adapter: Unified Image-instruction Adapter with Multi-modal Transformer for Image Generation
Lunhao Duan, Shanshan Zhao, Wenjun Yan et al.
ROICtrl: Boosting Instance Control for Visual Generation
Yuchao Gu, Yipin Zhou, Yunfan Ye et al.
U-VAP: User-specified Visual Appearance Personalization via Decoupled Self Augmentation
You Wu, Kean Liu, Xiaoyue Mi et al.
DiET-GS: Diffusion Prior and Event Stream-Assisted Motion Deblurring 3D Gaussian Splatting
Seungjun Lee, Gim Hee Lee
Image Quality Assessment: From Human to Machine Preference
Chunyi Li, Yuan Tian, Xiaoyue Ling et al.
Zero-Shot Novel View and Depth Synthesis with Multi-View Geometric Diffusion
Vitor Guizilini, Muhammad Zubair Irshad, Dian Chen et al.
FOCUS: Knowledge-enhanced Adaptive Visual Compression for Few-shot Whole Slide Image Classification
Zhengrui Guo, Conghao Xiong, Jiabo MA et al.
UNOPose: Unseen Object Pose Estimation with an Unposed RGB-D Reference Image
Xingyu Liu, Gu Wang, Ruida Zhang et al.
ModeSeq: Taming Sparse Multimodal Motion Prediction with Sequential Mode Modeling
Zikang Zhou, Hengjian Zhou, Haibo Hu et al.
StdGEN: Semantic-Decomposed 3D Character Generation from Single Images
Yuze He, Yanning Zhou, Wang Zhao et al.
Fun with Flags: Robust Principal Directions via Flag Manifolds
Tolga Birdal, Nathan Mankovich
Towards a Perceptual Evaluation Framework for Lighting Estimation
Justine Giroux, Mohammad Reza Karimi Dastjerdi, Yannick Hold-Geoffroy et al.
MV-Adapter: Multimodal Video Transfer Learning for Video Text Retrieval
bowen zhang, Xiaojie Jin, Weibo Gong et al.
ArtiScene: Language-Driven Artistic 3D Scene Generation Through Image Intermediary
Zeqi Gu, Yin Cui, Max Li et al.
MOS: Modeling Object-Scene Associations in Generalized Category Discovery
Zhengyuan Peng, Jinpeng Ma, Zhimin Sun et al.
Real-time High-fidelity Gaussian Human Avatars with Position-based Interpolation of Spatially Distributed MLPs
Youyi Zhan, Tianjia Shao, Yin Yang et al.
Reconstructing Humans with a Biomechanically Accurate Skeleton
Yan Xia, Xiaowei Zhou, Etienne Vouga et al.
Spectral and Polarization Vision: Spectro-polarimetric Real-world Dataset
Yujin Jeon, Eunsue Choi, Youngchan Kim et al.
PEER Pressure: Model-to-Model Regularization for Single Source Domain Generalization
Dongkyu Cho, Inwoo Hwang, Sanghack Lee
Parametric Point Cloud Completion for Polygonal Surface Reconstruction
Zhaiyu Chen, Yuqing Wang, Liangliang Nan et al.
FluxSpace: Disentangled Semantic Editing in Rectified Flow Models
Yusuf Dalva, Kavana Venkatesh, Pinar Yanardag
Good, Cheap, and Fast: Overfitted Image Compression with Wasserstein Distortion
Jona Ballé, Luca Versari, Emilien Dupont et al.
EVOS: Efficient Implicit Neural Training via EVOlutionary Selector
Weixiang Zhang, Shuzhao Xie, Chengwei Ren et al.
4DTAM: Non-Rigid Tracking and Mapping via Dynamic Surface Gaussians
Hidenobu Matsuki, Gwangbin Bae, Andrew J. Davison
Narrating the Video: Boosting Text-Video Retrieval via Comprehensive Utilization of Frame-Level Captions
Chan Hur, Jeong-hun Hong, Dong-hun Lee et al.
Scene-Centric Unsupervised Panoptic Segmentation
Oliver Hahn, Christoph Reich, Nikita Araslanov et al.
HandOS: 3D Hand Reconstruction in One Stage
Xingyu Chen, Zhuheng Song, Xiaoke Jiang et al.
RainyGS: Efficient Rain Synthesis with Physically-Based Gaussian Splatting
Qiyu Dai, Xingyu Ni, Qianfan Shen et al.
BF-STVSR: B-Splines and Fourier---Best Friends for High Fidelity Spatial-Temporal Video Super-Resolution
Eunjin Kim, HYEONJIN KIM, Kyong Hwan Jin et al.
UIBDiffusion: Universal Imperceptible Backdoor Attack for Diffusion Models
Yuning Han, Bingyin Zhao, Rui Chu et al.
COAP: Memory-Efficient Training with Correlation-Aware Gradient Projection
Jinqi Xiao, Shen Sang, Tiancheng Zhi et al.
Memories of Forgotten Concepts
Matan Rusanovsky, Shimon Malnick, Amir Jevnisek et al.
Augmented Deep Contexts for Spatially Embedded Video Coding
Yifan Bian, Chuanbo Tang, Li Li et al.
Neuro-Symbolic Evaluation of Text-to-Video Models using Formal Verification
S P Sharan, Minkyu Choi, Sahil Shah et al.
IMFine: 3D Inpainting via Geometry-guided Multi-view Refinement
Zhihao Shi, Dong Huo, Yuhongze Zhou et al.
RePerformer: Immersive Human-centric Volumetric Videos from Playback to Photoreal Reperformance
Yuheng Jiang, Zhehao Shen, Chengcheng Guo et al.
Relation3D : Enhancing Relation Modeling for Point Cloud Instance Segmentation
Edward LOO, Jiacheng Deng
Golden Cudgel Network for Real-Time Semantic Segmentation
Guoyu Yang, Yuan Wang, Daming Shi et al.
Logits DeConfusion with CLIP for Few-Shot Learning
Shuo Li, Fang Liu, Zehua Hao et al.
RoboSense: Large-scale Dataset and Benchmark for Egocentric Robot Perception and Navigation in Crowded and Unstructured Environments
Haisheng Su, Feixiang Song, CONG MA et al.
InteractionMap: Improving Online Vectorized HDMap Construction with Interaction
Kuang Wu, Chuan Yang, Zhanbin Li
Effective Cloud Removal for Remote Sensing Images by an Improved Mean-Reverting Denoising Model with Elucidated Design Space
Yi Liu, Wengen Li, Jihong Guan et al.
Nonisotropic Gaussian Diffusion for Realistic 3D Human Motion Prediction
Cecilia Curreli, Dominik Muhle, Abhishek Saroha et al.
ARKit LabelMaker: A New Scale for Indoor 3D Scene Understanding
Guangda Ji, Silvan Weder, Francis Engelmann et al.
Zero-Shot Image Restoration Using Few-Step Guidance of Consistency Models (and Beyond)
Tomer Garber, Tom Tirer
Realistic Test-Time Adaptation of Vision-Language Models
Maxime Zanella, Clément Fuchs, Christophe De Vleeschouwer et al.
Masked Point-Entity Contrast for Open-Vocabulary 3D Scene Understanding
Yan Wang, Baoxiong Jia, Ziyu Zhu et al.
Hyperbolic Safety-Aware Vision-Language Models
Tobia Poppi, Tejaswi Kasarla, Pascal Mettes et al.
3D Occupancy Prediction with Low-Resolution Queries via Prototype-aware View Transformation
Gyeongrok Oh, Sung June Kim, Heeju Ko et al.
DualPM: Dual Posed-Canonical Point Maps for 3D Shape and Pose Reconstruction
Ben Kaye, Tomas Jakab, Shangzhe Wu et al.
Recurrence-Enhanced Vision-and-Language Transformers for Robust Multimodal Document Retrieval
Davide Caffagni, Sara Sarto, Marcella Cornia et al.
Robust 3D Shape Reconstruction in Zero-Shot from a Single Image in the Wild
Junhyeong Cho, Kim Youwang, Hunmin Yang et al.
Multitwine: Multi-Object Compositing with Text and Layout Control
Gemma Canet Tarrés, Zhe Lin, Zhifei Zhang et al.
GaussHDR: High Dynamic Range Gaussian Splatting via Learning Unified 3D and 2D Local Tone Mapping
Jinfeng Liu, Lingtong Kong, Bo Li et al.
SaMam: Style-aware State Space Model for Arbitrary Image Style Transfer
Hongda Liu, Longguang Wang, Ye Zhang et al.
Advancing Semantic Future Prediction through Multimodal Visual Sequence Transformers
Efstathios Karypidis, Ioannis Kakogeorgiou, Spyros Gidaris et al.
Show and Tell: Visually Explainable Deep Neural Nets via Spatially-Aware Concept Bottleneck Models
Itay Benou, Tammy Riklin Raviv
HybridGS: Decoupling Transients and Statics with 2D and 3D Gaussian Splatting
Jingyu Lin, Jiaqi Gu, Lubin Fan et al.
FIMA-Q: Post-Training Quantization for Vision Transformers by Fisher Information Matrix Approximation
Zhuguanyu Wu, Shihe Wang, Jiayi Zhang et al.
OpenMIBOOD: Open Medical Imaging Benchmarks for Out-Of-Distribution Detection
Max Gutbrod, David Rauber, Danilo Weber Nunes et al.
From Sparse to Dense: Camera Relocalization with Scene-Specific Detector from Feature Gaussian Splatting
Zhiwei Huang, Hailin Yu, Yichun Shentu et al.
Blurred LiDAR for Sharper 3D: Robust Handheld 3D Scanning with Diffuse LiDAR and RGB
Nikhil Behari, Aaron Young, Siddharth Somasundaram et al.
AutoURDF: Unsupervised Robot Modeling from Point Cloud Frames Using Cluster Registration
Jiong Lin, Lechen Zhang, Kwansoo Lee et al.
GIFStream: 4D Gaussian-based Immersive Video with Feature Stream
Hao Li, Sicheng Li, Xiang Gao et al.
Functionality Understanding and Segmentation in 3D Scenes
Jaime Corsetti, Francesco Giuliari, Alice Fasoli et al.
NVComposer: Boosting Generative Novel View Synthesis with Multiple Sparse and Unposed Images
Lingen Li, Zhaoyang Zhang, Yaowei Li et al.
Multi-View Pose-Agnostic Change Localization with Zero Labels
Chamuditha Jayanga Galappaththige, Jason Lai, Lloyd Windrim et al.
ShowHowTo: Generating Scene-Conditioned Step-by-Step Visual Instructions
Tomas Soucek, Prajwal Gatti, Michael Wray et al.
Hearing Anywhere in Any Environment
Xiulong Liu, Anurag Kumar, Paul Calamia et al.
GuardSplat: Efficient and Robust Watermarking for 3D Gaussian Splatting
Zixuan Chen, Guangcong Wang, Jiahao Zhu et al.
CountLLM: Towards Generalizable Repetitive Action Counting via Large Language Model
Ziyu Yao, Xuxin Cheng, Zhiqi Huang et al.
LaTexBlend: Scaling Multi-concept Customized Generation with Latent Textual Blending
Jian Jin, Zhenbo Yu, Yang Shen et al.
Point Clouds Meets Physics: Dynamic Acoustic Field Fitting Network for Point Cloud Understanding
Changshuo Wang, Shuting He, Xiang Fang et al.
MATCHA: Towards Matching Anything
Fei Xue, Sven Elflein, Laura Leal-Taixe et al.
PartRM: Modeling Part-Level Dynamics with Large Cross-State Reconstruction Model
Mingju Gao, Yike Pan, Huan-ang Gao et al.
AlphaPre: Amplitude-Phase Disentanglement Model for Precipitation Nowcasting
Kenghong Lin, Baoquan Zhang, Demin Yu et al.
DefectFill: Realistic Defect Generation with Inpainting Diffusion Model for Visual Inspection
Jaewoo Song, Daemin Park, Kanghyun Baek et al.
From Elements to Design: A Layered Approach for Automatic Graphic Design Composition
Jiawei Lin, Shizhao Sun, Danqing Huang et al.
CoMatcher: Multi-View Collaborative Feature Matching
Jintao Zhang, Zimin Xia, Mingyue Dong et al.
Lifelong Knowledge Editing for Vision Language Models with Low-Rank Mixture-of-Experts
Qizhou Chen, Chengyu Wang, Dakan Wang et al.
ReWind: Understanding Long Videos with Instructed Learnable Memory
Anxhelo Diko, Tinghuai Wang, Wassim Swaileh et al.
Distilled Prompt Learning for Incomplete Multimodal Survival Prediction
Yingxue Xu, Fengtao ZHOU, Chenyu Zhao et al.
Imputation-free and Alignment-free: Incomplete Multi-view Clustering Driven by Consensus Semantic Learning
yuzhuo dai, Jiaqi Jin, Zhibin Dong et al.
TSAM: Temporal SAM Augmented with Multimodal Prompts for Referring Audio-Visual Segmentation
Abduljalil Radman, Jorma Laaksonen
On Denoising Walking Videos for Gait Recognition
Dongyang Jin, Chao Fan, Jingzhe Ma et al.
Weakly Supervised Temporal Action Localization via Dual-Prior Collaborative Learning Guided by Multimodal Large Language Models
Quan Zhang, Jinwei Fang, Rui Yuan et al.
VasTSD: Learning 3D Vascular Tree-state Space Diffusion Model for Angiography Synthesis
Zhifeng Wang, Renjiao Yi, Xin Wen et al.
Towards Realistic Example-based Modeling via 3D Gaussian Stitching
Xinyu Gao, Ziyi Yang, Bingchen Gong et al.
Cross-Modal Interactive Perception Network with Mamba for Lung Tumor Segmentation in PET-CT Images
Jie Mei, Chenyu Lin, Yu Qiu et al.
Towards In-the-wild 3D Plane Reconstruction from a Single Image
Jiachen Liu, Rui Yu, Sili Chen et al.
SparseAlign: a Fully Sparse Framework for Cooperative Object Detection
Yunshuang Yuan, Yan Xia, Daniel Cremers et al.
KAC: Kolmogorov-Arnold Classifier for Continual Learning
Yusong Hu, Zichen Liang, Fei Yang et al.
Dense-SfM: Structure from Motion with Dense Consistent Matching
JongMin Lee, Sungjoo Yoo
Prompt-CAM: Making Vision Transformers Interpretable for Fine-Grained Analysis
Arpita Chowdhury, Dipanjyoti Paul, Zheda Mai et al.
Large Self-Supervised Models Bridge the Gap in Domain Adaptive Object Detection
Marc-Antoine Lavoie, Anas Mahmoud, Steven L. Waslander
CDI: Copyrighted Data Identification in Diffusion Models
Jan Dubiński, Antoni Kowalczuk, Franziska Boenisch et al.
Learning from Streaming Video with Orthogonal Gradients
Tengda Han, Dilara Gokay, Joseph Heyward et al.
Motion Modes: What Could Happen Next?
Karran Pandey, Yannick Hold-Geoffroy, Matheus Gadelha et al.
Rotation-Equivariant Self-Supervised Method in Image Denoising
Hanze Liu, Jiahong Fu, Qi Xie et al.
ReNeg: Learning Negative Embedding with Reward Guidance
Xiaomin Li, yixuan liu, Takashi Isobe et al.
Improving Gaussian Splatting with Localized Points Management
Haosen Yang, Chenhao Zhang, Wenqing Wang et al.
VELOCITI: Benchmarking Video-Language Compositional Reasoning with Strict Entailment
Darshana Saravanan, Varun Gupta, Darshan Singh S et al.
CamFreeDiff: Camera-free Image to Panorama Generation with Diffusion Model
Xiaoding Yuan, Shitao Tang, Kejie Li et al.
EchoWorld: Learning Motion-Aware World Models for Echocardiography Probe Guidance
Yang Yue, Yulin Wang, Haojun Jiang et al.
Uncertain Multimodal Intention and Emotion Understanding in the Wild
Qu Yang, QingHongYa Shi, Tongxin Wang et al.
CLIP Under the Microscope: A Fine-Grained Analysis of Multi-Object Representation
Reza Abbasi, Ali Nazari, Aminreza Sefid et al.
Order-Robust Class Incremental Learning: Graph-Driven Dynamic Similarity Grouping
Guannan Lai, Yujie Li, Xiangkun Wang et al.
Self-Learning Hyperspectral and Multispectral Image Fusion via Adaptive Residual Guided Subspace Diffusion Model
Jian Zhu, He Wang, Yang Xu et al.
Lift3D Policy: Lifting 2D Foundation Models for Robust 3D Robotic Manipulation
Yueru Jia, Jiaming Liu, Sixiang Chen et al.
Exploring Simple Open-Vocabulary Semantic Segmentation
Zihang Lai
Exploring Visual Vulnerabilities via Multi-Loss Adversarial Search for Jailbreaking Vision-Language Models
Shuyang Hao, Bryan Hooi, Jun Liu et al.
SimMotionEdit: Text-Based Human Motion Editing with Motion Similarity Prediction
Zhengyuan Li, Kai Cheng, Anindita Ghosh et al.
One is Plenty: A Polymorphic Feature Interpreter for Immutable Heterogeneous Collaborative Perception
Yuchen Xia, Quan Yuan, Guiyang Luo et al.
Denoising Functional Maps: Diffusion Models for Shape Correspondence
Aleksei Zhuravlev, Zorah Lähner, Vladislav Golyanik
Language-Guided Audio-Visual Learning for Long-Term Sports Assessment
Huangbiao Xu, Xiao Ke, Huanqi Wu et al.
Training-free Dense-Aligned Diffusion Guidance for Modular Conditional Image Synthesis
Zixuan Wang, DUO PENG, Feng Chen et al.
Attend to Not Attended: Structure-then-Detail Token Merging for Post-training DiT Acceleration
Haipeng Fang, Sheng Tang, Juan Cao et al.
Decoupling Fine Detail and Global Geometry for Compressed Depth Map Super-Resolution
Huan Zheng, Wencheng Han, Jianbing Shen
Patch Matters: Training-free Fine-grained Image Caption Enhancement via Local Perception
ruotian peng, Haiying He, Yake Wei et al.
Towards RAW Object Detection in Diverse Conditions
Zhong-Yu Li, Xin Jin, Bo-Yuan Sun et al.
DeRS: Towards Extremely Efficient Upcycled Mixture-of-Experts Models
Yongqi Huang, Peng Ye, Chenyu Huang et al.
TokenMotion: Decoupled Motion Control via Token Disentanglement for Human-centric Video Generation
Ruineng Li, Daitao Xing, Huiming Sun et al.
GarmentPile: Point-Level Visual Affordance Guided Retrieval and Adaptation for Cluttered Garments Manipulation
Ruihai Wu, Ziyu Zhu, Yuran Wang et al.
UniPre3D: Unified Pre-training of 3D Point Cloud Models with Cross-Modal Gaussian Splatting
Ziyi Wang, Yanran Zhang, Jie Zhou et al.
MambaVLT: Time-Evolving Multimodal State Space Model for Vision-Language Tracking
Xinqi Liu, Li Zhou, Zikun Zhou et al.
D^2iT: Dynamic Diffusion Transformer for Accurate Image Generation
Weinan Jia, Mengqi Huang, Nan Chen et al.
Satellite Observations Guided Diffusion Model for Accurate Meteorological States at Arbitrary Resolution
Siwei Tu, Ben Fei, Weidong Yang et al.
Unified Dense Prediction of Video Diffusion
Lehan Yang, Lu Qi, Xiangtai Li et al.
A Comprehensive Study of Decoder-Only LLMs for Text-to-Image Generation
Andrew Z Wang, Songwei Ge, Tero Karras et al.
AdaMMS: Model Merging for Heterogeneous Multimodal Large Language Models with Unsupervised Coefficient Optimization
Yiyang Du, Xiaochen Wang, Chi Chen et al.
SDGOCC: Semantic and Depth-Guided Bird's-Eye View Transformation for 3D Multimodal Occupancy Prediction
ZaiPeng Duan, Xuzhong Hu, Pei An et al.
Seeking Consistent Flat Minima for Better Domain Generalization via Refining Loss Landscapes
Aodi Li, Liansheng Zhuang, Xiao Long et al.
Detect-and-Guide: Self-regulation of Diffusion Models for Safe Text-to-Image Generation via Guideline Token Optimization
Feifei Li, Mi Zhang, Yiming Sun et al.
Beyond Local Sharpness: Communication-Efficient Global Sharpness-aware Minimization for Federated Learning
Debora Caldarola, Pietro Cagnasso, Barbara Caputo et al.
Navigating Image Restoration with VAR’s Distribution Alignment Prior
Siyang Wang, Naishan Zheng, Jie Huang et al.
UVGS: Reimagining Unstructured 3D Gaussian Splatting using UV Mapping
Aashish Rai, Dilin Wang, Mihir Jain et al.
StyleSSP: Sampling StartPoint Enhancement for Training-free Diffusion-based Method for Style Transfer
ruojun xu, Weijie Xi, Xiaodi Wang et al.
Generative Sparse-View Gaussian Splatting
Hanyang Kong, Xingyi Yang, Xinchao Wang
Accelerating Diffusion Transformer via Increment-Calibrated Caching with Channel-Aware Singular Value Decomposition
Zhiyuan Chen, Keyi Li, Yifan Jia et al.
Stable-SCore: A Stable Registration-based Framework for 3D Shape Correspondence
Haolin Liu, Xiaohang Zhan, Zizheng Yan et al.
RANGE: Retrieval Augmented Neural Fields for Multi-Resolution Geo-Embeddings
Aayush Dhakal, Srikumar Sastry, Subash Khanal et al.
Rethinking Spiking Self-Attention Mechanism: Implementing α-XNOR Similarity Calculation in Spiking Transformers
Yichen Xiao, Shuai Wang, Dehao Zhang et al.
Ego4o: Egocentric Human Motion Capture and Understanding from Multi-Modal Input
Jian Wang, Rishabh Dabral, Diogo Luvizon et al.
PanDA: Towards Panoramic Depth Anything with Unlabeled Panoramas and Mobius Spatial Augmentation
Zidong Cao, Jinjing Zhu, Weiming Zhang et al.
TCFG: Tangential Damping Classifier-free Guidance
Mingi Kwon, Shin seong Kim, Jaeseok Jeong et al.
Real-IAD D³: A Real-World 2D/Pseudo-3D/3D Dataset for Industrial Anomaly Detection
wenbing zhu, Lidong Wang, Ziqing Zhou et al.
DVHGNN: Multi-Scale Dilated Vision HGNN for Efficient Vision Recognition
Caoshuo Li, Tanzhe Li, Xiaobin Hu et al.
Semantic Library Adaptation: LoRA Retrieval and Fusion for Open-Vocabulary Semantic Segmentation
Reza Qorbani, Gianluca Villani, Theodoros Panagiotakopoulos et al.
DynFocus: Dynamic Cooperative Network Empowers LLMs with Video Understanding
Yudong Han, Qingpei Guo, Liyuan Pan et al.
Keyframe-Guided Creative Video Inpainting
Yuwei Guo, Ceyuan Yang, Anyi Rao et al.
POT: Prototypical Optimal Transport for Weakly Supervised Semantic Segmentation
Jian Wang, Tianhong Dai, Bingfeng Zhang et al.
3D-MVP: 3D Multiview Pretraining for Manipulation
Shengyi Qian, Kaichun Mo, Valts Blukis et al.
Visual Persona: Foundation Model for Full-Body Human Customization
Jisu Nam, Soowon Son, Zhan Xu et al.
Improving the Transferability of Adversarial Attacks on Face Recognition with Diverse Parameters Augmentation
Fengfan Zhou, Bangjie Yin, Hefei Ling et al.
MIRE: Matched Implicit Neural Representations
Dhananjaya Jayasundara, Heng Zhao, Demetrio Labate et al.
U-Know-DiffPAN: An Uncertainty-aware Knowledge Distillation Diffusion Framework with Details Enhancement for PAN-Sharpening
Sungpyo Kim, Jeonghyeok Do, Jaehyup Lee et al.
Chat-based Person Retrieval via Dialogue-Refined Cross-Modal Alignment
Yang Bai, Yucheng Ji, Min Cao et al.
Eval3D: Interpretable and Fine-grained Evaluation for 3D Generation
Shivam Duggal, Yushi Hu, Oscar Michel et al.
Unsupervised Template-assisted Point Cloud Shape Correspondence Network
Jiacheng Deng, Jiahao Lu, Tianzhu Zhang
In-Context Matting
He Guo, Zixuan Ye, Zhiguo Cao et al.
S-DyRF: Reference-Based Stylized Radiance Fields for Dynamic Scenes
Xingyi Li, Zhiguo Cao, Yizheng Wu et al.
Cross-view and Cross-pose Completion for 3D Human Understanding
Matthieu Armando, Salma Galaaoui, Fabien Baradel et al.
CoG-DQA: Chain-of-Guiding Learning with Large Language Models for Diagram Question Answering
Shaowei Wang, Lingling Zhang, Longji Zhu et al.
GLID: Pre-training a Generalist Encoder-Decoder Vision Model
Jihao Liu, Jinliang Zheng, Yu Liu et al.
CNC-Net: Self-Supervised Learning for CNC Machining Operations
Mohsen Yavartanoo, Sangmin Hong, Reyhaneh Neshatavar et al.
Flexible Depth Completion for Sparse and Varying Point Densities
Jinhyung Park, Yu-Jhe Li, Kris Kitani
Bayesian Exploration of Pre-trained Models for Low-shot Image Classification
Yibo Miao, Yu lei, Feng Zhou et al.
3D-Aware Face Editing via Warping-Guided Latent Direction Learning
Yuhao Cheng, Zhuo Chen, Xingyu Ren et al.
CMA: A Chromaticity Map Adapter for Robust Detection of Screen-Recapture Document Images
Changsheng Chen, Liangwei Lin, Yongqi Chen et al.
OpticalDR: A Deep Optical Imaging Model for Privacy-Protective Depression Recognition
Yuchen Pan, Junjun Jiang, Kui Jiang et al.
Fully Geometric Panoramic Localization
Junho Kim, Jiwon Jeong, Young Min Kim
Learning to Rank Patches for Unbiased Image Redundancy Reduction
Yang Luo, Zhineng Chen, Peng Zhou et al.