CVPR Papers

5,589 papers found • Page 47 of 112

Sharp-It: A Multi-view to Multi-view Diffusion Model for 3D Synthesis and Manipulation

Yiftach Edelstein, Or Patashnik, Dana Cohen-Bar et al.

CVPR 2025posterarXiv:2412.02631
1
citations

Shift the Lens: Environment-Aware Unsupervised Camouflaged Object Detection

Ji Du, Fangwei Hao, Mingyang Yu et al.

CVPR 2025poster

ShiftwiseConv: Small Convolutional Kernel with Large Kernel Effect

Dachong Li, li li, zhuangzhuang chen et al.

CVPR 2025posterarXiv:2401.12736

Shining Yourself: High-Fidelity Ornaments Virtual Try-on with Diffusion Model

Yingmao Miao, Zhanpeng Huang, Rui Han et al.

CVPR 2025posterarXiv:2503.16065

ShotAdapter: Text-to-Multi-Shot Video Generation with Diffusion Models

Ozgur Kara, Krishna Kumar Singh, Feng Liu et al.

CVPR 2025posterarXiv:2505.07652
13
citations

Show and Segment: Universal Medical Image Segmentation via In-Context Learning

Yunhe Gao, Di Liu, Zhuowei Li et al.

CVPR 2025posterarXiv:2503.19359
8
citations

Show and Tell: Visually Explainable Deep Neural Nets via Spatially-Aware Concept Bottleneck Models

Itay Benou, Tammy Riklin Raviv

CVPR 2025highlightarXiv:2502.20134
6
citations

ShowHowTo: Generating Scene-Conditioned Step-by-Step Visual Instructions

Tomas Soucek, Prajwal Gatti, Michael Wray et al.

CVPR 2025posterarXiv:2412.01987
6
citations

ShowMak3r: Compositional TV Show Reconstruction

Sangmin Kim, Seunguk Do, Jaesik Park

CVPR 2025posterarXiv:2504.19584

ShowUI: One Vision-Language-Action Model for GUI Visual Agent

Kevin Qinghong Lin, Linjie Li, Difei Gao et al.

CVPR 2025posterarXiv:2411.17465
128
citations

SIDA: Social Media Image Deepfake Detection, Localization and Explanation with Large Multimodal Model

Zhenglin Huang, Jinwei Hu, Yiwei He et al.

CVPR 2025posterarXiv:2412.04292
64
citations

Silence is Golden: Leveraging Adversarial Examples to Nullify Audio Control in LDM-based Talking-Head Generation

Yuan Gan, Jiaxu Miao, Yunze Wang et al.

CVPR 2025posterarXiv:2506.01591
3
citations

Silent Branding Attack: Trigger-free Data Poisoning Attack on Text-to-Image Diffusion Models

Sangwon Jang, June Suk Choi, Jaehyeong Jo et al.

CVPR 2025posterarXiv:2503.09669

SILMM: Self-Improving Large Multimodal Models for Compositional Text-to-Image Generation

Leigang Qu, Haochuan Li, Wenjie Wang et al.

CVPR 2025posterarXiv:2412.05818
9
citations

SimAvatar: Simulation-Ready Avatars with Layered Hair and Clothing

Xueting Li, Ye Yuan, Shalini De Mello et al.

CVPR 2025posterarXiv:2412.09545

Similarity-Guided Layer-Adaptive Vision Transformer for UAV Tracking

chaocan xue, Bineng Zhong, Qihua Liang et al.

CVPR 2025posterarXiv:2503.06625

SimLingo: Vision-Only Closed-Loop Autonomous Driving with Language-Action Alignment

Katrin Renz, Long Chen, Elahe Arani et al.

CVPR 2025highlightarXiv:2503.09594
45
citations

SimLTD: Simple Supervised and Semi-Supervised Long-Tailed Object Detection

Phi Vu Tran

CVPR 2025posterarXiv:2412.20047

SimMotionEdit: Text-Based Human Motion Editing with Motion Similarity Prediction

Zhengyuan Li, Kai Cheng, Anindita Ghosh et al.

CVPR 2025posterarXiv:2503.18211
6
citations

Simpler Diffusion: 1.5 FID on ImageNet512 with Pixel-space Diffusion

Emiel Hoogeboom, Thomas Mensink, Jonathan Heek et al.

CVPR 2025poster
3
citations

Simplification Is All You Need against Out-of-Distribution Overconfidence

Keke Tang, Chao Hou, Weilong Peng et al.

CVPR 2025poster
4
citations

Sim-to-Real Causal Transfer: A Metric Learning Approach to Causally-Aware Interaction Representations

Ahmad Rahimi, Po-Chien Luan, Yuejiang Liu et al.

CVPR 2025posterarXiv:2312.04540

Simulator HC: Regression-based Online Simulation of Starting Problem-Solution Pairs for Homotopy Continuation in Geometric Vision

Xinyue Zhang, Zijia Dai, Wanting Xu et al.

CVPR 2025highlightarXiv:2411.03745

SimVS: Simulating World Inconsistencies for Robust View Synthesis

Alex Trevithick, Roni Paiss, Philipp Henzler et al.

CVPR 2025posterarXiv:2412.07696
5
citations

Single Domain Generalization for Few-Shot Counting via Universal Representation Matching

Xianing Chen, Si Huo, Borui Jiang et al.

CVPR 2025posterarXiv:2505.16778
4
citations

SinGS: Animatable Single-Image Human Gaussian Splats with Kinematic Priors

Yufan Wu, Xuanhong Chen, Wen Li et al.

CVPR 2025poster
1
citations

SINR: Sparsity Driven Compressed Implicit Neural Representations

Dhananjaya Jayasundara, Sudarshan Rajagopalan, Yasiru Ranasinghe et al.

CVPR 2025posterarXiv:2503.19576

SIR-DIFF: Sparse Image Sets Restoration with Multi-View Diffusion Model

Yucheng Mao, Boyang Wang, Nilesh Kulkarni et al.

CVPR 2025posterarXiv:2503.14463

Six-CD: Benchmarking Concept Removals for Text-to-image Diffusion Models

Jie Ren, Kangrui Chen, Yingqian Cui et al.

CVPR 2025poster
3
citations

SKDream: Controllable Multi-view and 3D Generation with Arbitrary Skeletons

Yuanyou Xu, Zongxin Yang, Yi Yang

CVPR 2025highlight

SKE-Layout: Spatial Knowledge Enhanced Layout Generation with LLMs

Junsheng Wang, Nieqing Cao, Yan Ding et al.

CVPR 2025poster

SketchAgent: Language-Driven Sequential Sketch Generation

Yael Vinker, Tamar Rott Shaham, Kristine Zheng et al.

CVPR 2025posterarXiv:2411.17673
17
citations

Sketch Down the FLOPs: Towards Efficient Networks for Human Sketch

Aneeshan Sain, Subhajit Maity, Pinaki Nath Chowdhury et al.

CVPR 2025posterarXiv:2505.23763

SketchFusion: Learning Universal Sketch Features through Fusing Foundation Models

Subhadeep Koley, Tapas Kumar Dutta, Aneeshan Sain et al.

CVPR 2025posterarXiv:2503.14129

Sketchtopia: A Dataset and Foundational Agents for Benchmarking Asynchronous Multimodal Communication with Iconic Feedback

Mohd Hozaifa Khan, Ravi Kiran Sarvadevabhatla

CVPR 2025poster
1
citations

SketchVideo: Sketch-based Video Generation and Editing

Feng-Lin Liu, Hongbo Fu, Xintao Wang et al.

CVPR 2025posterarXiv:2503.23284
9
citations

Sketchy Bounding-box Supervision for 3D Instance Segmentation

qian deng, Le Hui, Jin Xie et al.

CVPR 2025posterarXiv:2505.16399

SkillMimic: Learning Basketball Interaction Skills from Demonstrations

Yinhuai Wang, Qihan Zhao, Runyi Yu et al.

CVPR 2025highlightarXiv:2408.15270
13
citations

Skip Tuning: Pre-trained Vision-Language Models are Effective and Efficient Adapters Themselves

Shihan Wu, Ji Zhang, Pengpeng Zeng et al.

CVPR 2025posterarXiv:2412.11509
8
citations

SkySense-O: Towards Open-World Remote Sensing Interpretation with Vision-Centric Visual-Language Modeling

Qi Zhu, Jiangwei Lao, Deyi Ji et al.

CVPR 2025poster

SLADE: Shielding against Dual Exploits in Large Vision-Language Models

Md Zarif Hossain, AHMED IMTEAJ

CVPR 2025poster

SLAM3R: Real-Time Dense Scene Reconstruction from Monocular RGB Videos

Yuzheng Liu, Siyan Dong, Shuzhe Wang et al.

CVPR 2025highlightarXiv:2412.09401

SleeperMark: Towards Robust Watermark against Fine-Tuning Text-to-image Diffusion Models

Zilan Wang, Junfeng Guo, Jiacheng Zhu et al.

CVPR 2025posterarXiv:2412.04852
14
citations

SlideChat: A Large Vision-Language Assistant for Whole-Slide Pathology Image Understanding

Ying Chen, Guoan Wang, Yuanfeng Ji et al.

CVPR 2025posterarXiv:2410.11761
27
citations

SLVR: Super-Light Visual Reconstruction via Blueprint Controllable Convolutions and Exploring Feature Diversity Representation

Ning Ni, Libao Zhang

CVPR 2025poster

SmartCLIP: Modular Vision-language Alignment with Identification Guarantees

Shaoan Xie, Lingjing Kong, Yujia Zheng et al.

CVPR 2025highlightarXiv:2507.22264
4
citations

SmartEraser: Remove Anything from Images using Masked-Region Guidance

Longtao Jiang, Zhendong Wang, Jianmin Bao et al.

CVPR 2025posterarXiv:2501.08279
12
citations

SMILE: Infusing Spatial and Motion Semantics in Masked Video Learning

Fida Mohammad Thoker, Letian Jiang, Chen Zhao et al.

CVPR 2025posterarXiv:2504.00527
3
citations

SMTPD: A New Benchmark for Temporal Prediction of Social Media Popularity

Yijie Xu, Bolun Zheng, Wei Zhu et al.

CVPR 2025posterarXiv:2503.04446
3
citations

SnapGen: Taming High-Resolution Text-to-Image Models for Mobile Devices with Efficient Architectures and Training

Jierun Chen, Dongting Hu, Xijie Huang et al.

CVPR 2025highlightarXiv:2412.09619