ICCV Highlight Papers
263 papers found • Page 5 of 6
SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation
Shiqi Huang, Shuting He, Huaiyuan Qin et al.
Seeing the Trees for the Forest: Rethinking Weakly-Supervised Medical Visual Grounding
Huy Ta, Duy Anh Huynh, Yutong Xie et al.
SEGA: A Stepwise Evolution Paradigm for Content-Aware Layout Generation with Design Prior
Bo Zhao, Haoran Wang, Jinghui Wang et al.
Self-Calibrating Gaussian Splatting for Large Field-of-View Reconstruction
Youming Deng, Wenqi Xian, Guandao Yang et al.
Sequential keypoint density estimator: an overlooked baseline of skeleton-based video anomaly detection
Anja Delić, Matej Grcic, Siniša Šegvić
SGAD: Semantic and Geometric-aware Descriptor for Local Feature Matching
Xiangzeng Liu, CHI WANG, Guanglu Shi et al.
Shape of Motion: 4D Reconstruction from a Single Video
Qianqian Wang, Vickie Ye, Hang Gao et al.
Similarity Memory Prior is All You Need for Medical Image Segmentation
Hao Tang, Zhiqing Guo, Liejun Wang et al.
Sliced Wasserstein Bridge for Open-Vocabulary Video Instance Segmentation
Zheyun Qin, Deng Yu, Chuanchen Luo et al.
SMSTracker: Tri-path Score Mask Sigma Fusion for Multi-Modal Tracking
Sixian Chan, Zedong Li, Xiaoqin Zhang et al.
Sparfels: Fast Reconstruction from Sparse Unposed Imagery
Shubhendu Jena, Amine Ouasfi, Mae Younes et al.
Spatio-Spectral Pattern Illumination for Direct and Indirect Separation from a Single Hyperspectral Image
Shin Ishihara, Imari Sato
SRefiner: Soft-Braid Attention for Multi-Agent Trajectory Refinement
Liwen Xiao, Zhiyu Pan, Zhicheng Wang et al.
StableDepth: Scene-Consistent and Scale-Invariant Monocular Depth
Zheng Zhang, Lihe Yang, Tianyu Yang et al.
Stable-Sim2Real: Exploring Simulation of Real-Captured 3D Data with Two-Stage Depth Diffusion
Mutian Xu, Chongjie Ye, Haolin Liu et al.
Stereo Any Video: Temporally Consistent Stereo Matching
Junpeng Jing, Weixun Luo, Ye Mao et al.
Stochastic Gradient Estimation for Higher-Order Differentiable Rendering
Zican Wang, Michael Fischer, Tobias Ritschel
StolenLoRA: Exploring LoRA Extraction Attacks via Synthetic Data
Yixu Wang, Yan Teng, Yingchun Wang et al.
Straighten Viscous Rectified Flow via Noise Optimization
Jimin Dai, Jiexi Yan, Jian Yang et al.
Structure Matters: Revisiting Boundary Refinement in Video Object Segmentation
Guanyi Qin, Ziyue Wang, Daiyun Shen et al.
SummDiff: Generative Modeling of Video Summarization with Diffusion
Kwanseok Kim, Jaehoon Hahm, Sumin Kim et al.
SuperEvent: Cross-Modal Learning of Event-based Keypoint Detection for SLAM
Yannick Burkhardt, Simon Schaefer, Stefan Leutenegger
Super Resolved Imaging with Adaptive Optics
Robin Swanson, Esther Y. H. Lin, Masen Lamb et al.
Synthesizing Near-Boundary OOD Samples for Out-of-Distribution Detection
Jinglun Li, Kaixun Jiang, Zhaoyu Chen et al.
Test-time Adaptation for Foundation Medical Segmentation Model Without Parametric Updates
Kecheng Chen, Xinyu Luo, Tiexin Qin et al.
Test-Time Prompt Tuning for Zero-Shot Depth Completion
Chanhwi Jeong, Inhwan Bae, Jin-Hwi Park et al.
Thermal Polarimetric Multi-view Stereo
Takahiro Kushida, Kenichiro Tanaka
The Scalability of Simplicity: Empirical Analysis of Vision-Language Learning with a Single Transformer
Weixian Lei, Jiacong Wang, Haochen Wang et al.
The Silent Assistant: NoiseQuery as Implicit Guidance for Goal-Driven Image Generation
Ruoyu Wang, Huayang Huang, Ye Zhu et al.
TikZero: Zero-Shot Text-Guided Graphics Program Synthesis
Jonas Belouadi, Eddy Ilg, Margret Keuper et al.
Tiling artifacts and trade-offs of feature normalization in the segmentation of large biological images
Elena Buglakova, Anwai Archit, Edoardo D'Imprima et al.
Token-Efficient VLM: High-Resolution Image Understanding via Dynamic Region Proposal
Yitong Jiang, Jinwei Gu, Tianfan Xue et al.
Towards Immersive Human-X Interaction: A Real-Time Framework for Physically Plausible Motion Synthesis
Kaiyang Ji, Ye Shi, Zichen Jin et al.
Towards Robust Defense against Customization via Protective Perturbation Resistant to Diffusion-based Purification
Wenkui Yang, Jie Cao, Junxian Duan et al.
Towards Scalable Spatial Intelligence via 2D-to-3D Data Lifting
Xingyu Miao, Haoran Duan, Quanhao Qian et al.
TPG-INR: Target Prior-Guided Implicit 3D CT Reconstruction for Enhanced Sparse-view Imaging
QingleiCao QingleiCao, Ziyao Tang, Xiaoqin Tang
TurboVSR: Fantastic Video Upscalers and Where to Find Them
Zhongdao Wang, Guodongfang Zhao, Jingjing Ren et al.
Two Losses, One Goal: Balancing Conflict Gradients for Semi-supervised Semantic Segmentation
Rui Sun, Huayu Mai, Wangkai Li et al.
UDC-VIT: A Real-World Video Dataset for Under-Display Cameras
Kyusu Ahn, JiSoo Kim, Sangik Lee et al.
Underwater Visual SLAM with Depth Uncertainty and Medium Modeling
Rui Liu, Sheng Fan, Wenguan Wang et al.
UniDxMD: Towards Unified Representation for Cross-Modal Unsupervised Domain Adaptation in 3D Semantic Segmentation
Zhengyin Liang, Hui Yin, Min Liang et al.
Unified Multimodal Understanding via Byte-Pair Visual Encoding
Wanpeng Zhang, Yicheng Feng, Hao Luo et al.
UniPhys: Unified Planner and Controller with Diffusion for Flexible Physics-Based Character Control
Yan Wu, Korrawe Karunratanakul, Zhengyi Luo et al.
Unleashing Vecset Diffusion Model for Fast Shape Generation
Zeqiang Lai, Zhao Yunfei, Zibo Zhao et al.
UnrealZoo: Enriching Photo-realistic Virtual Worlds for Embodied AI
Fangwei Zhong, Kui Wu, Churan Wang et al.
Unsupervised Joint Learning of Optical Flow and Intensity with Event Cameras
Shuang Guo, Friedhelm Hamann, Guillermo Gallego
UnZipLoRA: Separating Content and Style from a Single Image
Chang Liu, Viraj Shah, Aiyu Cui et al.
Video Individual Counting for Moving Drones
Yaowu Fan, Jia Wan, Tao Han et al.
Video Motion Graphs
Haiyang Liu, Zhan Xu, Fating Hong et al.
Visual Chronicles: Using Multimodal LLMs to Analyze Massive Collections of Images
Boyang Deng, Kyle Genova, Songyou Peng et al.