Most Cited CVPR "robustness metrics" Papers
5,589 papers found • Page 24 of 28
Conference
DiSRT-In-Bed: Diffusion-Based Sim-to-Real Transfer Framework for In-Bed Human Mesh Recovery
Jing Gao, Ce Zheng, Laszlo Jeni et al.
DirectTriGS: Triplane-based Gaussian Splatting Field Representation for 3D Generation
Xiaoliang Ju, Hongsheng Li
Learning Triangular Distribution in Visual World
Ping Chen, Xingpeng Zhang, Chengtao Zhou et al.
MetaWriter: Personalized Handwritten Text Recognition Using Meta-Learned Prompt Tuning
Wenhao Gu, Li Gu, Ching Suen et al.
Align-A-Video: Deterministic Reward Tuning of Image Diffusion Models for Consistent Video Editing
Shengzhi Wang, Yingkang Zhong, Jiangchuan Mu et al.
GeoDepth: From Point-to-Depth to Plane-to-Depth Modeling for Self-Supervised Monocular Depth Estimation
Haifeng Wu, Shuhang Gu, Lixin Duan et al.
TAROT: Towards Essentially Domain-Invariant Robustness with Theoretical Justification
Dongyoon Yang, Jihu Lee, Yongdai Kim
Image Reconstruction from Readout-Multiplexed Single-Photon Detector Arrays
Shashwath Bharadwaj, Ruangrawee Kitichotkul, Akshay Agarwal et al.
Concept Lancet: Image Editing with Compositional Representation Transplant
Jinqi Luo, Tianjiao Ding, Kwan Ho Ryan Chan et al.
Efficient Dynamic Scene Editing via 4D Gaussian-based Static-Dynamic Separation
Joohyun Kwon, Hanbyel Cho, Junmo Kim
Test-time Augmentation Improves Efficiency in Conformal Prediction
Divya M Shanmugam, Helen Lu, Swami Sankaranarayanan et al.
MirrorVerse: Pushing Diffusion Models to Realistically Reflect the World
Ankit Dhiman, Manan Shah, R. Venkatesh Babu
Attraction Diminishing and Distributing for Few-Shot Class-Incremental Learning
Li-Jun Zhao, Zhen-Duo Chen, Yongxin Wang et al.
Towards Smart Point-and-Shoot Photography
Jiawan Li, Fei Zhou, Zhipeng Zhong et al.
Anchor-Aware Similarity Cohesion in Target Frames Enables Predicting Temporal Moment Boundaries in 2D
Jiawei Tan, Hongxing Wang, Junwu Weng et al.
VSNet: Focusing on the Linguistic Characteristics of Sign Language
Yuhao Li, Xinyue Chen, Hongkai Li et al.
MaSS13K: A Matting-level Semantic Segmentation Benchmark
Chenxi Xie, Minghan LI, Hui Zeng et al.
Sketchtopia: A Dataset and Foundational Agents for Benchmarking Asynchronous Multimodal Communication with Iconic Feedback
Mohd Hozaifa Khan, Ravi Kiran Sarvadevabhatla
DART: Disease-aware Image-Text Alignment and Self-correcting Re-alignment for Trustworthy Radiology Report Generation
Sang-Jun Park, Keun-Soo Heo, Dong-Hee Shin et al.
Simulator HC: Regression-based Online Simulation of Starting Problem-Solution Pairs for Homotopy Continuation in Geometric Vision
Xinyue Zhang, Zijia Dai, Wanting Xu et al.
Automatic Spectral Calibration of Hyperspectral Images: Method, Dataset and Benchmark
Zhuoran Du, Shaodi You, Cheng Cheng et al.
Animating General Image with Large Visual Motion Model
Dengsheng Chen, Xiaoming Wei, Xiaolin Wei
Self-Supervised Large Scale Point Cloud Completion for Archaeological Site Restoration
Aocheng Li, James R. Zimmer-Dauphinee, Rajesh Kalyanam et al.
GA3CE: Unconstrained 3D Gaze Estimation with Gaze-Aware 3D Context Encoding
Yuki Kawana, Shintaro Shiba, Quan Kong et al.
LIM: Large Interpolator Model for Dynamic Reconstruction
Remy Sabathier, Niloy J. Mitra, David Novotny
Unlocking the Potential of Unlabeled Data in Semi-Supervised Domain Generalization
Dongkwan Lee, Kyomin Hwang, Nojun Kwak
Pose-Guided Temporal Enhancement for Robust Low-Resolution Hand Reconstruction
Kaixin Fan, Pengfei Ren, Jingyu Wang et al.
Fitted Neural Lossless Image Compression
Zhe Zhang, Zhenzhong Chen, Shan Liu
MTMMC: A Large-Scale Real-World Multi-Modal Camera Tracking Benchmark
Sanghyun Woo, Kwanyong Park, Inkyu Shin et al.
Attribute-Missing Multi-view Graph Clustering
Bowen Zhao, Qianqian Wang, Zhengming Ding et al.
D^3CTTA: Domain-Dependent Decorrelation for Continual Test-Time Adaption of 3D LiDAR Segmentation
Jichun Zhao, Haiyong Jiang, Haoxuan Song et al.
Self-Supervised Learning for Color Spike Camera Reconstruction
Yanchen Dong, Ruiqin Xiong, Xiaopeng Fan et al.
OpticalNet: An Optical Imaging Dataset and Benchmark Beyond the Diffraction Limit
Benquan Wang, Ruyi An, Jin-Kyu So et al.
CARE Transformer: Mobile-Friendly Linear Visual Transformer via Decoupled Dual Interaction
Yuan Zhou, Qingshan Xu, Jiequan Cui et al.
Efficient Data Driven Mixture-of-Expert Extraction from Trained Networks
Uranik Berisha, Jens Mehnert, Alexandru Paul Condurache
SVFR: A Unified Framework for Generalized Video Face Restoration
Zhiyao Wang, Xu Chen, Chengming Xu et al.
SemiDAViL: Semi-supervised Domain Adaptation with Vision-Language Guidance for Semantic Segmentation
Hritam Basak, Zhaozheng Yin
EfficientDreamer: High-Fidelity and Robust 3D Creation via Orthogonal-view Diffusion Priors
Zhipeng Hu, Minda Zhao, Chaoyi Zhao et al.
Quad-Pixel Image Defocus Deblurring: A New Benchmark and Model
Hang Chen, Yin Xie, Xiaoxiu Peng et al.
Generalized Gaussian Entropy Model for Point Cloud Attribute Compression with Dynamic Likelihood Intervals
Changhao Peng
Hierarchical Gaussian Mixture Model Splatting for Efficient and Part Controllable 3D Generation
Qitong Yang, Mingtao Feng, Zijie Wu et al.
EnliveningGS: Active Locomotion of 3DGS
Siyuan Shen, Tianjia Shao, Kun Zhou et al.
Random Conditioning with Distillation for Data-Efficient Diffusion Model Compression
Dohyun Kim, Sehwan Park, GeonHee Han et al.
Soft Self-labeling and Potts Relaxations for Weakly-supervised Segmentation
Zhongwen Zhang, Yuri Boykov
TADFormer: Task-Adaptive Dynamic TransFormer for Efficient Multi-Task Learning
Seungmin Baek, Soyul Lee, Hayeon Jo et al.
Towards Scalable Human-aligned Benchmark for Text-guided Image Editing
Suho Ryu, Kihyun Kim, Eugene Baek et al.
HeatFormer: A Neural Optimizer for Multiview Human Mesh Recovery
Yuto Matsubara, Ko Nishino
HELVIPAD: A Real-World Dataset for Omnidirectional Stereo Depth Estimation
Mehdi Zayene, Albias Havolli, Jannik Endres et al.
HSI: A Holistic Style Injector for Arbitrary Style Transfer
Shuhao Zhang, Hui Kang, Yang Liu et al.
Empowering Vector Graphics with Consistently Arbitrary Viewing and View-dependent Visibility
Yidi Li, Jun Xiao, Zhengda Lu et al.
Adapting to Observation Length of Trajectory Prediction via Contrastive Learning
Ruiqi Qiu, JUN GONG, Xinyu Zhang et al.
Towards Explicit Geometry-Reflectance Collaboration for Generalized LiDAR Segmentation in Adverse Weather
Longyu Yang, Ping Hu, Shangbo Yuan et al.
IM-Portrait: Learning 3D-aware Video Diffusion for Photorealistic Talking Heads from Monocular VideosC
Yuan Li, Ziqian Bai, Feitong Tan et al.
APT: Adaptive Personalized Training for Diffusion Models with Limited Data
JungWoo Chae, Jiyoon Kim, Jaewoong Choi et al.
Multi-modal Contrastive Learning with Negative Sampling Calibration for Phenotypic Drug Discovery
Jiahua Rao, Hanjing Lin, Leyu Chen et al.
Creating Your Editable 3D Photorealistic Avatar with Tetrahedron-constrained Gaussian Splatting
Hanxi Liu, Yifang Men, Zhouhui Lian
Dual Semantic Guidance for Open Vocabulary Semantic Segmentation
ZhengYang Wang, Tingliang Feng, Fan Lyu et al.
Boost the Inference with Co-training: A Depth-guided Mutual Learning Framework for Semi-supervised Medical Polyp Segmentation
Yuxin Li, Zihao Zhu, Yuxiang Zhang et al.
Targeted Forgetting of Image Subgroups in CLIP Models
Zeliang Zhang, Gaowen Liu, Charles Fleming et al.
HyperPose: Hypernetwork-Infused Camera Pose Localization and an Extended Cambridge Landmarks Dataset
Ron Ferens, Yosi Keller
MonoPlace3D: Learning 3D-Aware Object Placement for 3D Monocular Detection
Rishubh Parihar, Srinjay Sarkar, Sarthak Vora et al.
Enhancing Virtual Try-On with Synthetic Pairs and Error-Aware Noise Scheduling
Nannan Li, Kevin Shih, Bryan A. Plummer
WildAvatar: Learning In-the-wild 3D Avatars from the Web
Zihao Huang, Shoukang Hu, Guangcong Wang et al.
Sparse Point Cloud Patches Rendering via Splitting 2D Gaussians
Changfeng Ma, Ran Bi, Jie Guo et al.
Libra-Merging: Importance-redundancy and Pruning-merging Trade-off for Acceleration Plug-in in Large Vision-Language Model
Longrong Yang, Dong Shen, Chaoxiang Cai et al.
AdaptCMVC: Robust Adaption to Incremental Views in Continual Multi-view Clustering
Jing Wang, Songhe Feng, Kristoffer Knutsen Wickstrøm et al.
ArtiFade: Learning to Generate High-quality Subject from Blemished Images
Shuya Yang, Shaozhe Hao, Yukang Cao et al.
DynPose: Largely Improving the Efficiency of Human Pose Estimation by a Simple Dynamic Framework
Yalong Xu, Lin Zhao, Chen Gong et al.
Twinner: Shining Light on Digital Twins in a Few Snaps
Jesus Zarzar, Tom Monnier, Roman Shapovalov et al.
Acc3D: Accelerating Single Image to 3D Diffusion Models via Edge Consistency Guided Score Distillation
Kendong Liu, Zhiyu Zhu, Hui LIU et al.
MAGE : Single Image to Material-Aware 3D via the Multi-View G-Buffer Estimation Model
Haoyuan Wang, Zhenwei Wang, Xiaoxiao Long et al.
GroundingFace: Fine-grained Face Understanding via Pixel Grounding Multimodal Large Language Model
Yue Han, Jiangning Zhang, Junwei Zhu et al.
Self-Supervised Cross-View Correspondence with Predictive Cycle Consistency
Alan Baade, Changan Chen
Training Data Provenance Verification: Did Your Model Use Synthetic Data from My Generative Model for Training?
Yuechen Xie, Jie Song, Huiqiong Wang et al.
Outdoor Scene Extrapolation with Hierarchical Generative Cellular Automata
Dongsu Zhang, Francis Williams, Žan Gojčič et al.
Minimal Interaction Seperated Tuning: A New Paradigm for Visual Adaptation
Ningyuan Tang, Minghao Fu, Jianxin Wu
PARC: A Quantitative Framework Uncovering the Symmetries within Vision Language Models
Jenny Schmalfuss, Nadine Chang, Vibashan VS et al.
Mamba-Adaptor: State Space Model Adaptor for Visual Recognition
Fei Xie, Jiahao Nie, Yujin Tang et al.
CURSOR: Scalable Mixed-Order Hypergraph Matching with CUR Decomposition
Qixuan Zheng, Ming Zhang, Hong Yan
3D Prior Is All You Need: Cross-Task Few-shot 2D Gaze Estimation
Yihua Cheng, Hengfei Wang, Zhongqun Zhang et al.
Hierarchical Compact Clustering Attention (COCA) for Unsupervised Object-Centric Learning
Can Küçüksözen, Yucel Yemez
Revisiting Audio-Visual Segmentation with Vision-Centric Transformer
Shaofei Huang, Rui Ling, Tianrui Hui et al.
Towards Natural Language-Based Document Image Retrieval: New Dataset and Benchmark
Hao Guo, Xugong Qin, Jun Jie Ou Yang et al.
Revisiting Fairness in Multitask Learning: A Performance-Driven Approach for Variance Reduction
Xiaohan Qin, Xiaoxing Wang, Junchi Yan
PHGC: Procedural Heterogeneous Graph Completion for Natural Language Task Verification in Egocentric Videos
Xun Jiang, Zhiyi Huang, Xing Xu et al.
CSC-PA: Cross-image Semantic Correlation via Prototype Attentions for Single-network Semi-supervised Breast Tumor Segmentation
Zhenhui Ding, Guilian Chen, Qin Zhang et al.
Unlocking Generalization Power in LiDAR Point Cloud Registration
Zhenxuan Zeng, Qiao Wu, Xiyu Zhang et al.
Dual Energy-Based Model with Open-World Uncertainty Estimation for Out-of-distribution Detection
Qi Chen, Hu Ding
Few-shot Implicit Function Generation via Equivariance
Suizhi Huang, Xingyi Yang, Hongtao Lu et al.
PAVE: Patching and Adapting Video Large Language Models
Zhuoming Liu, Yiquan Li, Khoi D Nguyen et al.
Seeing A 3D World in A Grain of Sand
Yufan Zhang, Yu Ji, Yu Guo et al.
Feature Spectrum Learning for Remote Sensing Change Detection
Qi Zang, Dong Zhao, Shuang Wang et al.
Unified Reconstruction of Static and Dynamic Scenes from Events
Qiyao Gao, Peiqi Duan, Hanyue Lou et al.
Explicit Depth-Aware Blurry Video Frame Interpolation Guided by Differential Curves
yan zaoming, pengcheng lei, Tingting Wang et al.
Active Event-based Stereo Vision
Jianing Li, Yunjian Zhang, Haiqian Han et al.
ESC: Erasing Space Concept for Knowledge Deletion
Tae-Young Lee, Sundong Park, Minwoo Jeon et al.
Latent Space Imaging
Matheus Souza, Yidan Zheng, Kaizhang Kang et al.
VideoSPatS: Video SPatiotemporal Splines for Disentangled Occlusion, Appearance and Motion Modeling and Editing
Juan Luis Gonzalez Bello, Xu Yao, Alex Whelan et al.
PIAD: Pose and Illumination agnostic Anomaly Detection
Kaichen Yang, Junjie Cao, Zeyu Bai et al.
HiLoTs: High-Low Temporal Sensitive Representation Learning for Semi-Supervised LiDAR Segmentation in Autonomous Driving
R.D. Lin, Pengcheng Weng, Yinqiao Wang et al.
Odd-One-Out: Anomaly Detection by Comparing with Neighbors
Ankan Kumar Bhunia, Changjian Li, Hakan Bilen
Occlusion-aware Text-Image-Point Cloud Pretraining for Open-World 3D Object Recognition
Khanh Nguyen, Ghulam Mubashar Hassan, Ajmal Mian
Implicit Correspondence Learning for Image-to-Point Cloud Registration
Xinjun Li, Wenfei Yang, Jiacheng Deng et al.
Link to the Past: Temporal Propagation for Fast 3D Human Reconstruction from Monocular Video
Marchellus Matthew, Nadhira Noor, In Kyu Park
Symbolic Representation for Any-to-Any Generative Tasks
Jiaqi Chen, Xiaoye Zhu, Yue Wang et al.
Depth-Guided Bundle Sampling for Efficient Generalizable Neural Radiance Field Reconstruction
Li Fang, Hao Zhu, Longlong Chen et al.
Practical Solutions to the Relative Pose of Three Calibrated Cameras
Charalambos Tzamos, Viktor Kocur, Yaqing Ding et al.
Take the Bull by the Horns: Learning to Segment Hard Samples
Yuan Guo, Jingyu Kong, Yu Wang et al.
Non-Rigid Structure-from-Motion: Temporally-Smooth Procrustean Alignment and Spatially-Variant Deformation Modeling
Jiawei Shi, Hui Deng, Yuchao Dai
Can Text-to-Video Generation help Video-Language Alignment?
Luca Zanella, Massimiliano Mancini, Willi Menapace et al.
Data-free Universal Adversarial Perturbation with Pseudo-semantic Prior
Chanhui Lee, Yeonghwan Song, Jeany Son
SAMBLE: Shape-Specific Point Cloud Sampling for an Optimal Trade-Off Between Local Detail and Global Uniformity
Chengzhi Wu, Yuxin Wan, Hao Fu et al.
CamPoint: Boosting Point Cloud Segmentation with Virtual Camera
Jianhui Zhang, Luo Yizhi, Zicheng Zhang et al.
Vision-Guided Action: Enhancing 3D Human Motion Prediction with Gaze-informed Affordance in 3D Scenes
Ting Yu, Yi Lin, Jun Yu et al.
VRetouchEr: Learning Cross-frame Feature Interdependence with Imperfection Flow for Face Retouching in Videos
Wen Xue, Le Jiang, Lianxin Xie et al.
RASP: Revisiting 3D Anamorphic Art for Shadow-Guided Packing of Irregular Objects
Soumyaratna Debnath, Ashish Tiwari, Kaustubh Sadekar et al.
MFogHub: Bridging Multi-Regional and Multi-Satellite Data for Global Marine Fog Detection and Forecasting
Mengqiu XU, Kaixin Chen, Heng Guo et al.
EvOcc: Accurate Semantic Occupancy for Automated Driving Using Evidence Theory
Jonas Kälble, Sascha Wirges, Maxim Tatarchenko et al.
FRAME: Floor-aligned Representation for Avatar Motion from Egocentric Video
Andrea Boscolo Camiletto, Jian Wang, Eduardo Alvarado et al.
Dense Dispersed Structured Light for Hyperspectral 3D Imaging of Dynamic Scenes
Suhyun Shin, Seungwoo Yoon, Ryota Maeda et al.
Customized Condition Controllable Generation for Video Soundtrack
Fan Qi, KunSheng Ma, Changsheng Xu
Object Dynamics Modeling with Hierarchical Point Cloud-based Representations
Chanho Kim, Li Fuxin
Beyond Image Classification: A Video Benchmark and Dual-Branch Hybrid Discrimination Framework for Compositional Zero-Shot Learning
Dongyao Jiang, Haodong Jing, Yongqiang Ma et al.
Explaining Domain Shifts in Language: Concept Erasing for Interpretable Image Classification
Zequn Zeng, Yudi Su, Jianqiao Sun et al.
LoKi: Low-dimensional KAN for Efficient Fine-tuning Image Models
Xuan Cai, Renjie Pan, Hua Yang
WISNet: Pseudo Label Generation on Unbalanced and Patch Annotated Waste Images
Shifan Zhang, Hongzi Zhu, Yinan He et al.
Semantic Line Combination Detector
JINWON KO, Dongkwon Jin, Chang-Su Kim
MAC-Ego3D: Multi-Agent Gaussian Consensus for Real-Time Collaborative Ego-Motion and Photorealistic 3D Reconstruction
Xiaohao Xu, Feng Xue, Shibo Zhao et al.
Difference Inversion: Interpolate and Isolate the Difference with Token Consistency for Image Analogy Generation
Hyunsoo Kim, Donghyun Kim, Suhyun Kim
GaPT-DAR: Category-level Garments Pose Tracking via Integrated 2D Deformation and 3D Reconstruction
Li Zhang, mingliang xu, Jianan Wang et al.
Meta-Learning Hyperparameters for Parameter Efficient Fine-Tuning
Zichen Tian, Yaoyao Liu, Qianru Sun
Attention IoU: Examining Biases in CelebA using Attention Maps
Aaron Serianni, Tyler Zhu, Olga Russakovsky et al.
CoSER: Towards Consistent Dense Multiview Text-to-Image Generator for 3D Creation
Bonan Li, Zicheng Zhang, Xingyi Yang et al.
Sketchy Bounding-box Supervision for 3D Instance Segmentation
qian deng, Le Hui, Jin Xie et al.
Medusa: A Multi-Scale High-order Contrastive Dual-Diffusion Approach for Multi-View Clustering
Liang Chen, Zhe Xue, Yawen Li et al.
PhysicsGen: Can Generative Models Learn from Images to Predict Complex Physical Relations?
Martin Spitznagel, Jan Vaillant, Janis Keuper
FSboard: Over 3 Million Characters of ASL Fingerspelling Collected via Smartphones
Manfred Georg, Garrett Tanzer, Esha Uboweja et al.
CaMuViD: Calibration-Free Multi-View Detection
Amir Etefaghi Daryani, M. Usman Maqbool Bhutta, Byron Hernandez et al.
De^2Gaze: Deformable and Decoupled Representation Learning for 3D Gaze Estimation
Yunfeng Xiao, Xiaowei Bai, Baojun Chen et al.
Saliuitl: Ensemble Salience Guided Recovery of Adversarial Patches against CNNs
Mauricio Byrd Victorica, György Dán, Henrik Sandberg
Improving Personalized Search with Regularized Low-Rank Parameter Updates
Fiona Ryan, Josef Sivic, Fabian Caba Heilbron et al.
COUNTS: Benchmarking Object Detectors and Multimodal Large Language Models under Distribution Shifts
Jiansheng Li, Xingxuan Zhang, Hao Zou et al.
Data Distributional Properties As Inductive Bias for Systematic Generalization
Felipe del Rio, Alain Raymond, Daniel Florea et al.
Polarized Color Screen Matting
Kenji Enomoto, Scott Cohen, Brian Price et al.
UMFN: Unified Multi-Domain Face Normalization for Joint Cross-domain Prototype Learning and Heterogeneous Face Recognition
Meng Pang, Wenjun Zhang, Nanrun Zhou et al.
Poly-Autoregressive Prediction for Modeling Interactions
Neerja Thakkar, Tara Sadjadpour, Jathushan Rajasegaran et al.
Incorporating Dense Knowledge Alignment into Unified Multimodal Representation Models
Yuhao Cui, Xinxing Zu, Wenhua Zhang et al.
PURA: Parameter Update-Recovery Test-Time Adaption for RGB-T Tracking
Zekai Shao, Yufan Hu, Bin Fan et al.
SinGS: Animatable Single-Image Human Gaussian Splats with Kinematic Priors
Yufan Wu, Xuanhong Chen, Wen Li et al.
AUEditNet: Dual-Branch Facial Action Unit Intensity Manipulation with Implicit Disentanglement
Shiwei Jin, Zhen Wang, Lei Wang et al.
ShapeShifter: 3D Variations Using Multiscale and Sparse Point-Voxel Diffusion
Nissim Maruani, Wang Yifan, Matthew Fisher et al.
Spk2SRImgNet: Super-Resolve Dynamic Scene from Spike Stream via Motion Aligned Collaborative Filtering
Yuanlin Wang, Yiyang Zhang, Ruiqin Xiong et al.
Graph Neural Network Combining Event Stream and Periodic Aggregation for Low-Latency Event-based Vision
Manon Dampfhoffer, Thomas Mesquida, Damien Joubert et al.
Revisiting Generative Replay for Class Incremental Object Detection
Shizhou Zhang, Xueqiang Lv, Yinghui Xing et al.
Homogeneous Dynamics Space for Heterogeneous Humans
Xinpeng Liu, Junxuan Liang, Chenshuo Zhang et al.
MetricGrids: Arbitrary Nonlinear Approximation with Elementary Metric Grids based Implicit Neural Representation
Shu Wang, Yanbo Gao, Shuai Li et al.
Boosting Point-Supervised Temporal Action Localization through Integrating Query Reformation and Optimal Transport
Mengnan Liu, Le Wang, Sanping Zhou et al.
DITTO: Dual and Integrated Latent Topologies for Implicit 3D Reconstruction
Jaehyeok Shim, Kyungdon Joo
Fooling Polarization-Based Vision using Locally Controllable Polarizing Projection
Zhuoxiao Li, Zhihang Zhong, Shohei Nobuhara et al.
SeriesBench: A Benchmark for Narrative-Driven Drama Series Understanding
chenkai zhang, Yiming Lei, Zeming Liu et al.
PaReNeRF: Toward Fast Large-scale Dynamic NeRF with Patch-based Reference
Xiao Tang, Min Yang, Penghui Sun et al.
Tuning Stable Rank Shrinkage: Aiming at the Overlooked Structural Risk in Fine-tuning
Sicong Shen, Yang Zhou, Bingzheng Wei et al.
SD2Event:Self-supervised Learning of Dynamic Detectors and Contextual Descriptors for Event Cameras
Yuan Gao, Yuqing Zhu, Xinjun Li et al.
See Say and Segment: Teaching LMMs to Overcome False Premises
Tsung-Han Wu, Giscard Biamby, David Chan et al.
MTLoRA: Low-Rank Adaptation Approach for Efficient Multi-Task Learning
Ahmed Agiza, Marina Neseem, Sherief Reda
LiDAR-Net: A Real-scanned 3D Point Cloud Dataset for Indoor Scenes
Yanwen Guo, Yuanqi Li, Dayong Ren et al.
D3still: Decoupled Differential Distillation for Asymmetric Image Retrieval
Yi Xie, Yihong Lin, Wenjie Cai et al.
Attack To Defend: Exploiting Adversarial Attacks for Detecting Poisoned Models
Samar Fares, Karthik Nandakumar
Shallow-Deep Collaborative Learning for Unsupervised Visible-Infrared Person Re-Identification
Bin Yang, Jun Chen, Mang Ye
Shadow-Enlightened Image Outpainting
Hang Yu, Ruilin Li, Shaorong Xie et al.
Validating Privacy-Preserving Face Recognition under a Minimum Assumption
Hui Zhang, Xingbo Dong, YenLungLai et al.
Spatial-Aware Regression for Keypoint Localization
Dongkai Wang, Shiliang Zhang
Don’t Drop Your Samples! Coherence-Aware Training Benefits Conditional Diffusion
Nicolas Dufour, Victor Besnier, Vicky Kalogeiton et al.
IDGuard: Robust General Identity-centric POI Proactive Defense Against Face Editing Abuse
Yunshu Dai, Jianwei Fei, Fangjun Huang
Edge-Aware 3D Instance Segmentation Network with Intelligent Semantic Prior
Wonseok Roh, Hwanhee Jung, Giljoo Nam et al.
Forecasting of 3D Whole-body Human Poses with Grasping Objects
yan haitao, Qiongjie Cui, Jiexin Xie et al.
DIOD: Self-Distillation Meets Object Discovery
Sandra Kara, Hejer AMMAR, Julien Denize et al.
Pose-Transformed Equivariant Network for 3D Point Trajectory Prediction
Ruixuan Yu, Jian Sun
Synthesize Diagnose and Optimize: Towards Fine-Grained Vision-Language Understanding
Wujian Peng, Sicheng Xie, Zuyao You et al.
3DToonify: Creating Your High-Fidelity 3D Stylized Avatar Easily from 2D Portrait Images
Yifang Men, Hanxi Liu, Yuan Yao et al.
View From Above: Orthogonal-View aware Cross-view Localization
Shan Wang, Chuong Nguyen, Jiawei Liu et al.
Pixel-level Semantic Correspondence through Layout-aware Representation Learning and Multi-scale Matching Integration
Yixuan Sun, Zhangyue Yin, Haibo Wang et al.
GPLD3D: Latent Diffusion of 3D Shape Generative Models by Enforcing Geometric and Physical Priors
Yuan Dong, Qi Zuo, Xiaodong Gu et al.
JoAPR: Cleaning the Lens of Prompt Learning for Vision-Language Models
YUNCHENG GUO, Xiaodong Gu
Compositional Video Understanding with Spatiotemporal Structure-based Transformers
Hoyeoung Yun, Jinwoo Ahn, Minseo Kim et al.
Improving Training Efficiency of Diffusion Models via Multi-Stage Framework and Tailored Multi-Decoder Architecture
Huijie Zhang, Yifu Lu, Ismail Alkhouri et al.
Class Tokens Infusion for Weakly Supervised Semantic Segmentation
Sung-Hoon Yoon, Hoyong Kwon, Hyeonseong Kim et al.
Dual-Consistency Model Inversion for Non-Exemplar Class Incremental Learning
Zihuan Qiu, Yi Xu, Fanman Meng et al.
Training Vision Transformers for Semi-Supervised Semantic Segmentation
Xinting Hu, Li Jiang, Bernt Schiele
Person in Place: Generating Associative Skeleton-Guidance Maps for Human-Object Interaction Image Editing
ChangHee Yang, ChanHee Kang, Kyeongbo Kong et al.
Estimating Extreme 3D Image Rotations using Cascaded Attention
Shay Dekel, Yosi Keller, Martin Čadík
Open-Vocabulary 3D Semantic Segmentation with Foundation Models
Li Jiang, Shaoshuai Shi, Bernt Schiele
Draw Step by Step: Reconstructing CAD Construction Sequences from Point Clouds via Multimodal Diffusion.
Weijian Ma, Shuaiqi Chen, Yunzhong Lou et al.
Absolute Pose from One or Two Scaled and Oriented Features
Jonathan Ventura, Zuzana Kukelova, Torsten Sattler et al.
Higher-order Relational Reasoning for Pedestrian Trajectory Prediction
Sungjune Kim, Hyung-gun Chi, Hyerin Lim et al.
TransLoc4D: Transformer-based 4D Radar Place Recognition
Guohao Peng, Heshan Li, Yangyang Zhao et al.
Domain Gap Embeddings for Generative Dataset Augmentation
Yinong Oliver Wang, Younjoon Chung, Chen Henry Wu et al.
DeMatch: Deep Decomposition of Motion Field for Two-View Correspondence Learning
Shihua Zhang, Zizhuo Li, Yuan Gao et al.
CORES: Convolutional Response-based Score for Out-of-distribution Detection
Keke Tang, Chao Hou, Weilong Peng et al.
HOI-M^3: Capture Multiple Humans and Objects Interaction within Contextual Environment
Juze Zhang, Jingyan Zhang, Zining Song et al.