Most Cited ICCV "open-set domain generalization" Papers
2,701 papers found • Page 8 of 14
Conference
Learnable Fractional Reaction-Diffusion Dynamics for Under-Display ToF Imaging and Beyond
Xin Qiao, Matteo Poggi, Xing Wei et al.
Discretized Gaussian Representation for Tomographic Reconstruction
Shaokai Wu, Yuxiang Lu, Yapan Guo et al.
3D Test-time Adaptation via Graph Spectral Driven Point Shift
Xin Wei, Qin Yang, Yijie Fang et al.
EMoTive: Event-guided Trajectory Modeling for 3D Motion Estimation
Zengyu Wan, Wei Zhai, Yang Cao et al.
KDA: Knowledge Diffusion Alignment with Enhanced Context for Video Temporal Grounding
Ran Ran, Jiwei Wei, Shiyuan He et al.
VisNumBench: Evaluating Number Sense of Multimodal Large Language Models
Tengjin Weng, Jingyi Wang, Wenhao Jiang et al.
STEP-DETR: Advancing DETR-based Semi-Supervised Object Detection with Super Teacher and Pseudo-Label Guided Text Queries
Tahira Shehzadi, Khurram Azeem Hashmi, Shalini Sarode et al.
Completing 3D Partial Assemblies with View-Consistent 2D-3D Correspondence
Weihao Wang, Yu Lan, Mingyu You et al.
Aligning Global Semantics and Local Textures in Generative Video Enhancement
Zhikai Chen, Fuchen Long, Zhaofan Qiu et al.
Robust 3D-Masked Part-level Editing in 3D Gaussian Splatting with Regularized Score Distillation Sampling
Hayeon Kim, Ji Ha Jang, Se Young Chun
Structure Matters: Revisiting Boundary Refinement in Video Object Segmentation
Guanyi Qin, Ziyue Wang, Daiyun Shen et al.
AIM: Amending Inherent Interpretability via Self-Supervised Masking
Eyad Alshami, Shashank Agnihotri, Bernt Schiele et al.
One Last Attention for Your Vision-Language Model
Liang Chen, Ghazi Shazan Ahmad, Tianjun Yao et al.
RobustSplat: Decoupling Densification and Dynamics for Transient-Free 3DGS
Chuanyu Fu, Yuqi Zhang, Kunbin Yao et al.
High-Resolution Spatiotemporal Modeling with Global-Local State Space Models for Video-Based Human Pose Estimation
Runyang Feng, Hyung Jin Chang, Tze Ho Elden Tse et al.
Pi-GPS: Enhancing Geometry Problem Solving by Unleashing the Power of Diagrammatic Information
Junbo Zhao, Ting Zhang, Jiayu Sun et al.
Mitigating Catastrophic Overfitting in Fast Adversarial Training via Label Information Elimination
Chao Pan, Ke Tang, Li Qing et al.
Consistency Trajectory Matching for One-Step Generative Super-Resolution
Weiyi You, Mingyang Zhang, Leheng Zhang et al.
Amodal Depth Anything: Amodal Depth Estimation in the Wild
Zhenyu Li, Mykola Lavreniuk, Jian Shi et al.
One Perturbation is Enough: On Generating Universal Adversarial Perturbations against Vision-Language Pre-training Models
Hao Fang, Jiawei Kong, Wenbo Yu et al.
CVPT: Cross Visual Prompt Tuning
Lingyun Huang, Jianxu Mao, Junfei YI et al.
DDB: Diffusion Driven Balancing to Address Spurious Correlations
Aryan Yazdan Parast, Basim Azam, Naveed Akhtar
Geometric Alignment and Prior Modulation for View-Guided Point Cloud Completion on Unseen Categories
Jingqiao Xiu, Yicong Li, Na Zhao et al.
AnyBimanual: Transferring Unimanual Policy for General Bimanual Manipulation
Guanxing Lu, Tengbo Yu, Haoyuan Deng et al.
FVGen: Accelerating Novel-View Synthesis with Adversarial Video Diffusion Distillation
Wenbin Teng, Gonglin Chen, Haiwei Chen et al.
CoralSRT: Revisiting Coral Reef Semantic Segmentation by Feature Rectifying via Self-supervised Guidance
Zheng Ziqiang, Wong Kwan, Binh-Son Hua et al.
Learning Dense Feature Matching via Lifting Single 2D Image to 3D Space
Yingping Liang, Yutao Hu, Wenqi Shao et al.
Diagnosing Pretrained Models for Out-of-distribution Detection
Haipeng Xiong, Kai Xu, Angela Yao
What Makes for Text to 360-degree Panorama Generation with Stable Diffusion?
Jinhong Ni, Chang-Bin Zhang, Qiang Zhang et al.
Learning Normals of Noisy Points by Local Gradient-Aware Surface Filtering
Qing Li, Huifang Feng, Xun Gong et al.
Bayesian-Inspired Space-Time Superpixels
Kent Gauen, Stanley Chan
INSTINCT: Instance-Level Interaction Architecture for Query-Based Collaborative Perception
yunjiang xu, Yupeng Ouyang, Lingzhi Li et al.
Debiased Curriculum Adaptation for Safe Transfer Learning in Chest X-ray Classification
Mingyang Liu, Xinyang Chen, Yang Shu et al.
PHATNet: A Physics-guided Haze Transfer Network for Domain-adaptive Real-world Image Dehazing
Fu-Jen Tsai, Yan-Tsung Peng, Yen-Yu Lin et al.
Forensic-MoE: Exploring Comprehensive Synthetic Image Detection Traces with Mixture of Experts
Mingqi Fang, Ziguang Li, Lingyun Yu et al.
Information-Bottleneck Driven Binary Neural Network for Change Detection
Kaijie Yin, Zhiyuan Zhang, Shu Kong et al.
Entropy-Adaptive Diffusion Policy Optimization with Dynamic Step Alignment
Renye Yan, Jikang Cheng, Yaozhong Gan et al.
Time-Aware Auto White Balance in Mobile Photography
Mahmoud Afifi, Luxi Zhao, Abhijith Punnappurath et al.
Leveraging Panoptic Scene Graph for Evaluating Fine-Grained Text-to-Image Generation
Xueqing Deng, Linjie Yang, Qihang Yu et al.
ViewSRD: 3D Visual Grounding via Structured Multi-View Decomposition
Ronggang Huang, Haoxin Yang, Yan Cai et al.
Physical Degradation Model-Guided Interferometric Hyperspectral Reconstruction with Unfolding Transformer
Yuansheng Li, Yunhao Zou, Linwei Chen et al.
VPR-Cloak: A First Look at Privacy Cloak Against Visual Place Recognition
Shuting Dong, Mingzhi Chen, Feng Lu et al.
GUAVA: Generalizable Upper Body 3D Gaussian Avatar
Dongbin Zhang, Yunfei Liu, Lijian Lin et al.
HOMO-Feature: Cross-Arbitrary-Modal Image Matching with Homomorphism of Organized Major Orientation
Chenzhong Gao, Wei Li, Desheng Weng
GSOT3D: Towards Generic 3D Single Object Tracking in the Wild
Yifan Jiao, Yunhao Li, Junhua Ding et al.
Dynamic-DINO: Fine-Grained Mixture of Experts Tuning for Real-time Open-Vocabulary Object Detection
Yehao Lu, Minghe Weng, Zekang Xiao et al.
WAVE: Warp-Based View Guidance for Consistent Novel View Synthesis Using a Single Image
Jiwoo Park, Tae Choi, Youngjun Jun et al.
Lightweight and Fast Real-time Image Enhancement via Decomposition of the Spatial-aware Lookup Tables
Wontae Kim, Keuntek Lee, Nam Ik Cho
Neural Multi-View Self-Calibrated Photometric Stereo without Photometric Stereo Cues
Xu Cao, Takafumi Taketomi
EMatch: A Unified Framework for Event-based Optical Flow and Stereo Matching
Pengjie Zhang, Lin Zhu, Xiao Wang et al.
CounterPC: Counterfactual Feature Realignment for Unsupervised Domain Adaptation on Point Clouds
Feng Yang, Yichao Cao, Xiu Su et al.
Liberated-GS: 3D Gaussian Splatting Independent from SfM Point Clouds
Weihong Pan, Xiaoyu Zhang, Hongjia Zhai et al.
Unlocking the Potential of Diffusion Priors in Blind Face Restoration
Yunqi Miao, Zhiyu Qu, Mingqi Gao et al.
Implicit Counterfactual Learning for Audio-Visual Segmentation
Mingfeng Zha, Tianyu Li, Guoqing Wang et al.
STaR: Seamless Spatial-Temporal Aware Motion Retargeting with Penetration and Consistency Constraints
Xiaohang Yang, Qing Wang, Jiahao Yang et al.
MRGen: Segmentation Data Engine For Underrepresented MRI Modalities
Haoning Wu, Ziheng Zhao, Ya Zhang et al.
Rethink Sparse Signals for Pose-guided Text-to-image Generation
Wenjie Xuan, Jing Zhang, Juhua Liu et al.
Single-Scanline Relative Pose Estimation for Rolling Shutter Cameras
Petr Hruby, Marc Pollefeys
Enhancing Transferability of Targeted Adversarial Examples via Inverse Target Gradient Competition and Spatial Distance Stretching
Zhankai Li, Weiping Wang, jie li et al.
LDPose: Towards Inclusive Human Pose Estimation for Limb-Deficient Individuals in the Wild
Jiaying Ying, Heming Du, Kaihao Zhang et al.
SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion
Ahmed Nassar, Matteo Omenetti, Maksym Lysak et al.
Images as Noisy Labels: Unleashing the Potential of the Diffusion Model for Open-Vocabulary Semantic Segmentation
Fan Li, Xuanbin Wang, Xuan Wang et al.
ContextFace: Generating Facial Expressions from Emotional Contexts
minjung kim, Minsang Kim, Seung Jun Baek
SMP-Attack: Boosting the Transferability of Feature Importance-based Adversarial Attack with Semantics-aware Multi-granularity Patchout
Wen Yang, Guodong Liu, Di Ming
Spatial-Temporal Forgery Trace based Forgery Image Identification
Yilin Wang, Zunlei Feng, Jiachi Wang et al.
Towards Annotation-Free Evaluation: KPAScore for Human Keypoint Detection
Xiaoxiao Wang, Chunxiao Li, Peng Sun et al.
Ultra High-Resolution Image Inpainting with Patch-Based Content Consistency Adapter
JianHui Zhang, Shen Cheng, Qirui Sun et al.
Agreement aware and dissimilarity oriented GLOM
Ru Zeng, Yan Song, Yang ZHANG et al.
MeasureXpert: Automatic Anthropometric Measurement Extraction from Two Unregistered, Partial, Posed, and Dressed Body Scans
Ran Zhao, Xinxin Dai, Pengpeng Hu et al.
DiMPLe - Disentangled Multi-Modal Prompt Learning: Enhancing Out-Of-Distribution Alignment with Invariant and Spurious Feature Separation
Umaima Rahman, Mohammad Yaqub, Dwarikanath Mahapatra
ResidualViT for Efficient Temporally Dense Video Encoding
Mattia Soldan, Fabian Caba Heilbron, Bernard Ghanem et al.
Puppet-Master: Scaling Interactive Video Generation as a Motion Prior for Part-Level Dynamics
Ruining Li, Chuanxia Zheng, Christian Rupprecht et al.
Randomized Autoregressive Visual Generation
Qihang Yu, Ju He, Xueqing Deng et al.
Unsupervised RGB-D Point Cloud Registration for Scenes with Low Overlap and Photometric Inconsistency
yejun Shou, Haocheng Wang, Lingfeng Shen et al.
TOGA: Temporally Grounded Open-Ended Video QA with Weak Supervision
Ayush Gupta, Anirban Roy, Rama Chellappa et al.
Training-free Geometric Image Editing on Diffusion Models
Hanshen Zhu, Zhen Zhu, Kaile Zhang et al.
Monocular Facial Appearance Capture in the Wild
Yingyan Xu, Kate Gadola, Prashanth Chandran et al.
Growing a Twig to Accelerate Large Vision-Language Models
Zhenwei Shao, Mingyang Wang, Zhou Yu et al.
SignRep: Enhancing Self-Supervised Sign Representations
Ryan Wong, Necati Cihan Camgoz, Richard Bowden
MixA: A Mixed Attention approach with Stable Lightweight Linear Attention to enhance Efficiency of Vision Transformers at the Edge
Sabbir Ahmed, Jingtao Li, Weiming Zhuang et al.
OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation
Junyuan Zhang, Qintong Zhang, Bin Wang et al.
Efficient Event Camera Data Pretraining with Adaptive Prompt Fusion
Quanmin Liang, Qiang Li, Shuai Liu et al.
Head2Body: Body Pose Generation from Multi-sensory Head-mounted Inputs
Minh Tran, Hongda Mao, Qingshuang Chen et al.
Looking in the Mirror: A Faithful Counterfactual Explanation Method for Interpreting Deep Image Classification Models
Townim Chowdhury, Vu Phan, Kewen Liao et al.
FLSeg: Enhancing Privacy and Robustness in Federated Learning under Heterogeneous Data via Model Segmentation
Zichun Su, Zhi Lu, Yutong Wu et al.
Self-Calibrating Gaussian Splatting for Large Field-of-View Reconstruction
Youming Deng, Wenqi Xian, Guandao Yang et al.
Gradient Decomposition and Alignment for Incremental Object Detection
Wenlong Luo, Shizhou Zhang, De Cheng et al.
MSQ: Memory-Efficient Bit Sparsification Quantization
Seokho Han, Seoyeon Yoon, Jinhee Kim et al.
When and Where do Data Poisons Attack Textual Inversion?
Jeremy Styborski, Mingzhi Lyu, Jiayou Lu et al.
SRefiner: Soft-Braid Attention for Multi-Agent Trajectory Refinement
Liwen Xiao, Zhiyu Pan, Zhicheng Wang et al.
Rethinking Few Shot CLIP Benchmarks: A Critical Analysis in the Inductive Setting
Alexey Kravets, Da Chen, Vinay Namboodiri
AU-Blendshape for Fine-grained Stylized 3D Facial Expression Manipulation
Hao Li, Ju Dai, Feng Zhou et al.
BokehDiff: Neural Lens Blur with One-Step Diffusion
Chengxuan Zhu, Qingnan Fan, Qi Zhang et al.
Trial-Oriented Visual Rearrangement
Yuyi Liu, Xinhang Song, Tianliang Qi et al.
Debiased Teacher for Day-to-Night Domain Adaptive Object Detection
Yiming Cui, Liang Li, Haibing YIN et al.
SpikePack: Enhanced Information Flow in Spiking Neural Networks with High Hardware Compatibility
Guobin Shen, Jindong Li, Tenglong Li et al.
Social Debiasing for Fair Multi-modal LLMs
Harry Cheng, Yangyang Guo, Qingpei Guo et al.
Hierarchy-Aware Pseudo Word Learning with Text Adaptation for Zero-Shot Composed Image Retrieval
Zhe Li, Lei Zhang, Zheren Fu et al.
UPP: Unified Point-Level Prompting for Robust Point Cloud Analysis
Zixiang Ai, Zhenyu Cui, Yuxin Peng et al.
AV-Flow: Transforming Text to Audio-Visual Human-like Interactions
Aggelina Chatziagapi, Louis-Philippe Morency, Hongyu Gong et al.
Probabilistic Inertial Poser (ProbIP): Uncertainty-aware Human Motion Modeling from Sparse Inertial Sensors
Min Kim, Younho Jeon, Sungho Jo
SFUOD: Source-Free Unknown Object Detection
Keon-Hee Park, Seun-An Choe, Gyeong-Moon Park
Compression-Aware One-Step Diffusion Model for JPEG Artifact Removal
Jinpei Guo, Zheng Chen, Wenbo Li et al.
ConstStyle: Robust Domain Generalization with Unified Style Transformation
Nam Duong Tran, Nam Nguyen Phuong, Hieu Pham et al.
Golden Noise for Diffusion Models: A Learning Framework
zikai zhou, Shitong Shao, Lichen Bai et al.
Vision-Language Interactive Relation Mining for Open-Vocabulary Scene Graph Generation
Yukuan Min, Muli Yang, Jinhao Zhang et al.
OrderChain: Towards General Instruct-Tuning for Stimulating the Ordinal Understanding Ability of MLLM
Jinhong Wang, Shuo Tong, Jintai CHEN et al.
Unified Open-World Segmentation with Multi-Modal Prompts
Yang Liu, Yufei Yin, Chenchen Jing et al.
LayerAnimate: Layer-level Control for Animation
Yuxue Yang, Lue Fan, Zuzeng Lin et al.
Distilling Diffusion Models to Efficient 3D LiDAR Scene Completion
shengyuan zhang, An Zhao, Ling Yang et al.
SuperEvent: Cross-Modal Learning of Event-based Keypoint Detection for SLAM
Yannick Burkhardt, Simon Schaefer, Stefan Leutenegger
FedAGC: Federated Continual Learning with Asymmetric Gradient Correction
Chengchao Zhang, Fanhua Shang, Hongying Liu et al.
Joint Learning of Pose Regression and Denoising Diffusion with Score Scaling Sampling for Category-level 6D Pose Estimation
Seunghyun Lee, Tae-Kyun Kim
Intra-modal and Cross-modal Synchronization for Audio-visual Deepfake Detection and Temporal Localization
Ashutosh Anshul, Shreyas Gopal, Deepu Rajan et al.
CoTMR: Chain-of-Thought Multi-Scale Reasoning for Training-Free Zero-Shot Composed Image Retrieval
Zelong Sun, Dong Jing, Zhiwu Lu
The Curse of Conditions: Analyzing and Improving Optimal Transport for Conditional Flow-Based Generation
Ho Kei Cheng, Alex Schwing
DepthSync: Diffusion Guidance-Based Depth Synchronization for Scale- and Geometry-Consistent Video Depth Estimation
Yue-Jiang Dong, Wang Zhao, Jiale Xu et al.
InfiniDreamer: Arbitrarily Long Human Motion Generation via Segment Score Distillation
Wenjie Zhuo, Fan Ma, Hehe Fan
Client2Vec: Improving Federated Learning by Distribution Shifts Aware Client Indexing
Yongxin Guo, Lin Wang, Xiaoying Tang et al.
Instance-Level Video Depth in Groups Beyond Occlusions
Yuan Liang, Yang Zhou, Ziming Sun et al.
Future-Aware Interaction Network For Motion Forecasting
Shijie Li, Chunyu Liu, Xun Xu et al.
DreamCube: RGB-D Panorama Generation via Multi-plane Synchronization
Yukun Huang, Yanning Zhou, Jianan Wang et al.
Optical Model-Driven Sharpness Mapping for Autofocus in Small Depth-of-Field and Severe Defocus Scenarios
Chen-Liang Fan, Mingpei Cao, Chih-Chien Hung et al.
HyPiDecoder: Hybrid Pixel Decoder for Efficient Segmentation and Detection
Fengzhe Zhou, Humphrey Shi
MMAD: Multi-label Micro-Action Detection in Videos
Kun Li, pengyu Liu, Dan Guo et al.
Omni-scene Perception-oriented Point Cloud Geometry Enhancement for Coordinate Quantization
Wang Liu, Wei Gao
Auto-Regressive Transformation for Image Alignment
Kanggeon Lee, Soochahn Lee, Kyoung Mu Lee
Training-Free Industrial Defect Generation with Diffusion Models
Ruyi Xu, Yen-Tzu Chiu, Tai-I Chen et al.
Principles of Visual Tokens for Efficient Video Understanding
Xinyue Hao, Li, Shreyank Gowda et al.
TRCE: Towards Reliable Malicious Concept Erasure in Text-to-Image Diffusion Models
Ruidong Chen, honglin guo, Lanjun Wang et al.
Task-Aware Prompt Gradient Projection for Parameter-Efficient Tuning Federated Class-Incremental Learning
Hualong Ke, Yachao Zhang, Jiangming Shi et al.
Safeguarding Vision-Language Models: Mitigating Vulnerabilities to Gaussian Noise in Perturbation-based Attacks
Jiawei Wang, Yushen Zuo, Yuanjun Chai et al.
Diff2I2P: Differentiable Image-to-Point Cloud Registration with Diffusion Prior
Juncheng Mu, Chengwei REN, Weixiang Zhang et al.
KOEnsAttack: Towards Efficient Data-Free Black-Box Adversarial Attacks via Knowledge-Orthogonalized Substitute Ensembles
Chaoyong Yang, Jia-Li Yin, Bin Chen et al.
CLIP-Adapted Region-to-Text Learning for Generative Open-Vocabulary Semantic Segmentation
Jiannan Ge, Lingxi Xie, Hongtao Xie et al.
PersonalVideo: High ID-Fidelity Video Customization without Dynamic and Semantic Degradation
Hengjia Li, Haonan Qiu, Shiwei Zhang et al.
FuXi-RTM: A Physics-Guided Prediction Framework with Radiative Transfer Modeling
qiusheng huang, Xiaohui Zhong, Xu Fan et al.
DMQ: Dissecting Outliers of Diffusion Models for Post-Training Quantization
Dongyeun Lee, jiwan hur, Hyounguk Shon et al.
ImHead: A Large-scale Implicit Morphable Model for Localized Head Modeling
Rolandos Alexandros Potamias, Stathis Galanakis, Jiankang Deng et al.
Discovering Divergent Representations between Text-to-Image Models
Lisa Dunlap, Trevor Darrell, Joseph Gonzalez et al.
Clink! Chop! Thud! - Learning Object Sounds from Real-World Interactions
Mengyu Yang, Yiming Chen, Haozheng Pei et al.
Learning Visual Proxy for Compositional Zero-Shot Learning
Shiyu Zhang, Cheng Yan, Yang Liu et al.
LGA-Net: Learning Local and Global Affinities for Sparse Scribble based Image Colorization
Hongjin Lyu, Bo Li, Paul Rosin et al.
DISTIL: Data-Free Inversion of Suspicious Trojan Inputs via Latent Diffusion
Hossein Mirzaei, Zeinab Taghavi, Sepehr Rezaee et al.
Factorized Learning for Temporally Grounded Video-Language Models
Wenzheng Zeng, Difei Gao, Mike Zheng Shou et al.
MissRAG: Addressing the Missing Modality Challenge in Multimodal Large Language Models
Vittorio Pipoli, Alessia Saporita, Federico Bolelli et al.
Unknown Text Learning for CLIP-based Few-Shot Open-set Recognition
Rui Ma, Qilong Wang, Bing Cao et al.
Learning Separable Fine-Grained Representation via Dendrogram Construction from Coarse Labels for Fine-grained Visual Recognition
Guanghui Shi, Xuefeng liang, Wenjie Li et al.
Rethinking the Upsampling Process in Light Field Super-Resolution with Spatial-Epipolar Implicit Image Function
Ruixuan Cong, Yu Wang, Mingyuan Zhao et al.
Harnessing Input-Adaptive Inference for Efficient VLN
Dongwoo Kang, Akhil Perincherry, Zachary Coalson et al.
LA-MOTR: End-to-End Multi-Object Tracking by Learnable Association
Peng Wang, Yongcai Wang, Hualong Cao et al.
Mind the Gap: Preserving and Compensating for the Modality Gap in CLIP-Based Continual Learning
Linlan Huang, Xusheng Cao, Haori Lu et al.
Uncalibrated Structure from Motion on a Sphere
Jonathan Ventura, Viktor Larsson, Fredrik Kahl
Attention to the Burtiness in Visual Prompt Tuning!
Yuzhu Wang, Manni Duan, Shu Kong
A Linear N-Point Solver for Structure and Motion from Asynchronous Tracks
Hang Su, Yunlong Feng, Daniel Gehrig et al.
Multimodal Large Language Model-Guided ISP Hyperparameter Optimization with Dynamic Preference Learning
Xinyu Sun, Zhikun Zhao, congyan lang et al.
Deciphering Cross-Modal Alignment in Large Vision-Language Models via Modality Integration Rate
Qidong Huang, Xiaoyi Dong, Pan Zhang et al.
Doodle Your Keypoints: Sketch-Based Few-Shot Keypoint Detection
Subhajit Maity, Ayan Bhunia, Subhadeep Koley et al.
ShortV: Efficient Multimodal Large Language Models by Freezing Visual Tokens in Ineffective Layers
Qianhao Yuan, Qingyu Zhang, yanjiang liu et al.
Improving Noise Efficiency in Privacy-preserving Dataset Distillation
Runkai Zheng, Vishnu Dasu, Yinong Wang et al.
HAMSt3R: Human-Aware Multi-view Stereo 3D Reconstruction
Sara Rojas Martinez, Matthieu Armando, Bernard Ghanem et al.
NegRefine: Refining Negative Label-Based Zero-Shot OOD Detection
Amirhossein Ansari, Ke Wang, Pulei Xiong
GCAV: A Global Concept Activation Vector Framework for Cross-Layer Consistency in Interpretability
Zhenghao He, Sanchit Sinha, Guangzhi Xiong et al.
Semi-ViM: Bidirectional State Space Model for Mitigating Label Imbalance in Semi-Supervised Learning
Hongyang He, Hongyang Xie, Haochen You et al.
Beyond the Limits: Overcoming Negative Correlation of Activation-Based Training-Free NAS
Haidong Kang, Lianbo Ma, Pengjun Chen et al.
Semi-supervised Deep Transfer for Regression without Domain Alignment
Mainak Biswas, Ambedkar Dukkipati, Devarajan Sridharan
From Easy to Hard: The MIR Benchmark for Progressive Interleaved Multi-Image Reasoning
Hang Du, Jiayang Zhang, Guoshun Nan et al.
Diffusion Guided Adaptive Augmentation for Generalization in Visual Reinforcement Learning
Jeong Woon Lee, Hyoseok Hwang
RainbowPrompt: Diversity-Enhanced Prompt-Evolving for Continual Learning
Kiseong Hong, Gyeong-Hyeon Kim, Eunwoo Kim
A Good Teacher Adapts Their Knowledge for Distillation
Chengyao Qian, Trung Le, Mehrtash Harandi
SIGMAN: Scaling 3D Human Gaussian Generation with Millions of Assets
Yuhang Yang, Fengqi Liu, Yixing Lu et al.
Extending Foundational Monocular Depth Estimators to Fisheye Cameras with Calibration Tokens
Suchisrit Gangopadhyay, Jung Hee Kim, Xien Chen et al.
TWIST & SCOUT: Grounding Multimodal LLM-Experts by Forget-Free Tuning
Aritra Bhowmik, Mohammad Mahdi Derakhshani, Dennis Koelma et al.
CE-FAM: Concept-Based Explanation via Fusion of Activation Maps
Michihiro Kuroki, Toshihiko Yamasaki
PEFTDiff: Diffusion-Guided Transferability Estimation for Parameter-Efficient Fine-Tuning
PRAFFUL KHOBA, Zijian Wang, Chetan Arora et al.
FedMeNF: Privacy-Preserving Federated Meta-Learning for Neural Fields
Junhyeog Yun, Minui Hong, Gunhee Kim
Boosting Class Representation via Semantically Related Instances for Robust Long-Tailed Learning with Noisy Labels
Yuhang Li, Zhuying Li, Yuheng Jia
CAT: A Unified Click-and-Track Framework for Realistic Tracking
Yongsheng Yuan, Jie Zhao, Dong Wang et al.
Diffusion-Based Extreme High-speed Scenes Reconstruction with the Complementary Vision Sensor
Yapeng Meng, Yihan Lin, Taoyi Wang et al.
DiffuMatch: Category-Agnostic Spectral Diffusion Priors for Robust Non-rigid Shape Matching
Emery Pierson, Lei Li, Angela Dai et al.
SAC-GNC: SAmple Consensus for adaptive Graduated Non-Convexity
Valter Piedade, Chitturi Sidhartha, José Gaspar et al.
Towards Real Unsupervised Anomaly Detection Via Confident Meta-Learning
Muhammad Aqeel, Shakiba Sharifi, Marco Cristani et al.
NavQ: Learning a Q-Model for Foresighted Vision-and-Language Navigation
Peiran Xu, Xicheng Gong, Yadong Mu
GeoDiffusion: A Training-Free Framework for Accurate 3D Geometric Conditioning in Image Generation
Phillip Mueller, Talip Ünlü, Sebastian Schmidt et al.
Scaling 3D Compositional Models for Robust Classification and Pose Estimation
Xiaoding Yuan, Prakhar Kaushik, Guofeng Zhang et al.
Zero-AVSR: Zero-Shot Audio-Visual Speech Recognition with LLMs by Learning Language-Agnostic Speech Representations
Jeong Hun Yeo, Minsu Kim, Chae Won Kim et al.
StableDepth: Scene-Consistent and Scale-Invariant Monocular Depth
Zheng Zhang, Lihe Yang, Tianyu Yang et al.
4DSegStreamer: Streaming 4D Panoptic Segmentation via Dual Threads
Ling Liu, Jun Tian, Li Yi
Dynamic Point Maps: A Versatile Representation for Dynamic 3D Reconstruction
Edgar Sucar, Zihang Lai, Eldar Insafutdinov et al.
VOVTrack: Exploring the Potentiality in Raw Videos for Open-Vocabulary Multi-Object Tracking
Zekun Qian, Ruize Han, Junhui Hou et al.
C4D: 4D Made from 3D through Dual Correspondences
Shizun Wang, Zhenxiang Jiang, Xingyi Yang et al.
Princeton365: A Diverse Dataset with Accurate Camera Pose
Karhan Kayan, Stamatis Alexandropoulos, Rishabh Jain et al.
From Abyssal Darkness to Blinding Glare: A Benchmark on Extreme Exposure Correction in Real World
Bo Wang, Huiyuan Fu, Zhiye Huang et al.
Voyaging into Perpetual Dynamic Scenes from a Single View
Fengrui Tian, Tianjiao Ding, Jinqi Luo et al.
AJAHR: Amputated Joint Aware 3D Human Mesh Recovery
hyunjin cho, Giyun choi, Jongwon Choi
One Look is Enough: Seamless Patchwise Refinement for Zero-Shot Monocular Depth Estimation on High-Resolution Images
Byeongjun Kwon, Munchurl Kim
DialNav: Multi-turn Dialog Navigation with a Remote Guide
Leekyeung Han, Hyunji Min, Gyeom Hwangbo et al.
PS-Mamba: Spatial-Temporal Graph Mamba for Pose Sequence Refinement
Haoye Dong, Gim Hee Lee
RnGCam: High-speed video from rolling & global shutter measurements
Kevin Tandi, Xiang Dai, Chinmay Talegaonkar et al.
Easi3R: Estimating Disentangled Motion from DUSt3R Without Training
Xingyu Chen, Yue Chen, Yuliang Xiu et al.