Most Cited CVPR "automated fact-checking" Papers
5,589 papers found • Page 15 of 28
Conference
Harnessing Frozen Unimodal Encoders for Flexible Multimodal Alignment
Mayug Maniparambil, Raiymbek Akshulakov, YASSER ABDELAZIZ DAHOU DJILALI et al.
ConText-CIR: Learning from Concepts in Text for Composed Image Retrieval
Eric Xing, Pranavi Kolouju, Robert Pless et al.
UA-Pose: Uncertainty-Aware 6D Object Pose Estimation and Online Object Completion with Partial References
Ming-Feng Li, Xin Yang, Fu-En Wang et al.
Preconditioners for the Stochastic Training of Neural Fields
Shin-Fang Chng, Hemanth Saratchandran, Simon Lucey
EntitySAM: Segment Everything in Video
Mingqiao Ye, Seoung Wug Oh, Lei Ke et al.
EAP-GS: Efficient Augmentation of Pointcloud for 3D Gaussian Splatting in Few-shot Scene Reconstruction
Dongrui Dai, Yuxiang Xing
Potential Field Based Deep Metric Learning
Shubhang Bhatnagar, Narendra Ahuja
Solving Instance Detection from an Open-World Perspective
Qianqian Shen, Yunhan Zhao, Nahyun Kwon et al.
Mitigating Ambiguities in 3D Classification with Gaussian Splatting
Ruiqi Zhang, Hao Zhu, Jingyi Zhao et al.
Handling Spatial-Temporal Data Heterogeneity for Federated Continual Learning via Tail Anchor
Hao Yu, Xin Yang, Le Zhang et al.
Enhancing Dance-to-Music Generation via Negative Conditioning Latent Diffusion Model
Changchang Sun, Gaowen Liu, Charles Fleming et al.
Reasoning Mamba: Hypergraph-Guided Region Relation Calculating for Weakly Supervised Affordance Grounding
Yuxuan Wang, Aming Wu, Muli Yang et al.
RC-AutoCalib: An End-to-End Radar-Camera Automatic Calibration Network
Van-Tin Luu, Yong-Lin Cai, Vu-Hoang Tran et al.
Split Adaptation for Pre-trained Vision Transformers
Lixu Wang, Bingqi Shang, Yi Li et al.
Self-Supervised Spatial Correspondence Across Modalities
Ayush Shrivastava, Andrew Owens
Few-shot Personalized Scanpath Prediction
Ruoyu Xue, Jingyi Xu, Sounak Mondal et al.
STING-BEE: Towards Vision-Language Model for Real-World X-ray Baggage Security Inspection
Divya Velayudhan, Abdelfatah Ahmed, Mohamad Alansari et al.
Derivative-Free Diffusion Manifold-Constrained Gradient for Unified XAI
Won Jun Kim, Hyungjin Chung, Jaemin Kim et al.
COSMIC: Clique-Oriented Semantic Multi-space Integration for Robust CLIP Test-Time Adaptation
Fanding Huang, Jingyan Jiang, Qinting Jiang et al.
Adapting Text-to-Image Generation with Feature Difference Instruction for Generic Image Restoration
Chao Wang, Hehe Fan, Huichen Yang et al.
Test-Time Fine-Tuning of Image Compression Models for Multi-Task Adaptability
Unki Park, Seongmoon Jeong, Jang Youngchan et al.
Boltzmann Attention Sampling for Image Analysis with Small Objects
Theodore Zhao, Sid Kiblawi, Mu Wei et al.
Compass Control: Multi Object Orientation Control for Text-to-Image Generation
Rishubh Parihar, Vaibhav Agrawal, Sachidanand VS et al.
Ferret: An Efficient Online Continual Learning Framework under Varying Memory Constraints
Yuhao Zhou, Yuxin Tian, Jindi Lv et al.
DynaMoDe-NeRF: Motion-aware Deblurring Neural Radiance Field for Dynamic Scenes
Ashish Kumar, A. N. Rajagopalan
VISTREAM: Improving Computation Efficiency of Visual Streaming Perception via Law-of-Charge-Conservation Inspired Spiking Neural Network
Kang You, Ziling Wei, Jing Yan et al.
EZSR: Event-based Zero-Shot Recognition
Yan Yang, Liyuan Pan, Dongxu Li et al.
GenVDM: Generating Vector Displacement Maps From a Single Image
Yuezhi Yang, Qimin Chen, Vladimir G. Kim et al.
CustAny: Customizing Anything from A Single Example
Lingjie Kong, Kai WU, Chengming Xu et al.
Progressive Rendering Distillation: Adapting Stable Diffusion for Instant Text-to-Mesh Generation without 3D Data
Zhiyuan Ma, Xinyue Liang, Rongyuan Wu et al.
Language-Assisted Debiasing and Smoothing for Foundation Model-Based Semi-Supervised Learning
Na Zheng, Xuemeng Song, Xue Dong et al.
Feature Spectrum Learning for Remote Sensing Change Detection
Qi Zang, Dong Zhao, Shuang Wang et al.
CaMuViD: Calibration-Free Multi-View Detection
Amir Etefaghi Daryani, M. Usman Maqbool Bhutta, Byron Hernandez et al.
D^3CTTA: Domain-Dependent Decorrelation for Continual Test-Time Adaption of 3D LiDAR Segmentation
Jichun Zhao, Haiyong Jiang, Haoxuan Song et al.
Identity-preserving Distillation Sampling by Fixed-Point Iterator
SeonHwa Kim, Jiwon Kim, Soobin Park et al.
DynPose: Largely Improving the Efficiency of Human Pose Estimation by a Simple Dynamic Framework
Yalong Xu, Lin Zhao, Chen Gong et al.
Zero-Shot Head Swapping in Real-World Scenarios
Sohyun Jeong, Taewoong Kang, Hyojin Jang et al.
Data Distributional Properties As Inductive Bias for Systematic Generalization
Felipe del Rio, Alain Raymond, Daniel Florea et al.
SyncSDE: A Probabilistic Framework for Diffusion Synchronization
Hyunjun Lee, Hyunsoo Lee, Sookwan Han
Odd-One-Out: Anomaly Detection by Comparing with Neighbors
Ankan Kumar Bhunia, Changjian Li, Hakan Bilen
SVLTA: Benchmarking Vision-Language Temporal Alignment via Synthetic Video Situation
Hao Du, Bo Wu, Yan Lu et al.
CamPoint: Boosting Point Cloud Segmentation with Virtual Camera
Jianhui Zhang, Luo Yizhi, Zicheng Zhang et al.
MAC-Ego3D: Multi-Agent Gaussian Consensus for Real-Time Collaborative Ego-Motion and Photorealistic 3D Reconstruction
Xiaohao Xu, Feng Xue, Shibo Zhao et al.
LoKi: Low-dimensional KAN for Efficient Fine-tuning Image Models
Xuan Cai, Renjie Pan, Hua Yang
VSNet: Focusing on the Linguistic Characteristics of Sign Language
Yuhao Li, Xinyue Chen, Hongkai Li et al.
Provoking Multi-modal Few-Shot LVLM via Exploration-Exploitation In-Context Learning
Cheng Chen, Yunpeng Zhai, Yifan Zhao et al.
Latent Space Imaging
Matheus Souza, Yidan Zheng, Kaizhang Kang et al.
GroundingFace: Fine-grained Face Understanding via Pixel Grounding Multimodal Large Language Model
Yue Han, Jiangning Zhang, Junwei Zhu et al.
Acc3D: Accelerating Single Image to 3D Diffusion Models via Edge Consistency Guided Score Distillation
Kendong Liu, Zhiyu Zhu, Hui LIU et al.
DiSRT-In-Bed: Diffusion-Based Sim-to-Real Transfer Framework for In-Bed Human Mesh Recovery
Jing Gao, Ce Zheng, Laszlo Jeni et al.
FRAME: Floor-aligned Representation for Avatar Motion from Egocentric Video
Andrea Boscolo Camiletto, Jian Wang, Eduardo Alvarado et al.
Remote Photoplethysmography in Real-World and Extreme Lighting Scenarios
Hang Shao, lei luo, Jianjun Qian et al.
SemiDAViL: Semi-supervised Domain Adaptation with Vision-Language Guidance for Semantic Segmentation
Hritam Basak, Zhaozheng Yin
Concept Lancet: Image Editing with Compositional Representation Transplant
Jinqi Luo, Tianjiao Ding, Kwan Ho Ryan Chan et al.
ODHSR: Online Dense 3D Reconstruction of Humans and Scenes from Monocular Videos
Zetong Zhang, Manuel Kaufmann, Lixin Xue et al.
ArtiFade: Learning to Generate High-quality Subject from Blemished Images
Shuya Yang, Shaozhe Hao, Yukang Cao et al.
Perturb-and-Revise: Flexible 3D Editing with Generative Trajectories
Susung Hong, Johanna Suvi Karras, Ricardo Martin et al.
GeoDepth: From Point-to-Depth to Plane-to-Depth Modeling for Self-Supervised Monocular Depth Estimation
Haifeng Wu, Shuhang Gu, Lixin Duan et al.
Pose-Guided Temporal Enhancement for Robust Low-Resolution Hand Reconstruction
Kaixin Fan, Pengfei Ren, Jingyu Wang et al.
PhysicsGen: Can Generative Models Learn from Images to Predict Complex Physical Relations?
Martin Spitznagel, Jan Vaillant, Janis Keuper
F^3OCUS - Federated Finetuning of Vision-Language Foundation Models with Optimal Client Layer Updating Strategy via Multi-objective Meta-Heuristics
Pramit Saha, Felix Wagner, Divyanshu Mishra et al.
HeatFormer: A Neural Optimizer for Multiview Human Mesh Recovery
Yuto Matsubara, Ko Nishino
GlyphMastero: A Glyph Encoder for High-Fidelity Scene Text Editing
Tong Wang, Ting Liu, Xiaochao Qu et al.
MonoPlace3D: Learning 3D-Aware Object Placement for 3D Monocular Detection
Rishubh Parihar, Srinjay Sarkar, Sarthak Vora et al.
Customized Condition Controllable Generation for Video Soundtrack
Fan Qi, KunSheng Ma, Changsheng Xu
Percept, Memory, and Imagine: World Feature Simulating for Open-Domain Unknown Object Detection
Aming Wu, Cheng Deng
Towards Scalable Human-aligned Benchmark for Text-guided Image Editing
Suho Ryu, Kihyun Kim, Eugene Baek et al.
Implicit Correspondence Learning for Image-to-Point Cloud Registration
Xinjun Li, Wenfei Yang, Jiacheng Deng et al.
Minimal Interaction Seperated Tuning: A New Paradigm for Visual Adaptation
Ningyuan Tang, Minghao Fu, Jianxin Wu
Quad-Pixel Image Defocus Deblurring: A New Benchmark and Model
Hang Chen, Yin Xie, Xiaoxiu Peng et al.
HELVIPAD: A Real-World Dataset for Omnidirectional Stereo Depth Estimation
Mehdi Zayene, Albias Havolli, Jannik Endres et al.
NTClick: Achieving Precise Interactive Segmentation With Noise-tolerant Clicks
Chenyi Zhang, Ting Liu, Xiaochao Qu et al.
Named Entity Driven Zero-Shot Image Manipulation
Zhida Feng, Li Chen, Jing Tian et al.
Mamba-Adaptor: State Space Model Adaptor for Visual Recognition
Fei Xie, Jiahao Nie, Yujin Tang et al.
HSI: A Holistic Style Injector for Arbitrary Style Transfer
Shuhao Zhang, Hui Kang, Yang Liu et al.
UWAV: Uncertainty-weighted Weakly-supervised Audio-Visual Video Parsing
Yung-Hsuan Lai, Janek Ebbers, Yu-Chiang Frank Wang et al.
EfficientDreamer: High-Fidelity and Robust 3D Creation via Orthogonal-view Diffusion Priors
Zhipeng Hu, Minda Zhao, Chaoyi Zhao et al.
SGCR: Spherical Gaussians for Efficient 3D Curve Reconstruction
Xinran Yang, Donghao Ji, Yuanqi Li et al.
Saliuitl: Ensemble Salience Guided Recovery of Adversarial Patches against CNNs
Mauricio Byrd Victorica, György Dán, Henrik Sandberg
UMFN: Unified Multi-Domain Face Normalization for Joint Cross-domain Prototype Learning and Heterogeneous Face Recognition
Meng Pang, Wenjun Zhang, Nanrun Zhou et al.
A New Statistical Model of Star Speckles for Learning to Detect and Characterize Exoplanets in Direct Imaging Observations
Theo Bodrito, Olivier Flasseur, Julien Mairal et al.
PURA: Parameter Update-Recovery Test-Time Adaption for RGB-T Tracking
Zekai Shao, Yufan Hu, Bin Fan et al.
Leveraging Global Stereo Consistency for Category-Level Shape and 6D Pose Estimation from Stereo Images
Junning Qiu, Minglei Lu, Fei Wang et al.
Data-free Universal Adversarial Perturbation with Pseudo-semantic Prior
Chanhui Lee, Yeonghwan Song, Jeany Son
Adapting to Observation Length of Trajectory Prediction via Contrastive Learning
Ruiqi Qiu, JUN GONG, Xinyu Zhang et al.
Targeted Forgetting of Image Subgroups in CLIP Models
Zeliang Zhang, Gaowen Liu, Charles Fleming et al.
Self-Supervised Large Scale Point Cloud Completion for Archaeological Site Restoration
Aocheng Li, James R. Zimmer-Dauphinee, Rajesh Kalyanam et al.
Hierarchical Gaussian Mixture Model Splatting for Efficient and Part Controllable 3D Generation
Qitong Yang, Mingtao Feng, Zijie Wu et al.
Dynamic Group Normalization: Spatio-Temporal Adaptation to Evolving Data Statistics
Yair Smadar, Assaf Hoogi
ReSpec: Relevance and Specificity Grounded Online Filtering for Learning on Video-Text Data Streams
Chris Dongjoo Kim, Jihwan Moon, Sangwoo Moon et al.
End-to-End HOI Reconstruction Transformer with Graph-based Encoding
Zhenrong Wang, Qi Zheng, Sihan Ma et al.
SCSA: A Plug-and-Play Semantic Continuous-Sparse Attention for Arbitrary Semantic Style Transfer
Chunnan Shang, Zhizhong Wang, Hongwei Wang et al.
HiLoTs: High-Low Temporal Sensitive Representation Learning for Semi-Supervised LiDAR Segmentation in Autonomous Driving
R.D. Lin, Pengcheng Weng, Yinqiao Wang et al.
Two is Better than One: Efficient Ensemble Defense for Robust and Compact Models
Yoojin Jung, Byung Cheol Song
NN-Former: Rethinking Graph Structure in Neural Architecture Representation
Ruihan Xu, Haokui Zhang, Yaowei Wang et al.
Advancing Manga Analysis: Comprehensive Segmentation Annotations for the Manga109 Dataset
Minshan Xie, Jian Lin, Hanyuan Liu et al.
ESC: Erasing Space Concept for Knowledge Deletion
Tae-Young Lee, Sundong Park, Minwoo Jeon et al.
High-quality Point Cloud Oriented Normal Estimation via Hybrid Angular and Euclidean Distance Encoding
Yuanqi Li, Jingcheng Huang, Hongshen Wang et al.
POMP: Physics-constrainable Motion Generative Model through Phase Manifolds
Bin Ji, Ye Pan, zhimeng Liu et al.
MExD: An Expert-Infused Diffusion Model for Whole-Slide Image Classification
Jianwei Zhao, XIN LI, Fan Yang et al.
Automatic Spectral Calibration of Hyperspectral Images: Method, Dataset and Benchmark
Zhuoran Du, Shaodi You, Cheng Cheng et al.
Active Event-based Stereo Vision
Jianing Li, Yunjian Zhang, Haiqian Han et al.
Random Conditioning with Distillation for Data-Efficient Diffusion Model Compression
Dohyun Kim, Sehwan Park, GeonHee Han et al.
R2C: Mapping Room to Chessboard to Unlock LLM As Low-Level Action Planner
Ziyi Bai, Hanxuan Li, Bin Fu et al.
A Semantic Knowledge Complementarity based Decoupling Framework for Semi-supervised Class-imbalanced Medical Image Segmentation
Zheng Zhang, Guanchun Yin, Bo Zhang et al.
Uncertainty-Instructed Structure Injection for Generalizable HD Map Construction
Xiaolu Liu, Ruizi Yang, Song Wang et al.
PIDLoc: Cross-View Pose Optimization Network Inspired by PID Controllers
Wooju Lee, Juhye Park, Dasol Hong et al.
Video-Panda: Parameter-efficient Alignment for Encoder-free Video-Language Models
Jinhui Yi, Syed Talal Wasim, Yanan Luo et al.
Semantic Line Combination Detector
JINWON KO, Dongkwon Jin, Chang-Su Kim
Difference Inversion: Interpolate and Isolate the Difference with Token Consistency for Image Analogy Generation
Hyunsoo Kim, Donghyun Kim, Suhyun Kim
Dual Energy-Based Model with Open-World Uncertainty Estimation for Out-of-distribution Detection
Qi Chen, Hu Ding
An Image-like Diffusion Method for Human-Object Interaction Detection
Xiaofei Hui, Haoxuan Qu, Hossein Rahmani et al.
Homogeneous Dynamics Space for Heterogeneous Humans
Xinpeng Liu, Junxuan Liang, Chenshuo Zhang et al.
RCP-Bench: Benchmarking Robustness for Collaborative Perception Under Diverse Corruptions
Shihang Du, Sanqing Qu, Tianhang Wang et al.
Visual and Semantic Prompt Collaboration for Generalized Zero-Shot Learning
Huajie Jiang, Zhengxian Li, Xiaohan Yu et al.
Hyperdimensional Uncertainty Quantification for Multimodal Uncertainty Fusion in Autonomous Vehicles Perception
Luke Chen, Junyao Wang, Trier Mortlock et al.
GaPT-DAR: Category-level Garments Pose Tracking via Integrated 2D Deformation and 3D Reconstruction
Li Zhang, mingliang xu, Jianan Wang et al.
Boost the Inference with Co-training: A Depth-guided Mutual Learning Framework for Semi-supervised Medical Polyp Segmentation
Yuxin Li, Zihao Zhu, Yuxiang Zhang et al.
Polarized Color Screen Matting
Kenji Enomoto, Scott Cohen, Brian Price et al.
Neural 3D Strokes: Creating Stylized 3D Scenes with Vectorized 3D Strokes
Haobin Duan, Miao Wang, Yanxun Li et al.
Let Samples Speak: Mitigating Spurious Correlation by Exploiting the Clusterness of Samples
WEIWEI LI, Junzhuo Liu, Yuanyuan Ren et al.
Nested Diffusion Models Using Hierarchical Latent Priors
Xiao Zhang, Ruoxi Jiang, Rebecca Willett et al.
AirRoom: Objects Matter in Room Reidentification
Runmao Yao, Yi Du, Zhuoqun Chen et al.
Towards Cost-Effective Learning: A Synergy of Semi-Supervised and Active Learning
Tianxiang Yin, Ningzhong Liu, Han Sun
PersonaHOI: Effortlessly Improving Face Personalization in Human-Object Interaction Generation
Xinting Hu, Haoran Wang, Jan Lenssen et al.
Temporal Action Detection Model Compression by Progressive Block Drop
Xiaoyong Chen, Yong Guo, Jiaming Liang et al.
Unlocking Generalization Power in LiDAR Point Cloud Registration
Zhenxuan Zeng, Qiao Wu, Xiyu Zhang et al.
Keep the Balance: A Parameter-Efficient Symmetrical Framework for RGB+X Semantic Segmentation
Jiaxin Cai, Jingze Su, Qi Li et al.
Link-based Contrastive Learning for One-Shot Unsupervised Domain Adaptation
Yue Zhang, Mingyue Bin, Yuyang Zhang et al.
Point-Cache: Test-time Dynamic and Hierarchical Cache for Robust and Generalizable Point Cloud Analysis
Hongyu Sun, Qiuhong Ke, Ming Cheng et al.
WISNet: Pseudo Label Generation on Unbalanced and Patch Annotated Waste Images
Shifan Zhang, Hongzi Zhu, Yinan He et al.
Twinner: Shining Light on Digital Twins in a Few Snaps
Jesus Zarzar, Tom Monnier, Roman Shapovalov et al.
ONDA-Pose: Occlusion-Aware Neural Domain Adaptation for Self-Supervised 6D Object Pose Estimation
Tao Tan, Qiulei Dong
Object Dynamics Modeling with Hierarchical Point Cloud-based Representations
Chanho Kim, Li Fuxin
Enhancing Virtual Try-On with Synthetic Pairs and Error-Aware Noise Scheduling
Nannan Li, Kevin Shih, Bryan A. Plummer
Separation of Powers: On Segregating Knowledge from Observation in LLM-enabled Knowledge-based Visual Question Answering
Zhen Yang, Zhuo Tao, Qi Chen et al.
Deep Video Inverse Tone Mapping Based on Temporal Clues
Yuyao Ye, Ning Zhang, Yang Zhao et al.
SPARC: Score Prompting and Adaptive Fusion for Zero-Shot Multi-Label Recognition in Vision-Language Models
Kevin Miller, Aditya Gangrade, Samarth Mishra et al.
TADFormer: Task-Adaptive Dynamic TransFormer for Efficient Multi-Task Learning
Seungmin Baek, Soyul Lee, Hayeon Jo et al.
Sketchtopia: A Dataset and Foundational Agents for Benchmarking Asynchronous Multimodal Communication with Iconic Feedback
Mohd Hozaifa Khan, Ravi Kiran Sarvadevabhatla
DIO: Decomposable Implicit 4D Occupancy-Flow World Model
Christopher Diehl, Quinlan Sykora, Ben Agro et al.
CURSOR: Scalable Mixed-Order Hypergraph Matching with CUR Decomposition
Qixuan Zheng, Ming Zhang, Hong Yan
Type-R: Automatically Retouching Typos for Text-to-Image Generation
Wataru Shimoda, Naoto Inoue, Daichi Haraguchi et al.
FSboard: Over 3 Million Characters of ASL Fingerspelling Collected via Smartphones
Manfred Georg, Garrett Tanzer, Esha Uboweja et al.
MaDCoW: Marginal Distortion Correction for Wide-Angle Photography with Arbitrary Objects
Kevin Zhang, Jia-Bin Huang, Jose Echevarria et al.
Dual Semantic Guidance for Open Vocabulary Semantic Segmentation
ZhengYang Wang, Tingliang Feng, Fan Lyu et al.
Understanding Multi-layered Transmission Matrices
Marina Alterman, Anat Levin
Creating Your Editable 3D Photorealistic Avatar with Tetrahedron-constrained Gaussian Splatting
Hanxi Liu, Yifang Men, Zhouhui Lian
EASEMVC:Efficient Dual Selection Mechanism for Deep Multi-View Clustering
Baili Xiao, Zhibin Dong, KE LIANG et al.
Blind Bitstream-corrupted Video Recovery via Metadata-guided Diffusion Model
Shuyun Wang, Hu Zhang, Xin Shen et al.
Meta-Learning Hyperparameters for Parameter Efficient Fine-Tuning
Zichen Tian, Yaoyao Liu, Qianru Sun
Dense Dispersed Structured Light for Hyperspectral 3D Imaging of Dynamic Scenes
Suhyun Shin, Seungwoo Yoon, Ryota Maeda et al.
Symbolic Representation for Any-to-Any Generative Tasks
Jiaqi Chen, Xiaoye Zhu, Yue Wang et al.
Incorporating Dense Knowledge Alignment into Unified Multimodal Representation Models
Yuhao Cui, Xinxing Zu, Wenhua Zhang et al.
3D Prior Is All You Need: Cross-Task Few-shot 2D Gaze Estimation
Yihua Cheng, Hengfei Wang, Zhongqun Zhang et al.
SAMBLE: Shape-Specific Point Cloud Sampling for an Optimal Trade-Off Between Local Detail and Global Uniformity
Chengzhi Wu, Yuxin Wan, Hao Fu et al.
Instance-wise Supervision-level Optimization in Active Learning
Shinnosuke Matsuo, Riku Togashi, Ryoma Bise et al.
Black Hole-Driven Identity Absorbing in Diffusion Models
Muhammad Shaheryar, Jong Taek Lee, Soon Ki Jung
Gaussian Splatting Feature Fields for (Privacy-Preserving) Visual Localization
Maxime Pietrantoni, Gabriela Csurka, Torsten Sattler
Non-Rigid Structure-from-Motion: Temporally-Smooth Procrustean Alignment and Spatially-Variant Deformation Modeling
Jiawei Shi, Hui Deng, Yuchao Dai
DiGIT: Multi-Dilated Gated Encoder and Central-Adjacent Region Integrated Decoder for Temporal Action Detection Transformer
Ho-Joong Kim, Yearang Lee, Jung-Ho Hong et al.
Vision-Guided Action: Enhancing 3D Human Motion Prediction with Gaze-informed Affordance in 3D Scenes
Ting Yu, Yi Lin, Jun Yu et al.
Attraction Diminishing and Distributing for Few-Shot Class-Incremental Learning
Li-Jun Zhao, Zhen-Duo Chen, Yongxin Wang et al.
PIAD: Pose and Illumination agnostic Anomaly Detection
Kaichen Yang, Junjie Cao, Zeyu Bai et al.
Link to the Past: Temporal Propagation for Fast 3D Human Reconstruction from Monocular Video
Marchellus Matthew, Nadhira Noor, In Kyu Park
Argus: A Compact and Versatile Foundation Model for Vision
Weiming Zhuang, Chen Chen, Zhizhong Li et al.
CSC-PA: Cross-image Semantic Correlation via Prototype Attentions for Single-network Semi-supervised Breast Tumor Segmentation
Zhenhui Ding, Guilian Chen, Qin Zhang et al.
OFER: Occluded Face Expression Reconstruction
Pratheba Selvaraju, Victoria Abrevaya, Timo Bolkart et al.
DAMM-Diffusion: Learning Divergence-Aware Multi-Modal Diffusion Model for Nanoparticles Distribution Prediction
Junjie Zhou, Shouju Wang, Yuxia Tang et al.
Depth-Guided Bundle Sampling for Efficient Generalizable Neural Radiance Field Reconstruction
Li Fang, Hao Zhu, Longlong Chen et al.
Improving Personalized Search with Regularized Low-Rank Parameter Updates
Fiona Ryan, Josef Sivic, Fabian Caba Heilbron et al.
MetaWriter: Personalized Handwritten Text Recognition Using Meta-Learned Prompt Tuning
Wenhao Gu, Li Gu, Ching Suen et al.
VIRES: Video Instance Repainting via Sketch and Text Guided Generation
Shuchen Weng, Haojie Zheng, Peixuan Zhang et al.
Composing Parts for Expressive Object Generation
Harsh Rangwani, Aishwarya Agarwal, Kuldeep Kulkarni et al.
Relation-Rich Visual Document Generator for Visual Information Extraction
Zi-Han Jiang, Chien-Wei Lin, WeiHua Li et al.
Soft Self-labeling and Potts Relaxations for Weakly-supervised Segmentation
Zhongwen Zhang, Yuri Boykov
IM-Portrait: Learning 3D-aware Video Diffusion for Photorealistic Talking Heads from Monocular VideosC
Yuan Li, Ziqian Bai, Feitong Tan et al.
Align-A-Video: Deterministic Reward Tuning of Image Diffusion Models for Consistent Video Editing
Shengzhi Wang, Yingkang Zhong, Jiangchuan Mu et al.
LookingGlass: Generative Anamorphoses via Laplacian Pyramid Warping
Pascal Chang, Sergio Sancho, Jingwei Tang et al.
Libra-Merging: Importance-redundancy and Pruning-merging Trade-off for Acceleration Plug-in in Large Vision-Language Model
Longrong Yang, Dong Shen, Chaoxiang Cai et al.
STEPS: Sequential Probability Tensor Estimation for Text-to-Image Hard Prompt Search
Yuning Qiu, Andong Wang, Chao Li et al.
Balancing Two Classifiers via A Simplex ETF Structure for Model Calibration
Jiani Ni, He Zhao, Jintong Gao et al.
FFaceNeRF: Few-shot Face Editing in Neural Radiance Fields
Kwan Yun, Chaelin Kim, Hangyeul Shin et al.
TIDE: Training Locally Interpretable Domain Generalization Models Enables Test-time Correction
Aishwarya Agarwal, Srikrishna Karanam, Vineet Gandhi
CroCoDL: Cross-device Collaborative Dataset for Localization
Hermann Blum, Alessandro Mercurio, Joshua O'Reilly et al.
Video Language Model Pretraining with Spatio-temporal Masking
Yue Wu, Zhaobo Qi, Junshu Sun et al.
Self-Supervised Learning for Color Spike Camera Reconstruction
Yanchen Dong, Ruiqin Xiong, Xiaopeng Fan et al.
TAROT: Towards Essentially Domain-Invariant Robustness with Theoretical Justification
Dongyoon Yang, Jihu Lee, Yongdai Kim
ShapeShifter: 3D Variations Using Multiscale and Sparse Point-Voxel Diffusion
Nissim Maruani, Wang Yifan, Matthew Fisher et al.
Multi-modal Contrastive Learning with Negative Sampling Calibration for Phenotypic Drug Discovery
Jiahua Rao, Hanjing Lin, Leyu Chen et al.
Touch2Shape: Touch-Conditioned 3D Diffusion for Shape Exploration and Reconstruction
Yuanbo Wang, Zhaoxuan Zhang, Jiajin Qiu et al.
ADU: Adaptive Detection of Unknown Categories in Black-Box Domain Adaptation
Yushan Lai, Guowen Li, Haoyuan Liang et al.
Beyond Image Classification: A Video Benchmark and Dual-Branch Hybrid Discrimination Framework for Compositional Zero-Shot Learning
Dongyao Jiang, Haodong Jing, Yongqiang Ma et al.
Seeing A 3D World in A Grain of Sand
Yufan Zhang, Yu Ji, Yu Guo et al.
VEU-Bench: Towards Comprehensive Understanding of Video Editing
Bozheng Li, Yongliang Wu, YI LU et al.
LiSu: A Dataset and Method for LiDAR Surface Normal Estimation
Dušan Malić, Christian Fruhwirth-Reisinger, Samuel Schulter et al.
Transferable and Principled Efficiency for Open-Vocabulary Segmentation
Jingxuan Xu, Wuyang Chen, Yao Zhao et al.
Sharp-It: A Multi-view to Multi-view Diffusion Model for 3D Synthesis and Manipulation
Yiftach Edelstein, Or Patashnik, Dana Cohen-Bar et al.
SocialMOIF: Multi-Order Intention Fusion for Pedestrian Trajectory Prediction
Kai Chen, Xiaodong Zhao, Yujie Huang et al.
Foveated Instance Segmentation
Hongyi Zeng, Wenxuan Liu, Tianhua Xia et al.