Most Cited CVPR "biomedical outbreak alerts" Papers
5,589 papers found • Page 23 of 28
Conference
LP-Diff: Towards Improved Restoration of Real-World Degraded License Plate
Haoyan Gong, Zhenrong Zhang, Yuzheng Feng et al.
L-SWAG: Layer-Sample Wise Activation with Gradients Information for Zero-Shot NAS on Vision Transformers
Sofia Casarin, Sergio Escalera, Oswald Lanz
Tripartite Weight-Space Ensemble for Few-Shot Class-Incremental Learning
Juntae Lee, Munawar Hayat, Sungrack Yun
Where the Devil Hides: Deepfake Detectors Can No Longer Be Trusted
Shuaiwei Yuan, Junyu Dong, Yuezun Li
Nested Diffusion Models Using Hierarchical Latent Priors
Xiao Zhang, Ruoxi Jiang, Rebecca Willett et al.
Point Cloud Upsampling Using Conditional Diffusion Module with Adaptive Noise Suppression
Boqian Zhang, shen yang, Hao Chen et al.
ByTheWay: Boost Your Text-to-Video Generation Model to Higher Quality in a Training-free Way
Jiazi Bu, Pengyang Ling, Pan Zhang et al.
Do ImageNet-trained Models Learn Shortcuts? The Impact of Frequency Shortcuts on Generalization
Shunxin Wang, Raymond Veldhuis, Nicola Strisciuglio
UPME: An Unsupervised Peer Review Framework for Multimodal Large Language Model Evaluation
Qihui Zhang, Munan Ning, Zheyuan Liu et al.
Neural Inverse Rendering from Propagating Light
Anagh Malik, Benjamin Attal, Andrew Xie et al.
No Thing, Nothing: Highlighting Safety-Critical Classes for Robust LiDAR Semantic Segmentation in Adverse Weather
Junsung Park, HwiJeong Lee, Inha Kang et al.
FlexGS: Train Once, Deploy Everywhere with Many-in-One Flexible 3D Gaussian Splatting
Hengyu Liu, Yuehao Wang, Chenxin Li et al.
Shining Yourself: High-Fidelity Ornaments Virtual Try-on with Diffusion Model
Yingmao Miao, Zhanpeng Huang, Rui Han et al.
A Unified, Resilient, and Explainable Adversarial Patch Detector
Vishesh Kumar, Akshay Agarwal
Seek Common Ground While Reserving Differences: Semi-Supervised Image-Text Sentiment Recognition
Wuyou Xia, Guoli Jia, Sicheng Zhao et al.
VISTREAM: Improving Computation Efficiency of Visual Streaming Perception via Law-of-Charge-Conservation Inspired Spiking Neural Network
Kang You, Ziling Wei, Jing Yan et al.
Self-Evolving Visual Concept Library using Vision-Language Critics
Atharva Sehgal, Patrick Yuan, Ziniu Hu et al.
MICAS: Multi-grained In-Context Adaptive Sampling for 3D Point Cloud Processing
Feifei Shao, Ping Liu, Zhao Wang et al.
Split Adaptation for Pre-trained Vision Transformers
Lixu Wang, Bingqi Shang, Yi Li et al.
Synthetic Visual Genome
Jae Sung Park, Zixian Ma, Linjie Li et al.
Autoregressive Distillation of Diffusion Transformers
Yeongmin Kim, Sotiris Anagnostidis, Yuming Du et al.
Adaptive Softassign via Hadamard-Equipped Sinkhorn
Binrui Shen, Qiang Niu, Shengxin Zhu
DynaMoDe-NeRF: Motion-aware Deblurring Neural Radiance Field for Dynamic Scenes
Ashish Kumar, A. N. Rajagopalan
From Prototypes to General Distributions: An Efficient Curriculum for Masked Image Modeling
Jinhong Lin, Cheng-En Wu, Huanran Li et al.
Solving Instance Detection from an Open-World Perspective
Qianqian Shen, Yunhan Zhao, Nahyun Kwon et al.
Leveraging 3D Geometric Priors in 2D Rotation Symmetry Detection
Ahyun Seo, Minsu Cho
Potential Field Based Deep Metric Learning
Shubhang Bhatnagar, Narendra Ahuja
A2XP: Towards Private Domain Generalization
Geunhyeok Yu, Hyoseok Hwang
Self-Supervised Spatial Correspondence Across Modalities
Ayush Shrivastava, Andrew Owens
Towards Better Vision-Inspired Vision-Language Models
Yun-Hao Cao, Kaixiang Ji, Ziyuan Huang et al.
Evaluating Model Perception of Color Illusions in Photorealistic Scenes
Lingjun Mao, Zineng Tang, Alane Suhr
ProReflow: Progressive Reflow with Decomposed Velocity
Lei Ke, Haohang Xu, Xuefei Ning et al.
Exploiting Deblurring Networks for Radiance Fields
Haeyun Choi, Heemin Yang, Janghyeok Han et al.
Hand-held Object Reconstruction from RGB Video with Dynamic Interaction
Shijian Jiang, Qi Ye, Rengan Xie et al.
KMD: Koopman Multi-modality Decomposition for Generalized Brain Tumor Segmentation under Incomplete Modalities
Tianyi Liu, Haochuan Jiang, Kaizhu Huang
The PanAf-FGBG Dataset: Understanding the Impact of Backgrounds in Wildlife Behaviour Recognition
Otto Brookes, Maksim Kukushkin, Majid Mirmehdi et al.
beta-FFT: Nonlinear Interpolation and Differentiated Training Strategies for Semi-Supervised Medical Image Segmentation
Ming Hu, Jianfu Yin, Zhuangzhuang Ma et al.
Generative Quanta Color Imaging
Vishal Purohit, Junjie Luo, Yiheng Chi et al.
NeighborRetr: Balancing Hub Centrality in Cross-Modal Retrieval
Zengrong Lin, Zheng Wang, Tianwen Qian et al.
PhyS-EdiT: Physics-aware Semantic Image Editing with Text Description
Ziqi Cai, Shuchen Weng, Yifei Xia et al.
WinSyn: : A High Resolution Testbed for Synthetic Data
Tom Kelly, John Femiani, Peter Wonka
JiSAM: Alleviate Labeling Burden and Corner Case Problems in Autonomous Driving via Minimal Real-World Data
Runjian Chen, Wenqi Shao, Bo Zhang et al.
Flexible Group Count Enables Hassle-Free Structured Pruning
Jiamu Zhang, Shaochen Zhong, Andrew Ye et al.
Stop Walking in Circles! Bailing Out Early in Projected Gradient Descent
Philip Doldo, Derek Everett, Amol Khanna et al.
Adapting Text-to-Image Generation with Feature Difference Instruction for Generic Image Restoration
Chao Wang, Hehe Fan, Huichen Yang et al.
DTOS: Dynamic Time Object Sensing with Large Multimodal Model
Jirui Tian, Jinrong Zhang, Shenglan Liu et al.
FreeUV: Ground-Truth-Free Realistic Facial UV Texture Recovery via Cross-Assembly Inference Strategy
Xingchao Yang, Takafumi Taketomi, Yuki Endo et al.
ToonerGAN: Reinforcing GANs for Obfuscating Automated Facial Indexing
Kartik Thakral, Shashikant Prasad, Stuti Aswani et al.
Using Powerful Prior Knowledge of Diffusion Model in Deep Unfolding Networks for Image Compressive Sensing
Chen Liao, Yan Shen, Dan Li et al.
High-Fidelity Lightweight Mesh Reconstruction from Point Clouds
Chen Zhang, Wentao Wang, Ximeng Li et al.
MotionMap: Representing Multimodality in Human Pose Forecasting
Reyhaneh Hosseininejad, Megh Shukla, Saeed Saadatnejad et al.
LookingGlass: Generative Anamorphoses via Laplacian Pyramid Warping
Pascal Chang, Sergio Sancho, Jingwei Tang et al.
NeRFDeformer: NeRF Transformation from a Single View via 3D Scene Flows
Zhenggang Tang, Jason Ren, Xiaoming Zhao et al.
Reversing Flow for Image Restoration
Haina Qin, Wenyang Luo, Bing Li et al.
Do Your Best and Get Enough Rest for Continual Learning
Hankyul Kang, Gregor Seifer, Donghyun Lee et al.
Certified Human Trajectory Prediction
Mohammadhossein Bahari, Saeed Saadatnejad, Amirhossein Askari Farsangi et al.
ScaleLSD: Scalable Deep Line Segment Detection Streamlined
Zeran Ke, Bin Tan, Xianwei Zheng et al.
On-Device Self-Supervised Learning of Low-Latency Monocular Depth from Only Events
Jesse Hagenaars, Yilun Wu, Federico Paredes Valles et al.
LC-Mamba: Local and Continuous Mamba with Shifted Windows for Frame Interpolation
Min Wu Jeong, Chae Eun Rhee
LotusFilter: Fast Diverse Nearest Neighbor Search via a Learned Cutoff Table
Yusuke Matsui
HashPoint: Accelerated Point Searching and Sampling for Neural Rendering
Jiahao Ma, Miaomiao Liu, David Ahmedt-Aristizabal et al.
Bias for Action: Video Implicit Neural Representations with Bias Modulation
Alper Kayabasi, Anil Kumar Vadathya, Guha Balakrishnan et al.
DiffCAM: Data-Driven Saliency Maps by Capturing Feature Differences
Xingjian Li, Qiming Zhao, Neelesh Bisht et al.
EIDT-V: Exploiting Intersections in Diffusion Trajectories for Model-Agnostic, Zero-Shot, Training-Free Text-to-Video Generation
Diljeet Jagpal, Xi Chen, Vinay P. Namboodiri
Embracing Collaboration Over Competition: Condensing Multiple Prompts for Visual In-Context Learning
Jinpeng Wang, Tianci Luo, Yaohua Zha et al.
FLAVC: Learned Video Compression with Feature Level Attention
Chun Zhang, Heming Sun, Jiro Katto
Hybrid Concept Bottleneck Models
Yang Liu, Tianwei Zhang, Shi Gu
Segment Anything, Even Occluded
Wei-En Tai, Yu-Lin Shih, Cheng Sun et al.
Style Quantization for Data-Efficient GAN Training
Jian Wang, Xin Lan, Ji-Zhe Zhou et al.
Condensing Action Segmentation Datasets via Generative Network Inversion
Guodong Ding, Rongyu Chen, Angela Yao
Robust Multi-Object 4D Generation for In-the-wild Videos
Wen-Hsuan Chu, Lei Ke, Jianmeng Liu et al.
Recovering Dynamic 3D Sketches from Videos
Jaeah Lee, Changwoon Choi, Young Min Kim et al.
Preconditioners for the Stochastic Training of Neural Fields
Shin-Fang Chng, Hemanth Saratchandran, Simon Lucey
Provoking Multi-modal Few-Shot LVLM via Exploration-Exploitation In-Context Learning
Cheng Chen, Yunpeng Zhai, Yifan Zhao et al.
VinaBench: Benchmark for Faithful and Consistent Visual Narratives
Silin Gao, Sheryl Mathew, Li Mi et al.
Dual-Granularity Semantic Guided Sparse Routing Diffusion Model for General Pansharpening
Yinghui Xing, Qu Li Tao, Shizhou Zhang et al.
Thin-Shell-SfT: Fine-Grained Monocular Non-rigid 3D Surface Tracking with Neural Deformation Fields
Navami Kairanda, Marc Habermann, Shanthika Shankar Naik et al.
HOTFormerLoc: Hierarchical Octree Transformer for Versatile Lidar Place Recognition Across Ground and Aerial Views
Ethan Griffiths, Maryam Haghighat, Simon Denman et al.
Universal Semi-Supervised Domain Adaptation by Mitigating Common-Class Bias
Wenyu Zhang, Qingmu Liu, Felix Ong et al.
EBS-EKF: Accurate and High Frequency Event-based Star Tracking
Albert Reed, Connor Hashemi, Dennis Melamed et al.
VideoGuide: Improving Video Diffusion Models without Training Through a Teacher's Guide
Dohun Lee, Bryan Sangwoo Kim, Geon Yeong Park et al.
SACB-Net: Spatial-awareness Convolutions for Medical Image Registration
Xinxing Cheng, Tianyang Zhang, Wenqi Lu et al.
VITED: Video Temporal Evidence Distillation
Yujie Lu, Yale Song, Lorenzo Torresani et al.
Discovering Hidden Visual Concepts Beyond Linguistic Input in Infant Learning
Xueyi Ke, Satoshi Tsutsui, Yayun Zhang et al.
Boosting Domain Incremental Learning: Selecting the Optimal Parameters is All You Need
Qiang Wang, Xiang Song, Yuhang He et al.
T-CIL: Temperature Scaling using Adversarial Perturbation for Calibration in Class-Incremental Learning
Seong-Hyeon Hwang, Minsu Kim, Steven Euijong Whang
SEC-Prompt:SEmantic Complementary Prompting for Few-Shot Class-Incremental Learning
Ye Liu, Meng Yang
SPC-GS: Gaussian Splatting with Semantic-Prompt Consistency for Indoor Open-World Free-view Synthesis from Sparse Inputs
Guibiao Liao, Qing Li, Zhenyu Bao et al.
Test-Time Domain Generalization via Universe Learning: A Multi-Graph Matching Approach for Medical Image Segmentation
Xingguo Lv, Xingbo Dong, Liwen Wang et al.
VEU-Bench: Towards Comprehensive Understanding of Video Editing
Bozheng Li, Yongliang Wu, YI LU et al.
CorrBEV: Multi-View 3D Object Detection by Correlation Learning with Multi-modal Prototypes
ziteng xue, Mingzhe Guo, Heng Fan et al.
PersonaHOI: Effortlessly Improving Face Personalization in Human-Object Interaction Generation
Xinting Hu, Haoran Wang, Jan Lenssen et al.
HyperNVD: Accelerating Neural Video Decomposition via Hypernetworks
Maria Pilligua, Danna Xue, Javier Vazquez-Corral
Uncertainty-Instructed Structure Injection for Generalizable HD Map Construction
Xiaolu Liu, Ruizi Yang, Song Wang et al.
UWAV: Uncertainty-weighted Weakly-supervised Audio-Visual Video Parsing
Yung-Hsuan Lai, Janek Ebbers, Yu-Chiang Frank Wang et al.
Directional Label Diffusion Model for Learning from Noisy Labels
Senyu Hou, Gaoxia Jiang, Jia Zhang et al.
CroCoDL: Cross-device Collaborative Dataset for Localization
Hermann Blum, Alessandro Mercurio, Joshua O'Reilly et al.
Classic Video Denoising in a Machine Learning World: Robust, Fast, and Controllable
Xin Jin, Simon Niklaus, Zhoutong Zhang et al.
Deterministic Certification of Graph Neural Networks against Graph Poisoning Attacks with Arbitrary Perturbations
Jiate Li, Meng Pang, Yun Dong et al.
Separation of Powers: On Segregating Knowledge from Observation in LLM-enabled Knowledge-based Visual Question Answering
Zhen Yang, Zhuo Tao, Qi Chen et al.
PIDLoc: Cross-View Pose Optimization Network Inspired by PID Controllers
Wooju Lee, Juhye Park, Dasol Hong et al.
Style Evolving along Chain-of-Thought for Unknown-Domain Object Detection
Zihao Zhang, Aming Wu, Yahong Han
IPoD: Implicit Field Learning with Point Diffusion for Generalizable 3D Object Reconstruction from Single RGB-D Images
Yushuang Wu, Luyue Shi, Junhao Cai et al.
The Photographer's Eye: Teaching Multimodal Large Language Models to See, and Critique Like Photographers
Daiqing Qi, Handong Zhao, Jing Shi et al.
MEET: Towards Memory-Efficient Temporal Sparse Deep Neural Networks
Zeqi Zhu, Ibrahim Batuhan Akkaya, Luc Waeijen et al.
MaDCoW: Marginal Distortion Correction for Wide-Angle Photography with Arbitrary Objects
Kevin Zhang, Jia-Bin Huang, Jose Echevarria et al.
Touch2Shape: Touch-Conditioned 3D Diffusion for Shape Exploration and Reconstruction
Yuanbo Wang, Zhaoxuan Zhang, Jiajin Qiu et al.
Open Ad-hoc Categorization with Contextualized Feature Learning
Zilin Wang, Sangwoo Mo, Stella X. Yu et al.
ICON: Incremental CONfidence for Joint Pose and Radiance Field Optimization
Weiyao Wang, Pierre Gleize, Hao Tang et al.
MExD: An Expert-Infused Diffusion Model for Whole-Slide Image Classification
Jianwei Zhao, XIN LI, Fan Yang et al.
ReCon: Enhancing True Correspondence Discrimination through Relation Consistency for Robust Noisy Correspondence Learning
Quanxing Zha, Xin Liu, Shu-Juan Peng et al.
Neural 3D Strokes: Creating Stylized 3D Scenes with Vectorized 3D Strokes
Haobin Duan, Miao Wang, Yanxun Li et al.
Conformal Prediction and MLLM aided Uncertainty Quantification in Scene Graph Generation
Sayak Nag, Udita Ghosh, Calvin-Khang Ta et al.
Benchmarking Object Detectors under Real-World Distribution Shifts in Satellite Imagery
Sara Al-Emadi, Yin Yang, Ferda Ofli
TIDE: Training Locally Interpretable Domain Generalization Models Enables Test-time Correction
Aishwarya Agarwal, Srikrishna Karanam, Vineet Gandhi
CARL: A Framework for Equivariant Image Registration
Hastings Greer, Lin Tian, François-Xavier Vialard et al.
Stop Learning it all to Mitigate Visual Hallucination, Focus on the Hallucination Target.
Dokyoon Yoon, Youngsook Song, Woomyoung Park
Composing Parts for Expressive Object Generation
Harsh Rangwani, Aishwarya Agarwal, Kuldeep Kulkarni et al.
Two is Better than One: Efficient Ensemble Defense for Robust and Compact Models
Yoojin Jung, Byung Cheol Song
R2C: Mapping Room to Chessboard to Unlock LLM As Low-Level Action Planner
Ziyi Bai, Hanxuan Li, Bin Fu et al.
Balancing Two Classifiers via A Simplex ETF Structure for Model Calibration
Jiani Ni, He Zhao, Jintong Gao et al.
Sampling Innovation-Based Adaptive Compressive Sensing
Zhifu Tian, Tao Hu, Chaoyang Niu et al.
Efficient Event-Based Object Detection: A Hybrid Neural Network with Spatial and Temporal Attention
Soikat Hasan Ahmed, Jan Finkbeiner, Emre Neftci
VIRES: Video Instance Repainting via Sketch and Text Guided Generation
Shuchen Weng, Haojie Zheng, Peixuan Zhang et al.
Universal Domain Adaptation for Semantic Segmentation
Seun-An Choe, Keon Hee Park, Jinwoo Choi et al.
Light Transport-aware Diffusion Posterior Sampling for Single-View Reconstruction of 3D Volumes
Ludwic Leonard, Nils Thuerey, rüdiger westermann
RCP-Bench: Benchmarking Robustness for Collaborative Perception Under Diverse Corruptions
Shihang Du, Sanqing Qu, Tianhang Wang et al.
DIO: Decomposable Implicit 4D Occupancy-Flow World Model
Christopher Diehl, Quinlan Sykora, Ben Agro et al.
Deep Video Inverse Tone Mapping Based on Temporal Clues
Yuyao Ye, Ning Zhang, Yang Zhao et al.
Gaussian Splatting Feature Fields for (Privacy-Preserving) Visual Localization
Maxime Pietrantoni, Gabriela Csurka, Torsten Sattler
Improving Editability in Image Generation with Layer-wise Memory
Daneul Kim, Jaeah Lee, Jaesik Park
DAMM-Diffusion: Learning Divergence-Aware Multi-Modal Diffusion Model for Nanoparticles Distribution Prediction
Junjie Zhou, Shouju Wang, Yuxia Tang et al.
Probabilistic Prompt Distribution Learning for Animal Pose Estimation
Jiyong Rao, Brian Nlong Zhao, Yu Wang
Open Vocabulary Semantic Scene Sketch Understanding
Ahmed Bourouis, Judith Fan, Yulia Gryaditskaya
F^3OCUS - Federated Finetuning of Vision-Language Foundation Models with Optimal Client Layer Updating Strategy via Multi-objective Meta-Heuristics
Pramit Saha, Felix Wagner, Divyanshu Mishra et al.
FFaceNeRF: Few-shot Face Editing in Neural Radiance Fields
Kwan Yun, Chaelin Kim, Hangyeul Shin et al.
A New Statistical Model of Star Speckles for Learning to Detect and Characterize Exoplanets in Direct Imaging Observations
Theo Bodrito, Olivier Flasseur, Julien Mairal et al.
Recover and Match: Open-Vocabulary Multi-Label Recognition through Knowledge-Constrained Optimal Transport
Hao Tan, Zichang Tan, Jun Li et al.
STEPS: Sequential Probability Tensor Estimation for Text-to-Image Hard Prompt Search
Yuning Qiu, Andong Wang, Chao Li et al.
Let Samples Speak: Mitigating Spurious Correlation by Exploiting the Clusterness of Samples
WEIWEI LI, Junzhuo Liu, Yuanyuan Ren et al.
High-quality Point Cloud Oriented Normal Estimation via Hybrid Angular and Euclidean Distance Encoding
Yuanqi Li, Jingcheng Huang, Hongshen Wang et al.
Pseudo Visible Feature Fine-Grained Fusion for Thermal Object Detection
Ting Li, Mao Ye, Tianwen Wu et al.
G-FARS: Gradient-Field-based Auto-Regressive Sampling for 3D Part Grouping
Junfeng Cheng, Tania Stathaki
Link-based Contrastive Learning for One-Shot Unsupervised Domain Adaptation
Yue Zhang, Mingyue Bin, Yuyang Zhang et al.
Language-Assisted Debiasing and Smoothing for Foundation Model-Based Semi-Supervised Learning
Na Zheng, Xuemeng Song, Xue Dong et al.
Towards General Robustness Verification of MaxPool-based Convolutional Neural Networks via Tightening Linear Approximation
Yuan Xiao, Shiqing Ma, Juan Zhai et al.
NTClick: Achieving Precise Interactive Segmentation With Noise-tolerant Clicks
Chenyi Zhang, Ting Liu, Xiaochao Qu et al.
Transferable and Principled Efficiency for Open-Vocabulary Segmentation
Jingxuan Xu, Wuyang Chen, Yao Zhao et al.
Dynamic Group Normalization: Spatio-Temporal Adaptation to Evolving Data Statistics
Yair Smadar, Assaf Hoogi
Argus: A Compact and Versatile Foundation Model for Vision
Weiming Zhuang, Chen Chen, Zhizhong Li et al.
ETAP: Event-based Tracking of Any Point
Friedhelm Hamann, Daniel Gehrig, Filbert Febryanto et al.
EASEMVC:Efficient Dual Selection Mechanism for Deep Multi-View Clustering
Baili Xiao, Zhibin Dong, KE LIANG et al.
Multi-modal Topology-embedded Graph Learning for Spatially Resolved Genes Prediction from Pathology Images with Prior Gene Similarity Information
Hang Shi, Chi Changxi, Peng Wan et al.
A Semantic Knowledge Complementarity based Decoupling Framework for Semi-supervised Class-imbalanced Medical Image Segmentation
Zheng Zhang, Guanchun Yin, Bo Zhang et al.
Unbiasing through Textual Descriptions: Mitigating Representation Bias in Video Benchmarks
Nina Shvetsova, Arsha Nagrani, Bernt Schiele et al.
UniSTD: Towards Unified Spatio-Temporal Learning across Diverse Disciplines
Chen Tang, Xinzhu Ma, Encheng Su et al.
Efficient Personalization of Quantized Diffusion Model without Backpropagation
Hoigi Seo, Wongi Jeong, Kyungryeol Lee et al.
GLane3D: Detecting Lanes with Graph of 3D Keypoints
Halil İbrahim Öztürk, Muhammet Esat Kalfaoglu, Ozsel Kilinc
An Image-like Diffusion Method for Human-Object Interaction Detection
Xiaofei Hui, Haoxuan Qu, Hossein Rahmani et al.
Leveraging Global Stereo Consistency for Category-Level Shape and 6D Pose Estimation from Stereo Images
Junning Qiu, Minglei Lu, Fei Wang et al.
Percept, Memory, and Imagine: World Feature Simulating for Open-Domain Unknown Object Detection
Aming Wu, Cheng Deng
SGCR: Spherical Gaussians for Efficient 3D Curve Reconstruction
Xinran Yang, Donghao Ji, Yuanqi Li et al.
ReSpec: Relevance and Specificity Grounded Online Filtering for Learning on Video-Text Data Streams
Chris Dongjoo Kim, Jihwan Moon, Sangwoo Moon et al.
Real-time Acquisition and Reconstruction of Dynamic Volumes with Neural Structured Illumination
Yixin Zeng, Zoubin Bi, Yin Mingrui et al.
Flash3D: Super-scaling Point Transformers through Joint Hardware-Geometry Locality
Liyan Chen, Gregory P. Meyer, Zaiwei Zhang et al.
Relation-Rich Visual Document Generator for Visual Information Extraction
Zi-Han Jiang, Chien-Wei Lin, WeiHua Li et al.
UNEM: UNrolled Generalized EM for Transductive Few-Shot Learning
Long Zhou, Fereshteh Shakeri, Aymen Sadraoui et al.
Video Language Model Pretraining with Spatio-temporal Masking
Yue Wu, Zhaobo Qi, Junshu Sun et al.
OFER: Occluded Face Expression Reconstruction
Pratheba Selvaraju, Victoria Abrevaya, Timo Bolkart et al.
Blind Bitstream-corrupted Video Recovery via Metadata-guided Diffusion Model
Shuyun Wang, Hu Zhang, Xin Shen et al.
NN-Former: Rethinking Graph Structure in Neural Architecture Representation
Ruihan Xu, Haokui Zhang, Yaowei Wang et al.
ONDA-Pose: Occlusion-Aware Neural Domain Adaptation for Self-Supervised 6D Object Pose Estimation
Tao Tan, Qiulei Dong
Towards Cost-Effective Learning: A Synergy of Semi-Supervised and Active Learning
Tianxiang Yin, Ningzhong Liu, Han Sun
Named Entity Driven Zero-Shot Image Manipulation
Zhida Feng, Li Chen, Jing Tian et al.
Visual and Semantic Prompt Collaboration for Generalized Zero-Shot Learning
Huajie Jiang, Zhengxian Li, Xiaohan Yu et al.
Black Hole-Driven Identity Absorbing in Diffusion Models
Muhammad Shaheryar, Jong Taek Lee, Soon Ki Jung
ADU: Adaptive Detection of Unknown Categories in Black-Box Domain Adaptation
Yushan Lai, Guowen Li, Haoyuan Liang et al.
POMP: Physics-constrainable Motion Generative Model through Phase Manifolds
Bin Ji, Ye Pan, zhimeng Liu et al.
AirRoom: Objects Matter in Room Reidentification
Runmao Yao, Yi Du, Zhuoqun Chen et al.
Temporal Action Detection Model Compression by Progressive Block Drop
Xiaoyong Chen, Yong Guo, Jiaming Liang et al.
SemiETS: Integrating Spatial and Content Consistencies for Semi-Supervised End-to-end Text Spotting
Dongliang Luo, Hanshen Zhu, Ziyang Zhang et al.
Instance-wise Supervision-level Optimization in Active Learning
Shinnosuke Matsuo, Riku Togashi, Ryoma Bise et al.
Type-R: Automatically Retouching Typos for Text-to-Image Generation
Wataru Shimoda, Naoto Inoue, Daichi Haraguchi et al.
SketchFusion: Learning Universal Sketch Features through Fusing Foundation Models
Subhadeep Koley, Tapas Kumar Dutta, Aneeshan Sain et al.
Reducing Class-wise Confusion for Incremental Learning with Disentangled Manifolds
Huitong Chen, Yu Wang, Yan Fan et al.
Point-Cache: Test-time Dynamic and Hierarchical Cache for Robust and Generalizable Point Cloud Analysis
Hongyu Sun, Qiuhong Ke, Ming Cheng et al.
Advancing Manga Analysis: Comprehensive Segmentation Annotations for the Manga109 Dataset
Minshan Xie, Jian Lin, Hanyuan Liu et al.
SyncSDE: A Probabilistic Framework for Diffusion Synchronization
Hyunjun Lee, Hyunsoo Lee, Sookwan Han
Incremental Object Keypoint Learning
Mingfu Liang, Jiahuan Zhou, Xu Zou et al.
SPARC: Score Prompting and Adaptive Fusion for Zero-Shot Multi-Label Recognition in Vision-Language Models
Kevin Miller, Aditya Gangrade, Samarth Mishra et al.
SCSA: A Plug-and-Play Semantic Continuous-Sparse Attention for Arbitrary Semantic Style Transfer
Chunnan Shang, Zhizhong Wang, Hongwei Wang et al.
LiSu: A Dataset and Method for LiDAR Surface Normal Estimation
Dušan Malić, Christian Fruhwirth-Reisinger, Samuel Schulter et al.
RipVIS: Rip Currents Video Instance Segmentation Benchmark for Beach Monitoring and Safety
Andrei Dumitriu, Florin Tatui, Florin Miron et al.
Foveated Instance Segmentation
Hongyi Zeng, Wenxuan Liu, Tianhua Xia et al.
Keep the Balance: A Parameter-Efficient Symmetrical Framework for RGB+X Semantic Segmentation
Jiaxin Cai, Jingze Su, Qi Li et al.
Perturb-and-Revise: Flexible 3D Editing with Generative Trajectories
Susung Hong, Johanna Suvi Karras, Ricardo Martin et al.
Perceptual Inductive Bias Is What You Need Before Contrastive Learning
Junru Zhao, Tianqin Li, Dunhan Jiang et al.
Identity-preserving Distillation Sampling by Fixed-Point Iterator
SeonHwa Kim, Jiwon Kim, Soobin Park et al.
End-to-End HOI Reconstruction Transformer with Graph-based Encoding
Zhenrong Wang, Qi Zheng, Sihan Ma et al.
SVLTA: Benchmarking Vision-Language Temporal Alignment via Synthetic Video Situation
Hao Du, Bo Wu, Yan Lu et al.