Most Cited 2025 "homogeneous data distributions" Papers
22,274 papers found • Page 81 of 112
Conference
Robustifying Zero-Shot Vision Language Models by Subspaces Alignment
Junhao Dong, Piotr Koniusz, Liaoyuan Feng et al.
Test-Time Retrieval-Augmented Adaptation for Vision-Language Models
Xinqi Fan, Xueli CHEN, Luoxiao Yang et al.
CHARM3R: Towards Unseen Camera Height Robust Monocular 3D Detector
Abhinav Kumar, Yuliang Guo, Zhihao Zhang et al.
Multi-View Slot Attention Using Paraphrased Texts for Face Anti-Spoofing
Jeongmin Yu, Susang Kim, Kisu Lee et al.
SAMora: Enhancing SAM through Hierarchical Self-Supervised Pre-Training for Medical Images
Shuhang Chen, Hangjie Yuan, Pengwei Liu et al.
FE-CLIP: Frequency Enhanced CLIP Model for Zero-Shot Anomaly Detection and Segmentation
Tao Gong, Qi Chu, Bin Liu et al.
Cypher-RI: Reinforcement Learning for Integrating Schema Selection into Cypher Generation
Hanchen Su, Xuyuan Li, Yan Zhou et al.
Environment-Agnostic Pose: Generating Environment-independent Object Representations for 6D Pose Estimation
Shaobo Zhang, Yuhang Huang, Wanqing Zhao et al.
Bias-Resilient Weakly Supervised Semantic Segmentation Using Normalizing Flows
Xianglin Qiu, Xiaoyang Wang, Zhen Zhang et al.
Cracking Instance Jigsaw Puzzles: A Superior Alternative to Multiple Instance Learning for Whole Slide Image Analysis
Xiwen Chen, Peijie Qiu, Wenhui Zhu et al.
STDDNet: Harnessing Mamba for Video Polyp Segmentation via Spatial-aligned Temporal Modeling and Discriminative Dynamic Representation Learning
Guilian Chen, Huisi Wu, Jing Qin
DecAD: Decoupling Anomalies in Latent Space for Multi-Class Unsupervised Anomaly Detection
Xiaolei Wang, Xiaoyang Wang, Huihui Bai et al.
Few-Shot Pattern Detection via Template Matching and Regression
Eunchan Jo, Dahyun Kang, Sanghyun Kim et al.
Hierarchical Event Memory for Accurate and Low-latency Online Video Temporal Grounding
Minghang Zheng, Yuxin Peng, Benyuan Sun et al.
RA-BUSSeg: Relation-aware Semi-supervised Breast Ultrasound Image Segmentation via Adjacent Propagation and Cross-layer Alignment
Wanting ZHANG, Zhenhui Ding, Guilian Chen et al.
DisCo: Towards Distinct and Coherent Visual Encapsulation in Video MLLMs
JIAHE ZHAO, rongkun Zheng, Yi Wang et al.
Exploring Probabilistic Modeling Beyond Domain Generalization for Semantic Segmentation
I-Hsiang Chen, Hua-En Chang, Wei-Ting Chen et al.
Auto-Controlled Image Perception in MLLMs via Visual Perception Tokens
Runpeng Yu, Xinyin Ma, Xinchao Wang
Bridging the Gap between Brain and Machine in Interpreting Visual Semantics: Towards Self-adaptive Brain-to-Text Decoding
Jiaxuan Chen, Yu Qi, Yueming Wang et al.
Spatial Alignment and Temporal Matching Adapter for Video-Radar Remote Physiological Measurement
Qian Liang, Ruixu Geng, Jinbo Chen et al.
WeaveSeg: Iterative Contrast-weaving and Spectral Feature-refining for Nuclei Instance Segmentation
Jiajia Li, Huisi Wu, Jing Qin
CARIM: Caption-Based Autonomous Driving Scene Retrieval via Inclusive Text Matching
Minjoo Ki, Dae Jung Kim, Kisung Kim et al.
PS-Mamba: Spatial-Temporal Graph Mamba for Pose Sequence Refinement
Haoye Dong, Gim Hee Lee
Modeling Saliency Dataset Bias
Matthias Kümmerer, Harneet Singh Khanuja, Matthias Bethge
Controllable Latent Space Augmentation for Digital Pathology
Sofiène Boutaj, Marin Scalbert, Pierre Marza et al.
Event-guided Unified Framework for Low-light Video Enhancement, Frame Interpolation, and Deblurring
Taewoo Kim, Kuk-Jin Yoon
Interpretable point cloud classification using multiple instance learning
Matt De Vries, Reed Naidoo, Olga Fourkioti et al.
Learning Beyond Still Frames: Scaling Vision-Language Models with Video
Yiyuan Zhang, Handong Li, Jing Liu et al.
Beyond Pixel Uncertainty: Bounding the OoD Objects in Road Scenes
Huachao Zhu, Zelong Liu, Zhichao Sun et al.
Temporal-aware Query Routing for Real-time Video Instance Segmentation
Zesen Cheng, Kehan Li, Yian Zhao et al.
Learnable Retrieval Enhanced Visual-Text Alignment and Fusion for Radiology Report Generation
Qin Zhou, Guoyan Liang, Xindi Li et al.
Anomaly Detection of Integrated Circuits Package Substrates Using the Large Vision Model SAIC: Dataset Construction, Methodology, and Application
Ruiyun Yu, Bingyang Guo, Haoyuan Li
Prompt-driven Transferable Adversarial Attack on Person Re-Identification with Attribute-aware Textual Inversion
Yuan Bian, Min Liu, Yunqi Yi et al.
Medical World Model
Yijun Yang, Zhao-Yang Wang, Qiuping Liu et al.
DIH-CLIP: Unleashing the Diversity of Multi-Head Self-Attention for Training-Free Open-Vocabulary Semantic Segmentation
Songsong Duan, Xi Yang, Nannan Wang
Partially Matching Submap Helps: Uncetainty Modeling and Propagation for Text to Point Cloud Localization
Mingtao Feng, Longlong Mei, Zijie Wu et al.
TopicGeo: An Efficient Unified Framework for Geolocation
Xin Wang, Xinlin Wang, Shuiping Gou
HRScene: How Far Are VLMs from Effective High-Resolution Image Understanding?
Yusen Zhang, Wenliang Zheng, Aashrith Madasu et al.
CObL: Toward Zero-Shot Ordinal Layering without User Prompting
Aneel Damaraju, Dean Hazineh, Todd Zickler
Debiasing Trace Guidance: Top-down Trace Distillation and Bottom-up Velocity Alignment for Unsupervised Anomaly Detection
Xingjian Wang, Li Chai, Jiming Chen
Similarity Memory Prior is All You Need for Medical Image Segmentation
Hao Tang, Zhiqing Guo, Liejun Wang et al.
CalliReader: Contextualizing Chinese Calligraphy via an Embedding-Aligned Vision-Language Model
Yuxuan Luo, Jiaqi Tang, Chenyi Huang et al.
Accelerate 3D Object Detection Models via Zero-Shot Attention Key Pruning
Lizhen Xu, Xiuxiu Bai, Xiaojun Jia et al.
Adapting Vehicle Detectors for Aerial Imagery to Unseen Domains with Weak Supervision
Xiao Fang, Minhyek Jeon, Zheyang Qin et al.
DC-TTA: Divide-and-Conquer Framework for Test-Time Adaptation of Interactive Segmentation
Jihun Kim, Hoyong Kwon, Hyeokjun Kweon et al.
FIND: Few-Shot Anomaly Inspection with Normal-Only Multi-Modal Data
YITING LI, Fayao Liu, Jingyi Liao et al.
VISO: Accelerating In-orbit Object Detection with Language-Guided Mask Learning and Sparse Inference
Meiqi Wang, Han Qiu
Unsupervised Histopathological Image Semantic Segmentation with Overlapping Patches Consistency Constraint
Wentian Cai, Weizhao Weng, Zihao Huang et al.
UINavBench: A Framework for Comprehensive Evaluation of Interactive Digital Agents
Harsh Agrawal, Eldon Schoop, Xinlei Pan et al.
VIPerson: Flexibly Generating Virtual Identity for Person Re-Identification
Xiao-Wen Zhang, Delong Zhang, Yi-Xing Peng et al.
Towards Robustness of Person Search against Corruptions
Woojung Son, Yoonki Cho, Guoyuan An et al.
SALoM: Structure Aware Temporal Graph Networks with Long-Short Memory Updater
Hanwen Liu, Longjiao Zhang, Rui Wang et al.
One Look is Enough: Seamless Patchwise Refinement for Zero-Shot Monocular Depth Estimation on High-Resolution Images
Byeongjun Kwon, Munchurl Kim
Flow-MIL: Constructing Highly-expressive Latent Feature Space For Whole Slide Image Classification Using Normalizing Flow
Yingfan MA, Bohan An, Ao Shen et al.
Background Invariance Testing According to Semantic Proximity
Zukang Liao, Min Chen
Vision-Language Neural Graph Featurization for Extracting Retinal Lesions
Taimur Hassan, Anabia Sohail, Muzammal Naseer et al.
EquiCaps: Predictor-Free Pose-Aware Pre-Trained Capsule Networks
Athinoulla Konstantinou, Georgios Leontidis, Mamatha Thota et al.
AJAHR: Amputated Joint Aware 3D Human Mesh Recovery
hyunjin cho, Giyun choi, Jongwon Choi
SpikeDiff: Zero-shot High-Quality Video Reconstruction from Chromatic Spike Camera and Sub-millisecond Spike Streams
Siqi Yang, Jinxiu Liang, Zhaojun Huang et al.
Global Motion Corresponder for 3D Point-Based Scene Interpolation under Large Motion
Junru Lin, Chirag Vashist, Mikaela Uy et al.
Teaching AI the Anatomy Behind the Scan: Addressing Anatomical Flaws in Medical Image Segmentation with Learnable Prior
Young Seok Jeon, Hongfei Yang, Huazhu Fu et al.
TESPEC: Temporally-Enhanced Self-Supervised Pretraining for Event Cameras
Mohammad Mohammadi, Ziyi Wu, Igor Gilitschenski
Token-Efficient VLM: High-Resolution Image Understanding via Dynamic Region Proposal
Yitong Jiang, Jinwei Gu, Tianfan Xue et al.
Region-aware Anchoring Mechanism for Efficient Referring Visual Grounding
Shuyi Ouyang, Ziwei Niu, Hongyi Wang et al.
Enrich and Detect: Video Temporal Grounding with Multimodal LLMs
Shraman Pramanick, Effrosyni Mavroudi, Yale Song et al.
From Abyssal Darkness to Blinding Glare: A Benchmark on Extreme Exposure Correction in Real World
Bo Wang, Huiyuan Fu, Zhiye Huang et al.
Breaking Grid Constraints: Dynamic Graph Reconstruction Network for Multi-organ Segmentation
Junhao Xiao, Yang Wei, Jingyu Wang et al.
MaskSAM: Auto-prompt SAM with Mask Classification for Volumetric Medical Image Segmentation
Bin Xie, Hao Tang, Bin Duan et al.
H3R: Hybrid Multi-view Correspondence for Generalizable 3D Reconstruction
Heng Jia, Na Zhao, Linchao Zhu
MEH: A Multi-Style Dataset and Toolkit for Advancing Egyptian Hieroglyph Recognition
Maksim Golyadkin, Rubanova Alexandrovna, Aleksandr Utkov et al.
Hybrid-Tower: Fine-grained Pseudo-query Interaction and Generation for Text-to-Video Retrieval
Bangxiang Lan, Ruobing Xie, Ruixiang Zhao et al.
Unbiased Missing-modality Multimodal Learning
Ruiting Dai, Chenxi Li, Yandong Yan et al.
ViM-VQ: Efficient Post-Training Vector Quantization for Visual Mamba
Juncan Deng, Shuaiting Li, Zeyu Wang et al.
DM-EFS: Dynamically Multiplexed Expanded Features Set Form for Robust and Efficient Small Object Detection
Aashish Sharma
DiffTell: A High-Quality Dataset for Describing Image Manipulation Changes
Zonglin Di, Jing Shi, Yifei Fan et al.
Estimating 2D Camera Motion with Hybrid Motion Basis
Haipeng Li, Tianhao Zhou, Zhanglei Yang et al.
Mixture-of-Scores: Robust Image-Text Data Valuation via Three Lines of Code
WU Sitong, Haoru Tan, Yukang Chen et al.
Inverse Image-Based Rendering for Light Field Generation from Single Images
Hyunjun Jung, Hae-Gon Jeon
MonoSOWA: Scalable monocular 3D Object detector Without human Annotations
Jan Skvrna, Lukas Neumann
ScoreHOI: Physically Plausible Reconstruction of Human-Object Interaction via Score-Guided Diffusion
AO LI, Jinpeng Liu, Yixuan Zhu et al.
Axis-level Symmetry Detection with Group-Equivariant Representation
Wongyun Yu, Ahyun Seo, Minsu Cho
PossLoss: A Reliable and Sensitive Facial Landmark Detection Loss Function
Qikui Zhu
U-ViLAR: Uncertainty-Aware Visual Localization for Autonomous Driving via Differentiable Association and Registration
Xiaofan Li, Zhihao Xu, Chenming Wu et al.
Group Inertial Poser: Multi-Person Pose and Global Translation from Sparse Inertial Sensors and Ultra-Wideband Ranging
Ying Xue, Jiaxi Jiang, Rayan Armani et al.
PointGAC: Geometric-Aware Codebook for Masked Point Modeling
Abiao Li, Chenlei Lv, Guofeng Mei et al.
Statistical Confidence Rescoring for Robust 3D Scene Graph Generation from Multi-View Images
Qi Xun Yeo, Yanyan Li, Gim Hee Lee
PHD: Personalized 3D Human Body Fitting with Point Diffusion
Hsuan-I Ho, Chen Guo, Po-Chen Wu et al.
Image-Guided Shape-from-Template Using Mesh Inextensibility Constraints
Dinh-Vinh-Thuy Tran, Ruochen Chen, Shaifali Parashar
Dual-S3D: Hierarchical Dual-Path Selective SSM-CNN for High-Fidelity Implicit Reconstruction
Luoxi Zhang, Pragyan Shrestha, Yu Zhou et al.
FastPoint: Accelerating 3D Point Cloud Model Inference via Sample Point Distance Prediction
Donghyun Lee, Dawoon Jeong, Jae W. Lee et al.
RobuSTereo: Robust Zero-Shot Stereo Matching under Adverse Weather
Yuran Wang, Yingping Liang, Yutao Hu et al.
AutoComPose: Automatic Generation of Pose Transition Descriptions for Composed Pose Retrieval Using Multimodal LLMs
Yi-Ting Shen, Sungmin Eum, Doheon Lee et al.
MMGeo: Multimodal Compositional Geo-Localization for UAVs
Yuxiang Ji, Boyong He, Zhuoyue Tan et al.
Large Scene Generation with Cube-Absorb Discrete Diffusion
Qianjiang Hu, Wei Hu
SynAD: Enhancing Real-World End-to-End Autonomous Driving Models through Synthetic Data Integration
Jongsuk Kim, Jae Young Lee, Gyojin Han et al.
SpiderSolver: A Geometry-Aware Transformer for Solving PDEs on Complex Geometries
KAI QI, Fan Wang, Zhewen Dong et al.
Gaussian-based World Model: Gaussian Priors for Voxel-Based Occupancy Prediction and Future Motion Prediction
Tuo Feng, Wenguan Wang, Yi Yang
Tracking Tiny Drones against Clutter: Large-Scale Infrared Benchmark with Motion-Centric Adaptive Algorithm
Jiahao Zhang, Zongli Jiang, Gang Wang et al.
DAA*: Deep Angular A Star for Image-based Path Planning
Zhiwei Xu
RCTDistill: Cross-Modal Knowledge Distillation Framework for Radar-Camera 3D Object Detection with Temporal Fusion
Geonho Bang, Minjae Seong, Jisong Kim et al.
GSRecon: Efficient Generalizable Gaussian Splatting for Surface Reconstruction from Sparse Views
Hang Yang, Le Hui, Jianjun Qian et al.
VOccl3D: A Video Benchmark Dataset for 3D Human Pose and Shape Estimation under real Occlusions
Yash Garg, Saketh Bachu, Arindam Dutta et al.
Towards Safer and Understandable Driver Intention Prediction
Mukilan Karuppasamy, Shankar Gangisetty, Shyam Nandan Rai et al.
Exploring View Consistency for Scene-Adaptive Low-Light Light Field Image Enhancement
Shuo Zhang, Chen Gao, Youfang Lin
InstaDrive: Instance-Aware Driving World Models for Realistic and Consistent Video Generation
Zhuoran Yang, Xi Guo, Chenjing Ding et al.
NormalLoc: Visual Localization on Textureless 3D Models using Surface Normals
Jiro Abe, Gaku Nakano, Kazumine Ogura
NGD: Neural Gradient Based Deformation for Monocular Garment Reconstruction
Soham Dasgupta, Shanthika Naik, Preet Savalia et al.
HccePose (BF): Predicting Front & Back Surfaces to Construct Ultra-Dense 2D-3D Correspondences for Pose Estimation
Yulin Wang, Mengting Hu, Hongli Li et al.
Lifting the Structural Morphing for Wide-Angle Images Rectification: Unified Content and Boundary Modeling
Wenting Luan, Siqi Lu, Yongbin Zheng et al.
4DSegStreamer: Streaming 4D Panoptic Segmentation via Dual Threads
Ling Liu, Jun Tian, Li Yi
StableDepth: Scene-Consistent and Scale-Invariant Monocular Depth
Zheng Zhang, Lihe Yang, Tianyu Yang et al.
RayletDF: Raylet Distance Fields for Generalizable 3D Surface Reconstruction from Point Clouds or Gaussians
Shenxing Wei, Jinxi Li, Yafei YANG et al.
Semantic-guided Camera Ray Regression for Visual Localization
Yesheng Zhang, Xu Zhao
Polarimetric Neural Field via Unified Complex-Valued Wave Representation
Chu Zhou, Yixin Yang, Junda Liao et al.
High-Precision 3D Measurement of Complex Textured Surfaces Using Multiple Filtering Approach
Yuchong Chen, Jian Yu, Shaoyan Gai et al.
From Gallery to Wrist: Realistic 3D Bracelet Insertion in Videos
Chenjian Gao, Lihe Ding, Rui Han et al.
HiNeuS: High-fidelity Neural Surface Mitigating Low-texture and Reflective Ambiguity
Yida Wang, Xueyang Zhang, Kun Zhan et al.
I2-World: Intra-Inter Tokenization for Efficient Dynamic 4D Scene Forecasting
Zhimin Liao, Ping Wei, Ruijie Zhang et al.
InsideOut: Integrated RGB-Radiative Gaussian Splatting for Comprehensive 3D Object Representation
Jungmin Lee, Seonghyuk Hong, Juyong Lee et al.
RIOcc: Efficient Cross-Modal Fusion Transformer with Collaborative Feature Refinement for 3D Semantic Occupancy Prediction
Baojie Fan, Xiaotian Li, Yuhan Zhou et al.
MetaScope: Optics-Driven Neural Network for Ultra-Micro Metalens Endoscopy
Wuyang Li, Wentao Pan, Xiaoyuan Liu et al.
Mitigating Geometric Degradation in Fast DownSampling via FastAdapter for Point Cloud Segmentation
Shuofeng Sun, Haibin Yan
SEHDR: Single-Exposure HDR Novel View Synthesis via 3D Gaussian Bracketing
Yiyu Li, Haoyuan Wang, Ke Xu et al.
TARS: Traffic-Aware Radar Scene Flow Estimation
Jialong Wu, Marco Braun, Dominic Spata et al.
Leaps and Bounds: An Improved Point Cloud Winding Number Formulation for Fast Normal Estimation and Surface Reconstruction
Chamin Hewa Koneputugodage, Dylan Campbell, Stephen Gould
Harnessing Text-to-Image Diffusion Models for Point Cloud Self-Supervised Learning
Yiyang Chen, Shanshan Zhao, Lunhao Duan et al.
OD-RASE: Ontology-Driven Risk Assessment and Safety Enhancement for Autonomous Driving
Kota Shimomura, Masaki Nambata, Atsuya Ishikawa et al.
MDP-Omni: Parameter-free Multimodal Depth Prior-based Sampling for Omnidirectional Stereo Matching
Eunjin Son, HyungGi Jo, Wookyong Kwon et al.
Projective Equivariant Networks via Second-order Fundamental Differential Invariants
Yikang Li, Yeqing Qiu, Yuxuan Chen et al.
EDM: Efficient Deep Feature Matching
Xi Li, Tong Rao, Cihui Pan
TOTP: Transferable Online Pedestrian Trajectory Prediction with Temporal-Adaptive Mamba Latent Diffusion
Ziyang Ren, Ping Wei, Shangqi Deng et al.
Layer-wise Vision Injection with Disentangled Attention for Efficient LVLMs
Xuange Zhang, Dengjie Li, Bo Liu et al.
Weakly-Supervised Learning of Dense Functional Correspondences
Stefan Stojanov, Linan Zhao, Yunzhi Zhang et al.
PlaneRAS: Learning Planar Primitives for 3D Plane Recovery
Fang Zhang, Wenzhao Zheng, Linqing Zhao et al.
Unleashing the Temporal Potential of Stereo Event Cameras for Continuous-Time 3D Object Detection
Jae Young Kang, Hoonhee Cho, Kuk-Jin Yoon
AR-VRM: Imitating Human Motions for Visual Robot Manipulation with Analogical Reasoning
Dejie Yang, Zijing Zhao, Yang Liu
LightCity: An Urban Dataset for Outdoor Inverse Rendering and Reconstruction under Multi-illumination Conditions
Jingjing Wang, Qirui Hu, Chong Bao et al.
Skill-Driven Neurosymbolic State Abstractions
Alper Ahmetoglu, Steven James, Cameron Allen et al.
Feature Extraction and Representation of Pre-training Point Cloud Based on Diffusion Models
Chang Qiu, Feipeng Da, Zilei Zhang
Temporal Overlapping Prediction: A Self-supervised Pre-training Method for LiDAR Moving Object Segmentation
Ziliang Miao, Runjian Chen, Yixi Cai et al.
A Simple yet Mighty Hartley Diffusion Versatilist for Generalizable Dense Vision Tasks
Qi Bi, Jingjun Yi, Huimin Huang et al.
S²M²: Scalable Stereo Matching Model for Reliable Depth Estimation
JUNHONG MIN, YOUNGPIL JEON, Jimin Kim et al.
SpatialTrackerV2: Advancing 3D Point Tracking with Explicit Camera Motion
Yuxi Xiao, Jianyuan Wang, Nan Xue et al.
Towards Visual Localization Interoperability: Cross-Feature for Collaborative Visual Localization and Mapping
Alberto Jaenal, Paula Carbó Cubero, Jose Araujo et al.
MiDSummer: Multi-Guidance Diffusion for Controllable Zero-Shot Immersive Gaussian Splatting Scene Generation
Anjun Hu, Richard Tomsett, Valentin Gourmet et al.
Spatio-Spectral Pattern Illumination for Direct and Indirect Separation from a Single Hyperspectral Image
Shin Ishihara, Imari Sato
GeoFormer: Geometry Point Encoder for 3D Object Detection with Graph-based Transformer
Xin Jin, Haisheng Su, Cong Ma et al.
Tile-wise vs. Image-wise: Random-Tile Loss and Training Paradigm for Gaussian Splatting
Xiaoyu Zhang, Weihong Pan, Xiaojun Xiang et al.
Explaining Human Preferences via Metrics for Structured 3D Reconstruction
Jack Langerman, Denis Rozumny, Yuzhong Huang et al.
RoCo-Sim: Enhancing Roadside Collaborative Perception through Foreground Simulation
Yuwen Du, Anning Hu, Zichen Chao et al.
Sample-Adaptivity Tradeoff in On-Demand Sampling
Nika Haghtalab, Omar Montasser, Mingda Qiao
UniVerse: Unleashing the Scene Prior of Video Diffusion Models for Robust Radiance Field Reconstruction
Jin Cao, Hongrui Wu, Ziyong Feng et al.
ExploreGS: Explorable 3D Scene Reconstruction with Virtual Camera Samplings and Diffusion Priors
Minsu Kim, Subin Jeon, In Cho et al.
LaneDiffusion: Improving Centerline Graph Learning via Prior Injected BEV Feature Generation
Zijie Wang, Weiming Zhang, Wei Zhang et al.
Planar Affine Rectification from Local Change of Scale and Orientation
Yuval Nissan, Marc Pollefeys, Daniel Barath
ERNet: Efficient Non-Rigid Registration Network for Point Sequences
Guangzhao He, Yuxi Xiao, Zhen Xu et al.
Doppler-Aware LiDAR-RADAR Fusion for Weather-Robust 3D Detection
Yujeong Chae, Heejun Park, Hyeonseong Kim et al.
Focal Plane Visual Feature Generation and Matching on a Pixel Processor Array
Hongyi Zhang, Laurie Bose, Jianing Chen et al.
GloPER: Unsupervised Animal Pattern Extraction from Local Reconstruction
Bowen Chen, Yun Sing Koh, Gillian Dobbie
ArgMatch: Adaptive Refinement Gathering for Efficient Dense Matching
Yuxin Deng, Kaining Zhang, Linfeng Tang et al.
Thermal Polarimetric Multi-view Stereo
Takahiro Kushida, Kenichiro Tanaka
Epipolar Consistent Attention Aggregation Network for Unsupervised Light Field Disparity Estimation
Chen Gao, Shuo Zhang, Youfang Lin
GenFlow3D: Generative Scene Flow Estimation and Prediction on Point Cloud Sequences
Hanlin Li, Wenming Weng, Yueyi Zhang et al.
Prior-aware Dynamic Temporal Modeling Framework for Sequential 3D Hand Pose Estimation
Pengfei Ren, Jingyu Wang, Haifeng Sun et al.
DRaM-LHM: A Quaternion Framework for Iterative Camera Pose Estimation
Chen Lin, Weizhi Du, Zhixiang Min et al.
Tree Skeletonization from 3D Point Clouds by Denoising Diffusion
Elias Marks, Lucas Nunes, Federico Magistri et al.
SAFT: Shape and Appearance of Fabrics from Template via Differentiable Physical Simulations from Monocular Video
David Stotko, Reinhard Klein
TANDEM: Bi-Level Data Mixture Optimization with Twin Networks
Jiaxing Wang, Deping Xiang, Jin Xu et al.
Neural Inverse Rendering for High-Accuracy 3D Measurement of Moving Objects with Fewer Phase-Shifting Patterns
Yuki Urakawa, Yoshihiro Watanabe
Scaling 3D Compositional Models for Robust Classification and Pose Estimation
Xiaoding Yuan, Prakhar Kaushik, Guofeng Zhang et al.
OVA-Fields: Weakly Supervised Open-Vocabulary Affordance Fields for Robot Operational Part Detection
Heng Su, Mengying Xie, Nieqing Cao et al.
Mitigating Instability in High Residual Adaptive Sampling for PINNs via Langevin Dynamics
Minseok Jeong, Giup Seo, Euiseok Hwang
GeoDiffusion: A Training-Free Framework for Accurate 3D Geometric Conditioning in Image Generation
Phillip Mueller, Talip Ünlü, Sebastian Schmidt et al.
Recover Biological Structure from Sparse-View Diffraction Images with Neural Volumetric Prior
Renzhi He, Haowen Zhou, Yubei Chen et al.
NavQ: Learning a Q-Model for Foresighted Vision-and-Language Navigation
Peiran Xu, Xicheng Gong, Yadong Mu
ProtoPairNet: Interpretable Regression through Prototypical Pair Reasoning
Rose Gurung, Ronilo Ragodos, Chiyu Ma et al.
Online Bilateral Trade With Minimal Feedback: Don’t Waste Seller’s Time
Francesco Bacchiocchi, Matteo Castiglioni, Roberto Colomboni et al.
When Anchors Meet Cold Diffusion: A Multi-Stage Approach to Lane Detection
Bo-Lun Huang, Tzu-Hsiang Ni, Feng-Kai Huang et al.
Motal: Unsupervised 3D Object Detection by Modality and Task-specific Knowledge Transfer
Hai Wu, Hongwei Lin, Xusheng Guo et al.
NeuFrameQ: Neural Frame Fields for Scalable and Generalizable Anisotropic Quadrangulation
Ying-Tian Liu, Jiajun Li, Yu-Tao Liu et al.
Fairness-aware Anomaly Detection via Fair Projection
Feng Xiao, Xiaoying Tang, Jicong Fan
PolGS: Polarimetric Gaussian Splatting for Fast Reflective Surface Reconstruction
Yufei Han, Bowen Tie, Heng Guo et al.
Stochastic Gradient Estimation for Higher-Order Differentiable Rendering
Zican Wang, Michael Fischer, Tobias Ritschel
Zero-shot Inexact CAD Model Alignment from a Single Image
Pattaramanee Arsomngern, Sasikarn Khwanmuang, Matthias Nießner et al.
MGSfM: Multi-Camera Geometry Driven Global Structure-from-Motion
peilin Tao, Hainan Cui, Diantao Tu et al.
Uncertainty-Aware Diffusion-Guided Refinement of 3D Scenes
Sarosij Bose, Arindam Dutta, Sayak Nag et al.
MAESTRO: Task-Relevant Optimization via Adaptive Feature Enhancement and Suppression for Multi-task 3D Perception
ChangWon Kang, Jisong Kim, Hongjae Shin et al.
Joint Semantic and Rendering Enhancements in 3D Gaussian Modeling with Anisotropic Local Encoding
Jingming He, Chongyi Li, Shiqi Wang et al.
Learning Large Motion Estimation from Intermediate Representations with a High-Resolution Optical Flow Dataset Featuring Long-Range Dynamic Motion
Hoonhee Cho, Yuhwan Jeong, Kuk-Jin Yoon
V2XScenes: A Multiple Challenging Traffic Conditions Dataset for Large-Range Vehicle-Infrastructure Collaborative Perception
Bowen Wang, Yafei Wang, Wei Gong et al.
Scalable Evaluation and Neural Models for Compositional Generalization
Giacomo Camposampiero, Pietro Barbiero, Michael Hersche et al.
GeoExplorer: Active Geo-localization with Curiosity-Driven Exploration
Li Mi, Manon Béchaz, Zeming Chen et al.
PatchDEMUX: A Certifiably Robust Framework for Multi-label Classifiers Against Adversarial Patches
Dennis Jacob, Chong Xiang, Prateek Mittal
Communication-Efficient Multi-Vehicle Collaborative Semantic Segmentation via Sparse 3D Gaussian Sharing
Tianyu Hong, Xiaobo Zhou, Wenkai Hu et al.
DATA: Domain-And-Time Alignment for High-Quality Feature Fusion in Collaborative Perception
Chengchang Tian, Jianwei Ma, Yan Huang et al.
Hi-Gaussian: Hierarchical Gaussians under Normalized Spherical Projection for Single-View 3D Reconstruction
Binjian Xie, Pengju Zhang, Hao Wei et al.
Heatmap Regression without Soft-Argmax for Facial Landmark Detection
Chiao-An Yang, Raymond A. Yeh
Exploiting Vision Language Model for Training-Free 3D Point Cloud OOD Detection via Graph Score Propagation
Tiankai Chen, Yushu Li, Adam Goodge et al.
Is Tracking really more challenging in First Person Egocentric Vision?
Matteo Dunnhofer, Zaira Manigrasso, Christian Micheloni
Puzzle Similarity: A Perceptually-guided Cross-Reference Metric for Artifact Detection in 3D Scene Reconstructions
Nicolai Hermann, Jorge Condor, Piotr Didyk