Most Cited CVPR "3d all-atom models" Papers
5,589 papers found • Page 15 of 28
Conference
Learning to Normalize on the SPD Manifold under Bures-Wasserstein Geometry
Rui Wang, Shaocheng Jin, Ziheng Chen et al.
Insightful Instance Features for 3D Instance Segmentation
Wonseok Roh, Hwanhee Jung, Giljoo Nam et al.
LOCORE: Image Re-ranking with Long-Context Sequence Modeling
Zilin Xiao, Pavel Suma, Ayush Sachdeva et al.
Discontinuity-preserving Normal Integration with Auxiliary Edges
Hyomin Kim, Yucheol Jung, Seungyong Lee
Category-Agnostic Neural Object Rigging
Guangzhao He, Chen Geng, Shangzhe Wu et al.
DiffCAM: Data-Driven Saliency Maps by Capturing Feature Differences
Xingjian Li, Qiming Zhao, Neelesh Bisht et al.
EntitySAM: Segment Everything in Video
Mingqiao Ye, Seoung Wug Oh, Lei Ke et al.
Mitigating Ambiguities in 3D Classification with Gaussian Splatting
Ruiqi Zhang, Hao Zhu, Jingyi Zhao et al.
Sample- and Parameter-Efficient Auto-Regressive Image Models
Elad Amrani, Leonid Karlinsky, Alex M. Bronstein
Potential Field Based Deep Metric Learning
Shubhang Bhatnagar, Narendra Ahuja
Solving Instance Detection from an Open-World Perspective
Qianqian Shen, Yunhan Zhao, Nahyun Kwon et al.
Boosting the Dual-Stream Architecture in Ultra-High Resolution Segmentation with Resolution-Biased Uncertainty Estimation
Rong Qin, Xingyu Liu, Jinglei Shi et al.
SP3D: Boosting Sparsely-Supervised 3D Object Detection via Accurate Cross-Modal Semantic Prompts
Shijia Zhao, Qiming Xia, Xusheng Guo et al.
Project-Probe-Aggregate: Efficient Fine-Tuning for Group Robustness
Beier Zhu, Jiequan Cui, Hanwang Zhang et al.
BLADE: Single-view Body Mesh Estimation through Accurate Depth Estimation
Shengze Wang, Jiefeng Li, Tianye Li et al.
Flexible Group Count Enables Hassle-Free Structured Pruning
Jiamu Zhang, Shaochen Zhong, Andrew Ye et al.
Learning to Count without Annotations
Lukas Knobel, Tengda Han, Yuki Asano
Split Adaptation for Pre-trained Vision Transformers
Lixu Wang, Bingqi Shang, Yi Li et al.
Coherent 3D Portrait Video Reconstruction via Triplane Fusion
Shengze Wang, Xueting Li, Chao Liu et al.
Rethinking Correspondence-based Category-Level Object Pose Estimation
Huan Ren, Wenfei Yang, Shifeng Zhang et al.
Galaxy Walker: Geometry-aware VLMs For Galaxy-scale Understanding
Tianyu Chen, Xingcheng Fu, Yisen Gao et al.
Do ImageNet-trained Models Learn Shortcuts? The Impact of Frequency Shortcuts on Generalization
Shunxin Wang, Raymond Veldhuis, Nicola Strisciuglio
UA-Pose: Uncertainty-Aware 6D Object Pose Estimation and Online Object Completion with Partial References
Ming-Feng Li, Xin Yang, Fu-En Wang et al.
Seek Common Ground While Reserving Differences: Semi-Supervised Image-Text Sentiment Recognition
Wuyou Xia, Guoli Jia, Sicheng Zhao et al.
Diffusion-based Event Generation for High-Quality Image Deblurring
Xinan Xie, Qing Zhang, Wei-Shi Zheng
ToonerGAN: Reinforcing GANs for Obfuscating Automated Facial Indexing
Kartik Thakral, Shashikant Prasad, Stuti Aswani et al.
Embracing Collaboration Over Competition: Condensing Multiple Prompts for Visual In-Context Learning
Jinpeng Wang, Tianci Luo, Yaohua Zha et al.
Few-shot Personalized Scanpath Prediction
Ruoyu Xue, Jingyi Xu, Sounak Mondal et al.
COSMIC: Clique-Oriented Semantic Multi-space Integration for Robust CLIP Test-Time Adaptation
Fanding Huang, Jingyan Jiang, Qinting Jiang et al.
Adaptive Softassign via Hadamard-Equipped Sinkhorn
Binrui Shen, Qiang Niu, Shengxin Zhu
Boltzmann Attention Sampling for Image Analysis with Small Objects
Theodore Zhao, Sid Kiblawi, Mu Wei et al.
JiSAM: Alleviate Labeling Burden and Corner Case Problems in Autonomous Driving via Minimal Real-World Data
Runjian Chen, Wenqi Shao, Bo Zhang et al.
EasyCraft: A Robust and Efficient Framework for Automatic Avatar Crafting
Suzhen Wang, Weijie Chen, Wei Zhang et al.
SynTab-LLaVA: Enhancing Multimodal Table Understanding with Decoupled Synthesis
Bangbang Zhou, Zuan Gao, Zixiao Wang et al.
FlexDrive: Toward Trajectory Flexibility in Driving Scene Gaussian Splatting Reconstruction and Rendering
Jingqiu Zhou, Lue Fan, Linjiang Huang et al.
v-CLR: View-Consistent Learning for Open-World Instance Segmentation
Chang-Bin Zhang, Jinhong Ni, Yujie Zhong et al.
Consistency-aware Self-Training for Iterative-based Stereo Matching
Jingyi Zhou, Peng Ye, Haoyu Zhang et al.
Synthetic Visual Genome
Jae Sung Park, Zixian Ma, Linjie Li et al.
Observation-Guided Diffusion Probabilistic Models
Junoh Kang, Jinyoung Choi, Sungik Choi et al.
VITED: Video Temporal Evidence Distillation
Yujie Lu, Yale Song, Lorenzo Torresani et al.
RC-AutoCalib: An End-to-End Radar-Camera Automatic Calibration Network
Van-Tin Luu, Yong-Lin Cai, Vu-Hoang Tran et al.
Multi-Modal Synergistic Implicit Image Enhancement for Efficient Optical Flow Estimation
Weichen Dai, wu hexing, xiaoyang weng et al.
Adapting Text-to-Image Generation with Feature Difference Instruction for Generic Image Restoration
Chao Wang, Hehe Fan, Huichen Yang et al.
BOE-ViT: Boosting Orientation Estimation with Equivariance in Self-Supervised 3D Subtomogram Alignment
Runmin Jiang, Jackson Daggett, Shriya Pingulkar et al.
DirectTriGS: Triplane-based Gaussian Splatting Field Representation for 3D Generation
Xiaoliang Ju, Hongsheng Li
PURA: Parameter Update-Recovery Test-Time Adaption for RGB-T Tracking
Zekai Shao, Yufan Hu, Bin Fan et al.
HSI: A Holistic Style Injector for Arbitrary Style Transfer
Shuhao Zhang, Hui Kang, Yang Liu et al.
CroCoDL: Cross-device Collaborative Dataset for Localization
Hermann Blum, Alessandro Mercurio, Joshua O'Reilly et al.
AdaptCMVC: Robust Adaption to Incremental Views in Continual Multi-view Clustering
Jing Wang, Songhe Feng, Kristoffer Knutsen Wickstrøm et al.
Blind Bitstream-corrupted Video Recovery via Metadata-guided Diffusion Model
Shuyun Wang, Hu Zhang, Xin Shen et al.
3D Prior Is All You Need: Cross-Task Few-shot 2D Gaze Estimation
Yihua Cheng, Hengfei Wang, Zhongqun Zhang et al.
Minimal Interaction Seperated Tuning: A New Paradigm for Visual Adaptation
Ningyuan Tang, Minghao Fu, Jianxin Wu
Improving Personalized Search with Regularized Low-Rank Parameter Updates
Fiona Ryan, Josef Sivic, Fabian Caba Heilbron et al.
CaMuViD: Calibration-Free Multi-View Detection
Amir Etefaghi Daryani, M. Usman Maqbool Bhutta, Byron Hernandez et al.
Data Distributional Properties As Inductive Bias for Systematic Generalization
Felipe del Rio, Alain Raymond, Daniel Florea et al.
HyperPose: Hypernetwork-Infused Camera Pose Localization and an Extended Cambridge Landmarks Dataset
Ron Ferens, Yosi Keller
SyncSDE: A Probabilistic Framework for Diffusion Synchronization
Hyunjun Lee, Hyunsoo Lee, Sookwan Han
VRetouchEr: Learning Cross-frame Feature Interdependence with Imperfection Flow for Face Retouching in Videos
Wen Xue, Le Jiang, Lianxin Xie et al.
Composing Parts for Expressive Object Generation
Harsh Rangwani, Aishwarya Agarwal, Kuldeep Kulkarni et al.
ShapeShifter: 3D Variations Using Multiscale and Sparse Point-Voxel Diffusion
Nissim Maruani, Wang Yifan, Matthew Fisher et al.
Difference Inversion: Interpolate and Isolate the Difference with Token Consistency for Image Analogy Generation
Hyunsoo Kim, Donghyun Kim, Suhyun Kim
TAROT: Towards Essentially Domain-Invariant Robustness with Theoretical Justification
Dongyoon Yang, Jihu Lee, Yongdai Kim
Real-time Acquisition and Reconstruction of Dynamic Volumes with Neural Structured Illumination
Yixin Zeng, Zoubin Bi, Yin Mingrui et al.
Revisiting Fairness in Multitask Learning: A Performance-Driven Approach for Variance Reduction
Xiaohan Qin, Xiaoxing Wang, Junchi Yan
ADU: Adaptive Detection of Unknown Categories in Black-Box Domain Adaptation
Yushan Lai, Guowen Li, Haoyuan Liang et al.
DiSRT-In-Bed: Diffusion-Based Sim-to-Real Transfer Framework for In-Bed Human Mesh Recovery
Jing Gao, Ce Zheng, Laszlo Jeni et al.
ODHSR: Online Dense 3D Reconstruction of Humans and Scenes from Monocular Videos
Zetong Zhang, Manuel Kaufmann, Lixin Xue et al.
DynPose: Largely Improving the Efficiency of Human Pose Estimation by a Simple Dynamic Framework
Yalong Xu, Lin Zhao, Chen Gong et al.
Self-Supervised Learning for Color Spike Camera Reconstruction
Yanchen Dong, Ruiqin Xiong, Xiaopeng Fan et al.
Perturb-and-Revise: Flexible 3D Editing with Generative Trajectories
Susung Hong, Johanna Suvi Karras, Ricardo Martin et al.
Sketchtopia: A Dataset and Foundational Agents for Benchmarking Asynchronous Multimodal Communication with Iconic Feedback
Mohd Hozaifa Khan, Ravi Kiran Sarvadevabhatla
COUNTS: Benchmarking Object Detectors and Multimodal Large Language Models under Distribution Shifts
Jiansheng Li, Xingxuan Zhang, Hao Zou et al.
Explicit Depth-Aware Blurry Video Frame Interpolation Guided by Differential Curves
yan zaoming, pengcheng lei, Tingting Wang et al.
DART: Disease-aware Image-Text Alignment and Self-correcting Re-alignment for Trustworthy Radiology Report Generation
Sang-Jun Park, Keun-Soo Heo, Dong-Hee Shin et al.
HeatFormer: A Neural Optimizer for Multiview Human Mesh Recovery
Yuto Matsubara, Ko Nishino
Classic Video Denoising in a Machine Learning World: Robust, Fast, and Controllable
Xin Jin, Simon Niklaus, Zhoutong Zhang et al.
Argus: A Compact and Versatile Foundation Model for Vision
Weiming Zhuang, Chen Chen, Zhizhong Li et al.
SCSA: A Plug-and-Play Semantic Continuous-Sparse Attention for Arbitrary Semantic Style Transfer
Chunnan Shang, Zhizhong Wang, Hongwei Wang et al.
TIDE: Training Locally Interpretable Domain Generalization Models Enables Test-time Correction
Aishwarya Agarwal, Srikrishna Karanam, Vineet Gandhi
Boosting Point-Supervised Temporal Action Localization through Integrating Query Reformation and Optimal Transport
Mengnan Liu, Le Wang, Sanping Zhou et al.
Advancing Manga Analysis: Comprehensive Segmentation Annotations for the Manga109 Dataset
Minshan Xie, Jian Lin, Hanyuan Liu et al.
R2C: Mapping Room to Chessboard to Unlock LLM As Low-Level Action Planner
Ziyi Bai, Hanxuan Li, Bin Fu et al.
Mamba-Adaptor: State Space Model Adaptor for Visual Recognition
Fei Xie, Jiahao Nie, Yujin Tang et al.
Percept, Memory, and Imagine: World Feature Simulating for Open-Domain Unknown Object Detection
Aming Wu, Cheng Deng
Graph Neural Network Combining Event Stream and Periodic Aggregation for Low-Latency Event-based Vision
Manon Dampfhoffer, Thomas Mesquida, Damien Joubert et al.
A Semantic Knowledge Complementarity based Decoupling Framework for Semi-supervised Class-imbalanced Medical Image Segmentation
Zheng Zhang, Guanchun Yin, Bo Zhang et al.
High-quality Point Cloud Oriented Normal Estimation via Hybrid Angular and Euclidean Distance Encoding
Yuanqi Li, Jingcheng Huang, Hongshen Wang et al.
Object Dynamics Modeling with Hierarchical Point Cloud-based Representations
Chanho Kim, Li Fuxin
Video-Panda: Parameter-efficient Alignment for Encoder-free Video-Language Models
Jinhui Yi, Syed Talal Wasim, Yanan Luo et al.
GeoDepth: From Point-to-Depth to Plane-to-Depth Modeling for Self-Supervised Monocular Depth Estimation
Haifeng Wu, Shuhang Gu, Lixin Duan et al.
SemiDAViL: Semi-supervised Domain Adaptation with Vision-Language Guidance for Semantic Segmentation
Hritam Basak, Zhaozheng Yin
Spk2SRImgNet: Super-Resolve Dynamic Scene from Spike Stream via Motion Aligned Collaborative Filtering
Yuanlin Wang, Yiyang Zhang, Ruiqin Xiong et al.
Foveated Instance Segmentation
Hongyi Zeng, Wenxuan Liu, Tianhua Xia et al.
MetricGrids: Arbitrary Nonlinear Approximation with Elementary Metric Grids based Implicit Neural Representation
Shu Wang, Yanbo Gao, Shuai Li et al.
De^2Gaze: Deformable and Decoupled Representation Learning for 3D Gaze Estimation
Yunfeng Xiao, Xiaowei Bai, Baojun Chen et al.
GlyphMastero: A Glyph Encoder for High-Fidelity Scene Text Editing
Tong Wang, Ting Liu, Xiaochao Qu et al.
PIDLoc: Cross-View Pose Optimization Network Inspired by PID Controllers
Wooju Lee, Juhye Park, Dasol Hong et al.
Soft Self-labeling and Potts Relaxations for Weakly-supervised Segmentation
Zhongwen Zhang, Yuri Boykov
Latent Space Imaging
Matheus Souza, Yidan Zheng, Kaizhang Kang et al.
Integral Fast Fourier Color Constancy
Wenjun Wei, Yanlin Qian, Huaian Chen et al.
HELVIPAD: A Real-World Dataset for Omnidirectional Stereo Depth Estimation
Mehdi Zayene, Albias Havolli, Jannik Endres et al.
Two is Better than One: Efficient Ensemble Defense for Robust and Compact Models
Yoojin Jung, Byung Cheol Song
Balancing Two Classifiers via A Simplex ETF Structure for Model Calibration
Jiani Ni, He Zhao, Jintong Gao et al.
An Image-like Diffusion Method for Human-Object Interaction Detection
Xiaofei Hui, Haoxuan Qu, Hossein Rahmani et al.
Type-R: Automatically Retouching Typos for Text-to-Image Generation
Wataru Shimoda, Naoto Inoue, Daichi Haraguchi et al.
STEPS: Sequential Probability Tensor Estimation for Text-to-Image Hard Prompt Search
Yuning Qiu, Andong Wang, Chao Li et al.
Polarized Color Screen Matting
Kenji Enomoto, Scott Cohen, Brian Price et al.
AirRoom: Objects Matter in Room Reidentification
Runmao Yao, Yi Du, Zhuoqun Chen et al.
Dual Energy-Based Model with Open-World Uncertainty Estimation for Out-of-distribution Detection
Qi Chen, Hu Ding
MExD: An Expert-Infused Diffusion Model for Whole-Slide Image Classification
Jianwei Zhao, XIN LI, Fan Yang et al.
Unlocking Generalization Power in LiDAR Point Cloud Registration
Zhenxuan Zeng, Qiao Wu, Xiyu Zhang et al.
Medusa: A Multi-Scale High-order Contrastive Dual-Diffusion Approach for Multi-View Clustering
Liang Chen, Zhe Xue, Yawen Li et al.
Meta-Learning Hyperparameters for Parameter Efficient Fine-Tuning
Zichen Tian, Yaoyao Liu, Qianru Sun
Homogeneous Dynamics Space for Heterogeneous Humans
Xinpeng Liu, Junxuan Liang, Chenshuo Zhang et al.
Black Hole-Driven Identity Absorbing in Diffusion Models
Muhammad Shaheryar, Jong Taek Lee, Soon Ki Jung
Creating Your Editable 3D Photorealistic Avatar with Tetrahedron-constrained Gaussian Splatting
Hanxi Liu, Yifang Men, Zhouhui Lian
Understanding Multi-layered Transmission Matrices
Marina Alterman, Anat Levin
WildAvatar: Learning In-the-wild 3D Avatars from the Web
Zihao Huang, Shoukang Hu, Guangcong Wang et al.
OFER: Occluded Face Expression Reconstruction
Pratheba Selvaraju, Victoria Abrevaya, Timo Bolkart et al.
Pseudo Visible Feature Fine-Grained Fusion for Thermal Object Detection
Ting Li, Mao Ye, Tianwen Wu et al.
Leveraging Global Stereo Consistency for Category-Level Shape and 6D Pose Estimation from Stereo Images
Junning Qiu, Minglei Lu, Fei Wang et al.
Style Evolving along Chain-of-Thought for Unknown-Domain Object Detection
Zihao Zhang, Aming Wu, Yahong Han
LoKi: Low-dimensional KAN for Efficient Fine-tuning Image Models
Xuan Cai, Renjie Pan, Hua Yang
Align-A-Video: Deterministic Reward Tuning of Image Diffusion Models for Consistent Video Editing
Shengzhi Wang, Yingkang Zhong, Jiangchuan Mu et al.
Transferable and Principled Efficiency for Open-Vocabulary Segmentation
Jingxuan Xu, Wuyang Chen, Yao Zhao et al.
ReCon: Enhancing True Correspondence Discrimination through Relation Consistency for Robust Noisy Correspondence Learning
Quanxing Zha, Xin Liu, Shu-Juan Peng et al.
Link to the Past: Temporal Propagation for Fast 3D Human Reconstruction from Monocular Video
Marchellus Matthew, Nadhira Noor, In Kyu Park
Relation-Rich Visual Document Generator for Visual Information Extraction
Zi-Han Jiang, Chien-Wei Lin, WeiHua Li et al.
PAVE: Patching and Adapting Video Large Language Models
Zhuoming Liu, Yiquan Li, Khoi D Nguyen et al.
VIRES: Video Instance Repainting via Sketch and Text Guided Generation
Shuchen Weng, Haojie Zheng, Peixuan Zhang et al.
Touch2Shape: Touch-Conditioned 3D Diffusion for Shape Exploration and Reconstruction
Yuanbo Wang, Zhaoxuan Zhang, Jiajin Qiu et al.
F^3OCUS - Federated Finetuning of Vision-Language Foundation Models with Optimal Client Layer Updating Strategy via Multi-objective Meta-Heuristics
Pramit Saha, Felix Wagner, Divyanshu Mishra et al.
VEU-Bench: Towards Comprehensive Understanding of Video Editing
Bozheng Li, Yongliang Wu, YI LU et al.
Dense Dispersed Structured Light for Hyperspectral 3D Imaging of Dynamic Scenes
Suhyun Shin, Seungwoo Yoon, Ryota Maeda et al.
Depth-Guided Bundle Sampling for Efficient Generalizable Neural Radiance Field Reconstruction
Li Fang, Hao Zhu, Longlong Chen et al.
PIAD: Pose and Illumination agnostic Anomaly Detection
Kaichen Yang, Junjie Cao, Zeyu Bai et al.
Dynamic Group Normalization: Spatio-Temporal Adaptation to Evolving Data Statistics
Yair Smadar, Assaf Hoogi
Instance-wise Supervision-level Optimization in Active Learning
Shinnosuke Matsuo, Riku Togashi, Ryoma Bise et al.
Boost the Inference with Co-training: A Depth-guided Mutual Learning Framework for Semi-supervised Medical Polyp Segmentation
Yuxin Li, Zihao Zhu, Yuxiang Zhang et al.
Video Language Model Pretraining with Spatio-temporal Masking
Yue Wu, Zhaobo Qi, Junshu Sun et al.
Incorporating Dense Knowledge Alignment into Unified Multimodal Representation Models
Yuhao Cui, Xinxing Zu, Wenhua Zhang et al.
Visual and Semantic Prompt Collaboration for Generalized Zero-Shot Learning
Huajie Jiang, Zhengxian Li, Xiaohan Yu et al.
Multi-modal Topology-embedded Graph Learning for Spatially Resolved Genes Prediction from Pathology Images with Prior Gene Similarity Information
Hang Shi, Chi Changxi, Peng Wan et al.
Language-Assisted Debiasing and Smoothing for Foundation Model-Based Semi-Supervised Learning
Na Zheng, Xuemeng Song, Xue Dong et al.
Point-Cache: Test-time Dynamic and Hierarchical Cache for Robust and Generalizable Point Cloud Analysis
Hongyu Sun, Qiuhong Ke, Ming Cheng et al.
Towards Cost-Effective Learning: A Synergy of Semi-Supervised and Active Learning
Tianxiang Yin, Ningzhong Liu, Han Sun
SinGS: Animatable Single-Image Human Gaussian Splats with Kinematic Priors
Yufan Wu, Xuanhong Chen, Wen Li et al.
GaPT-DAR: Category-level Garments Pose Tracking via Integrated 2D Deformation and 3D Reconstruction
Li Zhang, mingliang xu, Jianan Wang et al.
Keep the Balance: A Parameter-Efficient Symmetrical Framework for RGB+X Semantic Segmentation
Jiaxin Cai, Jingze Su, Qi Li et al.
Feature Spectrum Learning for Remote Sensing Change Detection
Qi Zang, Dong Zhao, Shuang Wang et al.
Take the Bull by the Horns: Learning to Segment Hard Samples
Yuan Guo, Jingyu Kong, Yu Wang et al.
ONDA-Pose: Occlusion-Aware Neural Domain Adaptation for Self-Supervised 6D Object Pose Estimation
Tao Tan, Qiulei Dong
Enhancing Virtual Try-On with Synthetic Pairs and Error-Aware Noise Scheduling
Nannan Li, Kevin Shih, Bryan A. Plummer
LiSu: A Dataset and Method for LiDAR Surface Normal Estimation
Dušan Malić, Christian Fruhwirth-Reisinger, Samuel Schulter et al.
Explaining Domain Shifts in Language: Concept Erasing for Interpretable Image Classification
Zequn Zeng, Yudi Su, Jianqiao Sun et al.
ETAP: Event-based Tracking of Any Point
Friedhelm Hamann, Daniel Gehrig, Filbert Febryanto et al.
UMFN: Unified Multi-Domain Face Normalization for Joint Cross-domain Prototype Learning and Heterogeneous Face Recognition
Meng Pang, Wenjun Zhang, Nanrun Zhou et al.
Unbiasing through Textual Descriptions: Mitigating Representation Bias in Video Benchmarks
Nina Shvetsova, Arsha Nagrani, Bernt Schiele et al.
The Photographer's Eye: Teaching Multimodal Large Language Models to See, and Critique Like Photographers
Daiqing Qi, Handong Zhao, Jing Shi et al.
Seeing A 3D World in A Grain of Sand
Yufan Zhang, Yu Ji, Yu Guo et al.
EnliveningGS: Active Locomotion of 3DGS
Siyuan Shen, Tianjia Shao, Kun Zhou et al.
DiGIT: Multi-Dilated Gated Encoder and Central-Adjacent Region Integrated Decoder for Temporal Action Detection Transformer
Ho-Joong Kim, Yearang Lee, Jung-Ho Hong et al.
Targeted Forgetting of Image Subgroups in CLIP Models
Zeliang Zhang, Gaowen Liu, Charles Fleming et al.
Self-Supervised Large Scale Point Cloud Completion for Archaeological Site Restoration
Aocheng Li, James R. Zimmer-Dauphinee, Rajesh Kalyanam et al.
Twinner: Shining Light on Digital Twins in a Few Snaps
Jesus Zarzar, Tom Monnier, Roman Shapovalov et al.
SPARC: Score Prompting and Adaptive Fusion for Zero-Shot Multi-Label Recognition in Vision-Language Models
Kevin Miller, Aditya Gangrade, Samarth Mishra et al.
Unified Reconstruction of Static and Dynamic Scenes from Events
Qiyao Gao, Peiqi Duan, Hanyue Lou et al.
MAC-Ego3D: Multi-Agent Gaussian Consensus for Real-Time Collaborative Ego-Motion and Photorealistic 3D Reconstruction
Xiaohao Xu, Feng Xue, Shibo Zhao et al.
FRAME: Floor-aligned Representation for Avatar Motion from Egocentric Video
Andrea Boscolo Camiletto, Jian Wang, Eduardo Alvarado et al.
Adapting to Observation Length of Trajectory Prediction via Contrastive Learning
Ruiqi Qiu, JUN GONG, Xinyu Zhang et al.
Fitted Neural Lossless Image Compression
Zhe Zhang, Zhenzhong Chen, Shan Liu
NTClick: Achieving Precise Interactive Segmentation With Noise-tolerant Clicks
Chenyi Zhang, Ting Liu, Xiaochao Qu et al.
Pose-Guided Temporal Enhancement for Robust Low-Resolution Hand Reconstruction
Kaixin Fan, Pengfei Ren, Jingyu Wang et al.
Automatic Spectral Calibration of Hyperspectral Images: Method, Dataset and Benchmark
Zhuoran Du, Shaodi You, Cheng Cheng et al.
Customized Condition Controllable Generation for Video Soundtrack
Fan Qi, KunSheng Ma, Changsheng Xu
Acc3D: Accelerating Single Image to 3D Diffusion Models via Edge Consistency Guided Score Distillation
Kendong Liu, Zhiyu Zhu, Hui LIU et al.
SVLTA: Benchmarking Vision-Language Temporal Alignment via Synthetic Video Situation
Hao Du, Bo Wu, Yan Lu et al.
Remote Photoplethysmography in Real-World and Extreme Lighting Scenarios
Hang Shao, lei luo, Jianjun Qian et al.
Data-free Universal Adversarial Perturbation with Pseudo-semantic Prior
Chanhui Lee, Yeonghwan Song, Jeany Son
Attribute-Missing Multi-view Graph Clustering
Bowen Zhao, Qianqian Wang, Zhengming Ding et al.
EvOcc: Accurate Semantic Occupancy for Automated Driving Using Evidence Theory
Jonas Kälble, Sascha Wirges, Maxim Tatarchenko et al.
Provoking Multi-modal Few-Shot LVLM via Exploration-Exploitation In-Context Learning
Cheng Chen, Yunpeng Zhai, Yifan Zhao et al.
HiLoTs: High-Low Temporal Sensitive Representation Learning for Semi-Supervised LiDAR Segmentation in Autonomous Driving
R.D. Lin, Pengcheng Weng, Yinqiao Wang et al.
CamPoint: Boosting Point Cloud Segmentation with Virtual Camera
Jianhui Zhang, Luo Yizhi, Zicheng Zhang et al.
Separation of Powers: On Segregating Knowledge from Observation in LLM-enabled Knowledge-based Visual Question Answering
Zhen Yang, Zhuo Tao, Qi Chen et al.
Symbolic Representation for Any-to-Any Generative Tasks
Jiaqi Chen, Xiaoye Zhu, Yue Wang et al.
ESC: Erasing Space Concept for Knowledge Deletion
Tae-Young Lee, Sundong Park, Minwoo Jeon et al.
Deep Video Inverse Tone Mapping Based on Temporal Clues
Yuyao Ye, Ning Zhang, Yang Zhao et al.
ReSpec: Relevance and Specificity Grounded Online Filtering for Learning on Video-Text Data Streams
Chris Dongjoo Kim, Jihwan Moon, Sangwoo Moon et al.
Active Event-based Stereo Vision
Jianing Li, Yunjian Zhang, Haiqian Han et al.
NN-Former: Rethinking Graph Structure in Neural Architecture Representation
Ruihan Xu, Haokui Zhang, Yaowei Wang et al.
POMP: Physics-constrainable Motion Generative Model through Phase Manifolds
Bin Ji, Ye Pan, zhimeng Liu et al.
Let Samples Speak: Mitigating Spurious Correlation by Exploiting the Clusterness of Samples
WEIWEI LI, Junzhuo Liu, Yuanyuan Ren et al.
PersonaHOI: Effortlessly Improving Face Personalization in Human-Object Interaction Generation
Xinting Hu, Haoran Wang, Jan Lenssen et al.
Uncertainty-Instructed Structure Injection for Generalizable HD Map Construction
Xiaolu Liu, Ruizi Yang, Song Wang et al.
CURSOR: Scalable Mixed-Order Hypergraph Matching with CUR Decomposition
Qixuan Zheng, Ming Zhang, Hong Yan
Sampling Innovation-Based Adaptive Compressive Sensing
Zhifu Tian, Tao Hu, Chaoyang Niu et al.
Odd-One-Out: Anomaly Detection by Comparing with Neighbors
Ankan Kumar Bhunia, Changjian Li, Hakan Bilen
Nested Diffusion Models Using Hierarchical Latent Priors
Xiao Zhang, Ruoxi Jiang, Rebecca Willett et al.
Temporal Action Detection Model Compression by Progressive Block Drop
Xiaoyong Chen, Yong Guo, Jiaming Liang et al.