Most Cited CVPR "causal graph traversal" Papers
5,589 papers found • Page 18 of 28
Conference
Self-Supervised Class-Agnostic Motion Prediction with Spatial and Temporal Consistency Regularizations
Kewei Wang, Yizheng Wu, Jun Cen et al.
Revisiting Sampson Approximations for Geometric Estimation Problems
Felix Rydell, Angelica Torres, Viktor Larsson
EarthLoc: Astronaut Photography Localization by Indexing Earth from Space
Gabriele Berton, Alex Stoken, Barbara Caputo et al.
Geometry-aware Reconstruction and Fusion-refined Rendering for Generalizable Neural Radiance Fields
Tianqi Liu, Xinyi Ye, Min Shi et al.
A Unified Framework for Human-centric Point Cloud Video Understanding
Yiteng Xu, Kecheng Ye, xiao han et al.
Prompt Augmentation for Self-supervised Text-guided Image Manipulation
Rumeysa Bodur, Binod Bhattarai, Tae-Kyun Kim
Coupled Laplacian Eigenmaps for Locally-Aware 3D Rigid Point Cloud Matching
Matteo Bastico, Etienne Decencière, Laurent Corté et al.
ExtraNeRF: Visibility-Aware View Extrapolation of Neural Radiance Fields with Diffusion Models
Meng-Li Shih, Wei-Chiu Ma, Lorenzo Boyice et al.
Enhancing 3D Fidelity of Text-to-3D using Cross-View Correspondences
Seungwook Kim, Kejie Li, Xueqing Deng et al.
Unbiased Estimator for Distorted Conics in Camera Calibration
Chaehyeon Song, Jaeho Shin, Myung-Hwan Jeon et al.
Accurate Training Data for Occupancy Map Prediction in Automated Driving Using Evidence Theory
Jonas Kälble, Sascha Wirges, Maxim Tatarchenko et al.
A&B BNN: Add&Bit-Operation-Only Hardware-Friendly Binary Neural Network
Ruichen Ma, Guanchao Qiao, Yian Liu et al.
A Bayesian Approach to OOD Robustness in Image Classification
Prakhar Kaushik, Adam Kortylewski, Alan L. Yuille
Unveiling the Unknown: Unleashing the Power of Unknown to Known in Open-Set Source-Free Domain Adaptation
Fuli Wan, Han Zhao, Xu Yang et al.
Density-guided Translator Boosts Synthetic-to-Real Unsupervised Domain Adaptive Segmentation of 3D Point Clouds
Zhimin Yuan, Wankang Zeng, Yanfei Su et al.
Normalizing Flows on the Product Space of SO(3) Manifolds for Probabilistic Human Pose Modeling
Olaf Dünkel, Tim Salzmann, Florian Pfaff
Efficient Multitask Dense Predictor via Binarization
Yuzhang Shang, Dan Xu, Gaowen Liu et al.
RCL: Reliable Continual Learning for Unified Failure Detection
Fei Zhu, Zhen Cheng, Xu-Yao Zhang et al.
G-NeRF: Geometry-enhanced Novel View Synthesis from Single-View Images
Zixiong Huang, Qi Chen, Libo Sun et al.
Combining Frame and GOP Embeddings for Neural Video Representation
Jens Eirik Saethre, Roberto Azevedo, Christopher Schroers
Deep Single Image Camera Calibration by Heatmap Regression to Recover Fisheye Images Under Manhattan World Assumption
Nobuhiko Wakai, Satoshi Sato, Yasunori Ishii et al.
Language-conditioned Detection Transformer
Jang Hyun Cho, Philipp Krähenbühl
Semantic-Aware Multi-Label Adversarial Attacks
Hassan Mahmood, Ehsan Elhamifar
Laplacian-guided Entropy Model in Neural Codec with Blur-dissipated Synthesis
Atefeh Khoshkhahtinat, Ali Zafari, Piyush Mehta et al.
CAM Back Again: Large Kernel CNNs from a Weakly Supervised Object Localization Perspective
Shunsuke Yasuki, Masato Taki
Dual-Enhanced Coreset Selection with Class-wise Collaboration for Online Blurry Class Incremental Learning
Yutian Luo, Shiqi Zhao, Haoran Wu et al.
CHAIN: Enhancing Generalization in Data-Efficient GANs via lipsCHitz continuity constrAIned Normalization
Yao Ni, Piotr Koniusz
TetraSphere: A Neural Descriptor for O(3)-Invariant Point Cloud Analysis
Pavlo Melnyk, Andreas Robinson, Michael Felsberg et al.
Fixed Point Diffusion Models
Luke Melas-Kyriazi, Xingjian Bai
Task-aligned Part-aware Panoptic Segmentation through Joint Object-Part Representations
Daan de Geus, Gijs Dubbelman
Incorporating Geo-Diverse Knowledge into Prompting for Increased Geographical Robustness in Object Recognition
Kyle Buettner, Sina Malakouti, Xiang Li et al.
DanceCamera3D: 3D Camera Movement Synthesis with Music and Dance
Zixuan Wang, Jia Jia, Shikun Sun et al.
Probabilistic Sampling of Balanced K-Means using Adiabatic Quantum Computing
Jan-Nico Zaech, Martin Danelljan, Tolga Birdal et al.
FreeMan: Towards Benchmarking 3D Human Pose Estimation under Real-World Conditions
Jiong WANG, Fengyu Yang, Bingliang Li et al.
Multi-Session SLAM with Differentiable Wide-Baseline Pose Optimization
Lahav Lipson, Jia Deng
Sparse Views Near Light: A Practical Paradigm for Uncalibrated Point-light Photometric Stereo
Mohammed Brahimi, Bjoern Haefner, Zhenzhang Ye et al.
Joint-Task Regularization for Partially Labeled Multi-Task Learning
Kento Nishi, Junsik Kim, Wanhua Li et al.
HyperSDFusion: Bridging Hierarchical Structures in Language and Geometry for Enhanced 3D Text2Shape Generation
Zhiying Leng, Tolga Birdal, Xiaohui Liang et al.
CPR-Coach: Recognizing Composite Error Actions based on Single-class Training
Shunli Wang, Shuaibing Wang, Dingkang Yang et al.
Dispersed Structured Light for Hyperspectral 3D Imaging
Suhyun Shin, Seokjun Choi, Felix Heide et al.
Focus on Hiders: Exploring Hidden Threats for Enhancing Adversarial Training
Qian Li, Yuxiao Hu, Yinpeng Dong et al.
Learning to Produce Semi-dense Correspondences for Visual Localization
Khang Truong Giang, Soohwan Song, Sungho Jo
Weak-to-Strong 3D Object Detection with X-Ray Distillation
Alexander Gambashidze, Aleksandr Dadukin, Maksim Golyadkin et al.
Mind Artist: Creating Artistic Snapshots with Human Thought
Jiaxuan Chen, Yu Qi, Yueming Wang et al.
PEGASUS: Personalized Generative 3D Avatars with Composable Attributes
Hyunsoo Cha, Byungjun Kim, Hanbyul Joo
LASA: Instance Reconstruction from Real Scans using A Large-scale Aligned Shape Annotation Dataset
Haolin Liu, Chongjie Ye, Yinyu Nie et al.
HoVLE: Unleashing the Power of Monolithic Vision-Language Models with Holistic Vision-Language Embedding
Chenxin Tao, Shiqian Su, Xizhou Zhu et al.
Detect Any Mirrors: Boosting Learning Reliability on Large-Scale Unlabeled Data with an Iterative Data Engine
Zhaohu Xing, Lihao Liu, Yijun Yang et al.
Learning Heterogeneous Tissues with Mixture of Experts for Gigapixel Whole Slide Images
Junxian Wu, Minheng Chen, Xinyi Ke et al.
Insights from the Use of Previously Unseen Neural Architecture Search Datasets
Rob Geada, David Towers, Matthew Forshaw et al.
PerLA: Perceptive 3D Language Assistant
Guofeng Mei, Wei Lin, Luigi Riz et al.
Hardware-Rasterized Ray-Based Gaussian Splatting
Samuel Rota Bulò, Lorenzo Porzi, Nemanja Bartolovic et al.
Locality-Aware Zero-Shot Human-Object Interaction Detection
Sanghyun Kim, Deunsol Jung, Minsu Cho
Stationary Representations: Optimally Approximating Compatibility and Implications for Improved Model Replacements
Niccolò Biondi, Federico Pernici, Simone Ricci et al.
Interpretable Generative Models through Post-hoc Concept Bottlenecks
Akshay R. Kulkarni, Ge Yan, Chung-En Sun et al.
Exact: Exploring Space-Time Perceptive Clues for Weakly Supervised Satellite Image Time Series Semantic Segmentation
Hao Zhu, Yan Zhu, Jiayu Xiao et al.
Birth and Death of a Rose
Chen Geng, Yunzhi Zhang, Shangzhe Wu et al.
Free360: Layered Gaussian Splatting for Unbounded 360-Degree View Synthesis from Extremely Sparse and Unposed Views
Chong Bao, Xiyu Zhang, Zehao Yu et al.
Bringing CLIP to the Clinic: Dynamic Soft Labels and Negation-Aware Learning for Medical Analysis
Hanbin Ko, Chang Min Park
Silent Branding Attack: Trigger-free Data Poisoning Attack on Text-to-Image Diffusion Models
Sangwon Jang, June Suk Choi, Jaehyeong Jo et al.
Learning from Neighbors: Category Extrapolation for Long-Tail Learning
Shizhen Zhao, Xin Wen, Jiahui Liu et al.
Conformal Prediction for Zero-Shot Models
Julio Silva-Rodríguez, Ismail Ben Ayed, Jose Dolz
Learning Physics From Video: Unsupervised Physical Parameter Estimation for Continuous Dynamical Systems
Alejandro Castañeda Garcia, Jan Warchocki, Jan van Gemert et al.
NECA: Neural Customizable Human Avatar
Junjin Xiao, Qing Zhang, Zhan Xu et al.
MITracker: Multi-View Integration for Visual Object Tracking
Mengjie Xu, Yitao Zhu, Haotian Jiang et al.
HyperLoRA: Parameter-Efficient Adaptive Generation for Portrait Synthesis
Mengtian Li, Jinshu Chen, Wanquan Feng et al.
Adversarial Domain Prompt Tuning and Generation for Single Domain Generalization
Zhipeng Xu, De Cheng, XINYANG JIANG et al.
Hyperspherical Classification with Dynamic Label-to-Prototype Assignment
Mohammad Saadabadi Saadabadi, Ali Dabouei, Sahar Rahimi Malakshan et al.
Robust Message Embedding via Attention Flow-Based Steganography
Huayuan Ye, Shenzhuo Zhang, Shiqi Jiang et al.
Object-aware Sound Source Localization via Audio-Visual Scene Understanding
Sung Jin Um, Dongjin Kim, Sangmin Lee et al.
Mind the Trojan Horse: Image Prompt Adapter Enabling Scalable and Deceptive Jailbreaking
Junxi Chen, Junhao Dong, Xiaohua Xie
ZeroGrasp: Zero-Shot Shape Reconstruction Enabled Robotic Grasping
Shun Iwase, Muhammad Zubair Irshad, Katherine Liu et al.
Multi-Resolution Pathology-Language Pre-training Model with Text-Guided Visual Representation
Shahad Albastaki, Anabia Sohail, IYYAKUTTI IYAPPAN GANAPATHI et al.
OVFoodSeg: Elevating Open-Vocabulary Food Image Segmentation via Image-Informed Textual Representation
Xiongwei Wu, Sicheng Yu, Ee-Peng Lim et al.
Pos3R: 6D Pose Estimation for Unseen Objects Made Easy
Weijian Deng, Dylan Campbell, Chunyi Sun et al.
PTDiffusion: Free Lunch for Generating Optical Illusion Hidden Pictures with Phase-Transferred Diffusion Model
Xiang Gao, Shuai Yang, Jiaying Liu
Jailbreaking the Non-Transferable Barrier via Test-Time Data Disguising
Yongli Xiang, Ziming Hong, Lina Yao et al.
DiTASK: Multi-Task Fine-Tuning with Diffeomorphic Transformations
Krishna Sri Ipsit Mantri, Carola-Bibiane Schönlieb, Bruno Ribeiro et al.
AniMo: Species-Aware Model for Text-Driven Animal Motion Generation
Xuan Wang, Kai Ruan, Xing Zhang et al.
Adaptive Non-Uniform Timestep Sampling for Accelerating Diffusion Model Training
Myunsoo Kim, Donghyeon Ki, Seong-Woong Shim et al.
Neural Motion Simulator Pushing the Limit of World Models in Reinforcement Learning
Chenjie Hao, Weyl Lu, Yifan Xu et al.
Cropper: Vision-Language Model for Image Cropping through In-Context Learning
Seung Hyun Lee, Jijun jiang, Yiran Xu et al.
Single Domain Generalization for Few-Shot Counting via Universal Representation Matching
Xianing Chen, Si Huo, Borui Jiang et al.
Continuous Pose for Monocular Cameras in Neural Implicit Representation
Qi Ma, Danda Paudel, Ajad Chhatkuli et al.
DiscoVLA: Discrepancy Reduction in Vision, Language, and Alignment for Parameter-Efficient Video-Text Retrieval
Leqi Shen, Guoqiang Gong, Tianxiang Hao et al.
NoPain: No-box Point Cloud Attack via Optimal Transport Singular Boundary
Zezeng Li, Xiaoyu Du, Na Lei et al.
SocialGesture: Delving into Multi-person Gesture Understanding
Xu Cao, Pranav Virupaksha, Wenqi Jia et al.
MARBLE: Material Recomposition and Blending in CLIP-Space
Ta-Ying Cheng, Prafull Sharma, Mark Boss et al.
Masked AutoDecoder is Effective Multi-Task Vision Generalist
Han Qiu, Jiaxing Huang, Peng Gao et al.
EnvPoser: Environment-aware Realistic Human Motion Estimation from Sparse Observations with Uncertainty Modeling
Songpengcheng Xia, Yu Zhang, Zhuo Su et al.
From Alexnet to Transformers: Measuring the Non-linearity of Deep Neural Networks with Affine Optimal Transport
Quentin Bouniot, Ievgen Redko, Anton Mallasto et al.
Towards Robust 3D Pose Transfer with Adversarial Learning
Haoyu Chen, Hao Tang, Ehsan Adeli et al.
LongDiff: Training-Free Long Video Generation in One Go
Zhuoling Li, Hossein Rahmani, Qiuhong Ke et al.
Fractal Calibration for Long-tailed Object Detection
Konstantinos Alexandridis, Ismail Elezi, Jiankang Deng et al.
Text Embedding is Not All You Need: Attention Control for Text-to-Image Semantic Alignment with Text Self-Attention Maps
Jeeyung Kim, Erfan Esmaeili Fakhabi, Qiang Qiu
CTRL-D: Controllable Dynamic 3D Scene Editing with Personalized 2D Diffusion
Kai He, Chin-Hsuan Wu, Igor Gilitschenski
Unraveling Normal Anatomy via Fluid-Driven Anomaly Randomization
Peirong Liu, Ana Lawry Aguila, Juan Iglesias
Omnidirectional Multi-Object Tracking
Kai Luo, Hao Shi, Sheng Wu et al.
MAtCha Gaussians: Atlas of Charts for High-Quality Geometry and Photorealism From Sparse Views
Antoine Guédon, Tomoki Ichikawa, Kohei Yamashita et al.
Multi-party Collaborative Attention Control for Image Customization
Han Yang, Chuanguang Yang, Qiuli Wang et al.
From Faces to Voices: Learning Hierarchical Representations for High-quality Video-to-Speech
Jihoon Kim, Jeongsoo Choi, Jaehun Kim et al.
Detection-Friendly Nonuniformity Correction: A Union Framework for Infrared UAV Target Detection
Houzhang Fang, Xiaolin Wang, Zengyang Li et al.
Binarized Neural Network for Multi-spectral Image Fusion
Junming Hou, Xiaoyu Chen, Ran Ran et al.
Learned Scanpaths Aid Blind Panoramic Video Quality Assessment
Kanglong FAN, Wen Wen, Mu Li et al.
RELOCATE: A Simple Training-Free Baseline for Visual Query Localization Using Region-Based Representations
Savya Khosla, Sethuraman T V, Alexander G. Schwing et al.
VideoComp: Advancing Fine-Grained Compositional and Temporal Alignment in Video-Text Models
Dahun Kim, AJ Piergiovanni, Ganesh Satish Mallya et al.
Removing Reflections from RAW Photos
Eric Kee, Adam Pikielny, Kevin Blackburn-Matzen et al.
Anomize: Better Open Vocabulary Video Anomaly Detection
Fei Li, Wenxuan Liu, Jingjing Chen et al.
FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute
Sotiris Anagnostidis, Gregor Bachmann, Yeongmin Kim et al.
X-MIC: Cross-Modal Instance Conditioning for Egocentric Action Generalization
Anna Kukleva, Fadime Sener, Edoardo Remelli et al.
Heterogeneous Skeleton-Based Action Representation Learning
Xiaoyan Ma, jidong kuang, Hongsong Wang et al.
MAP: MAsk-Pruning for Source-Free Model Intellectual Property Protection
Boyang Peng, Sanqing Qu, Yong Wu et al.
Visual Objectification in Films: Towards a New AI Task for Video Interpretation
Julie Tores, Lucile Sassatelli, Hui-Yin Wu et al.
Semantic and Expressive Variations in Image Captions Across Languages
Andre Ye, Sebastin Santy, Jena D. Hwang et al.
Revisiting Source-Free Domain Adaptation: Insights into Representativeness, Generalization, and Variety
Ronghang Zhu, Mengxuan Hu, Weiming Zhuang et al.
FutureHuman3D: Forecasting Complex Long-Term 3D Human Behavior from Video Observations
Christian Diller, Thomas Funkhouser, Angela Dai
Comprehensive Information Bottleneck for Unveiling Universal Attribution to Interpret Vision Transformers
Jung-Ho Hong, Ho-Joong Kim, Kyu-Sung Jeon et al.
LUCAS: Layered Universal Codec Avatars
Di Liu, Teng Deng, Giljoo Nam et al.
TFCustom: Customized Image Generation with Time-Aware Frequency Feature Guidance
Mushui Liu, Dong She, Qihan Huang et al.
VINECS: Video-based Neural Character Skinning
Zhouyingcheng Liao, Vladislav Golyanik, Marc Habermann et al.
OnlineAnySeg: Online Zero-Shot 3D Segmentation by Visual Foundation Model Guided 2D Mask Merging
Yijie Tang, Jiazhao Zhang, Yuqing Lan et al.
URWKV: Unified RWKV Model with Multi-state Perspective for Low-light Image Restoration
Rui Xu, Yuzhen Niu, Yuezhou Li et al.
Robust Self-calibration of Focal Lengths from the Fundamental Matrix
Viktor Kocur, Daniel Kyselica, Zuzana Kukelova
BIP3D: Bridging 2D Images and 3D Perception for Embodied Intelligence
Xuewu Lin, Tianwei Lin, Alan Huang et al.
VASparse: Towards Efficient Visual Hallucination Mitigation via Visual-Aware Token Sparsification
Xianwei Zhuang, Zhihong Zhu, Yuxin Xie et al.
TacoDepth: Towards Efficient Radar-Camera Depth Estimation with One-stage Fusion
Yiran Wang, Jiaqi Li, Chaoyi Hong et al.
Dynamic Integration of Task-Specific Adapters for Class Incremental Learning
Jiashuo Li, Shaokun Wang, Bo Qian et al.
SphereUFormer: A U-Shaped Transformer for Spherical 360 Perception
Yaniv Benny, Lior Wolf
On the Zero-shot Adversarial Robustness of Vision-Language Models: A Truly Zero-shot and Training-free Approach
Baoshun Tong, Hanjiang Lai, Yan Pan et al.
NLPrompt: Noise-Label Prompt Learning for Vision-Language Models
Bikang Pan, Qun Li, Xiaoying Tang et al.
Minority-Focused Text-to-Image Generation via Prompt Optimization
Soobin Um, Jong Chul Ye
Spin-UP: Spin Light for Natural Light Uncalibrated Photometric Stereo
Zongrui Li, Zhan Lu, Haojie Yan et al.
Commonsense Video Question Answering through Video-Grounded Entailment Tree Reasoning
Huabin Liu, Filip Ilievski, Cees G. M. Snoek
UHD-processer: Unified UHD Image Restoration with Progressive Frequency Learning and Degradation-aware Prompts
Yidi Liu, Dong Li, Xueyang Fu et al.
Pursuing Temporal-Consistent Video Virtual Try-On via Dynamic Pose Interaction
Dong Li, Wenqi Zhong, Wei Yu et al.
Learning to Navigate Efficiently and Precisely in Real Environments
Guillaume Bono, Hervé Poirier, Leonid Antsfeld et al.
Flash-Split: 2D Reflection Removal with Flash Cues and Latent Diffusion Separation
Tianfu Wang, Mingyang Xie, Haoming Cai et al.
Novel View Synthesis with Pixel-Space Diffusion Models
Noam Elata, Bahjat Kawar, Yaron Ostrovsky-Berman et al.
WeatherGen: A Unified Diverse Weather Generator for LiDAR Point Clouds via Spider Mamba Diffusion
Yang Wu, Yun Zhu, Kaihua Zhang et al.
Improving Autoregressive Visual Generation with Cluster-Oriented Token Prediction
Teng Hu, Jiangning Zhang, Ran Yi et al.
NightAdapter: Learning a Frequency Adapter for Generalizable Night-time Scene Segmentation
Qi Bi, Jingjun Yi, Huimin Huang et al.
FreeGave: 3D Physics Learning from Dynamic Videos by Gaussian Velocity
Jinxi Li, Ziyang Song, Siyuan Zhou et al.
MoEdit: On Learning Quantity Perception for Multi-object Image Editing
Yanfeng Li, Ka-Hou Chan, Yue Sun et al.
On the Robustness of Language Guidance for Low-Level Vision Tasks: Findings from Depth Estimation
Agneet Chatterjee, Tejas Gokhale, Chitta Baral et al.
Audio-Visual Semantic Graph Network for Audio-Visual Event Localization
Liang Liu, Shuaiyong Li, Yongqiang Zhu
ABC-Former: Auxiliary Bimodal Cross-domain Transformer with Interactive Channel Attention for White Balance
Yu-Cheng Chiu, GUAN-RONG CHEN, Zihao Chen et al.
Vid2Sim: Generalizable, Video-based Reconstruction of Appearance, Geometry and Physics for Mesh-free Simulation
Chuhao Chen, Zhiyang Dou, Chen Wang et al.
RoGSplat: Learning Robust Generalizable Human Gaussian Splatting from Sparse Multi-View Images
Junjin Xiao, Qing Zhang, Yongwei Nie et al.
Aerial Lifting: Neural Urban Semantic and Building Instance Lifting from Aerial Imagery
Yuqi Zhang, Guanying Chen, Jiaxing Chen et al.
4Deform: Neural Surface Deformation for Robust Shape Interpolation
Lu Sang, Zehranaz Canfes, Dongliang Cao et al.
Efficient Hyperparameter Optimization with Adaptive Fidelity Identification
Jiantong Jiang, Zeyi Wen, Atif Mansoor et al.
Deterministic Image-to-Image Translation via Denoising Brownian Bridge Models with Dual Approximators
Bohan Xiao, PEIYONG WANG, Qisheng He et al.
FDS: Frequency-Aware Denoising Score for Text-Guided Latent Diffusion Image Editing
Yufan Ren, Zicong Jiang, Tong Zhang et al.
Edit One for All: Interactive Batch Image Editing
Thao Nguyen, Utkarsh Ojha, Yuheng Li et al.
MMCert: Provable Defense against Adversarial Attacks to Multi-modal Models
Yanting Wang, Hongye Fu, Wei Zou et al.
Modeling Thousands of Human Annotators for Generalizable Text-to-Image Person Re-identification
Jiayu Jiang, Changxing Ding, Wentao Tan et al.
Document Haystacks: Vision-Language Reasoning Over Piles of 1000+ Documents
Jun Chen, Dannong Xu, Junjie Fei et al.
Seeing What Matters: Empowering CLIP with Patch Generation-to-Selection
Gensheng Pei, Tao Chen, Yujia Wang et al.
MambaVO: Deep Visual Odometry Based on Sequential Matching Refinement and Training Smoothing
Shuo Wang, Wanting Li, Yongcai Wang et al.
Dynamic Stereotype Theory Induced Micro-expression Recognition with Oriented Deformation
Bohao Zhang, Xuejiao Wang, Changbo Wang et al.
Enhancing Testing-Time Robustness for Trusted Multi-View Classification in the Wild
Wei Liu, Yufei Chen, Xiaodong Yue
Structured Gradient-based Interpretations via Norm-Regularized Adversarial Training
Shizhan Gong, Qi Dou, Farzan Farnia
Towards Understanding and Quantifying Uncertainty for Text-to-Image Generation
Gianni Franchi, Nacim Belkhir, Dat NGUYEN et al.
ProtoDepth: Unsupervised Continual Depth Completion with Prototypes
Patrick Rim, Hyoungseob Park, Suchisrit Gangopadhyay et al.
Boosting Self-Supervision for Single-View Scene Completion via Knowledge Distillation
Keonhee Han, Dominik Muhle, Felix Wimbauer et al.
Low-Latency Neural Stereo Streaming
Qiqi Hou, Farzad Farhadzadeh, Amir Said et al.
Proximal Algorithm Unrolling: Flexible and Efficient Reconstruction Networks for Single-Pixel Imaging
Ping Wang, Lishun Wang, Gang Qu et al.
IAAO: Interactive Affordance Learning for Articulated Objects in 3D Environments
Can Zhang, Gim Hee Lee
CoMBO: Conflict Mitigation via Branched Optimization for Class Incremental Segmentation
Kai Fang, Anqi Zhang, Guangyu Gao et al.
VideoHandles: Editing 3D Object Compositions in Videos Using Video Generative Priors
Juil Koo, Paul Guerrero, Chun-Hao P. Huang et al.
EchoONE: Segmenting Multiple Echocardiography Planes in One Model
Jiongtong Hu, Wei Zhuo, Jun Cheng et al.
HyperNet Fields: Efficiently Training Hypernetworks without Ground Truth by Learning Weight Trajectories
Eric Hedlin, Munawar Hayat, Fatih Porikli et al.
CGMatch: A Different Perspective of Semi-supervised Learning
Bo Cheng, Jueqing Lu, Yuan Tian et al.
A Subspace-Constrained Tyler's Estimator and its Applications to Structure from Motion
Feng Yu, Teng Zhang, Gilad Lerman
SCE-MAE: Selective Correspondence Enhancement with Masked Autoencoder for Self-Supervised Landmark Estimation
Kejia Yin, Varshanth Rao, Ruowei Jiang et al.
Projecting Trackable Thermal Patterns for Dynamic Computer Vision
Mark Sheinin, Aswin C. Sankaranarayanan, Srinivasa G. Narasimhan
Rate-In: Information-Driven Adaptive Dropout Rates for Improved Inference-Time Uncertainty Estimation
Tal Zeevi, Ravid Shwartz-Ziv, Yann LeCun et al.
A Polarization-Aided Transformer for Image Deblurring via Motion Vector Decomposition
Duosheng Chen, Shihao Zhou, Jinshan Pan et al.
Learnable Earth Parser: Discovering 3D Prototypes in Aerial Scans
Romain Loiseau, Elliot Vincent, Mathieu Aubry et al.
High-Fidelity Relightable Monocular Portrait Animation with Lighting-Controllable Video Diffusion Model
Mingtao Guo, Guanyu Xing, Yanli Liu
Universal Robustness via Median Randomized Smoothing for Real-World Super-Resolution
Zakariya Chaouai, Mohamed Tamaazousti
Zero-Shot 4D Lidar Panoptic Segmentation
Yushan Zhang, Aljoša Ošep, Laura Leal-Taixe et al.
DeSplat: Decomposed Gaussian Splatting for Distractor-Free Rendering
Yihao Wang, Marcus Klasson, Matias Turkulainen et al.
InterDyn: Controllable Interactive Dynamics with Video Diffusion Models
Rick Akkerman, Haiwen Feng, Michael J. Black et al.
Enhancing Facial Privacy Protection via Weakening Diffusion Purification
Ali Salar, Qing Liu, Yingli Tian et al.
PikeLPN: Mitigating Overlooked Inefficiencies of Low-Precision Neural Networks
Marina Neseem, Conor McCullough, Randy Hsin et al.
On the Consistency of Video Large Language Models in Temporal Comprehension
Minjoon Jung, Junbin Xiao, Byoung-Tak Zhang et al.
MotionPRO: Exploring the Role of Pressure in Human MoCap and Beyond
Shenghao Ren, Yi Lu, Jiayi Huang et al.
Uni-Renderer: Unifying Rendering and Inverse Rendering Via Dual Stream Diffusion
ZhiFei Chen, Tianshuo Xu, Wenhang Ge et al.
Learning to Highlight Audio by Watching Movies
Chao Huang, Ruohan Gao, J. M. F. Tsang et al.
Intensity-Robust Autofocus for Spike Camera
Changqing Su, Zhiyuan Ye, Yongsheng Xiao et al.
D^3-Human: Dynamic Disentangled Digital Human from Monocular Video
Honghu Chen, Bo Peng, Yunfan Tao et al.
AesthetiQ: Enhancing Graphic Layout Design via Aesthetic-Aware Preference Alignment of Multi-modal Large Language Models
Sohan Patnaik, Rishabh Jain, Balaji Krishnamurthy et al.
XScale-NVS: Cross-Scale Novel View Synthesis with Hash Featurized Manifold
Guangyu Wang, Jinzhi Zhang, Fan Wang et al.
Object-Centric Prompt-Driven Vision-Language-Action Model for Robotic Manipulation
Xiaoqi Li, Lingyun Xu, Mingxu Zhang et al.
Point-VOS: Pointing Up Video Object Segmentation
Sabarinath Mahadevan, Idil Esen Zulfikar, Paul Voigtlaender et al.
Latent Drifting in Diffusion Models for Counterfactual Medical Image Synthesis
Yousef Yeganeh, Ioannis Charisiadis, Marta Hasny et al.
DiffPortrait360: Consistent Portrait Diffusion for 360 View Synthesis
Yuming Gu, Phong Tran, Yujian Zheng et al.
iSegMan: Interactive Segment-and-Manipulate 3D Gaussians
Yian Zhao, Wanshi Xu, Ruochong Zheng et al.
Complementary Advantages: Exploiting Cross-Field Frequency Correlation for NIR-Assisted Image Denoising
Yuchen Wang, Hongyuan Wang, Lizhi Wang et al.