Most Cited 2024 "posterior inference" Papers
12,324 papers found • Page 52 of 62
Conference
A Versatile Framework for Continual Test-Time Domain Adaptation: Balancing Discriminability and Generalizability
Xu Yang, Xuan chen, Moqi Li et al.
Efficient Solution of Point-Line Absolute Pose
Petr Hruby, Timothy Duff, Marc Pollefeys
SPIN: Simultaneous Perception Interaction and Navigation
Shagun Uppal, Ananye Agarwal, Haoyu Xiong et al.
CAMixerSR: Only Details Need More "Attention"
Yan Wang, Yi Liu, Shijie Zhao et al.
FISBe: A Real-World Benchmark Dataset for Instance Segmentation of Long-Range Thin Filamentous Structures
Lisa Mais, Peter Hirsch, Claire Managan et al.
POPDG: Popular 3D Dance Generation with PopDanceSet
Zhenye Luo, Min Ren, Xuecai Hu et al.
RankMatch: Exploring the Better Consistency Regularization for Semi-supervised Semantic Segmentation
Huayu Mai, Rui Sun, Tianzhu Zhang et al.
CoDe: An Explicit Content Decoupling Framework for Image Restoration
Enxuan Gu, Hongwei Ge, Yong Guo
Masked Spatial Propagation Network for Sparsity-Adaptive Depth Refinement
Jinyoung Jun, Jae-Han Lee, Chang-Su Kim
D^4: Dataset Distillation via Disentangled Diffusion Model
Duo Su, Junjie Hou, Weizhi Gao et al.
An Empirical Study of the Generalization Ability of Lidar 3D Object Detectors to Unseen Domains
George Eskandar
MarkovGen: Structured Prediction for Efficient Text-to-Image Generation
Sadeep Jayasumana, Daniel Glasner, Srikumar Ramalingam et al.
Rethinking the Representation in Federated Unsupervised Learning with Non-IID Data
Xinting Liao, Weiming Liu, Chaochao Chen et al.
ViLa-MIL: Dual-scale Vision-Language Multiple Instance Learning for Whole Slide Image Classification
Jiangbo Shi, Chen Li, Tieliang Gong et al.
Open3DIS: Open-Vocabulary 3D Instance Segmentation with 2D Mask Guidance
Phuc Nguyen, Tuan Duc Ngo, Evangelos Kalogerakis et al.
Learning to Localize Objects Improves Spatial Reasoning in Visual-LLMs
Kanchana Ranasinghe, Satya Narayan Shukla, Omid Poursaeed et al.
CaDeT: a Causal Disentanglement Approach for Robust Trajectory Prediction in Autonomous Driving
Mozhgan Pourkeshavarz, Junrui Zhang, Amir Rasouli
Continual Forgetting for Pre-trained Vision Models
Hongbo Zhao, Bolin Ni, Junsong Fan et al.
Boosting Neural Representations for Videos with a Conditional Decoder
XINJIE ZHANG, Ren Yang, Dailan He et al.
Unsupervised Feature Learning with Emergent Data-Driven Prototypicality
Yunhui Guo, Youren Zhang, Yubei Chen et al.
Text-Guided 3D Face Synthesis - From Generation to Editing
Yunjie Wu, Yapeng Meng, Zhipeng Hu et al.
Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models
Huan Ling, Seung Wook Kim, Antonio Torralba et al.
IReNe: Instant Recoloring of Neural Radiance Fields
Alessio Mazzucchelli, Adrian Garcia-Garcia, Elena Garces et al.
Constrained Layout Generation with Factor Graphs
Mohammed Haroon Dupty, Yanfei Dong, Sicong Leng et al.
URHand: Universal Relightable Hands
Zhaoxi Chen, Gyeongsik Moon, Kaiwen Guo et al.
Neural Implicit Morphing of Face Images
Guilherme Schardong, Tiago Novello, Hallison Paz et al.
Distilling CLIP with Dual Guidance for Learning Discriminative Human Body Shape Representation
Feng Liu, Minchul Kim, Zhiyuan Ren et al.
Snapshot Lidar: Fourier Embedding of Amplitude and Phase for Single-Image Depth Reconstruction
Sarah Friday, Yunzi Shi, Yaswanth Kumar Cherivirala et al.
CARZero: Cross-Attention Alignment for Radiology Zero-Shot Classification
Haoran Lai, Qingsong Yao, Zihang Jiang et al.
MedM2G: Unifying Medical Multi-Modal Generation via Cross-Guided Diffusion with Visual Invariant
Chenlu Zhan, Gaoang Wang, Yu LIN et al.
GLID: Pre-training a Generalist Encoder-Decoder Vision Model
Jihao Liu, Jinliang Zheng, Yu Liu et al.
Your Image is My Video: Reshaping the Receptive Field via Image-To-Video Differentiable AutoAugmentation and Fusion
Sofia Casarin, Cynthia Ugwu, Sergio Escalera et al.
Kernel Adaptive Convolution for Scene Text Detection via Distance Map Prediction
Jinzhi Zheng, Heng Fan, Libo Zhang
NeRFCodec: Neural Feature Compression Meets Neural Radiance Fields for Memory-Efficient Scene Representation
Sicheng Li, Hao Li, Yiyi Liao et al.
Contrastive Denoising Score for Text-guided Latent Diffusion Image Editing
Hyelin Nam, Gihyun Kwon, Geon Yeong Park et al.
PlatoNeRF: 3D Reconstruction in Plato's Cave via Single-View Two-Bounce Lidar
Tzofi Klinghoffer, Xiaoyu Xiang, Siddharth Somasundaram et al.
DiffLoc: Diffusion Model for Outdoor LiDAR Localization
Wen Li, Yuyang Yang, Shangshu Yu et al.
Text-Guided Variational Image Generation for Industrial Anomaly Detection and Segmentation
Mingyu Lee, Jongwon Choi
Tackling the Singularities at the Endpoints of Time Intervals in Diffusion Models
Pengze Zhang, Hubery Yin, Chen Li et al.
Troika: Multi-Path Cross-Modal Traction for Compositional Zero-Shot Learning
Siteng Huang, Biao Gong, Yutong Feng et al.
Soften to Defend: Towards Adversarial Robustness via Self-Guided Label Refinement
Daiwei Yu, Zhuorong Li, Lina Wei et al.
LoCoNet: Long-Short Context Network for Active Speaker Detection
Xizi Wang, Feng Cheng, Gedas Bertasius
WinSyn: : A High Resolution Testbed for Synthetic Data
Tom Kelly, John Femiani, Peter Wonka
Retrieval-Augmented Layout Transformer for Content-Aware Layout Generation
Daichi Horita, Naoto Inoue, Kotaro Kikuchi et al.
Wired Perspectives: Multi-View Wire Art Embraces Generative AI
Zhiyu Qu, LAN YANG, Honggang Zhang et al.
Small Scale Data-Free Knowledge Distillation
He Liu, Yikai Wang, Huaping Liu et al.
Transfer CLIP for Generalizable Image Denoising
Jun Cheng, Dong Liang, Shan Tan
Validating Privacy-Preserving Face Recognition under a Minimum Assumption
Hui Zhang, Xingbo Dong, YenLungLai et al.
CLiC: Concept Learning in Context
Mehdi Safaee, Aryan Mikaeili, Or Patashnik et al.
IDGuard: Robust General Identity-centric POI Proactive Defense Against Face Editing Abuse
Yunshu Dai, Jianwei Fei, Fangjun Huang
Feature Re-Embedding: Towards Foundation Model-Level Performance in Computational Pathology
Wenhao Tang, Fengtao ZHOU, Sheng Huang et al.
SpatialTracker: Tracking Any 2D Pixels in 3D Space
Yuxi Xiao, Qianqian Wang, Shangzhan Zhang et al.
TRIP: Temporal Residual Learning with Image Noise Prior for Image-to-Video Diffusion Models
Zhongwei Zhang, Fuchen Long, Yingwei Pan et al.
Perceptual Assessment and Optimization of HDR Image Rendering
Peibei Cao, Rafal Mantiuk, Kede Ma
Pose-Transformed Equivariant Network for 3D Point Trajectory Prediction
Ruixuan Yu, Jian Sun
Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator for Vision Applications
Yuwen Xiong, Zhiqi Li, Yuntao Chen et al.
Multimodal Representation Learning by Alternating Unimodal Adaptation
Xiaohui Zhang, Jaehong Yoon, Mohit Bansal et al.
Compositional Video Understanding with Spatiotemporal Structure-based Transformers
Hoyeoung Yun, Jinwoo Ahn, Minseo Kim et al.
Make-It-Vivid: Dressing Your Animatable Biped Cartoon Characters from Text
Junshu Tang, Yanhong Zeng, Ke Fan et al.
Coherent Temporal Synthesis for Incremental Action Segmentation
Guodong Ding, Hans Golong, Angela Yao
Person in Place: Generating Associative Skeleton-Guidance Maps for Human-Object Interaction Image Editing
ChangHee Yang, ChanHee Kang, Kyeongbo Kong et al.
Estimating Extreme 3D Image Rotations using Cascaded Attention
Shay Dekel, Yosi Keller, Martin Čadík
Towards Real-World HDR Video Reconstruction: A Large-Scale Benchmark Dataset and A Two-Stage Alignment Network
Yong Shu, Liquan Shen, Xiangyu Hu et al.
Photo-SLAM: Real-time Simultaneous Localization and Photorealistic Mapping for Monocular Stereo and RGB-D Cameras
Huajian Huang, Longwei Li, Hui Cheng et al.
Attention Calibration for Disentangled Text-to-Image Personalization
Yanbing Zhang, Mengping Yang, Qin Zhou et al.
SmartMask: Context Aware High-Fidelity Mask Generation for Fine-grained Object Insertion and Layout Control
Jaskirat Singh, Jianming Zhang, Qing Liu et al.
GraCo: Granularity-Controllable Interactive Segmentation
Yian Zhao, Kehan Li, Zesen Cheng et al.
Segment Every Out-of-Distribution Object
Wenjie Zhao, Jia Li, Xin Dong et al.
Upscale-A-Video: Temporal-Consistent Diffusion Model for Real-World Video Super-Resolution
Shangchen Zhou, Peiqing Yang, Jianyi Wang et al.
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild
Fanghua Yu, Jinjin Gu, Zheyuan Li et al.
Masked and Shuffled Blind Spot Denoising for Real-World Images
Hamadi Chihaoui, Paolo Favaro
Open-Vocabulary Object 6D Pose Estimation
Jaime Corsetti, Davide Boscaini, Changjae Oh et al.
Generative Region-Language Pretraining for Open-Ended Object Detection
Chuang Lin, Yi Jiang, Lizhen Qu et al.
Boosting Diffusion Models with Moving Average Sampling in Frequency Domain
Yurui Qian, Qi Cai, Yingwei Pan et al.
Discovering Syntactic Interaction Clues for Human-Object Interaction Detection
Jinguo Luo, Weihong Ren, Weibo Jiang et al.
Quantifying Uncertainty in Motion Prediction with Variational Bayesian Mixture
Juanwu Lu, Can Cui, Yunsheng Ma et al.
Generative Latent Coding for Ultra-Low Bitrate Image Compression
Zhaoyang Jia, Jiahao Li, Bin Li et al.
Selectively Informative Description can Reduce Undesired Embedding Entanglements in Text-to-Image Personalization
Jimyeong Kim, Jungwon Park, Wonjong Rhee
SG-PGM: Partial Graph Matching Network with Semantic Geometric Fusion for 3D Scene Graph Alignment and Its Downstream Tasks
Yaxu Xie, Alain Pagani, Didier Stricker
Back to 3D: Few-Shot 3D Keypoint Detection with Back-Projected 2D Features
Thomas Wimmer, Peter Wonka, Maks Ovsjanikov
Alpha-CLIP: A CLIP Model Focusing on Wherever You Want
Zeyi Sun, Ye Fang, Tong Wu et al.
DemoFusion: Democratising High-Resolution Image Generation With No $$$
Ruoyi DU, Dongliang Chang, Timothy Hospedales et al.
Activity-Biometrics: Person Identification from Daily Activities
Shehreen Azad, Yogesh S. Rawat
Holoported Characters: Real-time Free-viewpoint Rendering of Humans from Sparse RGB Cameras
Ashwath Shetty, Marc Habermann, Guoxing Sun et al.
Neighbor Relations Matter in Video Scene Detection
Jiawei Tan, Hongxing Wang, Jiaxin Li et al.
Fast ODE-based Sampling for Diffusion Models in Around 5 Steps
Zhenyu Zhou, Defang Chen, Can Wang et al.
Referring Image Editing: Object-level Image Editing via Referring Expressions
Chang Liu, Xiangtai Li, Henghui Ding
InNeRF360: Text-Guided 3D-Consistent Object Inpainting on 360-degree Neural Radiance Fields
Dongqing Wang, Tong Zhang, Alaa Abboud et al.
From-Ground-To-Objects: Coarse-to-Fine Self-supervised Monocular Depth Estimation of Dynamic Objects with Ground Contact Prior
Jaeho Moon, Juan Luis Gonzalez Bello, Byeongjun Kwon et al.
Unsupervised Blind Image Deblurring Based on Self-Enhancement
Lufei Chen, Xiangpeng Tian, Shuhua Xiong et al.
Mask Grounding for Referring Image Segmentation
Yong Xien Chng, Henry Zheng, Yizeng Han et al.
SignGraph: A Sign Sequence is Worth Graphs of Nodes
Shiwei Gan, Yafeng Yin, Zhiwei Jiang et al.
Embracing Unimodal Aleatoric Uncertainty for Robust Multimodal Fusion
Zixian Gao, Xun Jiang, Xing Xu et al.
DGC-GNN: Leveraging Geometry and Color Cues for Visual Descriptor-Free 2D-3D Matching
Shuzhe Wang, Juho Kannala, Daniel Barath
FreeDrag: Feature Dragging for Reliable Point-based Image Editing
Pengyang Ling, Lin Chen, Pan Zhang et al.
TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models
Yushi Huang, Ruihao Gong, Jing Liu et al.
GaussianAvatars: Photorealistic Head Avatars with Rigged 3D Gaussians
Shenhan Qian, Tobias Kirschstein, Liam Schoneveld et al.
Explaining CLIP's Performance Disparities on Data from Blind/Low Vision Users
Daniela Massiceti, Camilla Longden, Agnieszka Słowik et al.
MMCert: Provable Defense against Adversarial Attacks to Multi-modal Models
Yanting Wang, Hongye Fu, Wei Zou et al.
DiffusionMTL: Learning Multi-Task Denoising Diffusion Model from Partially Annotated Data
Hanrong Ye, Dan Xu
Revisiting Spatial-Frequency Information Integration from a Hierarchical Perspective for Panchromatic and Multi-Spectral Image Fusion
Jiangtong Tan, Jie Huang, Naishan Zheng et al.
FineSports: A Multi-person Hierarchical Sports Video Dataset for Fine-grained Action Understanding
Jinglin Xu, Guohao Zhao, Sibo Yin et al.
MULTIFLOW: Shifting Towards Task-Agnostic Vision-Language Pruning
Matteo Farina, Massimiliano Mancini, Elia Cunegatti et al.
Unleashing Unlabeled Data: A Paradigm for Cross-View Geo-Localization
Guopeng Li, Ming Qian, Gui-Song Xia
CURSOR: Scalable Mixed-Order Hypergraph Matching with CUR Decomposition
Qixuan Zheng, Ming Zhang, Hong Yan
FCS: Feature Calibration and Separation for Non-Exemplar Class Incremental Learning
Qiwei Li, Yuxin Peng, Jiahuan Zhou
GreedyViG: Dynamic Axial Graph Construction for Efficient Vision GNNs
Mustafa Munir, William Avery, Md Mostafijur Rahman et al.
Exploring Regional Clues in CLIP for Zero-Shot Semantic Segmentation
Yi Zhang, Meng-Hao Guo, Miao Wang et al.
GALA: Generating Animatable Layered Assets from a Single Scan
Taeksoo Kim, Byungjun Kim, Shunsuke Saito et al.
Improving Graph Contrastive Learning via Adaptive Positive Sampling
Jiaming Zhuo, Feiyang Qin, Can Cui et al.
Hearing Anything Anywhere
Mason Wang, Ryosuke Sawata, Samuel Clarke et al.
Dr2Net: Dynamic Reversible Dual-Residual Networks for Memory-Efficient Finetuning
Chen Zhao, Shuming Liu, Karttikeya Mangalam et al.
Diffusion-EDFs: Bi-equivariant Denoising Generative Modeling on SE(3) for Visual Robotic Manipulation
Hyunwoo Ryu, Jiwoo Kim, Hyunseok An et al.
BOTH2Hands: Inferring 3D Hands from Both Text Prompts and Body Dynamics
Wenqian Zhang, Molin Huang, Yuxuan Zhou et al.
Bayesian Exploration of Pre-trained Models for Low-shot Image Classification
Yibo Miao, Yu lei, Feng Zhou et al.
Benchmarking the Robustness of Temporal Action Detection Models Against Temporal Corruptions
Runhao Zeng, Xiaoyong Chen, Jiaming Liang et al.
RepKPU: Point Cloud Upsampling with Kernel Point Representation and Deformation
Yi Rong, Haoran Zhou, Kang Xia et al.
4K4D: Real-Time 4D View Synthesis at 4K Resolution
Zhen Xu, Sida Peng, Haotong Lin et al.
Context-Guided Spatio-Temporal Video Grounding
Xin Gu, Heng Fan, Yan Huang et al.
TokenHMR: Advancing Human Mesh Recovery with a Tokenized Pose Representation
Sai Kumar Dwivedi, Yu Sun, Priyanka Patel et al.
Re-thinking Data Availability Attacks Against Deep Neural Networks
Bin Fang, Bo Li, Shuang Wu et al.
Logit Standardization in Knowledge Distillation
Shangquan Sun, Wenqi Ren, Jingzhi Li et al.
A Unified Approach for Text- and Image-guided 4D Scene Generation
Yufeng Zheng, Xueting Li, Koki Nagano et al.
CONFORM: Contrast is All You Need for High-Fidelity Text-to-Image Diffusion Models
Tuna Han Salih Meral, Enis Simsar, Federico Tombari et al.
SPECAT: SPatial-spEctral Cumulative-Attention Transformer for High-Resolution Hyperspectral Image Reconstruction
Zhiyang Yao, Shuyang Liu, Xiaoyun Yuan et al.
Video-Based Human Pose Regression via Decoupled Space-Time Aggregation
Jijie He, Wenwu Yang
Neural Refinement for Absolute Pose Regression with Feature Synthesis
Shuai Chen, Yash Bhalgat, Xinghui Li et al.
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
Jianyuan Wang, Nikita Karaev, Christian Rupprecht et al.
Boosting Image Restoration via Priors from Pre-trained Models
Xiaogang Xu, Shu Kong, Tao Hu et al.
CPP-Net: Embracing Multi-Scale Feature Fusion into Deep Unfolding CP-PPA Network for Compressive Sensing
Zhen Guo, Hongping Gan
GenFlow: Generalizable Recurrent Flow for 6D Pose Refinement of Novel Objects
Sungphill Moon, Hyeontae Son, Dongcheol Hur et al.
PKU-DyMVHumans: A Multi-View Video Benchmark for High-Fidelity Dynamic Human Modeling
Xiaoyun Zheng, Liwei Liao, Xufeng Li et al.
DiffCast: A Unified Framework via Residual Diffusion for Precipitation Nowcasting
Demin Yu, Xutao Li, Yunming Ye et al.
MeshGPT: Generating Triangle Meshes with Decoder-Only Transformers
Yawar Siddiqui, Antonio Alliegro, Alexey Artemov et al.
RadarDistill: Boosting Radar-based Object Detection Performance via Knowledge Distillation from LiDAR Features
Geonho Bang, Kwangjin Choi, Jisong Kim et al.
Task-Conditioned Adaptation of Visual Features in Multi-Task Policy Learning
Pierre Marza, Laetitia Matignon, Olivier Simonin et al.
EasyDrag: Efficient Point-based Manipulation on Diffusion Models
Xingzhong Hou, Boxiao Liu, Yi Zhang et al.
Learned Lossless Image Compression based on Bit Plane Slicing
Zhe Zhang, Huairui Wang, Zhenzhong Chen et al.
BEM: Balanced and Entropy-based Mix for Long-Tailed Semi-Supervised Learning
Hongwei Zheng, Linyuan Zhou, Han Li et al.
Dancing with Still Images: Video Distillation via Static-Dynamic Disentanglement
Ziyu Wang, Yue Xu, Cewu Lu et al.
SC-Tune: Unleashing Self-Consistent Referential Comprehension in Large Vision Language Models
Tongtian Yue, Jie Cheng, Longteng Guo et al.
Frequency-Adaptive Dilated Convolution for Semantic Segmentation
Linwei Chen, Lin Gu, Dezhi Zheng et al.
TexTile: A Differentiable Metric for Texture Tileability
Carlos Rodriguez-Pardo, Dan Casas, Elena Garces et al.
MatSynth: A Modern PBR Materials Dataset
Giuseppe Vecchio, Valentin Deschaintre
Image Processing GNN: Breaking Rigidity in Super-Resolution
Yuchuan Tian, Hanting Chen, Chao Xu et al.
ECoDepth: Effective Conditioning of Diffusion Models for Monocular Depth Estimation
Suraj Patni, Aradhye Agarwal, Chetan Arora
Bi-Causal: Group Activity Recognition via Bidirectional Causality
Youliang Zhang, Wenxuan Liu, danni xu et al.
Riemannian Multinomial Logistics Regression for SPD Neural Networks
Ziheng Chen, Yue Song, Gaowen Liu et al.
LED: A Large-scale Real-world Paired Dataset for Event Camera Denoising
Yuxing Duan
NoiseCollage: A Layout-Aware Text-to-Image Diffusion Model Based on Noise Cropping and Merging
Takahiro Shirakawa, Seiichi Uchida
OmniMedVQA: A New Large-Scale Comprehensive Evaluation Benchmark for Medical LVLM
Yutao Hu, Tianbin, Quanfeng Lu et al.
Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models
Daniel Geng, Inbum Park, Andrew Owens
Visual Programming for Zero-shot Open-Vocabulary 3D Visual Grounding
Zhihao Yuan, Jinke Ren, Chun-Mei Feng et al.
Towards HDR and HFR Video from Rolling-Mixed-Bit Spikings
Yakun Chang, Yeliduosi Xiaokaiti, Yujia Liu et al.
Learn from View Correlation: An Anchor Enhancement Strategy for Multi-view Clustering
Suyuan Liu, KE LIANG, Zhibin Dong et al.
Passive Snapshot Coded Aperture Dual-Pixel RGB-D Imaging
Bhargav Ghanekar, Salman Siddique Khan, Pranav Sharma et al.
UniPAD: A Universal Pre-training Paradigm for Autonomous Driving
Honghui Yang, Sha Zhang, Di Huang et al.
ECLIPSE: A Resource-Efficient Text-to-Image Prior for Image Generations
Maitreya Patel, Changhoon Kim, Sheng Cheng et al.
Low-Res Leads the Way: Improving Generalization for Super-Resolution by Self-Supervised Learning
Haoyu Chen, Wenbo Li, Jinjin Gu et al.
Neural Video Compression with Feature Modulation
Jiahao Li, Bin Li, Yan Lu
Nearest is Not Dearest: Towards Practical Defense against Quantization-conditioned Backdoor Attacks
Boheng Li, Yishuo Cai, Haowei Li et al.
Dual DETRs for Multi-Label Temporal Action Detection
Yuhan Zhu, Guozhen Zhang, Jing Tan et al.
Discriminative Probing and Tuning for Text-to-Image Generation
Leigang Qu, Wenjie Wang, Yongqi Li et al.
GigaTraj: Predicting Long-term Trajectories of Hundreds of Pedestrians in Gigapixel Complex Scenes
Haozhe Lin, Chunyu Wei, Li He et al.
Outdoor Scene Extrapolation with Hierarchical Generative Cellular Automata
Dongsu Zhang, Francis Williams, Žan Gojčič et al.
Comparing the Decision-Making Mechanisms by Transformers and CNNs via Explanation Methods
Mingqi Jiang, Saeed Khorram, Li Fuxin
Continual Segmentation with Disentangled Objectness Learning and Class Recognition
Yizheng Gong, Siyue Yu, Xiaoyang Wang et al.
Image Sculpting: Precise Object Editing with 3D Geometry Control
Jiraphon Yenphraphai, Xichen Pan, Sainan Liu et al.
Attribute-Guided Pedestrian Retrieval: Bridging Person Re-ID with Internal Attribute Variability
Yan Huang, Zhang Zhang, Qiang Wu et al.
Weakly Misalignment-free Adaptive Feature Alignment for UAVs-based Multimodal Object Detection
Chen Chen, Jiahao Qi, Xingyue Liu et al.
Prompt-Driven Dynamic Object-Centric Learning for Single Domain Generalization
Deng Li, Aming Wu, Yaowei Wang et al.
EscherNet: A Generative Model for Scalable View Synthesis
Xin Kong, Shikun Liu, Xiaoyang Lyu et al.
MVCPS-NeuS: Multi-view Constrained Photometric Stereo for Neural Surface Reconstruction
Hiroaki Santo, Fumio Okura, Yasuyuki Matsushita
OHTA: One-shot Hand Avatar via Data-driven Implicit Priors
Xiaozheng Zheng, Chao Wen, Zhuo Su et al.
E-GPS: Explainable Geometry Problem Solving via Top-Down Solver and Bottom-Up Generator
Wenjun Wu, Lingling Zhang, Jun Liu et al.
MultiPhys: Multi-Person Physics-aware 3D Motion Estimation
Nicolás Ugrinovic, Boxiao Pan, Georgios Pavlakos et al.
LMDrive: Closed-Loop End-to-End Driving with Large Language Models
Hao Shao, Yuxuan Hu, Letian Wang et al.
ID-Blau: Image Deblurring by Implicit Diffusion-based reBLurring AUgmentation
Jia-Hao Wu, Fu-Jen Tsai, Yan-Tsung Peng et al.
GauHuman: Articulated Gaussian Splatting from Monocular Human Videos
Shoukang Hu, Tao Hu, Ziwei Liu
BEVNeXt: Reviving Dense BEV Frameworks for 3D Object Detection
Zhenxin Li, Shiyi Lan, Jose M. Alvarez et al.
AnySkill: Learning Open-Vocabulary Physical Skill for Interactive Agents
Jieming Cui, Tengyu Liu, Nian Liu et al.
HumanNeRF-SE: A Simple yet Effective Approach to Animate HumanNeRF with Diverse Poses
Caoyuan Ma, Yu-Lun Liu, Zhixiang Wang et al.
SAM-6D: Segment Anything Model Meets Zero-Shot 6D Object Pose Estimation
Jiehong Lin, lihua liu, Dekun Lu et al.
SurMo: Surface-based 4D Motion Modeling for Dynamic Human Rendering
Tao Hu, Fangzhou Hong, Ziwei Liu
LeftRefill: Filling Right Canvas based on Left Reference through Generalized Text-to-Image Diffusion Model
Chenjie Cao, Yunuo Cai, Qiaole Dong et al.
ManipLLM: Embodied Multimodal Large Language Model for Object-Centric Robotic Manipulation
Xiaoqi Li, Mingxu Zhang, Yiran Geng et al.
Unveiling Parts Beyond Objects: Towards Finer-Granularity Referring Expression Segmentation
Wenxuan Wang, Tongtian Yue, Yisi Zhang et al.
PanoPose: Self-supervised Relative Pose Estimation for Panoramic Images
Diantao Tu, Hainan Cui, Xianwei Zheng et al.
Mask4Align: Aligned Entity Prompting with Color Masks for Multi-Entity Localization Problems
Haoquan Zhang, Ronggang Huang, Yi Xie et al.
Global and Local Prompts Cooperation via Optimal Transport for Federated Learning
Hongxia Li, Wei Huang, Jingya Wang et al.
VSCode: General Visual Salient and Camouflaged Object Detection with 2D Prompt Learning
Ziyang Luo, Nian Liu, Wangbo Zhao et al.
Dense Optical Tracking: Connecting the Dots
Guillaume Le Moing, Jean Ponce, Cordelia Schmid
Multi-agent Collaborative Perception via Motion-aware Robust Communication Network
Shixin Hong, Yu LIU, Zhi Li et al.
Ungeneralizable Examples
Jingwen Ye, Xinchao Wang
Language-only Training of Zero-shot Composed Image Retrieval
Geonmo Gu, Sanghyuk Chun, Wonjae Kim et al.
Causal-CoG: A Causal-Effect Look at Context Generation for Boosting Multi-modal Language Models
Shitian Zhao, Zhuowan Li, YadongLu et al.
Rapid Motor Adaptation for Robotic Manipulator Arms
Yichao Liang, Kevin Ellis, João F. Henriques
Instruct-Imagen: Image Generation with Multi-modal Instruction
Hexiang Hu, Kelvin C.K. Chan, Yu-Chuan Su et al.
Diffeomorphic Template Registration for Atmospheric Turbulence Mitigation
Dong Lao, Congli Wang, Alex Wong et al.
Adapting to Length Shift: FlexiLength Network for Trajectory Prediction
Yi Xu, Yun Fu