Most Cited CVPR "message passing networks" Papers
5,589 papers found • Page 14 of 28
Conference
See Further When Clear: Curriculum Consistency Model
Yunpeng Liu, Boxiao Liu, Yi Zhang et al.
Residual Learning in Diffusion Models
Junyu Zhang, Daochang Liu, Eunbyung Park et al.
Blurry-Edges: Photon-Limited Depth Estimation from Defocused Boundaries
Wei Xu, Charlie Wagner, Junjie Luo et al.
BiM-VFI: Bidirectional Motion Field-Guided Frame Interpolation for Video with Non-uniform Motions
Wonyong Seo, Jihyong Oh, Munchurl Kim
Few-shot Personalized Scanpath Prediction
Ruoyu Xue, Jingyi Xu, Sounak Mondal et al.
Silence is Golden: Leveraging Adversarial Examples to Nullify Audio Control in LDM-based Talking-Head Generation
Yuan Gan, Jiaxu Miao, Yunze Wang et al.
Towards Source-Free Machine Unlearning
Sk Miraj Ahmed, Umit Basaran, Dripta S. Raychaudhuri et al.
Adaptive Softassign via Hadamard-Equipped Sinkhorn
Binrui Shen, Qiang Niu, Shengxin Zhu
CAV-MAE Sync: Improving Contrastive Audio-Visual Mask Autoencoders via Fine-Grained Alignment
Edson Araujo, Andrew Rouditchenko, Yuan Gong et al.
Dense Match Summarization for Faster Two-view Estimation
Jonathan Astermark, Anders Heyden, Viktor Larsson
GenVDM: Generating Vector Displacement Maps From a Single Image
Yuezhi Yang, Qimin Chen, Vladimir G. Kim et al.
SemAlign3D: Semantic Correspondence between RGB-Images through Aligning 3D Object-Class Representations
Krispin Wandel, Hesheng Wang
A2XP: Towards Private Domain Generalization
Geunhyeok Yu, Hyoseok Hwang
Rethinking Correspondence-based Category-Level Object Pose Estimation
Huan Ren, Wenfei Yang, Shifeng Zhang et al.
Boosting the Dual-Stream Architecture in Ultra-High Resolution Segmentation with Resolution-Biased Uncertainty Estimation
Rong Qin, Xingyu Liu, Jinglei Shi et al.
SPC-GS: Gaussian Splatting with Semantic-Prompt Consistency for Indoor Open-World Free-view Synthesis from Sparse Inputs
Guibiao Liao, Qing Li, Zhenyu Bao et al.
NeighborRetr: Balancing Hub Centrality in Cross-Modal Retrieval
Zengrong Lin, Zheng Wang, Tianwen Qian et al.
SEC-Prompt:SEmantic Complementary Prompting for Few-Shot Class-Incremental Learning
Ye Liu, Meng Yang
Learning to Count without Annotations
Lukas Knobel, Tengda Han, Yuki Asano
Reasoning Mamba: Hypergraph-Guided Region Relation Calculating for Weakly Supervised Affordance Grounding
Yuxuan Wang, Aming Wu, Muli Yang et al.
DiffCAM: Data-Driven Saliency Maps by Capturing Feature Differences
Xingjian Li, Qiming Zhao, Neelesh Bisht et al.
SAM-REF: Introducing Image-Prompt Synergy during Interaction for Detail Enhancement in the Segment Anything Model
Chongkai Yu, Ting Liu, Li Anqi et al.
Preconditioners for the Stochastic Training of Neural Fields
Shin-Fang Chng, Hemanth Saratchandran, Simon Lucey
MICAS: Multi-grained In-Context Adaptive Sampling for 3D Point Cloud Processing
Feifei Shao, Ping Liu, Zhao Wang et al.
Decision SpikeFormer: Spike-Driven Transformer for Decision Making
Wei Huang, Qinying Gu, Nanyang Ye
PoseBH: Prototypical Multi-Dataset Training Beyond Human Pose Estimation
Uyoung Jeong, Jonathan Freer, Seungryul Baek et al.
LOCORE: Image Re-ranking with Long-Context Sequence Modeling
Zilin Xiao, Pavel Suma, Ayush Sachdeva et al.
FlexGS: Train Once, Deploy Everywhere with Many-in-One Flexible 3D Gaussian Splatting
Hengyu Liu, Yuehao Wang, Chenxin Li et al.
Segment This Thing: Foveated Tokenization for Efficient Point-Prompted Segmentation
Tanner Schmidt, Richard Newcombe
ImViD: Immersive Volumetric Videos for Enhanced VR Engagement
Zhengxian Yang, Shi Pan, Shengqi Wang et al.
Potential Field Based Deep Metric Learning
Shubhang Bhatnagar, Narendra Ahuja
Deep Fair Multi-View Clustering with Attention KAN
HaiMing Xu, Qianqian Wang, Boyue Wang et al.
DeepCompress-ViT: Rethinking Model Compression to Enhance Efficiency of Vision Transformers at the Edge
Sabbir Ahmed, Abdullah Al Arafat, Deniz Najafi et al.
Exploring Semantic Feature Discrimination for Perceptual Image Super-Resolution and Opinion-Unaware No-Reference Image Quality Assessment
Guanglu Dong, Xiangyu Liao, Mingyang Li et al.
Color Alignment in Diffusion
Ka Chun SHUM, Binh-Son Hua, Thanh Nguyen et al.
PhyS-EdiT: Physics-aware Semantic Image Editing with Text Description
Ziqi Cai, Shuchen Weng, Yifei Xia et al.
CheckManual: A New Challenge and Benchmark for Manual-based Appliance Manipulation
Yuxing Long, Jiyao Zhang, Mingjie Pan et al.
Learning Dynamic Collaborative Network for Semi-supervised 3D Vessel Segmentation
Jiao Xu, Xin Chen, Lihe Zhang
NTR-Gaussian: Nighttime Dynamic Thermal Reconstruction with 4D Gaussian Splatting Based on Thermodynamics
Kun Yang, Yuxiang Liu, Zeyu Cui et al.
v-CLR: View-Consistent Learning for Open-World Instance Segmentation
Chang-Bin Zhang, Jinhong Ni, Yujie Zhong et al.
Mesh Mamba: A Unified State Space Model for Saliency Prediction in Non-Textured and Textured Meshes
Kaiwei Zhang, Dandan Zhu, Xiongkuo Min et al.
Early-Bird Diffusion: Investigating and Leveraging Timestep-Aware Early-Bird Tickets in Diffusion Models for Efficient Training
Lexington Whalen, Zhenbang Du, Haoran You et al.
Auto-Encoded Supervision for Perceptual Image Super-Resolution
MinKyu Lee, Sangeek Hyun, Woojin Jun et al.
Adapting Text-to-Image Generation with Feature Difference Instruction for Generic Image Restoration
Chao Wang, Hehe Fan, Huichen Yang et al.
Towards Understanding How Knowledge Evolves in Large Vision-Language Models
Sudong Wang, Yunjian Zhang, Yao Zhu et al.
ViUniT: Visual Unit Tests for More Robust Visual Programming
Artemis Panagopoulou, Honglu Zhou, silvio savarese et al.
Not All Parameters Matter: Masking Diffusion Models for Enhancing Generation Ability
Lei Wang, Senmao Li, Fei Yang et al.
L-SWAG: Layer-Sample Wise Activation with Gradients Information for Zero-Shot NAS on Vision Transformers
Sofia Casarin, Sergio Escalera, Oswald Lanz
EigenGS Representation: From Eigenspace to Gaussian Image Space
LO-WEI TAI, Ching-En Ching En, Li et al.
Enhancing Dance-to-Music Generation via Negative Conditioning Latent Diffusion Model
Changchang Sun, Gaowen Liu, Charles Fleming et al.
JiSAM: Alleviate Labeling Burden and Corner Case Problems in Autonomous Driving via Minimal Real-World Data
Runjian Chen, Wenqi Shao, Bo Zhang et al.
Hand-held Object Reconstruction from RGB Video with Dynamic Interaction
Shijian Jiang, Qi Ye, Rengan Xie et al.
FSBench: A Figure Skating Benchmark for Advancing Artistic Sports Understanding
Rong Gao, Xin Liu, Zhuozhao Hu et al.
DTOS: Dynamic Time Object Sensing with Large Multimodal Model
Jirui Tian, Jinrong Zhang, Shenglan Liu et al.
Incomplete Multi-modal Brain Tumor Segmentation via Learnable Sorting State Space Model
Zheyu Zhang, Yayuan Lu, Feipeng Ma et al.
RC-AutoCalib: An End-to-End Radar-Camera Automatic Calibration Network
Van-Tin Luu, Yong-Lin Cai, Vu-Hoang Tran et al.
A Unified, Resilient, and Explainable Adversarial Patch Detector
Vishesh Kumar, Akshay Agarwal
Deterministic-to-Stochastic Diverse Latent Feature Mapping for Human Motion Synthesis
Hua Yu, Weiming Liu, Gui Xu et al.
T-CIL: Temperature Scaling using Adversarial Perturbation for Calibration in Class-Incremental Learning
Seong-Hyeon Hwang, Minsu Kim, Steven Euijong Whang
Implicit Correspondence Learning for Image-to-Point Cloud Registration
Xinjun Li, Wenfei Yang, Jiacheng Deng et al.
Argus: A Compact and Versatile Foundation Model for Vision
Weiming Zhuang, Chen Chen, Zhizhong Li et al.
POMP: Physics-constrainable Motion Generative Model through Phase Manifolds
Bin Ji, Ye Pan, zhimeng Liu et al.
Homogeneous Dynamics Space for Heterogeneous Humans
Xinpeng Liu, Junxuan Liang, Chenshuo Zhang et al.
Separation of Powers: On Segregating Knowledge from Observation in LLM-enabled Knowledge-based Visual Question Answering
Zhen Yang, Zhuo Tao, Qi Chen et al.
Gaussian Splatting Feature Fields for (Privacy-Preserving) Visual Localization
Maxime Pietrantoni, Gabriela Csurka, Torsten Sattler
PIDLoc: Cross-View Pose Optimization Network Inspired by PID Controllers
Wooju Lee, Juhye Park, Dasol Hong et al.
SinGS: Animatable Single-Image Human Gaussian Splats with Kinematic Priors
Yufan Wu, Xuanhong Chen, Wen Li et al.
Ferret: An Efficient Online Continual Learning Framework under Varying Memory Constraints
Yuhao Zhou, Yuxin Tian, Jindi Lv et al.
Explaining Domain Shifts in Language: Concept Erasing for Interpretable Image Classification
Zequn Zeng, Yudi Su, Jianqiao Sun et al.
A New Statistical Model of Star Speckles for Learning to Detect and Characterize Exoplanets in Direct Imaging Observations
Theo Bodrito, Olivier Flasseur, Julien Mairal et al.
Improving Personalized Search with Regularized Low-Rank Parameter Updates
Fiona Ryan, Josef Sivic, Fabian Caba Heilbron et al.
LiSu: A Dataset and Method for LiDAR Surface Normal Estimation
Dušan Malić, Christian Fruhwirth-Reisinger, Samuel Schulter et al.
UMotion: Uncertainty-driven Human Motion Estimation from Inertial and Ultra-wideband Units
Huakun Liu, Hiroki Ota, Xin Wei et al.
HSI: A Holistic Style Injector for Arbitrary Style Transfer
Shuhao Zhang, Hui Kang, Yang Liu et al.
Composing Parts for Expressive Object Generation
Harsh Rangwani, Aishwarya Agarwal, Kuldeep Kulkarni et al.
Perturb-and-Revise: Flexible 3D Editing with Generative Trajectories
Susung Hong, Johanna Suvi Karras, Ricardo Martin et al.
Soft Self-labeling and Potts Relaxations for Weakly-supervised Segmentation
Zhongwen Zhang, Yuri Boykov
The Photographer's Eye: Teaching Multimodal Large Language Models to See, and Critique Like Photographers
Daiqing Qi, Handong Zhao, Jing Shi et al.
VSNet: Focusing on the Linguistic Characteristics of Sign Language
Yuhao Li, Xinyue Chen, Hongkai Li et al.
Twinner: Shining Light on Digital Twins in a Few Snaps
Jesus Zarzar, Tom Monnier, Roman Shapovalov et al.
Incorporating Dense Knowledge Alignment into Unified Multimodal Representation Models
Yuhao Cui, Xinxing Zu, Wenhua Zhang et al.
Towards Scalable Human-aligned Benchmark for Text-guided Image Editing
Suho Ryu, Kihyun Kim, Eugene Baek et al.
Temporal Action Detection Model Compression by Progressive Block Drop
Xiaoyong Chen, Yong Guo, Jiaming Liang et al.
Type-R: Automatically Retouching Typos for Text-to-Image Generation
Wataru Shimoda, Naoto Inoue, Daichi Haraguchi et al.
Instance-wise Supervision-level Optimization in Active Learning
Shinnosuke Matsuo, Riku Togashi, Ryoma Bise et al.
SGCR: Spherical Gaussians for Efficient 3D Curve Reconstruction
Xinran Yang, Donghao Ji, Yuanqi Li et al.
Language-Assisted Debiasing and Smoothing for Foundation Model-Based Semi-Supervised Learning
Na Zheng, Xuemeng Song, Xue Dong et al.
MaDCoW: Marginal Distortion Correction for Wide-Angle Photography with Arbitrary Objects
Kevin Zhang, Jia-Bin Huang, Jose Echevarria et al.
RCP-Bench: Benchmarking Robustness for Collaborative Perception Under Diverse Corruptions
Shihang Du, Sanqing Qu, Tianhang Wang et al.
MonoPlace3D: Learning 3D-Aware Object Placement for 3D Monocular Detection
Rishubh Parihar, Srinjay Sarkar, Sarthak Vora et al.
Fitted Neural Lossless Image Compression
Zhe Zhang, Zhenzhong Chen, Shan Liu
Minimal Interaction Seperated Tuning: A New Paradigm for Visual Adaptation
Ningyuan Tang, Minghao Fu, Jianxin Wu
DAMM-Diffusion: Learning Divergence-Aware Multi-Modal Diffusion Model for Nanoparticles Distribution Prediction
Junjie Zhou, Shouju Wang, Yuxia Tang et al.
Ranking Distillation for Open-Ended Video Question Answering with Insufficient Labels
Tianming Liang, Chaolei Tan, Beihao Xia et al.
Graph Neural Network Combining Event Stream and Periodic Aggregation for Low-Latency Event-based Vision
Manon Dampfhoffer, Thomas Mesquida, Damien Joubert et al.
SemiDAViL: Semi-supervised Domain Adaptation with Vision-Language Guidance for Semantic Segmentation
Hritam Basak, Zhaozheng Yin
Video-Panda: Parameter-efficient Alignment for Encoder-free Video-Language Models
Jinhui Yi, Syed Talal Wasim, Yanan Luo et al.
A Semantic Knowledge Complementarity based Decoupling Framework for Semi-supervised Class-imbalanced Medical Image Segmentation
Zheng Zhang, Guanchun Yin, Bo Zhang et al.
ONDA-Pose: Occlusion-Aware Neural Domain Adaptation for Self-Supervised 6D Object Pose Estimation
Tao Tan, Qiulei Dong
An Image-like Diffusion Method for Human-Object Interaction Detection
Xiaofei Hui, Haoxuan Qu, Hossein Rahmani et al.
DirectTriGS: Triplane-based Gaussian Splatting Field Representation for 3D Generation
Xiaoliang Ju, Hongsheng Li
DART: Disease-aware Image-Text Alignment and Self-correcting Re-alignment for Trustworthy Radiology Report Generation
Sang-Jun Park, Keun-Soo Heo, Dong-Hee Shin et al.
Can Text-to-Video Generation help Video-Language Alignment?
Luca Zanella, Massimiliano Mancini, Willi Menapace et al.
Multi-modal Topology-embedded Graph Learning for Spatially Resolved Genes Prediction from Pathology Images with Prior Gene Similarity Information
Hang Shi, Chi Changxi, Peng Wan et al.
SCSA: A Plug-and-Play Semantic Continuous-Sparse Attention for Arbitrary Semantic Style Transfer
Chunnan Shang, Zhizhong Wang, Hongwei Wang et al.
Odd-One-Out: Anomaly Detection by Comparing with Neighbors
Ankan Kumar Bhunia, Changjian Li, Hakan Bilen
Named Entity Driven Zero-Shot Image Manipulation
Zhida Feng, Li Chen, Jing Tian et al.
NTClick: Achieving Precise Interactive Segmentation With Noise-tolerant Clicks
Chenyi Zhang, Ting Liu, Xiaochao Qu et al.
Transferable and Principled Efficiency for Open-Vocabulary Segmentation
Jingxuan Xu, Wuyang Chen, Yao Zhao et al.
Adapting to Observation Length of Trajectory Prediction via Contrastive Learning
Ruiqi Qiu, JUN GONG, Xinyu Zhang et al.
PersonaHOI: Effortlessly Improving Face Personalization in Human-Object Interaction Generation
Xinting Hu, Haoran Wang, Jan Lenssen et al.
Beyond Human Perception: Understanding Multi-Object World from Monocular View
Keyu Guo, Yongle Huang, Shijie Sun et al.
High-quality Point Cloud Oriented Normal Estimation via Hybrid Angular and Euclidean Distance Encoding
Yuanqi Li, Jingcheng Huang, Hongshen Wang et al.
HeatFormer: A Neural Optimizer for Multiview Human Mesh Recovery
Yuto Matsubara, Ko Nishino
Active Event-based Stereo Vision
Jianing Li, Yunjian Zhang, Haiqian Han et al.
EASEMVC:Efficient Dual Selection Mechanism for Deep Multi-View Clustering
Baili Xiao, Zhibin Dong, KE LIANG et al.
Hyperdimensional Uncertainty Quantification for Multimodal Uncertainty Fusion in Autonomous Vehicles Perception
Luke Chen, Junyao Wang, Trier Mortlock et al.
GlyphMastero: A Glyph Encoder for High-Fidelity Scene Text Editing
Tong Wang, Ting Liu, Xiaochao Qu et al.
Random Conditioning with Distillation for Data-Efficient Diffusion Model Compression
Dohyun Kim, Sehwan Park, GeonHee Han et al.
SVLTA: Benchmarking Vision-Language Temporal Alignment via Synthetic Video Situation
Hao Du, Bo Wu, Yan Lu et al.
Seeing A 3D World in A Grain of Sand
Yufan Zhang, Yu Ji, Yu Guo et al.
ShapeShifter: 3D Variations Using Multiscale and Sparse Point-Voxel Diffusion
Nissim Maruani, Wang Yifan, Matthew Fisher et al.
Balancing Two Classifiers via A Simplex ETF Structure for Model Calibration
Jiani Ni, He Zhao, Jintong Gao et al.
Touch2Shape: Touch-Conditioned 3D Diffusion for Shape Exploration and Reconstruction
Yuanbo Wang, Zhaoxuan Zhang, Jiajin Qiu et al.
Attraction Diminishing and Distributing for Few-Shot Class-Incremental Learning
Li-Jun Zhao, Zhen-Duo Chen, Yongxin Wang et al.
Pose-Guided Temporal Enhancement for Robust Low-Resolution Hand Reconstruction
Kaixin Fan, Pengfei Ren, Jingyu Wang et al.
CSC-PA: Cross-image Semantic Correlation via Prototype Attentions for Single-network Semi-supervised Breast Tumor Segmentation
Zhenhui Ding, Guilian Chen, Qin Zhang et al.
Attribute-Missing Multi-view Graph Clustering
Bowen Zhao, Qianqian Wang, Zhengming Ding et al.
Meta-Learning Hyperparameters for Parameter Efficient Fine-Tuning
Zichen Tian, Yaoyao Liu, Qianru Sun
Anchor-Aware Similarity Cohesion in Target Frames Enables Predicting Temporal Moment Boundaries in 2D
Jiawei Tan, Hongxing Wang, Junwu Weng et al.
GA3CE: Unconstrained 3D Gaze Estimation with Gaze-Aware 3D Context Encoding
Yuki Kawana, Shintaro Shiba, Quan Kong et al.
Link-based Contrastive Learning for One-Shot Unsupervised Domain Adaptation
Yue Zhang, Mingyue Bin, Yuyang Zhang et al.
Blind Bitstream-corrupted Video Recovery via Metadata-guided Diffusion Model
Shuyun Wang, Hu Zhang, Xin Shen et al.
Directional Label Diffusion Model for Learning from Noisy Labels
Senyu Hou, Gaoxia Jiang, Jia Zhang et al.
IM-Portrait: Learning 3D-aware Video Diffusion for Photorealistic Talking Heads from Monocular VideosC
Yuan Li, Ziqian Bai, Feitong Tan et al.
Flexible Group Count Enables Hassle-Free Structured Pruning
Jiamu Zhang, Shaochen Zhong, Andrew Ye et al.
Align-A-Video: Deterministic Reward Tuning of Image Diffusion Models for Consistent Video Editing
Shengzhi Wang, Yingkang Zhong, Jiangchuan Mu et al.
EvOcc: Accurate Semantic Occupancy for Automated Driving Using Evidence Theory
Jonas Kälble, Sascha Wirges, Maxim Tatarchenko et al.
PIAD: Pose and Illumination agnostic Anomaly Detection
Kaichen Yang, Junjie Cao, Zeyu Bai et al.
Concept Lancet: Image Editing with Compositional Representation Transplant
Jinqi Luo, Tianjiao Ding, Kwan Ho Ryan Chan et al.
3D Prior Is All You Need: Cross-Task Few-shot 2D Gaze Estimation
Yihua Cheng, Hengfei Wang, Zhongqun Zhang et al.
Dual-Granularity Semantic Guided Sparse Routing Diffusion Model for General Pansharpening
Yinghui Xing, Qu Li Tao, Shizhou Zhang et al.
MetricGrids: Arbitrary Nonlinear Approximation with Elementary Metric Grids based Implicit Neural Representation
Shu Wang, Yanbo Gao, Shuai Li et al.
Nested Diffusion Models Using Hierarchical Latent Priors
Xiao Zhang, Ruoxi Jiang, Rebecca Willett et al.
MirrorVerse: Pushing Diffusion Models to Realistically Reflect the World
Ankit Dhiman, Manan Shah, R. Venkatesh Babu
Adapting Dense Matching for Homography Estimation with Grid-based Acceleration
Kaining Zhang, Yuxin Deng, Jiayi Ma et al.
Multi-modal Contrastive Learning with Negative Sampling Calibration for Phenotypic Drug Discovery
Jiahua Rao, Hanjing Lin, Leyu Chen et al.
FFaceNeRF: Few-shot Face Editing in Neural Radiance Fields
Kwan Yun, Chaelin Kim, Hangyeul Shin et al.
Style Evolving along Chain-of-Thought for Unknown-Domain Object Detection
Zihao Zhang, Aming Wu, Yahong Han
VEU-Bench: Towards Comprehensive Understanding of Video Editing
Bozheng Li, Yongliang Wu, YI LU et al.
Latent Space Imaging
Matheus Souza, Yidan Zheng, Kaizhang Kang et al.
WISNet: Pseudo Label Generation on Unbalanced and Patch Annotated Waste Images
Shifan Zhang, Hongzi Zhu, Yinan He et al.
AdaptCMVC: Robust Adaption to Incremental Views in Continual Multi-view Clustering
Jing Wang, Songhe Feng, Kristoffer Knutsen Wickstrøm et al.
DynPose: Largely Improving the Efficiency of Human Pose Estimation by a Simple Dynamic Framework
Yalong Xu, Lin Zhao, Chen Gong et al.
ReSpec: Relevance and Specificity Grounded Online Filtering for Learning on Video-Text Data Streams
Chris Dongjoo Kim, Jihwan Moon, Sangwoo Moon et al.
Deep Video Inverse Tone Mapping Based on Temporal Clues
Yuyao Ye, Ning Zhang, Yang Zhao et al.
Data Distributional Properties As Inductive Bias for Systematic Generalization
Felipe del Rio, Alain Raymond, Daniel Florea et al.
ETAP: Event-based Tracking of Any Point
Friedhelm Hamann, Daniel Gehrig, Filbert Febryanto et al.
VIRES: Video Instance Repainting via Sketch and Text Guided Generation
Shuchen Weng, Haojie Zheng, Peixuan Zhang et al.
SocialMOIF: Multi-Order Intention Fusion for Pedestrian Trajectory Prediction
Kai Chen, Xiaodong Zhao, Yujie Huang et al.
Automatic Spectral Calibration of Hyperspectral Images: Method, Dataset and Benchmark
Zhuoran Du, Shaodi You, Cheng Cheng et al.
VRetouchEr: Learning Cross-frame Feature Interdependence with Imperfection Flow for Face Retouching in Videos
Wen Xue, Le Jiang, Lianxin Xie et al.
Video Language Model Pretraining with Spatio-temporal Masking
Yue Wu, Zhaobo Qi, Junshu Sun et al.
Take the Bull by the Horns: Learning to Segment Hard Samples
Yuan Guo, Jingyu Kong, Yu Wang et al.
Provoking Multi-modal Few-Shot LVLM via Exploration-Exploitation In-Context Learning
Cheng Chen, Yunpeng Zhai, Yifan Zhao et al.
Enhancing Virtual Try-On with Synthetic Pairs and Error-Aware Noise Scheduling
Nannan Li, Kevin Shih, Bryan A. Plummer
ArtiFade: Learning to Generate High-quality Subject from Blemished Images
Shuya Yang, Shaozhe Hao, Yukang Cao et al.
GroundingFace: Fine-grained Face Understanding via Pixel Grounding Multimodal Large Language Model
Yue Han, Jiangning Zhang, Junwei Zhu et al.
LookingGlass: Generative Anamorphoses via Laplacian Pyramid Warping
Pascal Chang, Sergio Sancho, Jingwei Tang et al.
Symbolic Representation for Any-to-Any Generative Tasks
Jiaqi Chen, Xiaoye Zhu, Yue Wang et al.
Leveraging Global Stereo Consistency for Category-Level Shape and 6D Pose Estimation from Stereo Images
Junning Qiu, Minglei Lu, Fei Wang et al.
Self-Supervised Cross-View Correspondence with Predictive Cycle Consistency
Alan Baade, Changan Chen
Sampling Innovation-Based Adaptive Compressive Sensing
Zhifu Tian, Tao Hu, Chaoyang Niu et al.
Zero-Shot Head Swapping in Real-World Scenarios
Sohyun Jeong, Taewoong Kang, Hyojin Jang et al.
End-to-End HOI Reconstruction Transformer with Graph-based Encoding
Zhenrong Wang, Qi Zheng, Sihan Ma et al.
Mamba-Adaptor: State Space Model Adaptor for Visual Recognition
Fei Xie, Jiahao Nie, Yujin Tang et al.
GeoDepth: From Point-to-Depth to Plane-to-Depth Modeling for Self-Supervised Monocular Depth Estimation
Haifeng Wu, Shuhang Gu, Lixin Duan et al.
Vision-Guided Action: Enhancing 3D Human Motion Prediction with Gaze-informed Affordance in 3D Scenes
Ting Yu, Yi Lin, Jun Yu et al.
Advancing Manga Analysis: Comprehensive Segmentation Annotations for the Manga109 Dataset
Minshan Xie, Jian Lin, Hanyuan Liu et al.
Customized Condition Controllable Generation for Video Soundtrack
Fan Qi, KunSheng Ma, Changsheng Xu
FRAME: Floor-aligned Representation for Avatar Motion from Egocentric Video
Andrea Boscolo Camiletto, Jian Wang, Eduardo Alvarado et al.
SAMBLE: Shape-Specific Point Cloud Sampling for an Optimal Trade-Off Between Local Detail and Global Uniformity
Chengzhi Wu, Yuxin Wan, Hao Fu et al.
GaPT-DAR: Category-level Garments Pose Tracking via Integrated 2D Deformation and 3D Reconstruction
Li Zhang, mingliang xu, Jianan Wang et al.
Revisiting Fairness in Multitask Learning: A Performance-Driven Approach for Variance Reduction
Xiaohan Qin, Xiaoxing Wang, Junchi Yan
EnliveningGS: Active Locomotion of 3DGS
Siyuan Shen, Tianjia Shao, Kun Zhou et al.
ESC: Erasing Space Concept for Knowledge Deletion
Tae-Young Lee, Sundong Park, Minwoo Jeon et al.
Unbiasing through Textual Descriptions: Mitigating Representation Bias in Video Benchmarks
Nina Shvetsova, Arsha Nagrani, Bernt Schiele et al.
CroCoDL: Cross-device Collaborative Dataset for Localization
Hermann Blum, Alessandro Mercurio, Joshua O'Reilly et al.
Empowering Vector Graphics with Consistently Arbitrary Viewing and View-dependent Visibility
Yidi Li, Jun Xiao, Zhengda Lu et al.
Sharp-It: A Multi-view to Multi-view Diffusion Model for 3D Synthesis and Manipulation
Yiftach Edelstein, Or Patashnik, Dana Cohen-Bar et al.
Learning Class Prototypes for Unified Sparse-Supervised 3D Object Detection
Yun Zhu, Le Hui, Hang Yang et al.
F^3OCUS - Federated Finetuning of Vision-Language Foundation Models with Optimal Client Layer Updating Strategy via Multi-objective Meta-Heuristics
Pramit Saha, Felix Wagner, Divyanshu Mishra et al.
WildAvatar: Learning In-the-wild 3D Avatars from the Web
Zihao Huang, Shoukang Hu, Guangcong Wang et al.
ToonerGAN: Reinforcing GANs for Obfuscating Automated Facial Indexing
Kartik Thakral, Shashikant Prasad, Stuti Aswani et al.
MAC-Ego3D: Multi-Agent Gaussian Consensus for Real-Time Collaborative Ego-Motion and Photorealistic 3D Reconstruction
Xiaohao Xu, Feng Xue, Shibo Zhao et al.
Black Hole-Driven Identity Absorbing in Diffusion Models
Muhammad Shaheryar, Jong Taek Lee, Soon Ki Jung
Deep Change Monitoring: A Hyperbolic Representative Learning Framework and a Dataset for Long-term Fine-grained Tree Change Detection
Yante Li, Hanwen Qi, Haoyu Chen et al.
PAVE: Patching and Adapting Video Large Language Models
Zhuoming Liu, Yiquan Li, Khoi D Nguyen et al.
Polarized Color Screen Matting
Kenji Enomoto, Scott Cohen, Brian Price et al.
PhysicsGen: Can Generative Models Learn from Images to Predict Complex Physical Relations?
Martin Spitznagel, Jan Vaillant, Janis Keuper