Most Cited 2024 "path space measure" Papers
12,324 papers found • Page 10 of 62
Conference
ParamISP: Learned Forward and Inverse ISPs using Camera Parameters
Woohyeok Kim, Geonu Kim, Junyong Lee et al.
VTQA: Visual Text Question Answering via Entity Alignment and Cross-Media Reasoning
Kang Chen, Xiangqian Wu
Split to Merge: Unifying Separated Modalities for Unsupervised Domain Adaptation
Xinyao Li, Yuke Li, Zhekai Du et al.
Text-Enhanced Data-free Approach for Federated Class-Incremental Learning
Minh-Tuan Tran, Trung Le, Xuan-May Le et al.
Lifting by Image – Leveraging Image Cues for Accurate 3D Human Pose Estimation
Feng Zhou, Jianqin Yin, Peiyang Li
FaceChain-ImagineID: Freely Crafting High-Fidelity Diverse Talking Faces from Disentangled Audio
Chao Xu, Yang Liu, Jiazheng Xing et al.
PSC-CPI: Multi-Scale Protein Sequence-Structure Contrasting for Efficient and Generalizable Compound-Protein Interaction Prediction
Lirong Wu, Yufei Huang, Cheng Tan et al.
Project-Fair and Truthful Mechanisms for Budget Aggregation
Rupert Freeman, Ulrike Schmidt-Kraepelin
Towards Modern Image Manipulation Localization: A Large-Scale Dataset and Novel Methods
Chenfan Qu, Yiwu Zhong, Chongyu Liu et al.
Shadow Generation for Composite Image Using Diffusion Model
Qingyang Liu, Junqi You, Jian-Ting Wang et al.
AssistGUI: Task-Oriented PC Graphical User Interface Automation
Difei Gao, Lei Ji, Zechen Bai et al.
A New Mechanism for Eliminating Implicit Conflict in Graph Contrastive Learning
Dongxiao He, Jitao Zhao, Cuiying Huo et al.
FreePoint: Unsupervised Point Cloud Instance Segmentation
Zhikai Zhang, Jian Ding, Li Jiang et al.
MESA: Matching Everything by Segmenting Anything
Yesheng Zhang, Xu Zhao
SuperSVG: Superpixel-based Scalable Vector Graphics Synthesis
Teng Hu, Ran Yi, Baihong Qian et al.
SemiReward: A General Reward Model for Semi-supervised Learning
Siyuan Li, Weiyang Jin, Zedong Wang et al.
PromptCCD: Learning Gaussian Mixture Prompt Pool for Continual Category Discovery
Fernando Julio Cendra, Bingchen Zhao, Kai Han
Generating Novel Leads for Drug Discovery Using LLMs with Logical Feedback
Shreyas Bhat Brahmavar, Ashwin Srinivasan, Tirtharaj Dash et al.
Diffusion Model for Robust Multi-Sensor Fusion in 3D Object Detection and BEV Segmentation
Duy Tho Le, Hengcan Shi, Jianfei Cai et al.
Image Captioning with Multi-Context Synthetic Data
Feipeng Ma, Y. Zhou, Fengyun Rao et al.
MoST: Motion Style Transformer Between Diverse Action Contents
Boeun Kim, Jungho Kim, Hyung Jin Chang et al.
PILoRA: Prototype Guided Incremental LoRA for Federated Class-Incremental Learning
Haiyang Guo, Fei Zhu, Wenzhuo Liu et al.
Source Prompt Disentangled Inversion for Boosting Image Editability with Diffusion Models
Ruibin Li, Ruihuang Li, Song Guo et al.
A Simple and Effective Point-based Network for Event Camera 6-DOFs Pose Relocalization
Hongwei Ren, Jiadong Zhu, Yue Zhou et al.
InsMapper: Exploring Inner-instance Information for Vectorized HD Mapping
Zhenhua Xu, Kwan-Yee K. Wong, Hengshuang ZHAO
Adaptive Bounding Box Uncertainties via Two-Step Conformal Prediction
Alexander Timans, Christoph-Nikolas Straehle, Kaspar Sakmann et al.
Improving Virtual Try-On with Garment-focused Diffusion Models
Siqi Wan, Yehao Li, Jingwen Chen et al.
Rotation-Agnostic Image Representation Learning for Digital Pathology
Saghir Alfasly, Abubakr Shafique, Peyman Nejat et al.
DVSAI: Diverse View-Shared Anchors Based Incomplete Multi-View Clustering
Shengju Yu, Siwei Wang, Pei Zhang et al.
Towards Faithful XAI Evaluation via Generalization-Limited Backdoor Watermark
Mengxi Ya, Yiming Li, Tao Dai et al.
FedMef: Towards Memory-efficient Federated Dynamic Pruning
Hong Huang, Weiming Zhuang, Chen Chen et al.
Blur-aware Spatio-temporal Sparse Transformer for Video Deblurring
Huicong Zhang, Haozhe Xie, Hongxun Yao
Open-Set Domain Adaptation for Semantic Segmentation
Seun-An Choe, Ah-Hyung Shin, Keon Hee Park et al.
Traffic Flow Optimisation for Lifelong Multi-Agent Path Finding
Zhe Chen, Daniel Harabor, Jiaoyang Li et al.
Code-Style In-Context Learning for Knowledge-Based Question Answering
Zhijie Nie, Richong Zhang, Zhongyuan Wang et al.
Text Image Inpainting via Global Structure-Guided Diffusion Models
Shipeng Zhu, Pengfei Fang, Chenjie Zhu et al.
Dataset Enhancement with Instance-Level Augmentations
Orest Kupyn, Christian Rupprecht
GSD: View-Guided Gaussian Splatting Diffusion for 3D Reconstruction
Yuxuan Mu, Xinxin Zuo, Chuan Guo et al.
HiPose: Hierarchical Binary Surface Encoding and Correspondence Pruning for RGB-D 6DoF Object Pose Estimation
Yongliang Lin, Yongzhi Su, Praveen Nathan et al.
Multiagent Multitraversal Multimodal Self-Driving: Open MARS Dataset
Yiming Li, Zhiheng Li, Nuo Chen et al.
CC-SAM: Enhancing SAM with Cross-feature Attention and Context for Ultrasound Image Segmentation
Shreyank Narayana Gowda, David A Clifton
Class Incremental Learning via Likelihood Ratio Based Task Prediction
Haowei Lin, Yijia Shao, Weinan Qian et al.
CoDA: Instructive Chain-of-Domain Adaptation with Severity-Aware Visual Prompt Tuning
Ziyang Gong, FuHao Li, Yupeng Deng et al.
STDiff: Spatio-Temporal Diffusion for Continuous Stochastic Video Prediction
Xi Ye, Guillaume-Alexandre Bilodeau
Harnessing Holistic Discourse Features and Triadic Interaction for Sentiment Quadruple Extraction in Dialogues
Bobo Li, Hao Fei, Lizi Liao et al.
TAMM: TriAdapter Multi-Modal Learning for 3D Shape Understanding
Zhihao Zhang, Shengcao Cao, Yu-Xiong Wang
Cloud-Device Collaborative Learning for Multimodal Large Language Models
Guanqun Wang, Jiaming Liu, Chenxuan Li et al.
Implicit Concept Removal of Diffusion Models
Zhili LIU, Kai Chen, Yifan Zhang et al.
Temporally and Distributionally Robust Optimization for Cold-Start Recommendation
Xinyu Lin, Wenjie Wang, Jujia Zhao et al.
T4P: Test-Time Training of Trajectory Prediction via Masked Autoencoder and Actor-specific Token Memory
Daehee Park, Jaeseok Jeong, Sung-Hoon Yoon et al.
Good Teachers Explain: Explanation-Enhanced Knowledge Distillation
Amin Parchami, Moritz Böhle, Sukrut Rao et al.
Diverse Person: Customize Your Own Dataset for Text-Based Person Search
Zifan Song, Guosheng Hu, Cairong Zhao
MM-Point: Multi-View Information-Enhanced Multi-Modal Self-Supervised 3D Point Cloud Understanding
HaiTao Yu, Mofei Song
Exploring the Transferability of Visual Prompting for Multimodal Large Language Models
Yichi Zhang, Yinpeng Dong, Siyuan Zhang et al.
Synchronization is All You Need: Exocentric-to-Egocentric Transfer for Temporal Action Segmentation with Unlabeled Synchronized Video Pairs
Camillo Quattrocchi, Antonino Furnari, Daniele Di Mauro et al.
MetaCoCo: A New Few-Shot Classification Benchmark with Spurious Correlation
Min Zhang, Haoxuan Li, Fei Wu et al.
Relightable and Animatable Neural Avatars from Videos
Wenbin Lin, Chengwei Zheng, Jun-hai Yong et al.
A Dual-Way Enhanced Framework from Text Matching Point of View for Multimodal Entity Linking
Shezheng Song, Shan Zhao, ChengYu Wang et al.
Unprocessing Seven Years of Algorithmic Fairness
André F. Cruz, Moritz Hardt
Composing Object Relations and Attributes for Image-Text Matching
Khoi Pham, Chuong Huynh, Ser-Nam Lim et al.
RMem: Restricted Memory Banks Improve Video Object Segmentation
Junbao Zhou, Ziqi Pang, Yu-Xiong Wang
Learning to Detect Multi-class Anomalies with Just One Normal Image Prompt
Bin-Bin Gao
Crowd-SAM:SAM as a smart annotator for object detection in crowded scenes
Zhi Cai, Yingjie Gao, Yaoyan Zheng et al.
PhysPT: Physics-aware Pretrained Transformer for Estimating Human Dynamics from Monocular Videos
Yufei Zhang, Jeffrey Kephart, Zijun Cui et al.
OVOR: OnePrompt with Virtual Outlier Regularization for Rehearsal-Free Class-Incremental Learning
Wei-Cheng Huang, Chun-Fu Chen, Hsiang Hsu
Spectral-Based Graph Neutral Networks for Complementary Item Recommendation
Haitong Luo, Xuying Meng, Suhang Wang et al.
BCLNet: Bilateral Consensus Learning for Two-View Correspondence Pruning
Xiangyang Miao, Guobao Xiao, Shiping Wang et al.
Geometric-Facilitated Denoising Diffusion Model for 3D Molecule Generation
6428 Can Xu, Haosen Wang, Weigang Wang et al.
Connecting Consistency Distillation to Score Distillation for Text-to-3D Generation
Zongrui Li, Minghui Hu, Qian Zheng et al.
Beta-Tuned Timestep Diffusion Model
Tianyi Zheng, Peng-Tao Jiang, Ben Wan et al.
Locality Sensitive Sparse Encoding for Learning World Models Online
Zichen Liu, Chao Du, Wee Sun Lee et al.
Fair-VPT: Fair Visual Prompt Tuning for Image Classification
Sungho Park, Hyeran Byun
Beat-It: Beat-Synchronized Multi-Condition 3D Dance Generation
Zikai Huang, Xuemiao Xu, Cheng Xu et al.
EgoCVR: An Egocentric Benchmark for Fine-Grained Composed Video Retrieval
Thomas Hummel, Shyamgopal Karthik, Mariana-Iuliana Georgescu et al.
Enhancing High-Resolution 3D Generation through Pixel-wise Gradient Clipping
Zijie Pan, Jiachen Lu, Xiatian Zhu et al.
Correspondence-Free Non-Rigid Point Set Registration Using Unsupervised Clustering Analysis
Mingyang Zhao, Jiang Jingen, Lei Ma et al.
COMBAT: Alternated Training for Effective Clean-Label Backdoor Attacks
Tran Huynh, Dang Nguyen, Tung Pham et al.
InfMAE: A Foundation Model in The Infrared Modality
Fangcen liu, Chenqiang Gao, Yaming Zhang et al.
Dense Projection for Anomaly Detection
Dazhi Fu, Zhao Zhang, Jicong Fan
MagDiff: Multi-Alignment Diffusion for High-Fidelity Video Generation and Editing
Haoyu Zhao, Tianyi Lu, Jiaxi Gu et al.
Deep Quantum Error Correction
Yoni Choukroun, Lior Wolf
Fine-Tuning Graph Neural Networks by Preserving Graph Generative Patterns
Yifei Sun, Qi Zhu, Yang Yang et al.
ViTEraser: Harnessing the Power of Vision Transformers for Scene Text Removal with SegMIM Pretraining
Dezhi Peng, Chongyu Liu, Yuliang Liu et al.
Rethinking the Evaluation Protocol of Domain Generalization
Han Yu, Xingxuan Zhang, Renzhe Xu et al.
UNIC: Universal Classification Models via Multi-teacher Distillation
Yannis Kalantidis, Larlus Diane, Mert Bulent SARIYILDIZ et al.
FAR: Flexible Accurate and Robust 6DoF Relative Camera Pose Estimation
Chris Rockwell, Nilesh Kulkarni, Linyi Jin et al.
PartSTAD: 2D-to-3D Part Segmentation Task Adaptation
Hyunjin Kim, Minhyuk Sung
Benchmarking Algorithms for Federated Domain Generalization
Ruqi Bai, Saurabh Bagchi, David Inouye
Understanding Video Transformers via Universal Concept Discovery
Matthew Kowal, Achal Dave, Rares Andrei Ambrus et al.
Diffusion Model is a Good Pose Estimator from 3D RF-Vision
Junqiao Fan, Jianfei Yang, Yuecong Xu et al.
Towards Understanding Factual Knowledge of Large Language Models
Xuming Hu, Junzhe Chen, Xiaochuan Li et al.
Revisiting Document-Level Relation Extraction with Context-Guided Link Prediction
Monika Jain, Raghava Mutharaju, Ramakanth Kavuluru et al.
Masked Video and Body-worn IMU Autoencoder for Egocentric Action Recognition
Mingfang Zhang, Yifei Huang, Ruicong Liu et al.
Weakly Supervised Semantic Segmentation for Driving Scenes
Dongseob Kim, Seungho Lee, Junsuk Choe et al.
HandDiff: 3D Hand Pose Estimation with Diffusion on Image-Point Cloud
WENCAN CHENG, Hao Tang, Luc Van Gool et al.
DIRECT-3D: Learning Direct Text-to-3D Generation on Massive Noisy 3D Data
Qihao Liu, Yi Zhang, Song Bai et al.
SPEAL: Skeletal Prior Embedded Attention Learning for Cross-Source Point Cloud Registration
Kezheng Xiong, Maoji Zheng, Qingshan Xu et al.
Visual Alignment Pre-training for Sign Language Translation
Peiqi Jiao, Yuecong Min, Xilin CHEN
LRANet: Towards Accurate and Efficient Scene Text Detection with Low-Rank Approximation
Yuchen Su, Zhineng Chen, Zhiwen Shao et al.
De-confounded Data-free Knowledge Distillation for Handling Distribution Shifts
Yuzheng Wang, Dingkang Yang, Zhaoyu Chen et al.
Every Node Is Different: Dynamically Fusing Self-Supervised Tasks for Attributed Graph Clustering
Pengfei Zhu, Qian Wang, Yu Wang et al.
Deep Diffusion Image Prior for Efficient OOD Adaptation in 3D Inverse Problems
Hyungjin Chung, Jong Chul Ye
Understanding Augmentation-based Self-Supervised Representation Learning via RKHS Approximation and Regression
Runtian Zhai, Bingbin Liu, Andrej Risteski et al.
COPlanner: Plan to Roll Out Conservatively but to Explore Optimistically for Model-Based RL
Xiyao Wang, Ruijie Zheng, Yanchao Sun et al.
SFPNet: Sparse Focal Point Network for Semantic Segmentation on General LiDAR Point Clouds
Yanbo Wang, Wentao Zhao, Cao Chuan et al.
Data Valuation and Detections in Federated Learning
Wenqian Li, Shuran Fu, Fengrui Zhang et al.
CLOSER: Towards Better Representation Learning for Few-Shot Class-Incremental Learning
Junghun Oh, Sungyong Baik, Kyoung Mu Lee
Decomposing Semantic Shifts for Composed Image Retrieval
Xingyu Yang, Daqing Liu, Heng Zhang et al.
Neural Visibility Field for Uncertainty-Driven Active Mapping
Shangjie Xue, Jesse Dill, Pranay Mathur et al.
LiFT: A Surprisingly Simple Lightweight Feature Transform for Dense ViT Descriptors
Saksham Suri, Matthew Walmer, Kamal Gupta et al.
Emergent Visual-Semantic Hierarchies in Image-Text Representations
Morris Alper, Hadar Averbuch-Elor
Enhancing 3D Object Detection with 2D Detection-Guided Query Anchors
Haoxuanye Ji, Pengpeng Liang, Erkang Cheng
Crystalformer: Infinitely Connected Attention for Periodic Structure Encoding
Tatsunori Taniai, Ryo Igarashi, Yuta Suzuki et al.
PreRoutGNN for Timing Prediction with Order Preserving Partition: Global Circuit Pre-training, Local Delay Learning and Attentional Cell Modeling
Ruizhe Zhong, Junjie Ye, Zhentao Tang et al.
Bongard-OpenWorld: Few-Shot Reasoning for Free-form Visual Concepts in the Real World
Rujie Wu, Xiaojian Ma, Zhenliang Zhang et al.
Explain via Any Concept: Concept Bottleneck Model with Open Vocabulary Concepts
Andong Tan, Fengtao Zhou, Hao Chen
RGMComm: Return Gap Minimization via Discrete Communications in Multi-Agent Reinforcement Learning
Jingdi Chen, Tian Lan, Carlee Joe-Wong
PetFace: A Large-Scale Dataset and Benchmark for Animal Identification
Risa Shinoda, Kaede Shiohara
City-on-Web: Real-time Neural Rendering of Large-scale Scenes on the Web
Kaiwen Song, Xiaoyi Zeng, Chenqu Ren et al.
Music Style Transfer with Time-Varying Inversion of Diffusion Models
Sifei Li, Yuxin Zhang, Fan Tang et al.
Towards Effective Usage of Human-Centric Priors in Diffusion Models for Text-based Human Image Generation
Junyan Wang, Zhenhong Sun, Stewart Tan et al.
RAW-Adapter: Adapting Pretrained Visual Model to Camera RAW Images
Ziteng Cui, Tatsuya Harada
Raindrop Clarity: A Dual-Focused Dataset for Day and Night Raindrop Removal
Yeying Jin, Xin Li, Jiadong Wang et al.
PAIR Diffusion: A Comprehensive Multimodal Object-Level Image Editor
Vidit Goel, Elia Peruzzo, Yifan Jiang et al.
Unlocking the Potential of Federated Learning: The Symphony of Dataset Distillation via Deep Generative Latents
Yuqi Jia, Saeed Vahidian, Jingwei Sun et al.
Differentiable Information Bottleneck for Deterministic Multi-view Clustering
Xiaoqiang Yan, Zhixiang Jin, Fengshou Han et al.
Exploring Pre-trained Text-to-Video Diffusion Models for Referring Video Object Segmentation
Xuelu Feng, Dongdong Chen, Junsong Yuan et al.
PosFormer: Recognizing Complex Handwritten Mathematical Expression with Position Forest Transformer
Tongkun Guan, Chengyu Lin, Wei Shen et al.
Knowledge Transfer with Simulated Inter-Image Erasing for Weakly Supervised Semantic Segmentation
Tao Chen, Xiruo Jiang, Gensheng Pei et al.
Self-Supervised Video Desmoking for Laparoscopic Surgery
Renlong Wu, Zhilu Zhang, Shuohao Zhang et al.
Keypoint Promptable Re-Identification
Vladimir Somers, Alexandre ALahi, Christophe De Vleeschouwer
One-stage Prompt-based Continual Learning
Youngeun Kim, YUHANG LI, Priyadarshini Panda
DeiT-LT: Distillation Strikes Back for Vision Transformer Training on Long-Tailed Datasets
Harsh Rangwani, Pradipto Mondal, Mayank Mishra et al.
What Makes a Good Prune? Maximal Unstructured Pruning for Maximal Cosine Similarity
Gabryel Mason-Williams, Fredrik Dahlqvist
SfmCAD: Unsupervised CAD Reconstruction by Learning Sketch-based Feature Modeling Operations
Pu Li, Jianwei Guo, HUIBIN LI et al.
SuperNormal: Neural Surface Reconstruction via Multi-View Normal Integration
Xu Cao, Takafumi Taketomi
Morphable Diffusion: 3D-Consistent Diffusion for Single-image Avatar Creation
Xiyi Chen, Marko Mihajlovic, Shaofei Wang et al.
CycleINR: Cycle Implicit Neural Representation for Arbitrary-Scale Volumetric Super-Resolution of Medical Data
Wei Fang, Yuxing Tang, Heng Guo et al.
Adaptive VIO: Deep Visual-Inertial Odometry with Online Continual Learning
Youqi Pan, Wugen Zhou, Yingdian Cao et al.
Revisiting Adversarial Training Under Long-Tailed Distributions
Xinli Yue, Ningping Mou, Qian Wang et al.
Motion-prior Contrast Maximization for Dense Continuous-Time Motion Estimation
Friedhelm Hamann, Ziyun Wang, Ioannis Asmanis et al.
Language-Driven 6-DoF Grasp Detection Using Negative Prompt Guidance
Tien Toan Nguyen, Minh Nhat Nhat Vu, Baoru Huang et al.
ConsistentEE: A Consistent and Hardness-Guided Early Exiting Method for Accelerating Language Models Inference
Ziqian Zeng, Yihuai Hong, Hongliang Dai et al.
Scaling and Masking: A New Paradigm of Data Sampling for Image and Video Quality Assessment
Yongxu Liu, Yinghui Quan, Guoyao Xiao et al.
UniM2AE: Multi-modal Masked Autoencoders with Unified 3D Representation for 3D Perception in Autonomous Driving
Jian Zou, Tianyu Huang, Guanglei Yang et al.
Label-Agnostic Forgetting: A Supervision-Free Unlearning in Deep Models
Shaofei Shen, Chenhao Zhang, Yawen Zhao et al.
What Effects the Generalization in Visual Reinforcement Learning: Policy Consistency with Truncated Return Prediction
Shuo Wang, Zhihao Wu, X. Hu et al.
Stitching Sub-trajectories with Conditional Diffusion Model for Goal-Conditioned Offline RL
Sungyoon Kim, Yunseon Choi, Daiki Matsunaga et al.
Towards Green AI in Fine-tuning Large Language Models via Adaptive Backpropagation
Kai Huang, Hanyun Yin, Heng Huang et al.
CoReS: Orchestrating the Dance of Reasoning and Segmentation
Xiaoyi Bao, Siyang Sun, Shuailei Ma et al.
Unsupervised Layer-Wise Score Aggregation for Textual OOD Detection
Maxime Darrin, Guillaume Staerman, Eduardo Dadalto Camara Gomes et al.
Listen to Look into the Future: Audio-Visual Egocentric Gaze Anticipation
Bolin Lai, Fiona Ryan, Wenqi Jia et al.
M2D2M: Multi-Motion Generation from Text with Discrete Diffusion Models
Seunggeun Chi, Hyung-gun Chi, Hengbo Ma et al.
Image Inpainting via Iteratively Decoupled Probabilistic Modeling
Wenbo Li, Xin Yu, Kun Zhou et al.
Align Before Adapt: Leveraging Entity-to-Region Alignments for Generalizable Video Action Recognition
Yifei Chen, Dapeng Chen, Ruijin Liu et al.
Towards Multimodal Open-Set Domain Generalization and Adaptation through Self-supervision
Hao Dong, Eleni Chatzi, Olga Fink
HAVE-FUN: Human Avatar Reconstruction from Few-Shot Unconstrained Images
Xihe Yang, Xingyu Chen, Daiheng Gao et al.
Condition-Aware Neural Network for Controlled Image Generation
Han Cai, Muyang Li, Qinsheng Zhang et al.
Latent Diffusion Prior Enhanced Deep Unfolding for Snapshot Spectral Compressive Imaging
Zongliang Wu, Ruiying Lu, Ying Fu et al.
PromptIQA: Boosting the Performance and Generalization for No-Reference Image Quality Assessment via Prompts
Zewen Chen, Haina Qin, Juan Wang et al.
Decompose-and-Compose: A Compositional Approach to Mitigating Spurious Correlation
Fahimeh Hosseini Noohdani, Parsa Hosseini, Aryan Yazdan Parast et al.
LMUFormer: Low Complexity Yet Powerful Spiking Model With Legendre Memory Units
Zeyu Liu, Gourav Datta, Anni Li et al.
Mismatch Quest: Visual and Textual Feedback for Image-Text Misalignment
Brian Gordon, Yonatan Bitton, Yonatan Shafir et al.
De-Diffusion Makes Text a Strong Cross-Modal Interface
Chen Wei, Chenxi Liu, Siyuan Qiao et al.
Adapting Short-Term Transformers for Action Detection in Untrimmed Videos
Min Yang, gaohuan, Ping Guo et al.
FRIH: Fine-Grained Region-Aware Image Harmonization
Jinlong Peng, Zekun Luo, Liang Liu et al.
Towards Latent Masked Image Modeling for Self-Supervised Visual Representation Learning
Yibing Wei, Abhinav Gupta, Pedro Morgado
Taming Latent Diffusion Model for Neural Radiance Field Inpainting
Chieh Lin, Changil Kim, Jia-Bin Huang et al.
SURER: Structure-Adaptive Unified Graph Neural Network for Multi-View Clustering
Jing Wang, Songhe Feng, Gengyu Lyu et al.
Visual Concept Connectome (VCC): Open World Concept Discovery and their Interlayer Connections in Deep Models
Matthew Kowal, Richard P. Wildes, Kosta Derpanis
PARE-Net: Position-Aware Rotation-Equivariant Networks for Robust Point Cloud Registration
Runzhao Yao, Shaoyi Du, Wenting Cui et al.
SHINOBI: Shape and Illumination using Neural Object Decomposition via BRDF Optimization In-the-wild
Andreas Engelhardt, Amit Raj, Mark Boss et al.
Optimizing Diffusion Models for Joint Trajectory Prediction and Controllable Generation
Yixiao Wang, Chen Tang, Lingfeng Sun et al.
Combating Data Imbalances in Federated Semi-supervised Learning with Dual Regulators
Sikai Bai, Shuaicheng Li, Weiming Zhuang et al.
VividDreamer: Invariant Score Distillation for Hyper-Realistic Text-to-3D Generation
Wenjie Zhuo, Fan Ma, Hehe Fan et al.
SuperGaussian: Repurposing Video Models for 3D Super Resolution
Yuan Shen, Duygu Ceylan, Paul Guerrero et al.
Progressive Poisoned Data Isolation for Training-Time Backdoor Defense
Yiming Chen, Haiwei Wu, Jiantao Zhou
ProxyDet: Synthesizing Proxy Novel Classes via Classwise Mixup for Open Vocabulary Object Detection
Joonhyun Jeong, Geondo Park, Jayeon Yoo et al.
Controllable Navigation Instruction Generation with Chain of Thought Prompting
Xianghao Kong, Jinyu Chen, Wenguan Wang et al.
CoT3DRef: Chain-of-Thoughts Data-Efficient 3D Visual Grounding
eslam Abdelrahman, Mohamed Ayman Mohamed, Mahmoud Ahmed et al.
Self-Supervised Multi-Modal Knowledge Graph Contrastive Hashing for Cross-Modal Search
Meiyu Liang, Junping Du, Zhengyang Liang et al.
Iterated Learning Improves Compositionality in Large Vision-Language Models
Chenhao Zheng, Jieyu Zhang, Aniruddha Kembhavi et al.
Heterogeneous Graph Reasoning for Fact Checking over Texts and Tables
Haisong Gong, Weizhi Xu, Shu Wu et al.
Object Pose Estimation via the Aggregation of Diffusion Features
Tianfu Wang, Guosheng Hu, Hongguang Wang
R-MAE: Regions Meet Masked Autoencoders
Duy-Kien Nguyen, Yanghao Li, Vaibhav Aggarwal et al.
C^2RV: Cross-Regional and Cross-View Learning for Sparse-View CBCT Reconstruction
Yiqun Lin, Jiewen Yang, hualiang wang et al.
Breaking Physical and Linguistic Borders: Multilingual Federated Prompt Tuning for Low-Resource Languages
Wanru Zhao, Yihong Chen, Royson Lee et al.
A Comprehensive Augmentation Framework for Anomaly Detection
Lin Jiang, Yaping Yan
Transformer-Based Selective Super-resolution for Efficient Image Refinement
Tianyi Zhang, Kishore Kasichainula, Yaoxin Zhuo et al.
Thermal3D-GS: Physics-induced 3D Gaussians for Thermal Infrared Novel-view Synthesis
Qian Chen, Shihao Shu, Xiangzhi Bai
IS-DARTS: Stabilizing DARTS through Precise Measurement on Candidate Importance
Hongyi He, Longjun Liu, Haonan Zhang et al.
MaGGIe: Masked Guided Gradual Human Instance Matting
Chuong Huynh, Seoung Wug Oh, Abhinav Shrivastava et al.
Beyond MOT: Semantic Multi-Object Tracking
Yunhao Li, Qin Li, Hao Wang et al.
Programmable Motion Generation for Open-Set Motion Control Tasks
Hanchao Liu, Xiaohang Zhan, Shaoli Huang et al.
Select and Distill: Selective Dual-Teacher Knowledge Transfer for Continual Learning on Vision-Language Models
Yu-Chu Yu, Chi-Pin Huang, Jr-Jen Chen et al.
The Hard Positive Truth about Vision-Language Compositionality
Amita Kamath, Cheng-Yu Hsieh, Kai-Wei Chang et al.
Semi-supervised Active Learning for Video Action Detection
Ayush Singh, Aayush J Rana, Akash Kumar et al.
ODM: A Text-Image Further Alignment Pre-training Approach for Scene Text Detection and Spotting
Chen Duan, Pei Fu, Shan Guo et al.
Video Prediction by Modeling Videos as Continuous Multi-Dimensional Processes
Gaurav Shrivastava, Abhinav Shrivastava
Stable Unlearnable Example: Enhancing the Robustness of Unlearnable Examples via Stable Error-Minimizing Noise
Yixin Liu, Kaidi Xu, Xun Chen et al.