Most Cited 2024 "active vision" Papers
12,324 papers found • Page 52 of 62
Conference
InterpretARA: Enhancing Hybrid Automatic Readability Assessment with Linguistic Feature Interpreter and Contrastive Learning
Jinshan Zeng, Xianchao Tong, Xianglong Yu et al.
Learning Multi-Modal Cross-Scale Deformable Transformer Network for Unregistered Hyperspectral Image Super-resolution
Wenqian Dong, Yang Xu, Jiahui Qu et al.
ScanERU: Interactive 3D Visual Grounding Based on Embodied Reference Understanding
Ziyang Lu, Yunqiang Pei, Guoqing Wang et al.
Model-Driven Deep Neural Network for Enhanced AoA Estimation Using 5G gNB
Shengheng Liu, Xingkang Li, Zihuan Mao et al.
Response Enhanced Semi-supervised Dialogue Query Generation
Jianheng Huang, Ante Wang, Linfeng Gao et al.
READ-PVLA: Recurrent Adapter with Partial Video-Language Alignment for Parameter-Efficient Transfer Learning in Low-Resource Video-Language Modeling
Thong Nguyen, Xiaobao Wu, Xinshuai Dong et al.
Winnie: Task-Oriented Dialog System with Structure-Aware Contrastive Learning and Enhanced Policy Planning
Kaizhi Gao, Tianyu Wang, Zhongjing Ma et al.
Dual-Prior Augmented Decoding Network for Long Tail Distribution in HOI Detection
Jiayi Gao, Kongming Liang, Tao Wei et al.
Low-Light Face Super-resolution via Illumination, Structure, and Texture Associated Representation
Chenyang Wang, Junjun Jiang, Kui Jiang et al.
One Self-Configurable Model to Solve Many Abstract Visual Reasoning Problems
Mikołaj Małkiński, Jacek Mańdziuk
A Diffusion Model with State Estimation for Degradation-Blind Inverse Imaging
Liya Ji, ZheFan Rao, Sinno Jialin Pan et al.
Self-Supervised 3D Human Mesh Recovery from a Single Image with Uncertainty-Aware Learning
Guoli Yan, Zichun Zhong, Jing Hua
Descanning: From Scanned to the Original Images with a Color Correction Diffusion Model
Junghun Cha, Ali Haider, Seoyun Yang et al.
SparseGNV: Generating Novel Views of Indoor Scenes with Sparse RGB-D Images
Weihao Cheng, Yan-Pei Cao, Ying Shan
Collaborative Tooth Motion Diffusion Model in Digital Orthodontics
Yeying Fan, Guangshun Wei, Chen Wang et al.
An Information-Flow Perspective on Algorithmic Fairness
Samuel Teuber, Bernhard Beckert
KeDuSR: Real-World Dual-Lens Super-resolution via Kernel-Free Matching
Huanjing Yue, Zifan Cui, Kun Li et al.
Robustly Train Normalizing Flows via KL Divergence Regularization
Kun Song, Ruben Solozabal Ochoa de Retana, Hao Li et al.
CoVR: Learning Composed Video Retrieval from Web Video Captions
Lucas Ventura, Antoine Yang, Cordelia Schmid et al.
Unknown-Aware Graph Regularization for Robust Semi-supervised Learning from Uncurated Data
Heejo Kong, Suneung Kim, Ho-Joong Kim et al.
Learning Encodings for Constructive Neural Combinatorial Optimization Needs to Regret
Rui Sun, Zhi Zheng, Zhenkun Wang
Taming the Sigmoid Bottleneck: Provably Argmaxable Sparse Multi-Label Classification
Andreas Grivas, Antonio Vergari, Adam Lopez
DP-AdamBC: Your DP-Adam Is Actually DP-SGD (Unless You Apply Bias Correction)
Qiaoyue Tang, Frederick Shpilevskiy, Mathias Lécuyer
Fine-Tuning Graph Neural Networks by Preserving Graph Generative Patterns
Yifei Sun, Qi Zhu, Yang Yang et al.
MEPSI: An MDL-Based Ensemble Pruning Approach with Structural Information
Xiao-Dong Bi, Shao-Qun Zhang, Yuan Jiang
Semi-supervised Learning of Dynamical Systems with Neural Ordinary Differential Equations: A Teacher-Student Model Approach
Yu Wang, Yuxuan Yin, Karthik Somayaji NS et al.
New Classes of the Greedy-Applicable Arm Feature Distributions in the Sparse Linear Bandit Problem
Koji Ichikawa, Shinji Ito, Daisuke Hatano et al.
Universal Weak Coreset
Ragesh Jaiswal, Amit Kumar
Contrastive Balancing Representation Learning for Heterogeneous Dose-Response Curves Estimation
Minqin Zhu, Anpeng Wu, Haoxuan Li et al.
RetroOOD: Understanding Out-of-Distribution Generalization in Retrosynthesis Prediction
Yemin Yu, Luotian Yuan, Ying WEI et al.
MemoryBank: Enhancing Large Language Models with Long-Term Memory
Wanjun Zhong, Lianghong Guo, Qiqi Gao et al.
REGLO: Provable Neural Network Repair for Global Robustness Properties
Feisi Fu, Zhilu Wang, Weichao Zhou et al.
CaMIL: Causal Multiple Instance Learning for Whole Slide Image Classification
Kaitao Chen, Shiliang Sun, Jing Zhao
Approximation Scheme for Weighted Metric Clustering via Sherali-Adams
Dmitrii Avdiukhin, Vaggos Chatziafratis, Konstantin Makarychev et al.
Inducing Point Operator Transformer: A Flexible and Scalable Architecture for Solving PDEs
Seungjun Lee, TaeIL Oh
Contextual Pandora’s Box
Alexia Atsidakou, Constantine Caramanis, Evangelia Gergatsouli et al.
Robust Distributed Gradient Aggregation Using Projections onto Gradient Manifolds
Kwang In Kim
Generative Model Perception Rectification Algorithm for Trade-Off between Diversity and Quality
Guipeng Lan, Shuai Xiao, Jiachen Yang et al.
Faithful Model Explanations through Energy-Constrained Conformal Counterfactuals
Patrick Altmeyer, Mojtaba Farmanbar, Arie Van Deursen et al.
Enhancing the Robustness of Spiking Neural Networks with Stochastic Gating Mechanisms
Jianhao Ding, Zhaofei Yu, Tiejun Huang et al.
A Closer Look at Curriculum Adversarial Training: From an Online Perspective
Lianghe Shi, Weiwei Liu
Provably Convergent Federated Trilevel Learning
Yang Jiao, Kai YANG, Tiancheng Wu et al.
Dynamic Knowledge Injection for AIXI Agents
Samuel Yang-Zhao, Kee Siong Ng, Marcus Hutter
Finite-Time Frequentist Regret Bounds of Multi-Agent Thompson Sampling on Sparse Hypergraphs
Tianyuan Jin, Hao-Lun Hsu, William Chang et al.
Feature Distribution Matching by Optimal Transport for Effective and Robust Coreset Selection
A Unified Self-Distillation Framework for Multimodal Sentiment Analysis with Uncertain Missing Modalities
Guiding a Harsh-Environments Robust Detector via RAW Data Characteristic Mining
Resisting Backdoor Attacks in Federated Learning via Bidirectional Elections and Individual Perspective
Zhen Qin, Feiyi Chen, Chen Zhi et al.
Transportable Representations for Domain Generalization
Kasra Jalaldoust, Elias Bareinboim
Exponential Hardness of Optimization from the Locality in Quantum Neural Networks
Hao-Kai Zhang, Chengkai Zhu, Geng Liu et al.
MFOS: Model-Free & One-Shot Object Pose Estimation
JongMin Lee, Yohann Cabon, Romain Brégier et al.
Hierarchical Topology Isomorphism Expertise Embedded Graph Contrastive Learning
Jiangmeng Li, Yifan Jin, Hang Gao et al.
PDE+: Enhancing Generalization via PDE with Adaptive Distributional Diffusion
Yige Yuan, Bingbing Xu, Bo Lin et al.
Learning Representations on the Unit Sphere: Investigating Angular Gaussian and Von Mises-Fisher Distributions for Online Continual Learning
Nicolas Michel, Giovanni Chierchia, Romain Negrel et al.
Towards Real-World Test-Time Adaptation: Tri-net Self-Training with Balanced Normalization
Yongyi Su, Xun Xu, Kui Jia
Probabilistic Offline Policy Ranking with Approximate Bayesian Computation
Longchao Da, Porter Jenkins, Trevor Schwantes et al.
Revisiting Disentanglement in Downstream Tasks: A Study on Its Necessity for Abstract Visual Reasoning
Ruiqian Nai, Zixin Wen, Ji Li et al.
Out of Thin Air: Exploring Data-Free Adversarial Robustness Distillation
Yuzheng Wang, Zhaoyu Chen, Dingkang Yang et al.
HAGO-Net: Hierarchical Geometric Massage Passing for Molecular Representation Learning
Hongbin Pei, Taile Chen, Chen A et al.
Unsupervised Template-assisted Point Cloud Shape Correspondence Network
Jiacheng Deng, Jiahao Lu, Tianzhu Zhang
X-3D: Explicit 3D Structure Modeling for Point Cloud Recognition
Shuofeng Sun, Yongming Rao, Jiwen Lu et al.
Efficient Model Stealing Defense with Noise Transition Matrix
Dong-Dong Wu, Chilin Fu, Weichang Wu et al.
MV-Adapter: Multimodal Video Transfer Learning for Video Text Retrieval
bowen zhang, Xiaojie Jin, Weibo Gong et al.
FLHetBench: Benchmarking Device and State Heterogeneity in Federated Learning
Junyuan Zhang, Shuang Zeng, Miao Zhang et al.
OmniParser: A Unified Framework for Text Spotting Key Information Extraction and Table Recognition
Jianqiang Wan, Sibo Song, Wenwen Yu et al.
CroSel: Cross Selection of Confident Pseudo Labels for Partial-Label Learning
Shiyu Tian, Hongxin Wei, Yiqun Wang et al.
PTM-VQA: Efficient Video Quality Assessment Leveraging Diverse PreTrained Models from the Wild
Kun Yuan, Hongbo Liu, Mading Li et al.
Improved Self-Training for Test-Time Adaptation
Jing Ma
Mudslide: A Universal Nuclear Instance Segmentation Method
Jun Wang
Rewrite the Stars
Xu Ma, Xiyang Dai, Yue Bai et al.
Virtual Immunohistochemistry Staining for Histological Images Assisted by Weakly-supervised Learning
Jiahan Li, Jiuyang Dong, Shenjin Huang et al.
3DiffTection: 3D Object Detection with Geometry-Aware Diffusion Features
Chenfeng Xu, Huan Ling, Sanja Fidler et al.
Model Adaptation for Time Constrained Embodied Control
Jaehyun Song, Minjong Yoo, Honguk Woo
SCE-MAE: Selective Correspondence Enhancement with Masked Autoencoder for Self-Supervised Landmark Estimation
Kejia Yin, Varshanth Rao, Ruowei Jiang et al.
Residual Denoising Diffusion Models
Jiawei Liu, Qiang Wang, Huijie Fan et al.
Weakly Supervised Point Cloud Semantic Segmentation via Artificial Oracle
Hyeokjun Kweon, Jihun Kim, Kuk-Jin Yoon
Generating Content for HDR Deghosting from Frequency View
Tao Hu, Qingsen Yan, Yuankai Qi et al.
Not All Prompts Are Secure: A Switchable Backdoor Attack Against Pre-trained Vision Transfomers
Sheng Yang, Jiawang Bai, Kuofeng Gao et al.
LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding Reasoning and Planning
Sijin Chen, Xin Chen, Chi Zhang et al.
GenTron: Diffusion Transformers for Image and Video Generation
Shoufa Chen, Mengmeng Xu, Jiawei Ren et al.
Backpropagation-free Network for 3D Test-time Adaptation
YANSHUO WANG, Ali Cheraghian, Zeeshan Hayder et al.
TransNeXt: Robust Foveal Visual Perception for Vision Transformers
Dai Shi
Cross-Dimension Affinity Distillation for 3D EM Neuron Segmentation
Xiaoyu Liu, Miaomiao Cai, Yinda Chen et al.
EfficientDreamer: High-Fidelity and Robust 3D Creation via Orthogonal-view Diffusion Priors
Zhipeng Hu, Minda Zhao, Chaoyi Zhao et al.
RichDreamer: A Generalizable Normal-Depth Diffusion Model for Detail Richness in Text-to-3D
Lingteng Qiu, Guanying Chen, Xiaodong Gu et al.
Robust Synthetic-to-Real Transfer for Stereo Matching
Jiawei Zhang, Jiahe Li, Lei Huang et al.
Understanding and Improving Source-free Domain Adaptation from a Theoretical Perspective
Yu Mitsuzumi, Akisato Kimura, Hisashi Kashima
CoDeF: Content Deformation Fields for Temporally Consistent Video Processing
Hao Ouyang, Qiuyu Wang, Yuxi Xiao et al.
BT-Adapter: Video Conversation is Feasible Without Video Instruction Tuning
Ruyang Liu, Chen Li, Yixiao Ge et al.
Video Frame Interpolation via Direct Synthesis with the Event-based Reference
Yuhan Liu, Yongjian Deng, Hao Chen et al.
CorrMatch: Label Propagation via Correlation Matching for Semi-Supervised Semantic Segmentation
Bo-Yuan Sun, Yuqi Yang, Le Zhang et al.
Rethinking Boundary Discontinuity Problem for Oriented Object Detection
Hang Xu, Xinyuan Liu, Haonan Xu et al.
Dual Prior Unfolding for Snapshot Compressive Imaging
Jiancheng Zhang, Haijin Zeng, Jiezhang Cao et al.
MCNet: Rethinking the Core Ingredients for Accurate and Efficient Homography Estimation
Haokai Zhu, Si-Yuan Cao, Jianxin Hu et al.
Uncertainty-Guided Never-Ending Learning to Drive
Lei Lai, Eshed Ohn-Bar, Sanjay Arora et al.
Feedback-Guided Autonomous Driving
Jimuyang Zhang, Zanming Huang, Arijit Ray et al.
Masked Autoencoders for Microscopy are Scalable Learners of Cellular Biology
Oren Kraus, Kian Kenyon-Dean, Saber Saberian et al.
3DGStream: On-the-Fly Training of 3D Gaussians for Efficient Streaming of Photo-Realistic Free-Viewpoint Videos
Jiakai Sun, Han Jiao, Guangyuan Li et al.
TextCraftor: Your Text Encoder Can be Image Quality Controller
Yanyu Li, Xian Liu, Anil Kag et al.
QN-Mixer: A Quasi-Newton MLP-Mixer Model for Sparse-View CT Reconstruction
Ishak Ayad, Nicolas Larue, Mai K. Nguyen
Prompt3D: Random Prompt Assisted Weakly-Supervised 3D Object Detection
Xiaohong Zhang, Huisheng Ye, Jingwen Li et al.
Efficient Meshflow and Optical Flow Estimation from Event Cameras
Xinglong Luo, Ao Luo, Zhengning Wang et al.
Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters
Jiazuo Yu, Yunzhi Zhuge, Lu Zhang et al.
CORES: Convolutional Response-based Score for Out-of-distribution Detection
Keke Tang, Chao Hou, Weilong Peng et al.
Equivariant Multi-Modality Image Fusion
Zixiang Zhao, Haowen Bai, Jiangshe Zhang et al.
PromptKD: Unsupervised Prompt Distillation for Vision-Language Models
Zheng Li, Xiang Li, xinyi fu et al.
Domain Gap Embeddings for Generative Dataset Augmentation
Yinong Oliver Wang, Younjoon Chung, Chen Henry Wu et al.
Learn to Rectify the Bias of CLIP for Unsupervised Semantic Segmentation
Jingyun Wang, Guoliang Kang
Leveraging Vision-Language Models for Improving Domain Generalization in Image Classification
Sravanti Addepalli, Ashish Asokan, Lakshay Sharma et al.
Draw Step by Step: Reconstructing CAD Construction Sequences from Point Clouds via Multimodal Diffusion.
Weijian Ma, Shuaiqi Chen, Yunzhong Lou et al.
Open-Vocabulary 3D Semantic Segmentation with Foundation Models
Li Jiang, Shaoshuai Shi, Bernt Schiele
Class Tokens Infusion for Weakly Supervised Semantic Segmentation
Sung-Hoon Yoon, Hoyong Kwon, Hyeonseong Kim et al.
GraphDreamer: Compositional 3D Scene Synthesis from Scene Graphs
Gege Gao, Weiyang Liu, Anpei Chen et al.
SeD: Semantic-Aware Discriminator for Image Super-Resolution
Bingchen Li, Xin Li, Hanxin Zhu et al.
JoAPR: Cleaning the Lens of Prompt Learning for Vision-Language Models
YUNCHENG GUO, Xiaodong Gu
View From Above: Orthogonal-View aware Cross-view Localization
Shan Wang, Chuong Nguyen, Jiawei Liu et al.
WorDepth: Variational Language Prior for Monocular Depth Estimation
Ziyao Zeng, Hyoungseob Park, Fengyu Yang et al.
EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI
Tai Wang, Xiaohan Mao, Chenming Zhu et al.
DIOD: Self-Distillation Meets Object Discovery
Sandra Kara, Hejer AMMAR, Julien Denize et al.
SNED: Superposition Network Architecture Search for Efficient Video Diffusion Model
Zhengang Li, Yan Kang, Yuchen Liu et al.
Deep Generative Model based Rate-Distortion for Image Downscaling Assessment
yuanbang liang, Bhavesh Garg, Paul L. Rosin et al.
VA3: Virtually Assured Amplification Attack on Probabilistic Copyright Protection for Text-to-Image Generative Models
Xiang Li, Qianli Shen, Kenji Kawaguchi
SNI-SLAM: Semantic Neural Implicit SLAM
Siting Zhu, Guangming Wang, Hermann Blum et al.
TextureDreamer: Image-Guided Texture Synthesis Through Geometry-Aware Diffusion
Yu-Ying Yeh, Jia-Bin Huang, Changil Kim et al.
Blur2Blur: Blur Conversion for Unsupervised Image Deblurring on Unknown Domains
Bang-Dang Pham, Phong Tran, Anh Tran et al.
In-distribution Public Data Synthesis with Diffusion Models for Differentially Private Image Classification
Jinseong Park, Yujin Choi, Jaewook Lee
ArtAdapter: Text-to-Image Style Transfer using Multi-Level Style Encoder and Explicit Adaptation
Dar-Yen Chen, Hamish Tennent, Ching-Wen Hsu
Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation
Bingxin Ke, Anton Obukhov, Shengyu Huang et al.
GS-IR: 3D Gaussian Splatting for Inverse Rendering
Zhihao Liang, Qi Zhang, Ying Feng et al.
SyncTalk: The Devil is in the Synchronization for Talking Head Synthesis
Ziqiao Peng, Wentao Hu, Yue Shi et al.
D3still: Decoupled Differential Distillation for Asymmetric Image Retrieval
Yi Xie, Yihong Lin, Wenjie Cai et al.
MTLoRA: Low-Rank Adaptation Approach for Efficient Multi-Task Learning
Ahmed Agiza, Marina Neseem, Sherief Reda
SelfOcc: Self-Supervised Vision-Based 3D Occupancy Prediction
Yuanhui Huang, Wenzhao Zheng, Borui Zhang et al.
Analyzing and Improving the Training Dynamics of Diffusion Models
Tero Karras, Miika Aittala, Jaakko Lehtinen et al.
DiffusionGAN3D: Boosting Text-guided 3D Generation and Domain Adaptation by Combining 3D GANs and Diffusion Priors
Biwen Lei, Kai Yu, Mengyang Feng et al.
Groupwise Query Specialization and Quality-Aware Multi-Assignment for Transformer-based Visual Relationship Detection
Jongha Kim, Jihwan Park, Jinyoung Park et al.
SD2Event:Self-supervised Learning of Dynamic Detectors and Contextual Descriptors for Event Cameras
Yuan Gao, Yuqing Zhu, Xinjun Li et al.
DiSR-NeRF: Diffusion-Guided View-Consistent Super-Resolution NeRF
Jie Long Lee, Chen Li, Gim Hee Lee
PaReNeRF: Toward Fast Large-scale Dynamic NeRF with Patch-based Reference
Xiao Tang, Min Yang, Penghui Sun et al.
Effective Video Mirror Detection with Inconsistent Motion Cues
Alex Warren, Ke Xu, Jiaying Lin et al.
Desigen: A Pipeline for Controllable Design Template Generation
Haohan Weng, Danqing Huang, YU QIAO et al.
Rich Human Feedback for Text-to-Image Generation
Youwei Liang, Junfeng He, Gang Li et al.
Dr. Bokeh: DiffeRentiable Occlusion-aware Bokeh Rendering
Yichen Sheng, Zixun Yu, Lu Ling et al.
Learning from Observer Gaze: Zero-Shot Attention Prediction Oriented by Human-Object Interaction Recognition
Yuchen Zhou, Linkai Liu, Chao Gou
Super-Resolution Reconstruction from Bayer-Pattern Spike Streams
Yanchen Dong, Ruiqin Xiong, Jian Zhang et al.
Diffusion Handles Enabling 3D Edits for Diffusion Models by Lifting Activations to 3D
Karran Pandey, Paul Guerrero, Matheus Gadelha et al.
Uncovering What Why and How: A Comprehensive Benchmark for Causation Understanding of Video Anomaly
Hang Du, Sicheng Zhang, Binzhu Xie et al.
DifFlow3D: Toward Robust Uncertainty-Aware Scene Flow Estimation with Iterative Diffusion-Based Refinement
Jiuming Liu, Guangming Wang, Weicai Ye et al.
Shadows Don't Lie and Lines Can't Bend! Generative Models don't know Projective Geometry...for now
Ayush Sarkar, Hanlin Mai, Amitabh Mahapatra et al.
Aligning Logits Generatively for Principled Black-Box Knowledge Distillation
Jing Ma, Xiang Xiang, Ke Wang et al.
Permutation Equivariance of Transformers and Its Applications
Hengyuan Xu, Liyao Xiang, Hangyu Ye et al.
HomoFormer: Homogenized Transformer for Image Shadow Removal
Jie Xiao, Xueyang Fu, Yurui Zhu et al.
HardMo: A Large-Scale Hardcase Dataset for Motion Capture
Jiaqi Liao, Chuanchen Luo, Yinuo Du et al.
SLICE: Stabilized LIME for Consistent Explanations for Image Classification
Revoti Prasad Bora, Kiran Raja, Philipp Terhörst et al.
EFHQ: Multi-purpose ExtremePose-Face-HQ dataset
Trung Dao, Duc H Vu, Cuong Pham et al.
Logarithmic Lenses: Exploring Log RGB Data for Image Classification
Bruce Maxwell, Sumegha Singhania, Avnish Patel et al.
TokenCompose: Text-to-Image Diffusion with Token-level Supervision
Zirui Wang, Zhizhou Sha, Zheng Ding et al.
Seeing the World through Your Eyes
Hadi Alzayer, Kevin Zhang, Brandon Y. Feng et al.
Learning Vision from Models Rivals Learning Vision from Data
Yonglong Tian, Lijie Fan, Kaifeng Chen et al.
JointSQ: Joint Sparsification-Quantization for Distributed Learning
Weiying Xie, Haowei Li, Ma Jitao et al.
Holo-Relighting: Controllable Volumetric Portrait Relighting from a Single Image
Yiqun Mei, Yu Zeng, He Zhang et al.
Matching 2D Images in 3D: Metric Relative Pose from Metric Correspondences
Axel Barroso-Laguna, Sowmya Munukutla, Victor Adrian Prisacariu et al.
MorpheuS: Neural Dynamic 360° Surface Reconstruction from Monocular RGB-D Video
Hengyi Wang, Jingwen Wang, Lourdes Agapito
Capturing Closely Interacted Two-Person Motions with Reaction Priors
Qi Fang, Yinghui Fan, Yanjun Li et al.
DiVa-360: The Dynamic Visual Dataset for Immersive Neural Fields
Cheng-You Lu, Peisen Zhou, Angela Xing et al.
Learning Visual Prompt for Gait Recognition
Kang Ma, Ying Fu, Chunshui Cao et al.
PolarRec: Improving Radio Interferometric Data Reconstruction Using Polar Coordinates
Ruoqi Wang, Zhuoyang Chen, Jiayi Zhu et al.
StyleCineGAN: Landscape Cinemagraph Generation using a Pre-trained StyleGAN
Jongwoo Choi, Kwanggyoon Seo, Amirsaman Ashtari et al.
Temporally Consistent Unbalanced Optimal Transport for Unsupervised Action Segmentation
Ming Xu, Stephen Gould
Learning for Transductive Threshold Calibration in Open-World Recognition
Qin ZHANG, DONGSHENG An, Tianjun Xiao et al.
SonicVisionLM: Playing Sound with Vision Language Models
Zhifeng Xie, Shengye Yu, Qile He et al.
Real-Time Exposure Correction via Collaborative Transformations and Adaptive Sampling
Ziwen Li, Feng Zhang, Meng Cao et al.
NeLF-Pro: Neural Light Field Probes for Multi-Scale Novel View Synthesis
Zinuo You, Andreas Geiger, Anpei Chen
OpenEQA: Embodied Question Answering in the Era of Foundation Models
Arjun Majumdar, Anurag Ajay, Xiaohan Zhang et al.
Modeling Dense Multimodal Interactions Between Biological Pathways and Histology for Survival Prediction
Guillaume Jaume, Anurag Vaidya, Richard J. Chen et al.
Practical Measurements of Translucent Materials with Inter-Pixel Translucency Prior
Zhenyu Chen, Jie Guo, Shuichang Lai et al.
View-Category Interactive Sharing Transformer for Incomplete Multi-View Multi-Label Learning
Shilong Ou, Zhe Xue, Yawen Li et al.
Dual Memory Networks: A Versatile Adaptation Approach for Vision-Language Models
Yabin Zhang, Wenjie Zhu, Hui Tang et al.
FISBe: A Real-World Benchmark Dataset for Instance Segmentation of Long-Range Thin Filamentous Structures
Lisa Mais, Peter Hirsch, Claire Managan et al.
RankMatch: Exploring the Better Consistency Regularization for Semi-supervised Semantic Segmentation
Huayu Mai, Rui Sun, Tianzhu Zhang et al.
CoDe: An Explicit Content Decoupling Framework for Image Restoration
Enxuan Gu, Hongwei Ge, Yong Guo
Masked Spatial Propagation Network for Sparsity-Adaptive Depth Refinement
Jinyoung Jun, Jae-Han Lee, Chang-Su Kim
D^4: Dataset Distillation via Disentangled Diffusion Model
Duo Su, Junjie Hou, Weizhi Gao et al.
An Empirical Study of the Generalization Ability of Lidar 3D Object Detectors to Unseen Domains
George Eskandar
ViLa-MIL: Dual-scale Vision-Language Multiple Instance Learning for Whole Slide Image Classification
Jiangbo Shi, Chen Li, Tieliang Gong et al.
CaDeT: a Causal Disentanglement Approach for Robust Trajectory Prediction in Autonomous Driving
Mozhgan Pourkeshavarz, Junrui Zhang, Amir Rasouli
Boosting Neural Representations for Videos with a Conditional Decoder
XINJIE ZHANG, Ren Yang, Dailan He et al.
Text-Guided 3D Face Synthesis - From Generation to Editing
Yunjie Wu, Yapeng Meng, Zhipeng Hu et al.
IReNe: Instant Recoloring of Neural Radiance Fields
Alessio Mazzucchelli, Adrian Garcia-Garcia, Elena Garces et al.
Distilling CLIP with Dual Guidance for Learning Discriminative Human Body Shape Representation
Feng Liu, Minchul Kim, Zhiyuan Ren et al.
CARZero: Cross-Attention Alignment for Radiology Zero-Shot Classification
Haoran Lai, Qingsong Yao, Zihang Jiang et al.
MedM2G: Unifying Medical Multi-Modal Generation via Cross-Guided Diffusion with Visual Invariant
Chenlu Zhan, Gaoang Wang, Yu LIN et al.
Your Image is My Video: Reshaping the Receptive Field via Image-To-Video Differentiable AutoAugmentation and Fusion
Sofia Casarin, Cynthia Ugwu, Sergio Escalera et al.
Contrastive Denoising Score for Text-guided Latent Diffusion Image Editing
Hyelin Nam, Gihyun Kwon, Geon Yeong Park et al.
DiffLoc: Diffusion Model for Outdoor LiDAR Localization
Wen Li, Yuyang Yang, Shangshu Yu et al.
Portrait4D: Learning One-Shot 4D Head Avatar Synthesis using Synthetic Data
Yu Deng, Duomin Wang, Xiaohang Ren et al.
Soften to Defend: Towards Adversarial Robustness via Self-Guided Label Refinement
Daiwei Yu, Zhuorong Li, Lina Wei et al.
Wired Perspectives: Multi-View Wire Art Embraces Generative AI
Zhiyu Qu, LAN YANG, Honggang Zhang et al.
Small Scale Data-Free Knowledge Distillation
He Liu, Yikai Wang, Huaping Liu et al.
Transfer CLIP for Generalizable Image Denoising
Jun Cheng, Dong Liang, Shan Tan