Most Cited AAAI "instance-level optimization" Papers
5,317 papers found • Page 17 of 27
Conference
Do Not DeepFake Me: Privacy-Preserving Neural 3D Head Reconstruction Without Sensitive Images
Jiayi Kong, Xurui Song, Shuo Huai et al.
Real-Time Neural Denoising with Render-Aware Knowledge Distillation
Mengxun Kong, Jie Guo, Chen Wang et al.
Stable Mean Teacher for Semi-supervised Video Action Detection
Akash Kumar, Sirshapan Mitra, Yogesh Singh Rawat
A Unified Degradation-Robust Approach to SSL and UDA for 3D Medical Images
Suruchi Kumari, Pravendra Singh
SAFIRE: Segment Any Forged Image Region
Myung-Joon Kwon, Wonjun Lee, Seung-Hun Nam et al.
Exploiting Diffusion Prior for Real-World Image Dehazing with Unpaired Training
Yunwei Lan, Zhigao Cui, Chang Liu et al.
Color Transfer with Modulated Flows
Maria Larchenko, Alexander Lobashev, Dmitry Guskov et al.
Rethinking Open-Vocabulary Segmentation of Radiance Fields in 3D Space
Hyunjee Lee, Youngsik Yun, Jeongmin Bae et al.
NBA3D: Neighbor-Based Confidence Adjustment for 3D Rare Object Detection Using LiDAR
Jooyoung Lee, Jaeyoon Lee, Jongwon Choi
MAMS: Model-Agnostic Module Selection Framework for Video Captioning
Sangho Lee, Il Yong Chun, Hogun Park
Enabling Region-Specific Control via Lassos in Point-Based Colorization
Sanghyeon Lee, Jooyeol Yun, Jaegul Choo
Concept Matching with Agent for Out-of-Distribution Detection
Yuxiao Lee, Xiaofeng Cao, Jingcai Guo et al.
FNIN: A Fourier Neural Operator-based Numerical Integration Network for Surface-from-gradients
Jiaqi Leng, Yakun Ju, Yuanxu Duan et al.
Disentangled Motion Modeling for Video Frame Interpolation
Jaihyun Lew, Jooyoung Choi, Chaehun Shin et al.
StyO: Stylize Your Face in Only One-Shot
Bonan Li, Zicheng Zhang, Xuecheng Nie et al.
FEAST-Mamba: FEAture and SpaTial Aware Mamba Network with Bidirectional Orthogonal Fusion for Cross-Modal Point Cloud Segmentation
Chade Li, Pengju Zhang, Bo Liu et al.
RemDet: Rethinking Efficient Model Design for UAV Object Detection
Chen Li, Rui Zhao, Zeyu Wang et al.
U-KAN Makes Strong Backbone for Medical Image Segmentation and Generation
Chenxin Li, Xinyu Liu, Wuyang Li et al.
Consistency of Compositional Generalization Across Multiple Levels
Chuanhao Li, Zhen Li, Chenchen Jing et al.
An Efficient Framework for Enhancing Discriminative Models via Diffusion Techniques
Chunxiao Li, Xiaoxiao Wang, Boming Miao et al.
Cascaded Diffusion Models for Virtual Try-On: Improving Control and Resolution
Guangyuan Li, Yongkang Wang, Junsheng Luan et al.
MaskViM: Domain Generalized Semantic Segmentation with State Space Models
Jiahao Li, Yang Lu, Yuan Xie et al.
Know Where You Are From: Event-Based Segmentation via Spatio-Temporal Propagation
Ke Li, Gengyu Lyu, Hao Chen et al.
Similar Modality Enhancement and Action Consistency Learning for Weakly Supervised Temporal Action Localization
Maodong Li, Chao Zheng, Jian Wang et al.
REGNav: Room Expert Guided Image-Goal Navigation
Pengna Li, Kangyi Wu, Jingwen Fu et al.
Region-aware Difference Distilling with Attribute-guided Contrastive Regularization for Change Captioning
Rong Li, Liang Li, Jiehua Zhang et al.
Enhancing Generalizability via Utilization of Unlabeled Data for Occupancy Perception
Ruihang Li, Tao Li, Shanding Ye et al.
A Compact Implicit Neural Representation for Efficient Storage of Massive 4D Functional Magnetic Resonance Imaging
Ruoran Li, Runzhao Yang, Wenxin Xiang et al.
DigitalLLaVA: Incorporating Digital Cognition Capability for Physical World Comprehension in Multimodal LLMs
Shiyu Li, Pengxu Wei, Pengchong Qiao et al.
Transferable Adversarial Face Attack with Text Controlled Attribute
Wenyun Li, Zheng Zhang, Xiangyuan Lan et al.
MambaLCT: Boosting Tracking via Long-term Context State Space Model
Xiaohai Li, Bineng Zhong, Qihua Liang et al.
PersonaMagic: Stage-Regulated High-Fidelity Face Customization with Tandem Equilibrium
Xinzhe Li, Jiahui Zhan, Shengfeng He et al.
Mamba-CAD: State Space Model for 3D Computer-Aided Design Generative Modeling
Xueyang Li, Yunzhong Lou, Yu Song et al.
StructSR: Refuse Spurious Details in Real-World Image Super-Resolution
Yachao Li, Dong Liang, Tianyu Ding et al.
Sparse Transfer Learning Accelerates and Enhances Certified Robustness: A Comprehensive Study
Zhangheng Li, Tianlong Chen, Linyi Li et al.
ProsodyTalker: 3D Visual Speech Animation via Prosody Decomposition
Zonglin Li, Xiaoqian Lv, Qinglin Liu et al.
Exploring the Potential of Large Vision-Language Models for Unsupervised Text-Based Person Retrieval
Zongyi Li, Li Jianbo, Yuxuan Shi et al.
Semantic-guided Masked Mutual Learning for Multi-modal Brain Tumor Segmentation with Arbitrary Missing Modalities
Guoyan Liang, Qin Zhou, Zhe Wang et al.
Skip Mamba Diffusion for Monocular 3D Semantic Scene Completion
Li Liang, Naveed Akhtar, Jordan Vice et al.
S-INF: Towards Realistic Indoor Scene Synthesis via Scene Implicit Neural Field
Zixi Liang, Guowei Xu, Haifeng Wu et al.
Progressive Distribution Matching for Federated Semi-Supervised Learning
Dongping Liao, Xitong Gao, Yabo Xu et al.
Multi-Granularity Video Object Segmentation
Sangbeom Lim, Seongchan Kim, Seungjun An et al.
DreamFit: Garment-Centric Human Generation via a Lightweight Anything-Dressing Encoder
Ente Lin, Xujie Zhang, Fuwei Zhao et al.
Decoupling Appearance Variations with 3D Consistent Features in Gaussian Splatting
Jiaqi Lin, Zhihao Li, Binxiao Huang et al.
InvSeg: Test-Time Prompt Inversion for Semantic Segmentation
Jiayi Lin, Jiabo Huang, Jian Hu et al.
Memory Efficient Matting with Adaptive Token Routing
Yiheng Lin, Yihan Hu, Chenyi Zhang et al.
AGLLDiff: Guiding Diffusion Models Towards Unsupervised Training-free Real-world Low-light Image Enhancement
Yunlong Lin, Tian Ye, Sixiang Chen et al.
Deep Hierarchies and Invariant Disease-Indicative Feature Learning for Computer Aided Diagnosis of Multiple Fundus Diseases
Yuxin Lin, Wei Wang, Xiaoling Luo et al.
Boosting Multimodal Large Language Models with Visual Tokens Withdrawal for Rapid Inference
Zhihang Lin, Mingbao Lin, Luxi Lin et al.
SOVGaussian: Sparse-View 3D Gaussian Splatting for Open-Vocabulary Scene Understanding
Peng Ling, Tiao Tan, Jiaqi Lin et al.
Thinking Racial Bias in Fair Forgery Detection: Models, Datasets and Evaluations
Decheng Liu, Zongqi Wang, Chunlei Peng et al.
UFO: Enhancing Diffusion-Based Video Generation with a Uniform Frame Organizer
Delong Liu, Zhaohui Hou, Mingjie Zhan et al.
Zero-Shot Noise2Mean: Gap Minimization for Efficient Denoising from a Single Noisy Image
Duo Liu, Yiqi Shi, Guoyin Zhang et al.
SCott: Accelerating Diffusion Models with Stochastic Consistency Distillation
Hongjian Liu, Qingsong Xie, Tianxiang Ye et al.
PEIE: Physics Embedded Illumination Estimation for Adaptive Dehazing
Huaizhuo Liu, Hai-Miao Hu, Yonglong Jiang et al.
TCPFormer: Learning Temporal Correlation with Implicit Pose Proxy for 3D Human Pose Estimation
Jiajie Liu, Mengyuan Liu, Hong Liu et al.
Union Is Strength! Unite the Power of LLMs and MLLMs for Chart Question Answering
Jiapeng Liu, Liang Li, Shihao Rao et al.
UP-Restorer: When Unrolling Meets Prompts for Unified Image Restoration
Minghao Liu, Wenhan Yang, Jinyi Luo et al.
Path-Adaptive Matting for Efficient Inference Under Various Computational Cost Constraints
Qinglin Liu, Zonglin Li, Xiaoqian Lv et al.
DeRainGS: Gaussian Splatting for Enhanced Scene Reconstruction in Rainy Environments
Shuhong Liu, Xiang Chen, Hongming Chen et al.
VQTalker: Towards Multilingual Talking Avatars Through Facial Motion Tokenization
Tao Liu, Ziyang Ma, Qi Chen et al.
Multi-view Consistent 3D Panoptic Scene Understanding
Xianzhu Liu, Xin Sun, Haozhe Xie et al.
Unlocking the Potential of Reverse Distillation for Anomaly Detection
Xinyue Liu, Jianyuan Wang, Biao Leng et al.
Unveiling the Knowledge of CLIP for Training-Free Open-Vocabulary Semantic Segmentation
Yajie Liu, Guodong Wang, Jinjin Zhang et al.
DoGA: Enhancing Grounded Object Detection via Grouped Pre-Training with Attributes
Yang Liu, Feng Hou, Yunjie Peng et al.
Towards Robust Visual Question Answering via Prompt-Driven Geometric Harmonization
Yishu Liu, Jiawei Zhu, Congcong Wen et al.
See Through Their Minds: Learning Transferable Brain Decoding Models from Cross-Subject fMRI
Yulong Liu, Yongqiang Ma, Guibo Zhu et al.
SCOPE: Sign Language Contextual Processing with Embedding from LLMs
Yuqi Liu, Wenqian Zhang, Sihan Ren et al.
Advancing Comprehensive Aesthetic Insight with Multi-Scale Text-Guided Self-Supervised Learning
Yuti Liu, Shice Liu, Junyuan Gao et al.
Training Verification-Friendly Neural Networks via Neuron Behavior Consistency
Zongxin Liu, Zhe Zhao, Fu Song et al.
Robust SAM: On the Adversarial Robustness of Vision Foundation Models
Jiahuan Long, Zhengqin Xu, Tingsong Jiang et al.
RGBT Tracking via All-layer Multimodal Interactions with Progressive Fusion Mamba
Andong Lu, Wanyu Wang, Chenglong Li et al.
Privacy-Preserving V2X Collaborative Perception Integrating Unknown Collaborators
Bin Lu, Xinyu Xiao, Changzhou Zhang et al.
DeMo: Deep Motion Field Consensus with Learnable Kernels for Two-view Correspondence Learning
Yifan Lu, Jiajun Le, Zizhuo Li et al.
Generative Video Diffusion for Unseen Novel Semantic Video Moment Retrieval
Dezhao Luo, Shaogang Gong, Jiabo Huang et al.
Beyond Pixel and Object: Part Feature as Reference for Few-Shot Video Object Segmentation
Naisong Luo, Guoxin Xiong, Tianzhu Zhang
Privacy-Preserving Low-Rank Adaptation Against Membership Inference Attacks for Latent Diffusion Models
Zihao Luo, Xilie Xu, Feng Liu et al.
Revisiting Change Captioning from Self-supervised Global-Part Alignment
Feixiao Lv, Rui Wang, Lihua Jing
ScaleMatch: Multi-scale Consistency Enhancement for Semi-supervised Semantic Segmentation
Liang Lv, Lefei Zhang
Step-Calibrated Diffusion for Biomedical Optical Image Restoration
Yiwei Lyu, Sung Jik Cha, Cheng Jiang et al.
Aligning and Prompting Anything for Zero-Shot Generalized Anomaly Detection
Jitao Ma, Weiying Xie, Hangyu Ye et al.
Does VLM Classification Benefit from LLM Description Semantics?
Pingchuan Ma, Lennart Rietdorf, Dmytro Kotovenko et al.
Instruct Where the Model Fails: Generative Data Augmentation via Guided Self-contrastive Fine-tuning
Weijian Ma, Ruoxin Chen, Keyue Zhang et al.
A Trusted Lesion-assessment Network for Interpretable Diagnosis of Coronary Artery Disease in Coronary CT Angiography
Xinghua Ma, Xinyan Fang, Mingye Zou et al.
Follow-Your-Click: Open-domain Regional Image Animation via Motion Prompts
Yue Ma, Yingqing He, Hongfa Wang et al.
Few-Shot Fine-Grained Image Classification with Progressively Feature Refinement and Continuous Relationship Modeling
Zhen-Xiang Ma, Zhen-Duo Chen, Tai Zheng et al.
OUS: Bridging Scene Context and Facial Features to Overcome the Rigid Cognitive Problem
Xinji Mai, Haoran Wang, Zeng Tao et al.
DMF-Net: Image-Guided Point Cloud Completion with Dual-Channel Modality Fusion and Shape-Aware Upsampling Transformer
Aihua Mao, Yuxuan Tang, Jiangtao Huang et al.
Black-Box Test-Time Prompt Tuning for Vision-Language Models
Fan'an Meng, Chaoran Cui, Hongjun Dai et al.
Sp3ctralMamba: Physics-Driven Joint State Space Model for Hyperspectral Image Reconstruction
Ge Meng, Jingyan Tu, Jingjia Huang et al.
Qua2SeDiMo: Quantifiable Quantization Sensitivity of Diffusion Models
Keith G. Mills, Mohammad Salameh, Ruichen Chen et al.
Energy vs. Noise: Towards Robust Temporal Action Localization in Open-World
Chenyu Mu, Jiahua Li, Kun Wei et al.
SegFace: Face Segmentation of Long-Tail Classes
Kartik Narayan, Vibashan Vs, Vishal M. Patel
HiGDA: Hierarchical Graph of Nodes to Learn Local-to-Global Topology for Semi-Supervised Domain Adaptation
Ba Hung Ngo, Doanh C. Bui, Nhat-Tuong Do-Tran et al.
iMoT: Inertial Motion Transformer for Inertial Navigation
Son Minh Nguyen, Duc Viet Le, Paul Havinga
SPU-IMR: Self-supervised Arbitrary-scale Point Cloud Upsampling via Iterative Mask-recovery Network
Ziming Nie, Qiao Wu, Chenlei Lv et al.
Exploring Semantic Consistency and Style Diversity for Domain Generalized Semantic Segmentation
Hongwei Niu, Linhuang Xie, Jianghang Lin et al.
Locate Anything on Earth: Advancing Open-Vocabulary Object Detection for Remote Sensing Community
Jiancheng Pan, Yanxing Liu, Yuqian Fu et al.
Learning with Open-world Noisy Data via Class-independent Margin in Dual Representation Space
Linchao Pan, Can Gao, Jie Zhou et al.
DuSSS: Dual Semantic Similarity-Supervised Vision-Language Model for Semi-Supervised Medical Image Segmentation
Qingtao Pan, Wenhao Qiao, Jingjiao Lou et al.
Fair Training with Zero Inputs
Wenjie Pan, Jianqing Zhu, Huanqiang Zeng
Procedure Knowledge Decoupled Distillation Strategy for Procedure Planning in Instructional Videos
Xiaotian Pan, Zhaobo Qi, Xin Sun et al.
S2S2: Semantic Stacking for Robust Semantic Segmentation in Medical Imaging
Yimu Pan, Sitao Zhang, Alison D. Gernand et al.
Point Cloud Semantic Segmentation with Sparse and Inhomogeneous Annotations
Zhiyi Pan, Nan Zhang, Wei Gao et al.
Modular-Cam: Modular Dynamic Camera-view Video Generation with LLM
Zirui Pan, Xin Wang, Yipeng Zhang et al.
Partially Blinded Unlearning: Class Unlearning for Deep Networks from Bayesian Perspective
Subhodip Panda, Shashwat Sourav, Prathosh A.P.
Beyond Text: Fine-Grained Multi-Modal Fact Verification with Hypergraph Transformers
Hui Pang, Chaozhuo Li, Litian Zhang et al.
SeeDiff: Off-the-Shelf Seeded Mask Generation from Diffusion Models
Joon Hyun Park, Kumju Jo, Sungyong Baik
EfficientVMamba: Atrous Selective Scan for Light Weight Visual Mamba
Xiaohuan Pei, Tao Huang, Chang Xu
CDE-Learning: Camera Deviation Elimination Learning for Unsupervised Person Re-identification
Jinjia Peng, Songyu Zhang, Huibing Wang
Adaptive Dual-domain Learning for Underwater Image Enhancement
Lintao Peng, Liheng Bian
Boosting Image De-Raining via Central-Surrounding Synergistic Convolution
Long Peng, Yang Wang, Xin Di et al.
3D-aware Select, Expand, and Squeeze Token for Aerial Action Recognition
Luying Peng, Xiangbo Shu, Yazhou Yao et al.
OAMaskFlow: Occlusion-Aware Motion Mask for Scene Flow
Xiongfeng Peng, Zhihua Liu, Weiming Li et al.
HVDualformer: Histogram-Vision Dual Transformer for White Balance
Yan-Tsung Peng, Guan-Rong Chen
Semi-supervised 3D Semantic Scene Completion with 2D Vision Foundation Model Guidance
Duc-Hai Pham, Duc-Dung Nguyen, Anh Pham et al.
Leveraging Anatomical Consistency for Multi-Object Detection in Ultrasound Images via Source-free Unsupervised Domain Adaptation
Bin Pu, Xingguo Lv, Jiewen Yang et al.
Dive into Aerial Remote Sensing Underwater Depth Estimation with Hyperspectral Imagery
Jiahao Qi, Xingyue Liu, Chen Chen et al.
Unsupervised Domain Adaptive Person Search via Dual Self-Calibration
Linfeng Qi, Huibing Wang, Jiqing Zhang et al.
PhysDiff: Physiology-based Dynamicity Disentangled Diffusion Model for Remote Physiological Measurement
Wei Qian, Gaoji Su, Dan Guo et al.
Holistic Correction with Object Prototype for Video Object Segmentation
Shengye Qiao, Changqun Xia, Yanjie Liang et al.
Integrating Low-Level Visual Cues for Enhanced Unsupervised Semantic Segmentation
Yuhao Qing, Dan Zeng, Shaorong Xie et al.
PC-BEV: An Efficient Polar-Cartesian BEV Fusion Framework for LiDAR Semantic Segmentation
Shoumeng Qiu, Xinrun Li, Xiangyang Xue et al.
High-Fidelity Polarimetric Implicit 3D Reconstruction with View-Dependent Physical Representation
Yu Qiu, Sijia Wen, Hainan Zhang et al.
HSOD-BIT-V2: A Challenging Benchmark for Hyperspectral Salient Object Detection
Yuhao Qiu, Shuyan Bai, Tingfa Xu et al.
Universal Features Guided Zero-Shot Category-Level Object Pose Estimation
Wentian Qu, Chenyu Meng, Heng Li et al.
GHOST: Gaussian Hypothesis Open-Set Technique
Ryan Rabinowitz, Steve Cruz, Manuel Günther et al.
CDTR: Semantic Alignment for Video Moment Retrieval Using Concept Decomposition Transformer
Ran Ran, Jiwei Wei, Xiangyi Cai et al.
Improving Integrated Gradient-based Transferable Adversarial Examples by Refining the Integration Path
Yuchen Ren, Zhengyu Zhao, Chenhao Lin et al.
GenHMR: Generative Human Mesh Recovery
Muhammad Usama Saleem, Ekkasit Pinyoanuntapong, Pu Wang et al.
FunEditor: Achieving Complex Image Edits via Function Aggregation with Diffusion Models
Mohammadreza Samadi, Fred X. Han, Mohammad Salameh et al.
PVTree: Realistic and Controllable Palm Vein Generation for Recognition Tasks
Sheng Shang, Chenglong Zhao, Ruixin Zhang et al.
Video Summarization Using Denoising Diffusion Probabilistic Model
Zirui Shang, Yubo Zhu, Hongxi Li et al.
IMAGDressing-v1: Customizable Virtual Dressing
Fei Shen, Xin Jiang, Xin He et al.
In2NeCT: Inter-class and Intra-class Neural Collapse Tuning for Semantic Segmentation of Imbalanced Remote Sensing Images
Junao Shen, Qiyun Hu, Tian Feng et al.
Topology-Aware 3D Gaussian Splatting: Leveraging Persistent Homology for Optimized Structural Integrity
Tianqi Shen, Shaohua Liu, Jiaqi Feng et al.
Free-Moving Object Reconstruction and Pose Estimation with Virtual Camera
Haixin Shi, Yinlin Hu, Daniel Koguciuk et al.
Normal-NeRF: Ambiguity-Robust Normal Estimation for Highly Reflective Scenes
Ji Shi, Xianghua Ying, Ruohao Guo et al.
Neural Block Compression: Variable Bitrates Feature Blocks for Texture Representation
Rui Shi, Yishun Dou, Zhong Zheng et al.
HS-FPN: High Frequency and Spatial Perception FPN for Tiny Object Detection
Zican Shi, Jing Hu, Jie Ren et al.
SdalsNet: Self-Distilled Attention Localization and Shift Network for Unsupervised Camouflaged Object Detection
Peiyao Shou, Yixiu Liu, Wei Wang et al.
OGP-Net: Optical Guidance Meets Pixel-Level Contrastive Distillation for Robust Multi-Modal and Missing Modality Segmentation
Aniruddh Sikdar, Jayant Teotia, Suresh Sundaram
Fine-Grained Perception in Panoramic Scenes: A Novel Task, Dataset, and Method for Object Importance Ranking
Jia Song, Chenglizhao Chen, Xu Yu et al.
CtrlAvatar: Controllable Avatars Generation via Disentangled Invertible Networks
Wenfeng Song, Yang Ding, Fei Hou et al.
ERL-MPP: Evolutionary Reinforcement Learning with Multi-head Puzzle Perception for Solving Large-scale Jigsaw Puzzles of Eroded Gaps
Xingke Song, Xiaoying Yang, Chenglin Yao et al.
Temporal Coherent Object Flow for Multi-Object Tracking
Zikai Song, Run Luo, Lintao Ma et al.
Toward Improving Robustness and Accuracy in Unsupervised Domain Adaptation
Aishwarya Soni, Tanima Dutta
Hierarchical Vector Quantization for Unsupervised Action Segmentation
Federico Spurio, Emad Bahrami, Gianpiero Francesca et al.
Can We Get Rid of Handcrafted Feature Extractors? SparseViT: Nonsemantics-Centered, Parameter-Efficient Image Manipulation Localization Through Spare-Coding Transformer
Lei Su, Xiaochen Ma, Xuekang Zhu et al.
EigenSR: Eigenimage-Bridged Pre-Trained RGB Learners for Single Hyperspectral Image Super-Resolution
Xi Su, Xiangfei Shen, Mingyang Wan et al.
Dual-branch Graph Feature Learning for NLOS Imaging
Xiongfei Su, Tianyi Zhu, Lina Liu et al.
Explicit Relational Reasoning Network for Scene Text Detection
Yuchen Su, Zhineng Chen, Yongkun Du et al.
3D Annotation-Free Learning by Distilling 2D Open-Vocabulary Segmentation Models for Autonomous Driving
Boyi Sun, Yuhang Liu, Xingxia Wang et al.
NeuralFlix: A Simple While Effective Framework for Semantic Decoding of Videos from Non-invasive Brain Recordings
Jingyuan Sun, Mingxiao Li, Marie-Francine Moens
Guided and Variance-Corrected Fusion with One-shot Style Alignment for Large-Content Image Generation
Shoukun Sun, Min Xian, Tiankai Yao et al.
M2Flow: A Motion Information Fusion Framework for Enhanced Unsupervised Optical Flow Estimation in Autonomous Driving
Xunpei Sun, Gang Chen, Zuoxun Hou
Leveraging Large Vision-Language Model as User Intent-Aware Encoder for Composed Image Retrieval
Zelong Sun, Dong Jing, Guoxing Yang et al.
C2P-CLIP: Injecting Category Common Prompt in CLIP to Enhance Generalization in Deepfake Detection
Chuangchuang Tan, Renshuai Tao, Huan Liu et al.
Neighbor Does Matter: Density-Aware Contrastive Learning for Medical Semi-supervised Segmentation
Feilong Tang, Zhongxing Xu, Ming Hu et al.
MUSE: Mamba Is Efficient Multi-scale Learner for Text-video Retrieval
Haoran Tang, Meng Cao, Jinfa Huang et al.
BEV-TSR: Text-Scene Retrieval in BEV Space for Autonomous Driving
Tao Tang, Dafeng Wei, Zhengyu Jia et al.
More Text, Less Point: Towards 3D Data-Efficient Point-Language Understanding
Yuan Tang, Xu Han, Xianzhi Li et al.
RAGG: Retrieval-Augmented Grasp Generation Model
Zhenhua Tang, Bin Zhu, Yanbin Hao et al.
From Representation Space to Prognostic Insights: Whole Slide Image Generation with Hierarchical Diffusion Model for Survival Prediction
Zhihao Tang, Xi Zhang, Chaozhuo Li
3D²-Actor: Learning Pose-Conditioned 3D-Aware Denoiser for Realistic Gaussian Avatar Modeling
Zichen Tang, Hongyu Yang, Hanchen Zhang et al.
Stitch, Contrast, and Segment: Learning a Human Action Segmentation Model Using Trimmed Skeleton Videos
Haitao Tian, Pierre Payeur
Unsupervised Self-Prior Embedding Neural Representation for Iterative Sparse-View CT Reconstruction
Xuanyu Tian, Lixuan Chen, Qing Wu et al.
AI-generated Image Quality Assessment in Visual Communication
Yu Tian, Yixuan Li, Baoliang Chen et al.
G-VEval: A Versatile Metric for Evaluating Image and Video Captions Using GPT-4o
Tony Cheng Tong, Sirui He, Zhiwen Shao et al.
Memory-Augmented Re-Completion for 3D Semantic Scene Completion
Yu-Wen Tseng, Sheng-Ping Yang, Jhih-Ciang Wu et al.
TextToucher: Fine-Grained Text-to-Touch Generation
Jiahang Tu, Hao Fu, Fengyu Yang et al.
Watch Video, Catch Keyword: Context-aware Keyword Attention for Moment Retrieval and Highlight Detection
Sung Jin Um, Dongjin Kim, Sangmin Lee et al.
VOILA: Complexity-Aware Universal Segmentation of CT Images by Voxel Interacting with Language
Zishuo Wan, Yu Gao, Wanyuan Pang et al.
ParGo: Bridging Vision-Language with Partial and Global Views
An-Lan Wang, Bin Shan, Wei Shi et al.
RA-GAR: A Richly Annotated Benchmark for Gait Attribute Recognition
Chenye Wang, Saihui Hou, Aoqi Li et al.
Towards Efficient Object Re-Identification with a Novel Cloud-Edge Collaborative Framework
Chuanming Wang, Yuxin Yang, Mengshi Qi et al.
Target-Driven Distillation: Consistency Distillation with Target Timestep Selection and Decoupled Guidance
Cunzheng Wang, Ziyuan Guo, Yuxuan Duan et al.
A Black-Box Evaluation Framework for Semantic Robustness in Bird’s Eye View Detection
Fu Wang, Yanghao Zhang, Xiangyu Yin et al.
Scene Graph-Grounded Image Generation
Fuyun Wang, Tong Zhang, Yuanzhi Wang et al.
S³-Mamba: Small-Size-Sensitive Mamba for Lesion Segmentation
Gui Wang, Yuexiang Li, Wenting Chen et al.
BLS-GAN: A Deep Layer Separation Framework for Eliminating Bone Overlap in Conventional Radiographs
Haolin Wang, Yafei Ou, Prasoon Ambalathankandy et al.
EMControl: Adding Conditional Control to Text-to-Image Diffusion Models via Expectation-Maximization
He Wang, Longquan Dai, Jinhui Tang
M2OST: Many-to-one Regression for Predicting Spatial Transcriptomics from Digital Pathology Images
Hongyi Wang, Xiuju Du, Jing Liu et al.
RAP-SR: RestorAtion Prior Enhancement in Diffusion Models for Realistic Image Super-Resolution
Jiangang Wang, Qingnan Fan, Jinwei Chen et al.
MM-Mixing: Multi-Modal Mixing Alignment for 3D Understanding
Jiaze Wang, Yi Wang, Ziyu Guo et al.
OV-DQUO: Open-Vocabulary DETR with Denoising Text Query Training and Open-World Unknown Objects Supervision
Junjie Wang, Bin Chen, Bin Kang et al.
InpDiffusion: Image Inpainting Localization via Conditional Diffusion Models
Kai Wang, Shaozhang Niu, Qixian Hao et al.
Tracking Everything Everywhere across Multiple Cameras
Li-Heng Wang, YuJu Cheng, Tyng-Luh Liu
VLScene: Vision-Language Guidance Distillation for Camera-Based 3D Semantic Scene Completion
Meng Wang, Huilong Pi, Ruihui Li et al.
Deep Multi-modal Graph Clustering via Graph Transformer Network
Qianqian Wang, Haiming Xu, Zihao Zhang et al.
The Parables of the Mustard Seed and the Yeast: Extremely Low-Budget, High-Performance Nighttime Semantic Segmentation
Shiqin Wang, Xin Xu, Haoyang Chen et al.
GFlow: Recovering 4D World from Monocular Video
Shizun Wang, Xingyi Yang, Qiuhong Shen et al.
Imagine: Image-Guided 3D Part Assembly with Structure Knowledge Graph
Weihao Wang, Yu Lan, Mingyu You et al.
MVReward: Better Aligning and Evaluating Multi-View Diffusion Models with Human Preferences
Weitao Wang, Haoran Xu, Yuxiao Yang et al.
FreeGen: Bridging Visual-Linguistic Discrepancies Towards Diffusion-based Pixel-level Data Synthesis
Wenzhuang Wang, Mingcan Ma, Yong Chen et al.
DCTMamba: Advancing JPEG Image Restoration Through Long-Sequence Modeling and Adaptive Frequency Strategy
Xi Wang, Xueyang Fu, Liang Li et al.
From 2D CAD Drawings to 3D Parametric Models: A Vision-Language Approach
Xilin Wang, Jia Zheng, Yuanchao Hu et al.
Lifting Scheme-Based Implicit Disentanglement of Emotion-Related Facial Dynamics in the Wild
Xingjian Wang, Li Chai
MIMTrack: In-Context Tracking via Masked Image Modeling
Xingmei Wang, Guohao Nie, Jiaxiang Meng et al.
From Coarse to Fine: A Matching and Alignment Framework for Unsupervised Cross-View Geo-Localization
Xueyi Wang, Lele Zhang, Zheng Fan et al.