Most Cited 2025 "non-linear vaes" Papers

22,274 papers found • Page 100 of 112

#19801

DMF-Net: Image-Guided Point Cloud Completion with Dual-Channel Modality Fusion and Shape-Aware Upsampling Transformer

Aihua Mao, Yuxuan Tang, Jiangtao Huang et al.

AAAI 2025paperarXiv:2406.17319
#19802

Black-Box Test-Time Prompt Tuning for Vision-Language Models

Fan'an Meng, Chaoran Cui, Hongjun Dai et al.

AAAI 2025paper
#19803

Sp3ctralMamba: Physics-Driven Joint State Space Model for Hyperspectral Image Reconstruction

Ge Meng, Jingyan Tu, Jingjia Huang et al.

AAAI 2025paper
#19804

Qua2SeDiMo: Quantifiable Quantization Sensitivity of Diffusion Models

Keith G. Mills, Mohammad Salameh, Ruichen Chen et al.

AAAI 2025paper
#19805

Energy vs. Noise: Towards Robust Temporal Action Localization in Open-World

Chenyu Mu, Jiahua Li, Kun Wei et al.

AAAI 2025paper
#19806

SegFace: Face Segmentation of Long-Tail Classes

Kartik Narayan, Vibashan Vs, Vishal M. Patel

AAAI 2025paperarXiv:2412.08647
#19807

HiGDA: Hierarchical Graph of Nodes to Learn Local-to-Global Topology for Semi-Supervised Domain Adaptation

Ba Hung Ngo, Doanh C. Bui, Nhat-Tuong Do-Tran et al.

AAAI 2025paperarXiv:2412.11819
#19808

iMoT: Inertial Motion Transformer for Inertial Navigation

Son Minh Nguyen, Duc Viet Le, Paul Havinga

AAAI 2025paperarXiv:2412.12190
#19809

SPU-IMR: Self-supervised Arbitrary-scale Point Cloud Upsampling via Iterative Mask-recovery Network

Ziming Nie, Qiao Wu, Chenlei Lv et al.

AAAI 2025paperarXiv:2502.19452
#19810

Exploring Semantic Consistency and Style Diversity for Domain Generalized Semantic Segmentation

Hongwei Niu, Linhuang Xie, Jianghang Lin et al.

AAAI 2025paperarXiv:2412.12050
#19811

Locate Anything on Earth: Advancing Open-Vocabulary Object Detection for Remote Sensing Community

Jiancheng Pan, Yanxing Liu, Yuqian Fu et al.

AAAI 2025paperarXiv:2408.09110
#19812

Learning with Open-world Noisy Data via Class-independent Margin in Dual Representation Space

Linchao Pan, Can Gao, Jie Zhou et al.

AAAI 2025paperarXiv:2501.11053
#19813

DuSSS: Dual Semantic Similarity-Supervised Vision-Language Model for Semi-Supervised Medical Image Segmentation

Qingtao Pan, Wenhao Qiao, Jingjiao Lou et al.

AAAI 2025paperarXiv:2412.12492
#19814

Fair Training with Zero Inputs

Wenjie Pan, Jianqing Zhu, Huanqiang Zeng

AAAI 2025paper
#19815

Procedure Knowledge Decoupled Distillation Strategy for Procedure Planning in Instructional Videos

Xiaotian Pan, Zhaobo Qi, Xin Sun et al.

AAAI 2025paper
#19816

S2S2: Semantic Stacking for Robust Semantic Segmentation in Medical Imaging

Yimu Pan, Sitao Zhang, Alison D. Gernand et al.

AAAI 2025paperarXiv:2412.13156
#19817

Point Cloud Semantic Segmentation with Sparse and Inhomogeneous Annotations

Zhiyi Pan, Nan Zhang, Wei Gao et al.

AAAI 2025paperarXiv:2312.06259
#19818

Modular-Cam: Modular Dynamic Camera-view Video Generation with LLM

Zirui Pan, Xin Wang, Yipeng Zhang et al.

AAAI 2025paperarXiv:2504.12048
#19819

Partially Blinded Unlearning: Class Unlearning for Deep Networks from Bayesian Perspective

Subhodip Panda, Shashwat Sourav, Prathosh A.P.

AAAI 2025paper
#19820

Beyond Text: Fine-Grained Multi-Modal Fact Verification with Hypergraph Transformers

Hui Pang, Chaozhuo Li, Litian Zhang et al.

AAAI 2025paper
#19821

SeeDiff: Off-the-Shelf Seeded Mask Generation from Diffusion Models

Joon Hyun Park, Kumju Jo, Sungyong Baik

AAAI 2025paperarXiv:2507.19808
#19822

EfficientVMamba: Atrous Selective Scan for Light Weight Visual Mamba

Xiaohuan Pei, Tao Huang, Chang Xu

AAAI 2025paperarXiv:2403.09977
#19823

CDE-Learning: Camera Deviation Elimination Learning for Unsupervised Person Re-identification

Jinjia Peng, Songyu Zhang, Huibing Wang

AAAI 2025paper
#19824

Adaptive Dual-domain Learning for Underwater Image Enhancement

Lintao Peng, Liheng Bian

AAAI 2025paperarXiv:2504.19198
#19825

Boosting Image De-Raining via Central-Surrounding Synergistic Convolution

Long Peng, Yang Wang, Xin Di et al.

AAAI 2025paper
#19826

3D-aware Select, Expand, and Squeeze Token for Aerial Action Recognition

Luying Peng, Xiangbo Shu, Yazhou Yao et al.

AAAI 2025paper
#19827

OAMaskFlow: Occlusion-Aware Motion Mask for Scene Flow

Xiongfeng Peng, Zhihua Liu, Weiming Li et al.

AAAI 2025paper
#19828

HVDualformer: Histogram-Vision Dual Transformer for White Balance

Yan-Tsung Peng, Guan-Rong Chen

AAAI 2025paper
#19829

Semi-supervised 3D Semantic Scene Completion with 2D Vision Foundation Model Guidance

Duc-Hai Pham, Duc-Dung Nguyen, Anh Pham et al.

AAAI 2025paperarXiv:2408.11559
#19830

Leveraging Anatomical Consistency for Multi-Object Detection in Ultrasound Images via Source-free Unsupervised Domain Adaptation

Bin Pu, Xingguo Lv, Jiewen Yang et al.

AAAI 2025paper
#19831

Dive into Aerial Remote Sensing Underwater Depth Estimation with Hyperspectral Imagery

Jiahao Qi, Xingyue Liu, Chen Chen et al.

AAAI 2025paper
#19832

Unsupervised Domain Adaptive Person Search via Dual Self-Calibration

Linfeng Qi, Huibing Wang, Jiqing Zhang et al.

AAAI 2025paperarXiv:2412.16506
#19833

PhysDiff: Physiology-based Dynamicity Disentangled Diffusion Model for Remote Physiological Measurement

Wei Qian, Gaoji Su, Dan Guo et al.

AAAI 2025paper
#19834

Holistic Correction with Object Prototype for Video Object Segmentation

Shengye Qiao, Changqun Xia, Yanjie Liang et al.

AAAI 2025paper
#19835

Integrating Low-Level Visual Cues for Enhanced Unsupervised Semantic Segmentation

Yuhao Qing, Dan Zeng, Shaorong Xie et al.

AAAI 2025paper
#19836

PC-BEV: An Efficient Polar-Cartesian BEV Fusion Framework for LiDAR Semantic Segmentation

Shoumeng Qiu, Xinrun Li, Xiangyang Xue et al.

AAAI 2025paperarXiv:2412.14821
#19837

High-Fidelity Polarimetric Implicit 3D Reconstruction with View-Dependent Physical Representation

Yu Qiu, Sijia Wen, Hainan Zhang et al.

AAAI 2025paper
#19838

HSOD-BIT-V2: A Challenging Benchmark for Hyperspectral Salient Object Detection

Yuhao Qiu, Shuyan Bai, Tingfa Xu et al.

AAAI 2025paper
#19839

Universal Features Guided Zero-Shot Category-Level Object Pose Estimation

Wentian Qu, Chenyu Meng, Heng Li et al.

AAAI 2025paperarXiv:2501.02831
#19840

GHOST: Gaussian Hypothesis Open-Set Technique

Ryan Rabinowitz, Steve Cruz, Manuel Günther et al.

AAAI 2025paperarXiv:2502.03359
#19841

CDTR: Semantic Alignment for Video Moment Retrieval Using Concept Decomposition Transformer

Ran Ran, Jiwei Wei, Xiangyi Cai et al.

AAAI 2025paper
#19842

Improving Integrated Gradient-based Transferable Adversarial Examples by Refining the Integration Path

Yuchen Ren, Zhengyu Zhao, Chenhao Lin et al.

AAAI 2025paperarXiv:2412.18844
#19843

GenHMR: Generative Human Mesh Recovery

Muhammad Usama Saleem, Ekkasit Pinyoanuntapong, Pu Wang et al.

AAAI 2025paperarXiv:2412.14444
#19844

FunEditor: Achieving Complex Image Edits via Function Aggregation with Diffusion Models

Mohammadreza Samadi, Fred X. Han, Mohammad Salameh et al.

AAAI 2025paperarXiv:2408.08495
#19845

PVTree: Realistic and Controllable Palm Vein Generation for Recognition Tasks

Sheng Shang, Chenglong Zhao, Ruixin Zhang et al.

AAAI 2025paperarXiv:2503.02547
#19846

Video Summarization Using Denoising Diffusion Probabilistic Model

Zirui Shang, Yubo Zhu, Hongxi Li et al.

AAAI 2025paperarXiv:2412.08357
#19847

IMAGDressing-v1: Customizable Virtual Dressing

Fei Shen, Xin Jiang, Xin He et al.

AAAI 2025paperarXiv:2407.12705
#19848

In2NeCT: Inter-class and Intra-class Neural Collapse Tuning for Semantic Segmentation of Imbalanced Remote Sensing Images

Junao Shen, Qiyun Hu, Tian Feng et al.

AAAI 2025paper
#19849

Topology-Aware 3D Gaussian Splatting: Leveraging Persistent Homology for Optimized Structural Integrity

Tianqi Shen, Shaohua Liu, Jiaqi Feng et al.

AAAI 2025paperarXiv:2412.16619
#19850

Free-Moving Object Reconstruction and Pose Estimation with Virtual Camera

Haixin Shi, Yinlin Hu, Daniel Koguciuk et al.

AAAI 2025paperarXiv:2405.05858
#19851

Normal-NeRF: Ambiguity-Robust Normal Estimation for Highly Reflective Scenes

Ji Shi, Xianghua Ying, Ruohao Guo et al.

AAAI 2025paperarXiv:2501.09460
#19852

Neural Block Compression: Variable Bitrates Feature Blocks for Texture Representation

Rui Shi, Yishun Dou, Zhong Zheng et al.

AAAI 2025paper
#19853

HS-FPN: High Frequency and Spatial Perception FPN for Tiny Object Detection

Zican Shi, Jing Hu, Jie Ren et al.

AAAI 2025paperarXiv:2412.10116
#19854

SdalsNet: Self-Distilled Attention Localization and Shift Network for Unsupervised Camouflaged Object Detection

Peiyao Shou, Yixiu Liu, Wei Wang et al.

AAAI 2025paper
#19855

OGP-Net: Optical Guidance Meets Pixel-Level Contrastive Distillation for Robust Multi-Modal and Missing Modality Segmentation

Aniruddh Sikdar, Jayant Teotia, Suresh Sundaram

AAAI 2025paper
#19856

Fine-Grained Perception in Panoramic Scenes: A Novel Task, Dataset, and Method for Object Importance Ranking

Jia Song, Chenglizhao Chen, Xu Yu et al.

AAAI 2025paper
#19857

CtrlAvatar: Controllable Avatars Generation via Disentangled Invertible Networks

Wenfeng Song, Yang Ding, Fei Hou et al.

AAAI 2025paper
#19858

ERL-MPP: Evolutionary Reinforcement Learning with Multi-head Puzzle Perception for Solving Large-scale Jigsaw Puzzles of Eroded Gaps

Xingke Song, Xiaoying Yang, Chenglin Yao et al.

AAAI 2025paperarXiv:2504.09608
#19859

Temporal Coherent Object Flow for Multi-Object Tracking

Zikai Song, Run Luo, Lintao Ma et al.

AAAI 2025paper
#19860

Toward Improving Robustness and Accuracy in Unsupervised Domain Adaptation

Aishwarya Soni, Tanima Dutta

AAAI 2025paper
#19861

Hierarchical Vector Quantization for Unsupervised Action Segmentation

Federico Spurio, Emad Bahrami, Gianpiero Francesca et al.

AAAI 2025paperarXiv:2412.17640
#19862

Can We Get Rid of Handcrafted Feature Extractors? SparseViT: Nonsemantics-Centered, Parameter-Efficient Image Manipulation Localization Through Spare-Coding Transformer

Lei Su, Xiaochen Ma, Xuekang Zhu et al.

AAAI 2025paperarXiv:2412.14598
#19863

EigenSR: Eigenimage-Bridged Pre-Trained RGB Learners for Single Hyperspectral Image Super-Resolution

Xi Su, Xiangfei Shen, Mingyang Wan et al.

AAAI 2025paperarXiv:2409.04050
#19864

Dual-branch Graph Feature Learning for NLOS Imaging

Xiongfei Su, Tianyi Zhu, Lina Liu et al.

AAAI 2025paperarXiv:2502.19683
#19865

Explicit Relational Reasoning Network for Scene Text Detection

Yuchen Su, Zhineng Chen, Yongkun Du et al.

AAAI 2025paperarXiv:2412.14692
#19866

3D Annotation-Free Learning by Distilling 2D Open-Vocabulary Segmentation Models for Autonomous Driving

Boyi Sun, Yuhang Liu, Xingxia Wang et al.

AAAI 2025paperarXiv:2405.15286
#19867

NeuralFlix: A Simple While Effective Framework for Semantic Decoding of Videos from Non-invasive Brain Recordings

Jingyuan Sun, Mingxiao Li, Marie-Francine Moens

AAAI 2025paper
#19868

Guided and Variance-Corrected Fusion with One-shot Style Alignment for Large-Content Image Generation

Shoukun Sun, Min Xian, Tiankai Yao et al.

AAAI 2025paperarXiv:2412.12771
#19869

M2Flow: A Motion Information Fusion Framework for Enhanced Unsupervised Optical Flow Estimation in Autonomous Driving

Xunpei Sun, Gang Chen, Zuoxun Hou

AAAI 2025paper
#19870

Leveraging Large Vision-Language Model as User Intent-Aware Encoder for Composed Image Retrieval

Zelong Sun, Dong Jing, Guoxing Yang et al.

AAAI 2025paperarXiv:2412.11087
#19871

C2P-CLIP: Injecting Category Common Prompt in CLIP to Enhance Generalization in Deepfake Detection

Chuangchuang Tan, Renshuai Tao, Huan Liu et al.

AAAI 2025paperarXiv:2408.09647
#19872

Neighbor Does Matter: Density-Aware Contrastive Learning for Medical Semi-supervised Segmentation

Feilong Tang, Zhongxing Xu, Ming Hu et al.

AAAI 2025paperarXiv:2412.19871
#19873

MUSE: Mamba Is Efficient Multi-scale Learner for Text-video Retrieval

Haoran Tang, Meng Cao, Jinfa Huang et al.

AAAI 2025paperarXiv:2408.10575
#19874

BEV-TSR: Text-Scene Retrieval in BEV Space for Autonomous Driving

Tao Tang, Dafeng Wei, Zhengyu Jia et al.

AAAI 2025paperarXiv:2401.01065
#19875

More Text, Less Point: Towards 3D Data-Efficient Point-Language Understanding

Yuan Tang, Xu Han, Xianzhi Li et al.

AAAI 2025paperarXiv:2408.15966
#19876

RAGG: Retrieval-Augmented Grasp Generation Model

Zhenhua Tang, Bin Zhu, Yanbin Hao et al.

AAAI 2025paper
#19877

From Representation Space to Prognostic Insights: Whole Slide Image Generation with Hierarchical Diffusion Model for Survival Prediction

Zhihao Tang, Xi Zhang, Chaozhuo Li

AAAI 2025paper
#19878

3D²-Actor: Learning Pose-Conditioned 3D-Aware Denoiser for Realistic Gaussian Avatar Modeling

Zichen Tang, Hongyu Yang, Hanchen Zhang et al.

AAAI 2025paper
#19879

Stitch, Contrast, and Segment: Learning a Human Action Segmentation Model Using Trimmed Skeleton Videos

Haitao Tian, Pierre Payeur

AAAI 2025paper
#19880

Unsupervised Self-Prior Embedding Neural Representation for Iterative Sparse-View CT Reconstruction

Xuanyu Tian, Lixuan Chen, Qing Wu et al.

AAAI 2025paperarXiv:2502.05445
#19881

AI-generated Image Quality Assessment in Visual Communication

Yu Tian, Yixuan Li, Baoliang Chen et al.

AAAI 2025paperarXiv:2412.15677
#19882

G-VEval: A Versatile Metric for Evaluating Image and Video Captions Using GPT-4o

Tony Cheng Tong, Sirui He, Zhiwen Shao et al.

AAAI 2025paperarXiv:2412.13647
#19883

Memory-Augmented Re-Completion for 3D Semantic Scene Completion

Yu-Wen Tseng, Sheng-Ping Yang, Jhih-Ciang Wu et al.

AAAI 2025paper
#19884

TextToucher: Fine-Grained Text-to-Touch Generation

Jiahang Tu, Hao Fu, Fengyu Yang et al.

AAAI 2025paperarXiv:2409.05427
#19885

Watch Video, Catch Keyword: Context-aware Keyword Attention for Moment Retrieval and Highlight Detection

Sung Jin Um, Dongjin Kim, Sangmin Lee et al.

AAAI 2025paperarXiv:2501.02504
#19886

VOILA: Complexity-Aware Universal Segmentation of CT Images by Voxel Interacting with Language

Zishuo Wan, Yu Gao, Wanyuan Pang et al.

AAAI 2025paperarXiv:2501.03482
#19887

ParGo: Bridging Vision-Language with Partial and Global Views

An-Lan Wang, Bin Shan, Wei Shi et al.

AAAI 2025paperarXiv:2408.12928
#19888

RA-GAR: A Richly Annotated Benchmark for Gait Attribute Recognition

Chenye Wang, Saihui Hou, Aoqi Li et al.

AAAI 2025paper
#19889

Towards Efficient Object Re-Identification with a Novel Cloud-Edge Collaborative Framework

Chuanming Wang, Yuxin Yang, Mengshi Qi et al.

AAAI 2025paperarXiv:2401.02041
#19890

Target-Driven Distillation: Consistency Distillation with Target Timestep Selection and Decoupled Guidance

Cunzheng Wang, Ziyuan Guo, Yuxuan Duan et al.

AAAI 2025paperarXiv:2409.01347
#19891

A Black-Box Evaluation Framework for Semantic Robustness in Bird’s Eye View Detection

Fu Wang, Yanghao Zhang, Xiangyu Yin et al.

AAAI 2025paperarXiv:2412.13913
#19892

Scene Graph-Grounded Image Generation

Fuyun Wang, Tong Zhang, Yuanzhi Wang et al.

AAAI 2025paper
#19893

S³-Mamba: Small-Size-Sensitive Mamba for Lesion Segmentation

Gui Wang, Yuexiang Li, Wenting Chen et al.

AAAI 2025paper
#19894

BLS-GAN: A Deep Layer Separation Framework for Eliminating Bone Overlap in Conventional Radiographs

Haolin Wang, Yafei Ou, Prasoon Ambalathankandy et al.

AAAI 2025paperarXiv:2409.07304
#19895

EMControl: Adding Conditional Control to Text-to-Image Diffusion Models via Expectation-Maximization

He Wang, Longquan Dai, Jinhui Tang

AAAI 2025paper
#19896

M2OST: Many-to-one Regression for Predicting Spatial Transcriptomics from Digital Pathology Images

Hongyi Wang, Xiuju Du, Jing Liu et al.

AAAI 2025paperarXiv:2409.15092
#19897

RAP-SR: RestorAtion Prior Enhancement in Diffusion Models for Realistic Image Super-Resolution

Jiangang Wang, Qingnan Fan, Jinwei Chen et al.

AAAI 2025paperarXiv:2412.07149
#19898

MM-Mixing: Multi-Modal Mixing Alignment for 3D Understanding

Jiaze Wang, Yi Wang, Ziyu Guo et al.

AAAI 2025paperarXiv:2405.18523
#19899

OV-DQUO: Open-Vocabulary DETR with Denoising Text Query Training and Open-World Unknown Objects Supervision

Junjie Wang, Bin Chen, Bin Kang et al.

AAAI 2025paperarXiv:2405.17913
#19900

InpDiffusion: Image Inpainting Localization via Conditional Diffusion Models

Kai Wang, Shaozhang Niu, Qixian Hao et al.

AAAI 2025paperarXiv:2501.02816
#19901

Tracking Everything Everywhere across Multiple Cameras

Li-Heng Wang, YuJu Cheng, Tyng-Luh Liu

AAAI 2025paper
#19902

VLScene: Vision-Language Guidance Distillation for Camera-Based 3D Semantic Scene Completion

Meng Wang, Huilong Pi, Ruihui Li et al.

AAAI 2025paperarXiv:2503.06219
#19903

Deep Multi-modal Graph Clustering via Graph Transformer Network

Qianqian Wang, Haiming Xu, Zihao Zhang et al.

AAAI 2025paper
#19904

The Parables of the Mustard Seed and the Yeast: Extremely Low-Budget, High-Performance Nighttime Semantic Segmentation

Shiqin Wang, Xin Xu, Haoyang Chen et al.

AAAI 2025paper
#19905

GFlow: Recovering 4D World from Monocular Video

Shizun Wang, Xingyi Yang, Qiuhong Shen et al.

AAAI 2025paperarXiv:2405.18426
#19906

Imagine: Image-Guided 3D Part Assembly with Structure Knowledge Graph

Weihao Wang, Yu Lan, Mingyu You et al.

AAAI 2025paper
#19907

MVReward: Better Aligning and Evaluating Multi-View Diffusion Models with Human Preferences

Weitao Wang, Haoran Xu, Yuxiao Yang et al.

AAAI 2025paperarXiv:2412.06614
#19908

FreeGen: Bridging Visual-Linguistic Discrepancies Towards Diffusion-based Pixel-level Data Synthesis

Wenzhuang Wang, Mingcan Ma, Yong Chen et al.

AAAI 2025paper
#19909

DCTMamba: Advancing JPEG Image Restoration Through Long-Sequence Modeling and Adaptive Frequency Strategy

Xi Wang, Xueyang Fu, Liang Li et al.

AAAI 2025paper
#19910

From 2D CAD Drawings to 3D Parametric Models: A Vision-Language Approach

Xilin Wang, Jia Zheng, Yuanchao Hu et al.

AAAI 2025paperarXiv:2412.11892
#19911

Lifting Scheme-Based Implicit Disentanglement of Emotion-Related Facial Dynamics in the Wild

Xingjian Wang, Li Chai

AAAI 2025paperarXiv:2412.13168
#19912

MIMTrack: In-Context Tracking via Masked Image Modeling

Xingmei Wang, Guohao Nie, Jiaxiang Meng et al.

AAAI 2025paper
#19913

From Coarse to Fine: A Matching and Alignment Framework for Unsupervised Cross-View Geo-Localization

Xueyi Wang, Lele Zhang, Zheng Fan et al.

AAAI 2025paper
#19914

RefDetector: A Simple Yet Effective Matching-based Method for Referring Expression Comprehension

Yabing Wang, Zhuotao Tian, Zheng Qin et al.

AAAI 2025paper
#19915

Hierarchical Alignment-enhanced Adaptive Grounding Network for Generalized Referring Expression Comprehension

Yaxian Wang, Henghui Ding, Shuting He et al.

AAAI 2025paperarXiv:2501.01416
#19916

Breaking Barriers in Physical-World Adversarial Examples: Improving Robustness and Transferability via Robust Feature

Yichen Wang, Yuxuan Chou, Ziqi Zhou et al.

AAAI 2025paperarXiv:2412.16958
#19917

Capturing the Unseen: Vision-Free Facial Motion Capture Using Inertial Measurement Units

Youjia Wang, Yiwen Wu, Hengan Zhou et al.

AAAI 2025paperarXiv:2402.03944
#19918

Re-Attentional Controllable Video Diffusion Editing

Yuanzhi Wang, Yong Li, Mengyi Liu et al.

AAAI 2025paperarXiv:2412.11710
#19919

MambaPro: Multi-Modal Object Re-identification with Mamba Aggregation and Synergistic Prompt

Yuhao Wang, Xuehu Liu, Tianyu Yan et al.

AAAI 2025paperarXiv:2412.10707
#19920

IteRPrimE: Zero-shot Referring Image Segmentation with Iterative Grad-CAM Refinement and Primary Word Emphasis

Yuji Wang, Jingchen Ni, Yong Liu et al.

AAAI 2025paperarXiv:2503.00936
#19921

Target Scanpath-Guided 360-Degree Image Enhancement

Yujia Wang, Fang-Lue Zhang, Neil A. Dodgson

AAAI 2025paper
#19922

DualNet: Robust Self-Supervised Stereo Matching with Pseudo-Label Supervision

Yun Wang, Jiahao Zheng, Chenghao Zhang et al.

AAAI 2025paper
#19923

Mamba YOLO: A Simple Baseline for Object Detection with State Space Model

Zeyu Wang, Chen Li, Huiying Xu et al.

AAAI 2025paperarXiv:2406.05835
#19924

Style Nursing with Spatial and Semantic Guidance for Zero-Shot Traffic Scene Style Transfer

Zhen Wang, Zihang Lin, Meng Yuan et al.

AAAI 2025paper
#19925

Thermal-Aware Low-Light Image Enhancement: A Real-World Benchmark and a New Light-Weight Model

Zhen Wang, Yaozu Wu, Dongyuan Li et al.

AAAI 2025paper
#19926

Attention-Imperceptible Backdoor Attacks on Vision Transformers

Zhishen Wang, Rui Wang, Lihua Jing

AAAI 2025paper
#19927

LLM-RG4: Flexible and Factual Radiology Report Generation Across Diverse Input Contexts

Zhuhao Wang, Yihua Sun, Zihan Li et al.

AAAI 2025paperarXiv:2412.12001
#19928

MSV-PCT: Multi-Sparse-View Enhanced Transformer Framework for Salient Object Detection in Point Clouds

Zihao Wang, Yiming Huang, Gengyu Lyu et al.

AAAI 2025paper
#19929

GlyphSR: A Simple Glyph-Aware Framework for Scene Text Image Super-Resolution

Baole Wei, Yuxuan Zhou, Liangcai Gao et al.

AAAI 2025paper
#19930

Power of Diversity: Enhancing Data-Free Black-Box Attack with Domain-Augmented Learning

Yang Wei, Jingyu Tan, Guowen Xu et al.

AAAI 2025paper
#19931

Achieving Lightweight Super-Resolution for Real-Time Computer Graphics

Yu Wen, Chen Zhang, Chenhao Xie et al.

AAAI 2025paper
#19932

Multi-axis Prompt and Multi-dimension Fusion Network for All-in-one Weather-degraded Image Restoration

Yuanbo Wen, Tao Gao, Jing Zhang et al.

AAAI 2025paper
#19933

USDRL: Unified Skeleton-Based Dense Representation Learning with Multi-Grained Feature Decorrelation

Wanjiang Weng, Hongsong Wang, Junbo Wang et al.

AAAI 2025paperarXiv:2412.09220
#19934

Spin: Diffusion-based Semantic Image Painting Through Independent Information Injection

Dantong Wu, Zhiqiang Chen, Tianjiao Du et al.

AAAI 2025paper
#19935

Structural Pruning via Spatial-aware Information Redundancy for Semantic Segmentation

Dongyue Wu, Zilin Guo, Li Yu et al.

AAAI 2025paperarXiv:2412.12672
#19936

SVRMamba: Slice-to-Volume Reconstruction from Multiple MRI Stacks with Slice Sequence Guided Mamba

Jiangjie Wu, Hongjiang Wei, Yuyao Zhang

AAAI 2025paper
#19937

VarCMP: Adapting Cross-Modal Pre-Training Models for Video Anomaly Retrieval

Peng Wu, Wanshun Su, Xiangteng He et al.

AAAI 2025paper
#19938

Realistic Noise Synthesis with Diffusion Models

Qi Wu, Mingyan Han, Ting Jiang et al.

AAAI 2025paperarXiv:2305.14022
#19939

PanAdapter: Two-Stage Fine-Tuning with Spatial-Spectral Priors Injecting for Pansharpening

RuoCheng Wu, Zien Zhang, Shangqi Deng et al.

AAAI 2025paperarXiv:2409.06980
#19940

Combating Multimodal LLM Hallucination via Bottom-Up Holistic Reasoning

Shengqiong Wu, Hao Fei, Liangming Pan et al.

AAAI 2025paperarXiv:2412.11124
#19941

CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities

Tao Wu, Yong Zhang, Xintao Wang et al.

AAAI 2025paperarXiv:2408.13239
#19942

Deconfound Semantic Shift and Incompleteness in Incremental Few-shot Semantic Segmentation

Yirui Wu, Yuhang Xia, Hao Li et al.

AAAI 2025paper
#19943

Video Repurposing from User Generated Content: A Large-scale Dataset and Benchmark

Yongliang Wu, Wenbo Zhu, Jiawang Cao et al.

AAAI 2025paperarXiv:2412.08879
#19944

MUCD: Unsupervised Point Cloud Change Detection via Masked Consistency

Yue Wu, Zhipeng Wang, Yongzhe Yuan et al.

AAAI 2025paper
#19945

Unified Knowledge Maintenance Pruning and Progressive Recovery with Weight Recalling for Large Vision-Language Models

Zimeng Wu, Jiaxin Chen, Yunhong Wang

AAAI 2025paper
#19946

RETRACTED: GEONet: Global Enhancement and Optimization Network for Lane Detection

Suyang Xi, Yunhao Liu, Hong Ding et al.

AAAI 2025paper
#19947

PlaNet: Learning to Mitigate Atmospheric Turbulence in Planetary Images

Yifei Xia, Chu Zhou, Chengxuan Zhu et al.

AAAI 2025paper
#19948

CA-Edit: Causality-Aware Condition Adapter for High-Fidelity Local Facial Attribute Editing

Xiaole Xian, Xilin He, Zenghao Niu et al.

AAAI 2025paperarXiv:2412.13565
#19949

ReMask-Animate: Refined Character Image Animation Using Mask-Guided Adapters

Xunzhi Xiang, Haiwei Xue, Zonghong Dai et al.

AAAI 2025paper
#19950

SMR-Net: Semantic-Guided Mutually Reinforcing Network for Cross-Modal Image Fusion and Salient Object Detection

Guobao Xiao, Xinyu Liu, Zebin Lin et al.

AAAI 2025paper
#19951

Boosting Vision State Space Model with Fractal Scanning

Haoke Xiao, Lv Tang, Peng-tao Jiang et al.

AAAI 2025paper
#19952

Text Proxy: Decomposing Retrieval from a 1-to-N Relationship into N 1-to-1 Relationships for Text-Video Retrieval

Jian Xiao, Zhenzhen Hu, Jia Li et al.

AAAI 2025paperarXiv:2410.06618
#19953

Cross-modulated Attention Transformer for RGBT Tracking

Yun Xiao, Jiacong Zhao, Andong Lu et al.

AAAI 2025paperarXiv:2408.02222
#19954

Omni-Query Active Learning for Source-Free Domain Adaptive Cross-Modality 3D Semantic Segmentation

Jianxiang Xie, Yao Wu, Yachao Zhang et al.

AAAI 2025paper
#19955

TextRefiner: Internal Visual Feature as Efficient Refiner for Vision-Language Models Prompt Tuning

Jingjing Xie, Yuxin Zhang, Jun Peng et al.

AAAI 2025paperarXiv:2412.08176
#19956

Discrete Prior-Based Temporal-Coherent Content Prediction for Blind Face Video Restoration

Lianxin Xie, Bingbing Zheng, Wen Xue et al.

AAAI 2025paperarXiv:2501.09960
#19957

Expand VSR Benchmark for VLLM to Expertize in Spatial Rules

Peijin Xie, Lin Sun, Bingquan Liu et al.

AAAI 2025paperarXiv:2412.18224
#19958

PointTalk: Audio-Driven Dynamic Lip Point Cloud for 3D Gaussian-based Talking Head Synthesis

Yifan Xie, Tao Feng, Xin Zhang et al.

AAAI 2025paperarXiv:2412.08504
#19959

HieraFashDiff: Hierarchical Fashion Design with Multi-stage Diffusion Models

Zhifeng Xie, Hao Li, Huiming Ding et al.

AAAI 2025paperarXiv:2401.07450
#19960

Few-Shot Incremental Learning via Foreground Aggregation and Knowledge Transfer for Audio-Visual Semantic Segmentation

Jingqiao Xiu, Mengze Li, Zongxin Yang et al.

AAAI 2025paper
#19961

DiffScene: Diffusion-Based Safety-Critical Scenario Generation for Autonomous Vehicles

Chejian Xu, Aleksandr Petiushko, Ding Zhao et al.

AAAI 2025paper
#19962

FR²Seg: Continual Segmentation Across Multiple Sites via Fourier Style Replay and Adaptive Consistency Regularization

Cheng Xu, Weiwen Zhang, Hongrui Zhang et al.

AAAI 2025paper
#19963

Less Is More: Token Context-Aware Learning for Object Tracking

Chenlong Xu, Bineng Zhong, Qihua Liang et al.

AAAI 2025paperarXiv:2501.00758
#19964

3DHumanEdit: Multi-modal Body Part-aware Conditioning Information Integration for 3D Human Manipulation

FeiFan Xu, Tianyi Chen, Fan Yang et al.

AAAI 2025paper
#19965

Motion Artifact Removal in Pixel-Frequency Domain via Alternate Masks and Diffusion Model

Jiahua Xu, Dawei Zhou, Lei Hu et al.

AAAI 2025paperarXiv:2412.07590
#19966

OmniSR: Shadow Removal Under Direct and Indirect Lighting

Jiamin Xu, Zelong Li, Yuxin Zheng et al.

AAAI 2025paperarXiv:2410.01719
#19967

Multiple Feature Refining Network for Visual Emotion Distribution Learning

Qinfu Xu, Shaozu Yuan, Yiwei Wei et al.

AAAI 2025paper
#19968

SCKD: Semi-Supervised Cross-Modality Knowledge Distillation for 4D Radar Object Detection

Ruoyu Xu, Zhiyu Xiang, Chenwei Zhang et al.

AAAI 2025paperarXiv:2412.14571
#19969

LiON: Learning Point-Wise Abstaining Penalty for LiDAR Outlier DetectioN Using Diverse Synthetic Data

Shaocong Xu, Pengfei Li, Qianpu Sun et al.

AAAI 2025paperarXiv:2309.10230
#19970

Zero-shot Video Moment Retrieval via Off-the-shelf Multimodal Large Language Models

Yifang Xu, Yunzhuo Sun, Benxiang Zhai et al.

AAAI 2025paperarXiv:2501.07972
#19971

HOIMamba: Efficient Mamba-based Disentangled Progressive Learning for HOI Detection

Yongchao Xu, Jiawei Liu, Sen Tao et al.

AAAI 2025paper
#19972

OOTDiffusion: Outfitting Fusion Based Latent Diffusion for Controllable Virtual Try-On

Yuhao Xu, Tao Gu, Weifeng Chen et al.

AAAI 2025paperarXiv:2403.01779
#19973

FLAME: Learning to Navigate with Multimodal LLM in Urban Environments

Yunzhe Xu, Yiyuan Pan, Zhe Liu et al.

AAAI 2025paperarXiv:2408.11051
#19974

FATE: Feature-Adapted Parameter Tuning for Vision-Language Models

Zhengqin Xu, Zelin Peng, Xiaokang Yang et al.

AAAI 2025paper
#19975

Toward Modality Gap: Vision Prototype Learning for Weakly-supervised Semantic Segmentation with CLIP

Zhongxing Xu, Feilong Tang, Zhe Chen et al.

AAAI 2025paperarXiv:2412.19650
#19976

RetouchGPT: LLM-based Interactive High-Fidelity Face Retouching via Imperfection Prompting

Wen Xue, Chun Ding, Ruotao Xu et al.

AAAI 2025paper
#19977

Physical Marker: Revealing Invisible Hyperlinks Hidden in Printed Trademarks

Yuliang Xue, Lei Tan, Guobiao Li et al.

AAAI 2025paper
#19978

Towards Universal Rainy Image Restoration: Benchmark and Baseline

Hujie Yan

AAAI 2025paper
#19979

SGTC: Semantic-Guided Triplet Co-training for Sparsely Annotated Semi-Supervised Medical Image Segmentation

Ke Yan, Qing Cai, Fan Zhang et al.

AAAI 2025paperarXiv:2412.15526
#19980

Data-Free Universal Attack by Exploiting the Intrinsic Vulnerability of Deep Models

YangTian Yan, Jinyu Tian

AAAI 2025paperarXiv:2503.22205
#19981

Robust Image Hashing Based on Contrastive Masked Autoencoder with Weak-Strong Augmentation Alignment

Cundian Yang, Guibo Luo, Yuesheng Zhu et al.

AAAI 2025paper
#19982

PlanLLM: Video Procedure Planning with Refinable Large Language Models

Dejie Yang, Zijing Zhao, Yang Liu

AAAI 2025paperarXiv:2412.19139
#19983

3CAD: A Large-Scale Real-World 3C Product Dataset for Unsupervised Anomaly Detection

Enquan Yang, Peng Xing, Hanyang Sun et al.

AAAI 2025paper
#19984

Diffusion Prior Interpolation for Flexibility Real-World Face Super-Resolution

Jiarui Yang, Tao Dai, Yufei Zhu et al.

AAAI 2025paperarXiv:2412.16552
#19985

SMamba: Sparse Mamba for Event-based Object Detection

Nan Yang, Yang Wang, Zhanwen Liu et al.

AAAI 2025paperarXiv:2501.11971
#19986

One-Shot Reference-based Structure-Aware Image to Sketch Synthesis

Rui Yang, Honghong Yang, Li Zhao et al.

AAAI 2025paper
#19987

LiDAR-LLM: Exploring the Potential of Large Language Models for 3D LiDAR Understanding

Senqiao Yang, Jiaming Liu, Renrui Zhang et al.

AAAI 2025paperarXiv:2312.14074
#19988

Asymmetric Hierarchical Difference-aware Interaction Network for Event-guided Motion Deblurring

Wen Yang, Jinjian Wu, Leida Li et al.

AAAI 2025paper
#19989

Dual Information Purification for Lightweight SAR Object Detection

Xi Yang, Jiachen Sun, Songsong Duan et al.

AAAI 2025paper
#19990

DriveGazen: Event-Based Driving Status Recognition Using Conventional Camera

Xiaoyin Yang, Xin Yang

AAAI 2025paperarXiv:2412.11753
#19991

Semantic Segmentation on Raindrop Degraded Images Using Two-Stage Dual Teacher-Student Learning

Xin Yang, Wending Yan, Yuan Yuan et al.

AAAI 2025paper
#19992

ERF: A Benchmark Dataset for Robust Semantic Segmentation Under Extreme Rainfall Conditions

Xin Yang, Xin Zhang, Xinchao Wang

AAAI 2025paper
#19993

FreqTS: Frequency-Aware Token Selection for Accelerating Diffusion Models

Xinye Yang, Yuxin Yang, Haoran Pang et al.

AAAI 2025paper
#19994

Driving in the Occupancy World: Vision-Centric 4D Occupancy Forecasting and Planning via World Models for Autonomous Driving

Yu Yang, Jianbiao Mei, Yukai Ma et al.

AAAI 2025paperarXiv:2408.14197
#19995

UAWTrack: Universal 3D Single Object Tracking in Adverse Weather

Yuxiang Yang, Hongjie Gu, Yingqi Deng et al.

AAAI 2025paper
#19996

RealPortrait: Realistic Portrait Animation with Diffusion Transformers

Zejun Yang, Huawei Wei, Zhisheng Wang

AAAI 2025paper
#19997

Single Image Rolling Shutter Removal with Diffusion Models

Zhanglei Yang, Haipeng Li, Mingbo Hong et al.

AAAI 2025paperarXiv:2407.02906
#19998

MMGDreamer: Mixed-Modality Graph for Geometry-Controllable 3D Indoor Scene Generation

Zhifei Yang, Keyang Lu, Chao Zhang et al.

AAAI 2025paperarXiv:2502.05874
#19999

MoRe: Class Patch Attention Needs Regularization for Weakly Supervised Semantic Segmentation

Zhiwei Yang, Yucong Meng, Kexue Fu et al.

AAAI 2025paperarXiv:2412.11076
#20000

MM-Tracker: Motion Mamba for UAV-platform Multiple Object Tracking

Mufeng Yao, Jinlong Peng, Qingdong He et al.

AAAI 2025paper