Most Cited AAAI "fixed-viewpoint video" Papers

5,317 papers found • Page 17 of 27

Filters:Most Cited AAAI fixed-viewpoint video Clear all

Conference

AAAI 2025 (3,028)COLM 2025 (418)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,558)oral (1,594)spotlight (1,421)highlight (975)

#3201

HVDualformer: Histogram-Vision Dual Transformer for White Balance

Yan-Tsung Peng, Guan-Rong Chen

AAAI 2025paper

#3202

Semi-supervised 3D Semantic Scene Completion with 2D Vision Foundation Model Guidance

Duc-Hai Pham, Duc-Dung Nguyen, Anh Pham et al.

AAAI 2025paperarXiv:2408.11559

#3203

Leveraging Anatomical Consistency for Multi-Object Detection in Ultrasound Images via Source-free Unsupervised Domain Adaptation

Bin Pu, Xingguo Lv, Jiewen Yang et al.

AAAI 2025paper

#3204

Dive into Aerial Remote Sensing Underwater Depth Estimation with Hyperspectral Imagery

Jiahao Qi, Xingyue Liu, Chen Chen et al.

AAAI 2025paper

#3205

PhysDiff: Physiology-based Dynamicity Disentangled Diffusion Model for Remote Physiological Measurement

Wei Qian, Gaoji Su, Dan Guo et al.

AAAI 2025paper

#3206

Holistic Correction with Object Prototype for Video Object Segmentation

Shengye Qiao, Changqun Xia, Yanjie Liang et al.

AAAI 2025paper

#3207

Integrating Low-Level Visual Cues for Enhanced Unsupervised Semantic Segmentation

Yuhao Qing, Dan Zeng, Shaorong Xie et al.

AAAI 2025paper

#3208

PC-BEV: An Efficient Polar-Cartesian BEV Fusion Framework for LiDAR Semantic Segmentation

Shoumeng Qiu, Xinrun Li, Xiangyang Xue et al.

AAAI 2025paperarXiv:2412.14821

#3209

High-Fidelity Polarimetric Implicit 3D Reconstruction with View-Dependent Physical Representation

Yu Qiu, Sijia Wen, Hainan Zhang et al.

AAAI 2025paper

#3210

HSOD-BIT-V2: A Challenging Benchmark for Hyperspectral Salient Object Detection

Yuhao Qiu, Shuyan Bai, Tingfa Xu et al.

AAAI 2025paper

#3211

Universal Features Guided Zero-Shot Category-Level Object Pose Estimation

Wentian Qu, Chenyu Meng, Heng Li et al.

AAAI 2025paperarXiv:2501.02831

#3212

GHOST: Gaussian Hypothesis Open-Set Technique

Ryan Rabinowitz, Steve Cruz, Manuel Günther et al.

AAAI 2025paperarXiv:2502.03359

#3213

CDTR: Semantic Alignment for Video Moment Retrieval Using Concept Decomposition Transformer

Ran Ran, Jiwei Wei, Xiangyi Cai et al.

AAAI 2025paper

#3214

Improving Integrated Gradient-based Transferable Adversarial Examples by Refining the Integration Path

Yuchen Ren, Zhengyu Zhao, Chenhao Lin et al.

AAAI 2025paperarXiv:2412.18844

#3215

GenHMR: Generative Human Mesh Recovery

Muhammad Usama Saleem, Ekkasit Pinyoanuntapong, Pu Wang et al.

AAAI 2025paperarXiv:2412.14444

#3216

FunEditor: Achieving Complex Image Edits via Function Aggregation with Diffusion Models

Mohammadreza Samadi, Fred X. Han, Mohammad Salameh et al.

AAAI 2025paperarXiv:2408.08495

#3217

PVTree: Realistic and Controllable Palm Vein Generation for Recognition Tasks

Sheng Shang, Chenglong Zhao, Ruixin Zhang et al.

AAAI 2025paperarXiv:2503.02547

#3218

Video Summarization Using Denoising Diffusion Probabilistic Model

Zirui Shang, Yubo Zhu, Hongxi Li et al.

AAAI 2025paperarXiv:2412.08357

#3219

IMAGDressing-v1: Customizable Virtual Dressing

Fei Shen, Xin Jiang, Xin He et al.

AAAI 2025paperarXiv:2407.12705

#3220

In2NeCT: Inter-class and Intra-class Neural Collapse Tuning for Semantic Segmentation of Imbalanced Remote Sensing Images

Junao Shen, Qiyun Hu, Tian Feng et al.

AAAI 2025paper

#3221

Topology-Aware 3D Gaussian Splatting: Leveraging Persistent Homology for Optimized Structural Integrity

Tianqi Shen, Shaohua Liu, Jiaqi Feng et al.

AAAI 2025paperarXiv:2412.16619

#3222

Free-Moving Object Reconstruction and Pose Estimation with Virtual Camera

Haixin Shi, Yinlin Hu, Daniel Koguciuk et al.

AAAI 2025paperarXiv:2405.05858

#3223

Normal-NeRF: Ambiguity-Robust Normal Estimation for Highly Reflective Scenes

Ji Shi, Xianghua Ying, Ruohao Guo et al.

AAAI 2025paperarXiv:2501.09460

#3224

Neural Block Compression: Variable Bitrates Feature Blocks for Texture Representation

Rui Shi, Yishun Dou, Zhong Zheng et al.

AAAI 2025paper

#3225

HS-FPN: High Frequency and Spatial Perception FPN for Tiny Object Detection

Zican Shi, Jing Hu, Jie Ren et al.

AAAI 2025paperarXiv:2412.10116

#3226

SdalsNet: Self-Distilled Attention Localization and Shift Network for Unsupervised Camouflaged Object Detection

Peiyao Shou, Yixiu Liu, Wei Wang et al.

AAAI 2025paper

#3227

OGP-Net: Optical Guidance Meets Pixel-Level Contrastive Distillation for Robust Multi-Modal and Missing Modality Segmentation

Aniruddh Sikdar, Jayant Teotia, Suresh Sundaram

AAAI 2025paper

#3228

Fine-Grained Perception in Panoramic Scenes: A Novel Task, Dataset, and Method for Object Importance Ranking

Jia Song, Chenglizhao Chen, Xu Yu et al.

AAAI 2025paper

#3229

CtrlAvatar: Controllable Avatars Generation via Disentangled Invertible Networks

Wenfeng Song, Yang Ding, Fei Hou et al.

AAAI 2025paper

#3230

ERL-MPP: Evolutionary Reinforcement Learning with Multi-head Puzzle Perception for Solving Large-scale Jigsaw Puzzles of Eroded Gaps

Xingke Song, Xiaoying Yang, Chenglin Yao et al.

AAAI 2025paperarXiv:2504.09608

#3231

Temporal Coherent Object Flow for Multi-Object Tracking

Zikai Song, Run Luo, Lintao Ma et al.

AAAI 2025paper

#3232

Toward Improving Robustness and Accuracy in Unsupervised Domain Adaptation

Aishwarya Soni, Tanima Dutta

AAAI 2025paper

#3233

Hierarchical Vector Quantization for Unsupervised Action Segmentation

Federico Spurio, Emad Bahrami, Gianpiero Francesca et al.

AAAI 2025paperarXiv:2412.17640

#3234

Can We Get Rid of Handcrafted Feature Extractors? SparseViT: Nonsemantics-Centered, Parameter-Efficient Image Manipulation Localization Through Spare-Coding Transformer

Lei Su, Xiaochen Ma, Xuekang Zhu et al.

AAAI 2025paperarXiv:2412.14598

#3235

EigenSR: Eigenimage-Bridged Pre-Trained RGB Learners for Single Hyperspectral Image Super-Resolution

Xi Su, Xiangfei Shen, Mingyang Wan et al.

AAAI 2025paperarXiv:2409.04050

#3236

Dual-branch Graph Feature Learning for NLOS Imaging

Xiongfei Su, Tianyi Zhu, Lina Liu et al.

AAAI 2025paperarXiv:2502.19683

#3237

Explicit Relational Reasoning Network for Scene Text Detection

Yuchen Su, Zhineng Chen, Yongkun Du et al.

AAAI 2025paperarXiv:2412.14692

#3238

3D Annotation-Free Learning by Distilling 2D Open-Vocabulary Segmentation Models for Autonomous Driving

Boyi Sun, Yuhang Liu, Xingxia Wang et al.

AAAI 2025paperarXiv:2405.15286

#3239

NeuralFlix: A Simple While Effective Framework for Semantic Decoding of Videos from Non-invasive Brain Recordings

Jingyuan Sun, Mingxiao Li, Marie-Francine Moens

AAAI 2025paper

#3240

Guided and Variance-Corrected Fusion with One-shot Style Alignment for Large-Content Image Generation

Shoukun Sun, Min Xian, Tiankai Yao et al.

AAAI 2025paperarXiv:2412.12771

#3241

M2Flow: A Motion Information Fusion Framework for Enhanced Unsupervised Optical Flow Estimation in Autonomous Driving

Xunpei Sun, Gang Chen, Zuoxun Hou

AAAI 2025paper

#3242

Leveraging Large Vision-Language Model as User Intent-Aware Encoder for Composed Image Retrieval

Zelong Sun, Dong Jing, Guoxing Yang et al.

AAAI 2025paperarXiv:2412.11087

#3243

C2P-CLIP: Injecting Category Common Prompt in CLIP to Enhance Generalization in Deepfake Detection

Chuangchuang Tan, Renshuai Tao, Huan Liu et al.

AAAI 2025paperarXiv:2408.09647

#3244

Neighbor Does Matter: Density-Aware Contrastive Learning for Medical Semi-supervised Segmentation

Feilong Tang, Zhongxing Xu, Ming Hu et al.

AAAI 2025paperarXiv:2412.19871

#3245

MUSE: Mamba Is Efficient Multi-scale Learner for Text-video Retrieval

Haoran Tang, Meng Cao, Jinfa Huang et al.

AAAI 2025paperarXiv:2408.10575

#3246

BEV-TSR: Text-Scene Retrieval in BEV Space for Autonomous Driving

Tao Tang, Dafeng Wei, Zhengyu Jia et al.

AAAI 2025paperarXiv:2401.01065

#3247

More Text, Less Point: Towards 3D Data-Efficient Point-Language Understanding

Yuan Tang, Xu Han, Xianzhi Li et al.

AAAI 2025paperarXiv:2408.15966

#3248

RAGG: Retrieval-Augmented Grasp Generation Model

Zhenhua Tang, Bin Zhu, Yanbin Hao et al.

AAAI 2025paper

#3249

From Representation Space to Prognostic Insights: Whole Slide Image Generation with Hierarchical Diffusion Model for Survival Prediction

Zhihao Tang, Xi Zhang, Chaozhuo Li

AAAI 2025paper

#3250

3D²-Actor: Learning Pose-Conditioned 3D-Aware Denoiser for Realistic Gaussian Avatar Modeling

Zichen Tang, Hongyu Yang, Hanchen Zhang et al.

AAAI 2025paper

#3251

Stitch, Contrast, and Segment: Learning a Human Action Segmentation Model Using Trimmed Skeleton Videos

Haitao Tian, Pierre Payeur

AAAI 2025paper

#3252

Unsupervised Self-Prior Embedding Neural Representation for Iterative Sparse-View CT Reconstruction

Xuanyu Tian, Lixuan Chen, Qing Wu et al.

AAAI 2025paperarXiv:2502.05445

#3253

AI-generated Image Quality Assessment in Visual Communication

Yu Tian, Yixuan Li, Baoliang Chen et al.

AAAI 2025paperarXiv:2412.15677

#3254

G-VEval: A Versatile Metric for Evaluating Image and Video Captions Using GPT-4o

Tony Cheng Tong, Sirui He, Zhiwen Shao et al.

AAAI 2025paperarXiv:2412.13647

#3255

Memory-Augmented Re-Completion for 3D Semantic Scene Completion

Yu-Wen Tseng, Sheng-Ping Yang, Jhih-Ciang Wu et al.

AAAI 2025paper

#3256

VOILA: Complexity-Aware Universal Segmentation of CT Images by Voxel Interacting with Language

Zishuo Wan, Yu Gao, Wanyuan Pang et al.

AAAI 2025paperarXiv:2501.03482

#3257

ParGo: Bridging Vision-Language with Partial and Global Views

An-Lan Wang, Bin Shan, Wei Shi et al.

AAAI 2025paperarXiv:2408.12928

#3258

RA-GAR: A Richly Annotated Benchmark for Gait Attribute Recognition

Chenye Wang, Saihui Hou, Aoqi Li et al.

AAAI 2025paper

#3259

Towards Efficient Object Re-Identification with a Novel Cloud-Edge Collaborative Framework

Chuanming Wang, Yuxin Yang, Mengshi Qi et al.

AAAI 2025paperarXiv:2401.02041

#3260

Target-Driven Distillation: Consistency Distillation with Target Timestep Selection and Decoupled Guidance

Cunzheng Wang, Ziyuan Guo, Yuxuan Duan et al.

AAAI 2025paperarXiv:2409.01347

#3261

A Black-Box Evaluation Framework for Semantic Robustness in Bird’s Eye View Detection

Fu Wang, Yanghao Zhang, Xiangyu Yin et al.

AAAI 2025paperarXiv:2412.13913

#3262

Scene Graph-Grounded Image Generation

Fuyun Wang, Tong Zhang, Yuanzhi Wang et al.

AAAI 2025paper

#3263

S³-Mamba: Small-Size-Sensitive Mamba for Lesion Segmentation

Gui Wang, Yuexiang Li, Wenting Chen et al.

AAAI 2025paper

#3264

BLS-GAN: A Deep Layer Separation Framework for Eliminating Bone Overlap in Conventional Radiographs

Haolin Wang, Yafei Ou, Prasoon Ambalathankandy et al.

AAAI 2025paperarXiv:2409.07304

#3265

EMControl: Adding Conditional Control to Text-to-Image Diffusion Models via Expectation-Maximization

He Wang, Longquan Dai, Jinhui Tang

AAAI 2025paper

#3266

M2OST: Many-to-one Regression for Predicting Spatial Transcriptomics from Digital Pathology Images

Hongyi Wang, Xiuju Du, Jing Liu et al.

AAAI 2025paperarXiv:2409.15092

#3267

RAP-SR: RestorAtion Prior Enhancement in Diffusion Models for Realistic Image Super-Resolution

Jiangang Wang, Qingnan Fan, Jinwei Chen et al.

AAAI 2025paperarXiv:2412.07149

#3268

MM-Mixing: Multi-Modal Mixing Alignment for 3D Understanding

Jiaze Wang, Yi Wang, Ziyu Guo et al.

AAAI 2025paperarXiv:2405.18523

#3269

OV-DQUO: Open-Vocabulary DETR with Denoising Text Query Training and Open-World Unknown Objects Supervision

Junjie Wang, Bin Chen, Bin Kang et al.

AAAI 2025paperarXiv:2405.17913

#3270

InpDiffusion: Image Inpainting Localization via Conditional Diffusion Models

Kai Wang, Shaozhang Niu, Qixian Hao et al.

AAAI 2025paperarXiv:2501.02816

#3271

Tracking Everything Everywhere across Multiple Cameras

Li-Heng Wang, YuJu Cheng, Tyng-Luh Liu

AAAI 2025paper

#3272

VLScene: Vision-Language Guidance Distillation for Camera-Based 3D Semantic Scene Completion

Meng Wang, Huilong Pi, Ruihui Li et al.

AAAI 2025paperarXiv:2503.06219

#3273

Deep Multi-modal Graph Clustering via Graph Transformer Network

Qianqian Wang, Haiming Xu, Zihao Zhang et al.

AAAI 2025paper

#3274

The Parables of the Mustard Seed and the Yeast: Extremely Low-Budget, High-Performance Nighttime Semantic Segmentation

Shiqin Wang, Xin Xu, Haoyang Chen et al.

AAAI 2025paper

#3275

GFlow: Recovering 4D World from Monocular Video

Shizun Wang, Xingyi Yang, Qiuhong Shen et al.

AAAI 2025paperarXiv:2405.18426

#3276

Imagine: Image-Guided 3D Part Assembly with Structure Knowledge Graph

Weihao Wang, Yu Lan, Mingyu You et al.

AAAI 2025paper

#3277

MVReward: Better Aligning and Evaluating Multi-View Diffusion Models with Human Preferences

Weitao Wang, Haoran Xu, Yuxiao Yang et al.

AAAI 2025paperarXiv:2412.06614

#3278

FreeGen: Bridging Visual-Linguistic Discrepancies Towards Diffusion-based Pixel-level Data Synthesis

Wenzhuang Wang, Mingcan Ma, Yong Chen et al.

AAAI 2025paper

#3279

DCTMamba: Advancing JPEG Image Restoration Through Long-Sequence Modeling and Adaptive Frequency Strategy

Xi Wang, Xueyang Fu, Liang Li et al.

AAAI 2025paper

#3280

From 2D CAD Drawings to 3D Parametric Models: A Vision-Language Approach

Xilin Wang, Jia Zheng, Yuanchao Hu et al.

AAAI 2025paperarXiv:2412.11892

#3281

Lifting Scheme-Based Implicit Disentanglement of Emotion-Related Facial Dynamics in the Wild

Xingjian Wang, Li Chai

AAAI 2025paperarXiv:2412.13168

#3282

MIMTrack: In-Context Tracking via Masked Image Modeling

Xingmei Wang, Guohao Nie, Jiaxiang Meng et al.

AAAI 2025paper

#3283

From Coarse to Fine: A Matching and Alignment Framework for Unsupervised Cross-View Geo-Localization

Xueyi Wang, Lele Zhang, Zheng Fan et al.

AAAI 2025paper

#3284

RefDetector: A Simple Yet Effective Matching-based Method for Referring Expression Comprehension

Yabing Wang, Zhuotao Tian, Zheng Qin et al.

AAAI 2025paper

#3285

Hierarchical Alignment-enhanced Adaptive Grounding Network for Generalized Referring Expression Comprehension

Yaxian Wang, Henghui Ding, Shuting He et al.

AAAI 2025paperarXiv:2501.01416

#3286

Breaking Barriers in Physical-World Adversarial Examples: Improving Robustness and Transferability via Robust Feature

Yichen Wang, Yuxuan Chou, Ziqi Zhou et al.

AAAI 2025paperarXiv:2412.16958

#3287

Capturing the Unseen: Vision-Free Facial Motion Capture Using Inertial Measurement Units

Youjia Wang, Yiwen Wu, Hengan Zhou et al.

AAAI 2025paperarXiv:2402.03944

#3288

Re-Attentional Controllable Video Diffusion Editing

Yuanzhi Wang, Yong Li, Mengyi Liu et al.

AAAI 2025paperarXiv:2412.11710

#3289

MambaPro: Multi-Modal Object Re-identification with Mamba Aggregation and Synergistic Prompt

Yuhao Wang, Xuehu Liu, Tianyu Yan et al.

AAAI 2025paperarXiv:2412.10707

#3290

IteRPrimE: Zero-shot Referring Image Segmentation with Iterative Grad-CAM Refinement and Primary Word Emphasis

Yuji Wang, Jingchen Ni, Yong Liu et al.

AAAI 2025paperarXiv:2503.00936

#3291

Target Scanpath-Guided 360-Degree Image Enhancement

Yujia Wang, Fang-Lue Zhang, Neil A. Dodgson

AAAI 2025paper

#3292

DualNet: Robust Self-Supervised Stereo Matching with Pseudo-Label Supervision

Yun Wang, Jiahao Zheng, Chenghao Zhang et al.

AAAI 2025paper

#3293

Mamba YOLO: A Simple Baseline for Object Detection with State Space Model

Zeyu Wang, Chen Li, Huiying Xu et al.

AAAI 2025paperarXiv:2406.05835

#3294

Style Nursing with Spatial and Semantic Guidance for Zero-Shot Traffic Scene Style Transfer

Zhen Wang, Zihang Lin, Meng Yuan et al.

AAAI 2025paper

#3295

Thermal-Aware Low-Light Image Enhancement: A Real-World Benchmark and a New Light-Weight Model

Zhen Wang, Yaozu Wu, Dongyuan Li et al.

AAAI 2025paper

#3296

Attention-Imperceptible Backdoor Attacks on Vision Transformers

Zhishen Wang, Rui Wang, Lihua Jing

AAAI 2025paper

#3297

LLM-RG4: Flexible and Factual Radiology Report Generation Across Diverse Input Contexts

Zhuhao Wang, Yihua Sun, Zihan Li et al.

AAAI 2025paperarXiv:2412.12001

#3298

MSV-PCT: Multi-Sparse-View Enhanced Transformer Framework for Salient Object Detection in Point Clouds

Zihao Wang, Yiming Huang, Gengyu Lyu et al.

AAAI 2025paper

#3299

GlyphSR: A Simple Glyph-Aware Framework for Scene Text Image Super-Resolution

Baole Wei, Yuxuan Zhou, Liangcai Gao et al.

AAAI 2025paper

#3300

Power of Diversity: Enhancing Data-Free Black-Box Attack with Domain-Augmented Learning

Yang Wei, Jingyu Tan, Guowen Xu et al.

AAAI 2025paper

#3301

Achieving Lightweight Super-Resolution for Real-Time Computer Graphics

Yu Wen, Chen Zhang, Chenhao Xie et al.

AAAI 2025paper

#3302

Multi-axis Prompt and Multi-dimension Fusion Network for All-in-one Weather-degraded Image Restoration

Yuanbo Wen, Tao Gao, Jing Zhang et al.

AAAI 2025paper

#3303

USDRL: Unified Skeleton-Based Dense Representation Learning with Multi-Grained Feature Decorrelation

Wanjiang Weng, Hongsong Wang, Junbo Wang et al.

AAAI 2025paperarXiv:2412.09220

#3304

Spin: Diffusion-based Semantic Image Painting Through Independent Information Injection

Dantong Wu, Zhiqiang Chen, Tianjiao Du et al.

AAAI 2025paper

#3305

Structural Pruning via Spatial-aware Information Redundancy for Semantic Segmentation

Dongyue Wu, Zilin Guo, Li Yu et al.

AAAI 2025paperarXiv:2412.12672

#3306

SVRMamba: Slice-to-Volume Reconstruction from Multiple MRI Stacks with Slice Sequence Guided Mamba

Jiangjie Wu, Hongjiang Wei, Yuyao Zhang

AAAI 2025paper

#3307

VarCMP: Adapting Cross-Modal Pre-Training Models for Video Anomaly Retrieval

Peng Wu, Wanshun Su, Xiangteng He et al.

AAAI 2025paper

#3308

Realistic Noise Synthesis with Diffusion Models

Qi Wu, Mingyan Han, Ting Jiang et al.

AAAI 2025paperarXiv:2305.14022

#3309

PanAdapter: Two-Stage Fine-Tuning with Spatial-Spectral Priors Injecting for Pansharpening

RuoCheng Wu, Zien Zhang, Shangqi Deng et al.

AAAI 2025paperarXiv:2409.06980

#3310

CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities

Tao Wu, Yong Zhang, Xintao Wang et al.

AAAI 2025paperarXiv:2408.13239

#3311

Deconfound Semantic Shift and Incompleteness in Incremental Few-shot Semantic Segmentation

Yirui Wu, Yuhang Xia, Hao Li et al.

AAAI 2025paper

#3312

Video Repurposing from User Generated Content: A Large-scale Dataset and Benchmark

Yongliang Wu, Wenbo Zhu, Jiawang Cao et al.

AAAI 2025paperarXiv:2412.08879

#3313

MUCD: Unsupervised Point Cloud Change Detection via Masked Consistency

Yue Wu, Zhipeng Wang, Yongzhe Yuan et al.

AAAI 2025paper

#3314

Unified Knowledge Maintenance Pruning and Progressive Recovery with Weight Recalling for Large Vision-Language Models

Zimeng Wu, Jiaxin Chen, Yunhong Wang

AAAI 2025paper

#3315

RETRACTED: GEONet: Global Enhancement and Optimization Network for Lane Detection

Suyang Xi, Yunhao Liu, Hong Ding et al.

AAAI 2025paper

#3316

PlaNet: Learning to Mitigate Atmospheric Turbulence in Planetary Images

Yifei Xia, Chu Zhou, Chengxuan Zhu et al.

AAAI 2025paper

#3317

CA-Edit: Causality-Aware Condition Adapter for High-Fidelity Local Facial Attribute Editing

Xiaole Xian, Xilin He, Zenghao Niu et al.

AAAI 2025paperarXiv:2412.13565

#3318

SMR-Net: Semantic-Guided Mutually Reinforcing Network for Cross-Modal Image Fusion and Salient Object Detection

Guobao Xiao, Xinyu Liu, Zebin Lin et al.

AAAI 2025paper

#3319

Boosting Vision State Space Model with Fractal Scanning

Haoke Xiao, Lv Tang, Peng-tao Jiang et al.

AAAI 2025paper

#3320

Text Proxy: Decomposing Retrieval from a 1-to-N Relationship into N 1-to-1 Relationships for Text-Video Retrieval

Jian Xiao, Zhenzhen Hu, Jia Li et al.

AAAI 2025paperarXiv:2410.06618

#3321

Cross-modulated Attention Transformer for RGBT Tracking

Yun Xiao, Jiacong Zhao, Andong Lu et al.

AAAI 2025paperarXiv:2408.02222

#3322

Omni-Query Active Learning for Source-Free Domain Adaptive Cross-Modality 3D Semantic Segmentation

Jianxiang Xie, Yao Wu, Yachao Zhang et al.

AAAI 2025paper

#3323

TextRefiner: Internal Visual Feature as Efficient Refiner for Vision-Language Models Prompt Tuning

Jingjing Xie, Yuxin Zhang, Jun Peng et al.

AAAI 2025paperarXiv:2412.08176

#3324

Discrete Prior-Based Temporal-Coherent Content Prediction for Blind Face Video Restoration

Lianxin Xie, Bingbing Zheng, Wen Xue et al.

AAAI 2025paperarXiv:2501.09960

#3325

Expand VSR Benchmark for VLLM to Expertize in Spatial Rules

Peijin Xie, Lin Sun, Bingquan Liu et al.

AAAI 2025paperarXiv:2412.18224

#3326

PointTalk: Audio-Driven Dynamic Lip Point Cloud for 3D Gaussian-based Talking Head Synthesis

Yifan Xie, Tao Feng, Xin Zhang et al.

AAAI 2025paperarXiv:2412.08504

#3327

HieraFashDiff: Hierarchical Fashion Design with Multi-stage Diffusion Models

Zhifeng Xie, Hao Li, Huiming Ding et al.

AAAI 2025paperarXiv:2401.07450

#3328

Few-Shot Incremental Learning via Foreground Aggregation and Knowledge Transfer for Audio-Visual Semantic Segmentation

Jingqiao Xiu, Mengze Li, Zongxin Yang et al.

AAAI 2025paper

#3329

DiffScene: Diffusion-Based Safety-Critical Scenario Generation for Autonomous Vehicles

Chejian Xu, Aleksandr Petiushko, Ding Zhao et al.

AAAI 2025paper

#3330

FR²Seg: Continual Segmentation Across Multiple Sites via Fourier Style Replay and Adaptive Consistency Regularization

Cheng Xu, Weiwen Zhang, Hongrui Zhang et al.

AAAI 2025paper

#3331

Less Is More: Token Context-Aware Learning for Object Tracking

Chenlong Xu, Bineng Zhong, Qihua Liang et al.

AAAI 2025paperarXiv:2501.00758

#3332

3DHumanEdit: Multi-modal Body Part-aware Conditioning Information Integration for 3D Human Manipulation

FeiFan Xu, Tianyi Chen, Fan Yang et al.

AAAI 2025paper

#3333

Motion Artifact Removal in Pixel-Frequency Domain via Alternate Masks and Diffusion Model

Jiahua Xu, Dawei Zhou, Lei Hu et al.

AAAI 2025paperarXiv:2412.07590

#3334

OmniSR: Shadow Removal Under Direct and Indirect Lighting

Jiamin Xu, Zelong Li, Yuxin Zheng et al.

AAAI 2025paperarXiv:2410.01719

#3335

Multiple Feature Refining Network for Visual Emotion Distribution Learning

Qinfu Xu, Shaozu Yuan, Yiwei Wei et al.

AAAI 2025paper

#3336

SCKD: Semi-Supervised Cross-Modality Knowledge Distillation for 4D Radar Object Detection

Ruoyu Xu, Zhiyu Xiang, Chenwei Zhang et al.

AAAI 2025paperarXiv:2412.14571

#3337

LiON: Learning Point-Wise Abstaining Penalty for LiDAR Outlier DetectioN Using Diverse Synthetic Data

Shaocong Xu, Pengfei Li, Qianpu Sun et al.

AAAI 2025paperarXiv:2309.10230

#3338

Zero-shot Video Moment Retrieval via Off-the-shelf Multimodal Large Language Models

Yifang Xu, Yunzhuo Sun, Benxiang Zhai et al.

AAAI 2025paperarXiv:2501.07972

#3339

HOIMamba: Efficient Mamba-based Disentangled Progressive Learning for HOI Detection

Yongchao Xu, Jiawei Liu, Sen Tao et al.

AAAI 2025paper

#3340

OOTDiffusion: Outfitting Fusion Based Latent Diffusion for Controllable Virtual Try-On

Yuhao Xu, Tao Gu, Weifeng Chen et al.

AAAI 2025paperarXiv:2403.01779

#3341

FLAME: Learning to Navigate with Multimodal LLM in Urban Environments

Yunzhe Xu, Yiyuan Pan, Zhe Liu et al.

AAAI 2025paperarXiv:2408.11051

#3342

FATE: Feature-Adapted Parameter Tuning for Vision-Language Models

Zhengqin Xu, Zelin Peng, Xiaokang Yang et al.

AAAI 2025paper

#3343

Toward Modality Gap: Vision Prototype Learning for Weakly-supervised Semantic Segmentation with CLIP

Zhongxing Xu, Feilong Tang, Zhe Chen et al.

AAAI 2025paperarXiv:2412.19650

#3344

RetouchGPT: LLM-based Interactive High-Fidelity Face Retouching via Imperfection Prompting

Wen Xue, Chun Ding, Ruotao Xu et al.

AAAI 2025paper

#3345

Physical Marker: Revealing Invisible Hyperlinks Hidden in Printed Trademarks

Yuliang Xue, Lei Tan, Guobiao Li et al.

AAAI 2025paper

#3346

Towards Universal Rainy Image Restoration: Benchmark and Baseline

Hujie Yan

AAAI 2025paper

#3347

SGTC: Semantic-Guided Triplet Co-training for Sparsely Annotated Semi-Supervised Medical Image Segmentation

Ke Yan, Qing Cai, Fan Zhang et al.

AAAI 2025paperarXiv:2412.15526

#3348

Data-Free Universal Attack by Exploiting the Intrinsic Vulnerability of Deep Models

YangTian Yan, Jinyu Tian

AAAI 2025paperarXiv:2503.22205

#3349

Robust Image Hashing Based on Contrastive Masked Autoencoder with Weak-Strong Augmentation Alignment

Cundian Yang, Guibo Luo, Yuesheng Zhu et al.

AAAI 2025paper

#3350

PlanLLM: Video Procedure Planning with Refinable Large Language Models

Dejie Yang, Zijing Zhao, Yang Liu

AAAI 2025paperarXiv:2412.19139

#3351

3CAD: A Large-Scale Real-World 3C Product Dataset for Unsupervised Anomaly Detection

Enquan Yang, Peng Xing, Hanyang Sun et al.

AAAI 2025paper

#3352

Diffusion Prior Interpolation for Flexibility Real-World Face Super-Resolution

Jiarui Yang, Tao Dai, Yufei Zhu et al.

AAAI 2025paperarXiv:2412.16552

#3353

SMamba: Sparse Mamba for Event-based Object Detection

Nan Yang, Yang Wang, Zhanwen Liu et al.

AAAI 2025paperarXiv:2501.11971

#3354

One-Shot Reference-based Structure-Aware Image to Sketch Synthesis

Rui Yang, Honghong Yang, Li Zhao et al.

AAAI 2025paper

#3355

LiDAR-LLM: Exploring the Potential of Large Language Models for 3D LiDAR Understanding

Senqiao Yang, Jiaming Liu, Renrui Zhang et al.

AAAI 2025paperarXiv:2312.14074

#3356

Asymmetric Hierarchical Difference-aware Interaction Network for Event-guided Motion Deblurring

Wen Yang, Jinjian Wu, Leida Li et al.

AAAI 2025paper

#3357

Dual Information Purification for Lightweight SAR Object Detection

Xi Yang, Jiachen Sun, Songsong Duan et al.

AAAI 2025paper

#3358

DriveGazen: Event-Based Driving Status Recognition Using Conventional Camera

Xiaoyin Yang, Xin Yang

AAAI 2025paperarXiv:2412.11753

#3359

Semantic Segmentation on Raindrop Degraded Images Using Two-Stage Dual Teacher-Student Learning

Xin Yang, Wending Yan, Yuan Yuan et al.

AAAI 2025paper

#3360

ERF: A Benchmark Dataset for Robust Semantic Segmentation Under Extreme Rainfall Conditions

Xin Yang, Xin Zhang, Xinchao Wang

AAAI 2025paper

#3361

FreqTS: Frequency-Aware Token Selection for Accelerating Diffusion Models

Xinye Yang, Yuxin Yang, Haoran Pang et al.

AAAI 2025paper

#3362

Driving in the Occupancy World: Vision-Centric 4D Occupancy Forecasting and Planning via World Models for Autonomous Driving

Yu Yang, Jianbiao Mei, Yukai Ma et al.

AAAI 2025paperarXiv:2408.14197

#3363

UAWTrack: Universal 3D Single Object Tracking in Adverse Weather

Yuxiang Yang, Hongjie Gu, Yingqi Deng et al.

AAAI 2025paper

#3364

RealPortrait: Realistic Portrait Animation with Diffusion Transformers

Zejun Yang, Huawei Wei, Zhisheng Wang

AAAI 2025paper

#3365

Single Image Rolling Shutter Removal with Diffusion Models

Zhanglei Yang, Haipeng Li, Mingbo Hong et al.

AAAI 2025paperarXiv:2407.02906

#3366

MMGDreamer: Mixed-Modality Graph for Geometry-Controllable 3D Indoor Scene Generation

Zhifei Yang, Keyang Lu, Chao Zhang et al.

AAAI 2025paperarXiv:2502.05874

#3367

MoRe: Class Patch Attention Needs Regularization for Weakly Supervised Semantic Segmentation

Zhiwei Yang, Yucong Meng, Kexue Fu et al.

AAAI 2025paperarXiv:2412.11076

#3368

MM-Tracker: Motion Mamba for UAV-platform Multiple Object Tracking

Mufeng Yao, Jinlong Peng, Qingdong He et al.

AAAI 2025paper

#3369

As Pseudo-Label Free as Possible: Leveraging Adaptive Feature Generation for Sparsely Annotated Object Detection

Shuilian Yao, Yu Liu, Qi Jia et al.

AAAI 2025paper

#3370

Towards Open-Vocabulary Remote Sensing Image Semantic Segmentation

Chengyang Ye, Yunzhi Zhuge, Pingping Zhang

AAAI 2025paperarXiv:2412.19492

#3371

VersaFusion: A Versatile Diffusion-Based Framework for Fine-Grained Image Editing and Enhancement

Haocun Ye, Xinlong Jiang, Chenlong Gao et al.

AAAI 2025paper

#3372

PromptHaze: Prompting Real-world Dehazing via Depth Anything Model

Tian Ye, Sixiang Chen, Haoyu Chen et al.

AAAI 2025paper

#3373

Optimized Gradient Clipping for Noisy Label Learning

Xichen Ye, Yifan Wu, Weizhong Zhang et al.

AAAI 2025paperarXiv:2412.08941

#3374

Personalized Lip Reading: Adapting to Your Unique Lip Movements with Vision and Language

Jeong Hun Yeo, Chae Won Kim, Hyunjun Kim et al.

AAAI 2025paperarXiv:2409.00986

#3375

FlexDataset: Crafting Annotated Dataset Generation for Diverse Applications

Ellen Yi-Ge, Leo Shawn

AAAI 2025paper

#3376

ImagePiece: Content-aware Re-tokenization for Efficient Image Recognition

Seungdong Yoa, Seungjun Lee, Hye-Seung Cho et al.

AAAI 2025paperarXiv:2412.16491

#3377

FOCUS: Towards Universal Foreground Segmentation

Zuyao You, Lingyu Kong, Lingchen Meng et al.

AAAI 2025paperarXiv:2501.05238

#3378

SGFormer: Semantic-Geometry Fusion Transformer for Multi-modal 3D Panoptic Segmentation

Hongqi Yu, Sixian Chan, Xiaolong Zhou et al.

AAAI 2025paper

#3379

Separating the Wheat from the Chaff: Spatio-Temporal Transformer with View-interweaved Attention for Photon-Efficient Depth Sensing

Letian Yu, Jiaxi Yang, Bo Dong et al.

AAAI 2025paper

#3380

ReMoGPT: Part-Level Retrieval-Augmented Motion-Language Models

Qing Yu, Mikihiro Tanaka, Kent Fujiwara

AAAI 2025paper

#3381

STGC-NeRF: Spatial-Temporal Geometric Consistency for LiDAR Neural Radiance Fields in Dynamic Scenes

Shangshu Yu, Xiaotian Sun, Wen Li et al.

AAAI 2025paper

#3382

Fine-grained Adaptive Visual Prompt for Generative Medical Visual Question Answering

Ting Yu, Zixuan Tong, Jun Yu et al.

AAAI 2025paper

#3383

OTPNet: ODE-inspired Tuning-free Proximal Network for Remote Sensing Image Fusion

Wei Yu, Zonglin Li, Qinglin Liu et al.

AAAI 2025paper

#3384

Cross-Lingual Text-Rich Visual Comprehension: An Information Theory Perspective

Xinmiao Yu, Xiaocheng Feng, Yun Li et al.

AAAI 2025paperarXiv:2412.17787

#3385

Building a Multi-modal Spatiotemporal Expert for Zero-shot Action Recognition with CLIP

Yating Yu, Congqi Cao, Yueran Zhang et al.

AAAI 2025paperarXiv:2412.09895

#3386

OLMD: Orientation-aware Long-term Motion Decoupling for Continuous Sign Language Recognition

Yiheng Yu, Sheng Liu, Yuan Feng et al.

AAAI 2025paperarXiv:2503.08205

#3387

Where Precision Meets Efficiency: Transformation Diffusion Model for Point Cloud Registration

Yongzhe Yuan, Yue Wu, Xiaolong Fan et al.

AAAI 2025paper

#3388

Efficient Neural Network Encoding for 3D Color Lookup Tables

Vahid Zehtab, David B. Lindell, Marcus A. Brubaker et al.

AAAI 2025paperarXiv:2412.15438

#3389

Gaze Label Alignment: Alleviating Domain Shift for Gaze Estimation

Guanzhong Zeng, Jingjing Wang, Zefu Xu et al.

AAAI 2025paperarXiv:2412.15601

#3390

TGFormer: Transformer with Track Query Group for Multi-Object Tracking

Rui Zeng, Yuanzhou Huang, Songwei Pei

AAAI 2025paper

#3391

World Knowledge-Enhanced Reasoning Using Instruction-Guided Interactor in Autonomous Driving

Mingliang Zhai, Cheng Li, Zengyuan Guo et al.

AAAI 2025paperarXiv:2412.06324

#3392

DetRF: Detachable Novel Views Synthesis of Dynamic Scenes Using Backdrop-Driven Neural Radiance Fields

Boyu Zhang, Zheng Zhu, Wenbo Xu

AAAI 2025paper

#3393

Training-Free and Hardware-Friendly Acceleration for Diffusion Models via Similarity-based Token Pruning

Evelyn Zhang, Jiayi Tang, Xuefei Ning et al.

AAAI 2025paper

#3394

When Open-Vocabulary Visual Question Answering Meets Causal Adapter: Benchmark and Approach

Feifei Zhang, Zhaoyi Zhang, Xi Zhang et al.

AAAI 2025paper

#3395

DocKylin: A Large Multimodal Model for Visual Document Understanding with Efficient Visual Slimming

Jiaxin Zhang, Wentao Yang, Songxuan Lai et al.

AAAI 2025paperarXiv:2406.19101

#3396

Just a Few Glances: Open-Set Visual Perception with Image Prompt Paradigm

Jinrong Zhang, Penghui Wang, Chunxiao Liu et al.

AAAI 2025paperarXiv:2412.10719

#3397

R^2-Art: Category-Level Articulation Pose Estimation from Single RGB Image via Cascade Render Strategy

Li Zhang, Haonan Jiang, Yukang Huo et al.

AAAI 2025paper

#3398

Common Sense Bias Modeling for Classification Tasks

Miao Zhang, Zee Fryer, Ben Colman et al.

AAAI 2025paperarXiv:2401.13213

#3399

IRMamba: Pixel Difference Mamba with Layer Restoration for Infrared Small Target Detection

Mingjin Zhang, Xiaolong Li, Fei Gao et al.

AAAI 2025paper

#3400

MOCID: Motion Context and Displacement Information Learning for Moving Infrared Small Target Detection

Mingjin Zhang, Yuanjun Ouyang, Fei Gao et al.

AAAI 2025paper

← Previous

1...15 16 17 18 19...27