Most Cited AAAI "modal integration" Papers
5,317 papers found • Page 16 of 27
Conference
Enhancing the Robustness of Spiking Neural Networks with Stochastic Gating Mechanisms
Jianhao Ding, Zhaofei Yu, Tiejun Huang et al.
A Closer Look at Curriculum Adversarial Training: From an Online Perspective
Lianghe Shi, Weiwei Liu
TrojanDec: Data-free Detection of Trojan Inputs in Self-supervised Learning
Yupei Liu, Yanting Wang, Jinyuan Jia
DRF: Improving Certified Robustness via Distributional Robustness Framework
Zekai Wang, Zhengyu Zhou, Weiwei Liu
Provably Convergent Federated Trilevel Learning
Yang Jiao, Kai YANG, Tiancheng Wu et al.
Recoverable Facial Identity Protection via Adaptive Makeup Transfer Adversarial Attacks
Xiyao Liu, Junxing Ma, Xinda Wang et al.
Dynamic Knowledge Injection for AIXI Agents
Samuel Yang-Zhao, Kee Siong Ng, Marcus Hutter
Finite-Time Frequentist Regret Bounds of Multi-Agent Thompson Sampling on Sparse Hypergraphs
Tianyuan Jin, Hao-Lun Hsu, William Chang et al.
Whole Genome Transformer for Gene Interaction Effects in Microbiome Habitat Specificity
Zhufeng Li, Sandeep Suresh Cranganore, Nicholas Youngblut et al.
AI-Powered Algorithm-Centric Quantum Processor Topology Design
Tian Li, Xiao-Yue Xu, Chen Ding et al.
Vox-UDA: Voxel-wise Unsupervised Domain Adaptation for Cryo-Electron Subtomogram Segmentation with Denoised Pseudo-Labeling
Haoran Li, Xingjian Li, Jiahua Shi et al.
IWRN:A Robust Blind Watermarking Method for Artwork Image Copyright Protection Against Noise Attack
Feifei Kou, Yuhan Yao, Siyuan Yao et al.
Learning Generalized Residual Exchange-Correlation-Uncertain Functional for Density Functional Theory
Sizhuo Jin, Shuo Chen, Jianjun Qian et al.
Feature Distribution Matching by Optimal Transport for Effective and Robust Coreset Selection
A Unified Self-Distillation Framework for Multimodal Sentiment Analysis with Uncertain Missing Modalities
Guiding a Harsh-Environments Robust Detector via RAW Data Characteristic Mining
Xiezhi: An Ever-Updating Benchmark for Holistic Domain Knowledge Evaluation
Zhouhong Gu, Xiaoxuan Zhu, Haoning Ye et al.
Resisting Backdoor Attacks in Federated Learning via Bidirectional Elections and Individual Perspective
Zhen Qin, Feiyi Chen, Chen Zhi et al.
Transportable Representations for Domain Generalization
Kasra Jalaldoust, Elias Bareinboim
Exponential Hardness of Optimization from the Locality in Quantum Neural Networks
Hao-Kai Zhang, Chengkai Zhu, Geng Liu et al.
Social Recommendation via Graph-Level Counterfactual Augmentation
Yinxuan Huang, Ke Liang, Yanyi Huang et al.
MFOS: Model-Free & One-Shot Object Pose Estimation
JongMin Lee, Yohann Cabon, Romain Brégier et al.
Hierarchical Topology Isomorphism Expertise Embedded Graph Contrastive Learning
Jiangmeng Li, Yifan Jin, Hang Gao et al.
ViFactCheck: A New Benchmark Dataset and Methods for Multi-Domain News Fact-Checking In Vietnamese
Tran Thai Hoa, Tran Quang Duy, Khanh Quoc Tran et al.
PDE+: Enhancing Generalization via PDE with Adaptive Distributional Diffusion
Yige Yuan, Bingbing Xu, Bo Lin et al.
HHAN: Comprehensive Infectious Disease Source Tracing via Heterogeneous Hypergraph Neural Network
Qiang He, Yunting Bao, Hui Fang et al.
Learning Representations on the Unit Sphere: Investigating Angular Gaussian and Von Mises-Fisher Distributions for Online Continual Learning
Nicolas Michel, Giovanni Chierchia, Romain Negrel et al.
Towards Real-World Test-Time Adaptation: Tri-net Self-Training with Balanced Normalization
Yongyi Su, Xun Xu, Kui Jia
A Theoretical Framework for an Efficient Normalizing Flow-Based Solution to the Electronic Schrödinger Equation
Daniel Freedman, Eyal Rozenberg, Alex Bronstein
Probabilistic Offline Policy Ranking with Approximate Bayesian Computation
Longchao Da, Porter Jenkins, Trevor Schwantes et al.
Revisiting Disentanglement in Downstream Tasks: A Study on Its Necessity for Abstract Visual Reasoning
Ruiqian Nai, Zixin Wen, Ji Li et al.
How to Re-enable PDE Loss for Physical Systems Modeling Under Partial Observation
Haodong Feng, Yue Wang, Dixia Fan
Knowledge Is Power: Harnessing Large Language Models for Enhanced Cognitive Diagnosis
Zhiang Dong, Jingyuan Chen, Fei Wu
Out of Thin Air: Exploring Data-Free Adversarial Robustness Distillation
Yuzheng Wang, Zhaoyu Chen, Dingkang Yang et al.
Improving Cancer Gene Prediction by Enhancing Common Information Between the PPI Network and Gene Functional Association
Chao Deng, Hongdong Li, Jianxin Wang
HAGO-Net: Hierarchical Geometric Massage Passing for Molecular Representation Learning
Hongbin Pei, Taile Chen, Chen A et al.
Path-Adaptive Matting for Efficient Inference Under Various Computational Cost Constraints
Qinglin Liu, Zonglin Li, Xiaoqian Lv et al.
Robust SAM: On the Adversarial Robustness of Vision Foundation Models
Jiahuan Long, Zhengqin Xu, Tingsong Jiang et al.
Generative Video Diffusion for Unseen Novel Semantic Video Moment Retrieval
Dezhao Luo, Shaogang Gong, Jiabo Huang et al.
VG-TVP: Multimodal Procedural Planning via Visually Grounded Text-Video Prompting
Muhammet Furkan Ilaslan, Ali Köksal, Kevin Qinghong Lin et al.
Game4Loc: A UAV Geo-Localization Benchmark from Game Data
Yuxiang Ji, Boyong He, Zhuoyue Tan et al.
Orchestrating the Symphony of Prompt Distribution Learning for Human-Object Interaction Detection
Mingda Jia, Liming Zhao, Ge Li et al.
DEEPTalk: Dynamic Emotion Embedding for Probabilistic Speech-Driven 3D Face Animation
Jisoo Kim, Jungbin Cho, Joonho Park et al.
Pedestrian Attribute Recognition: A New Benchmark Dataset and a Large Language Model Augmented Framework
Jiandong Jin, Xiao Wang, Qian Zhu et al.
ViPCap: Retrieval Text-Based Visual Prompts for Lightweight Image Captioning
Taewhan Kim, Soeun Lee, Si-Woo Kim et al.
U-KAN Makes Strong Backbone for Medical Image Segmentation and Generation
Chenxin Li, Xinyu Liu, Wuyang Li et al.
UniDet3D: Multi-dataset Indoor 3D Object Detection
Maksim Kolodiazhnyi, Anna Vorontsova, Matvey Skripkin et al.
Do Not DeepFake Me: Privacy-Preserving Neural 3D Head Reconstruction Without Sensitive Images
Jiayi Kong, Xurui Song, Shuo Huai et al.
Exploiting Diffusion Prior for Real-World Image Dehazing with Unpaired Training
Yunwei Lan, Zhigao Cui, Chang Liu et al.
Rethinking Open-Vocabulary Segmentation of Radiance Fields in 3D Space
Hyunjee Lee, Youngsik Yun, Jeongmin Bae et al.
MaskViM: Domain Generalized Semantic Segmentation with State Space Models
Jiahao Li, Yang Lu, Yuan Xie et al.
A Compact Implicit Neural Representation for Efficient Storage of Massive 4D Functional Magnetic Resonance Imaging
Ruoran Li, Runzhao Yang, Wenxin Xiang et al.
Transferable Adversarial Face Attack with Text Controlled Attribute
Wenyun Li, Zheng Zhang, Xiangyuan Lan et al.
ProsodyTalker: 3D Visual Speech Animation via Prosody Decomposition
Zonglin Li, Xiaoqian Lv, Qinglin Liu et al.
Decoupling Appearance Variations with 3D Consistent Features in Gaussian Splatting
Jiaqi Lin, Zhihao Li, Binxiao Huang et al.
Disentangled Motion Modeling for Video Frame Interpolation
Jaihyun Lew, Jooyoung Choi, Chaehun Shin et al.
AGLLDiff: Guiding Diffusion Models Towards Unsupervised Training-free Real-world Low-light Image Enhancement
Yunlong Lin, Tian Ye, Sixiang Chen et al.
RemDet: Rethinking Efficient Model Design for UAV Object Detection
Chen Li, Rui Zhao, Zeyu Wang et al.
4D Diffusion for Dynamic Protein Structure Prediction with Reference and Motion Guidance
Kaihui Cheng, Ce Liu, Qingkun Su et al.
G2LDetect: A Global-to-Local Approach for Hallucination Detection
Xiaoxia Cheng, Zeqi Tan, Zhe Zheng et al.
RingFormer: A Ring-Enhanced Graph Transformer for Organic Solar Cell Property Prediction
Zhihao Ding, Ting Zhang, Yiran Li et al.
HeMeNet: Heterogeneous Multichannel Equivariant Network for Protein Multi-task Learning
Rong Han, Wenbing Huang, Lingxiao Luo et al.
Controllable Protein Sequence Generation with LLM Preference Optimization
Xiangyu Liu, Yi Liu, Silei Chen et al.
DAMMFND: Domain-Aware Multimodal Multi-view Fake News Detection
Weihai Lu, Yu Tong, Zhiqiu Ye
M²N: A Progressive Macro-to-Micro 3D Modeling Scheme for Unveiling Drug-Target Affinity
Tianxu Lv, Jie Zhu, Jinyi Liu et al.
Multi-modal Deepfake Detection via Multi-task Audio-Visual Prompt Learning
Hui Miao, Yuanfang Guo, Zeming Liu et al.
SpeHeaTal: A Cluster-Enhanced Segmentation Method for Sperm Morphology Analysis
Yi Shi, Yun-Kai Wang, Xu-Peng Tian et al.
Generalized Implicit Neural Representations for Dynamic Molecular Surface Modeling
Fang Wu, Bozhen Hu, Stan Z. Li
MultiSFL: Towards Accurate Split Federated Learning via Multi-Model Aggregation and Knowledge Replay
Zeke Xia, Ming Hu, Dengke Yan et al.
Uncovering LLM-Generated Code: A Zero-Shot Synthetic Code Detector via Code Rewriting
Tong Ye, Yangkai Du, Tengfei Ma et al.
Efficient Traffic Prediction Through Spatio-Temporal Distillation
Qianru Zhang, Xinyi Gao, Haixin Wang et al.
Multi-Perspective Consolidation Enhanced Cognitive Diagnosis via Conditional Diffusion Model
Guanhao Zhao, Zhenya Huang, Cheng Cheng et al.
Multi-View Incremental Learning with Structured Hebbian Plasticity for Enhanced Fusion Efficiency
Yuhong Chen, Ailin Song, Huifeng Yin et al.
Symbolic Functional Decomposition: A Reconfiguration Approach
Mateus de Oliveira Oliveira, Wim Van Den Broeck
Towards More Discriminative Feature Learning in SNNs with Temporal-Self-Erasing Supervision
Wei Liu, Li Yang, Mingxuan Zhao et al.
Multi-to-Single: Reducing Multimodal Dependency in Emotion Recognition Through Contrastive Learning
Yan-Kai Liu, Jinyu Cai, Bao-Liang Lu et al.
ToMATO: Verbalizing the Mental States of Role-Playing LLMs for Benchmarking Theory of Mind
Kazutoshi Shinoda, Nobukatsu Hojo, Kyosuke Nishida et al.
SalM²: An Extremely Lightweight Saliency Mamba Model for Real-Time Cognitive Awareness of Driver Attention
Chunyu Zhao, Wentao Mu, Xian Zhou et al.
Progressive Self-Learning for Domain Adaptation on Symbolic Regression of Integer Sequences
Yaohui Zhu, Kaiming Sun, Zhengdong Luo et al.
HiRED: Attention-Guided Token Dropping for Efficient Inference of High-Resolution Vision-Language Models
Kazi Hasan Ibn Arif, JinYi Yoon, Dimitrios S. Nikolopoulos et al.
Can Generative Models Improve Self-Supervised Representation Learning?
Sana Ayromlou, Vahid Reza Khazaie, Fereshteh Forghani et al.
The Master Key Filters Hypothesis: Deep Filters Are General
Zahra Babaiee, Peyman M. Kiasari, Daniela Rus et al.
FreeMask: Rethinking the Importance of Attention Masks for Zero-Shot Video Editing
Lingling Cai, Kang Zhao, Hangjie Yuan et al.
Deep Graph Online Hashing for Multi-Label Image Retrieval
Yuan Cao, Xiangru Chen, Zifan Liu et al.
Segment Any 3D Gaussians
Jiazhong Cen, Jiemin Fang, Chen Yang et al.
Infinite-Canvas: Higher-Resolution Video Outpainting with Extensive Content Generation
Qihua Chen, Yue Ma, Hongfa Wang et al.
Cross-View Referring Multi-Object Tracking
Sijia Chen, En Yu, Wenbing Tao
M3Net: Multimodal Multi-task Learning for 3D Detection, Segmentation, and Occupancy Prediction in Autonomous Driving
Xuesong Chen, Shaoshuai Shi, Tao Ma et al.
3D Measurement of Complex Textured Objects Based on Bidirectional Fringe Projection
Yuchong Chen, Jian Yu, Shaoyan Gai et al.
EvHDR-GS: Event-guided HDR Video Reconstruction with 3D Gaussian Splatting
Zehao Chen, Zhan Lu, De Ma et al.
3DPGS: 3D Probabilistic Graph Search for Archaeological Piece Grouping
Junfeng Cheng, Yingkai Yang, Tania Stathaki
Bridge 2D-3D: Uncertainty-aware Hierarchical Registration Network with Domain Alignment
Zhixin Cheng, Jiacheng Deng, Xinjun Li et al.
Distribution-Level Feature Distancing for Machine Unlearning: Towards a Better Trade-off Between Model Utility and Forgetting
Dasol Choi, Dongbin Na
SIDL: A Real-World Dataset for Restoring Smartphone Images with Dirty Lenses
Sooyoung Choi, Sungyong Park, Heewon Kim
AttackBench: Evaluating Gradient-based Attacks for Adversarial Examples
Antonio Emanuele Cinà, Jérôme Rony, Maura Pintor et al.
PIXELS: Progressive Image Xemplar-based Editing with Latent Surgery
Shristi Das Biswas, Matthew Shreve, Xuelu Li et al.
Boundary-Aware Temporal Dynamic Pseudo-Supervision Pairs Generation for Zero-Shot Natural Language Video Localization
Xiongwen Deng, Haoyu Tang, Han Jiang et al.
AS-Det: Active Sampling for Adaptive 3D Object Detection in Point Clouds
Ziheng Ding, Xiaze Zhang, Qi Jing et al.
Latent Diffusion-Enhanced Virtual Try-On via Optimized Pseudo-Label Generation
Chenghu Du, Junyin Wang, Feng Yu et al.
SSUN-Net: Spatial-Spectral Prior-Aware Unfolding Network for Pan-Sharpening
Shijie Fang, Hongping Gan
PNVC: Towards Practical INR-based Video Compression
Ge Gao, Ho Man Kwan, Fan Zhang et al.
EventMamba: Enhancing Spatio-Temporal Locality with State Space Models for Event-Based Video Reconstruction
Chengjie Ge, Xueyang Fu, Peng He et al.
Implicit Location-Caption Alignment via Complementary Masking for Weakly-Supervised Dense Video Captioning
Shiping Ge, Qiang Chen, Zhiwei Jiang et al.
Surgical Workflow Recognition and Blocking Effectiveness Detection in Laparoscopic Liver Resection with Pringle Maneuver
Diandian Guo, Weixin Si, Zhixi Li et al.
PromptDet: A Lightweight 3D Object Detection Framework with LiDAR Prompts
Kun Guo, Qiang Ling
VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Grounding
Yongxin Guo, Jingyu Liu, Mingda Li et al.
LLaVA Needs More Knowledge: Retrieval Augmented Natural Language Generation with Knowledge Graph for Explaining Thoracic Pathologies
Ameer Hamza, Abdullah, Yong Hyun Ahn et al.
MagicMan: Generative Novel View Synthesis of Humans with 3D-Aware Diffusion and Iterative Refinement
Xu He, Zhiyong Wu, Xiaoyu Li et al.
Prompt Tuning In a Compact Attribute Space
Shiyu Hou, Tianfei Zhou, Shuai Zhang et al.
Resolving Multi-Condition Confusion for Finetuning-Free Personalized Image Generation
Qihan Huang, Siming Fu, Jinlong Liu et al.
PSReg: Prior-guided Sparse Mixture of Experts for Point Cloud Registration
Xiaoshui Huang, Zhou Huang, Yifan Zuo et al.
EGSRAL:An Enhanced 3D Gaussian Splatting Based Renderer with Automated Labeling for Large-Scale Driving Scene
Yixiong Huo, Guangfeng Jiang, Hongyang Wei et al.
High-Resolution Frame Interpolation with Patch-based Cascaded Diffusion
Junhwa Hur, Charles Herrmann, Saurabh Saxena et al.
Few-Shot Fine-Grained Image Classification with Progressively Feature Refinement and Continuous Relationship Modeling
Zhen-Xiang Ma, Zhen-Duo Chen, Tai Zheng et al.
SegFace: Face Segmentation of Long-Tail Classes
Kartik Narayan, Vibashan Vs, Vishal M. Patel
HiGDA: Hierarchical Graph of Nodes to Learn Local-to-Global Topology for Semi-Supervised Domain Adaptation
Ba Hung Ngo, Doanh C. Bui, Nhat-Tuong Do-Tran et al.
Locate Anything on Earth: Advancing Open-Vocabulary Object Detection for Remote Sensing Community
Jiancheng Pan, Yanxing Liu, Yuqian Fu et al.
Beyond Text: Fine-Grained Multi-Modal Fact Verification with Hypergraph Transformers
Hui Pang, Chaozhuo Li, Litian Zhang et al.
EfficientVMamba: Atrous Selective Scan for Light Weight Visual Mamba
Xiaohuan Pei, Tao Huang, Chang Xu
IMAGDressing-v1: Customizable Virtual Dressing
Fei Shen, Xin Jiang, Xin He et al.
Normal-NeRF: Ambiguity-Robust Normal Estimation for Highly Reflective Scenes
Ji Shi, Xianghua Ying, Ruohao Guo et al.
OGP-Net: Optical Guidance Meets Pixel-Level Contrastive Distillation for Robust Multi-Modal and Missing Modality Segmentation
Aniruddh Sikdar, Jayant Teotia, Suresh Sundaram
Temporal Coherent Object Flow for Multi-Object Tracking
Zikai Song, Run Luo, Lintao Ma et al.
Toward Improving Robustness and Accuracy in Unsupervised Domain Adaptation
Aishwarya Soni, Tanima Dutta
Explicit Relational Reasoning Network for Scene Text Detection
Yuchen Su, Zhineng Chen, Yongkun Du et al.
3D Annotation-Free Learning by Distilling 2D Open-Vocabulary Segmentation Models for Autonomous Driving
Boyi Sun, Yuhang Liu, Xingxia Wang et al.
C2P-CLIP: Injecting Category Common Prompt in CLIP to Enhance Generalization in Deepfake Detection
Chuangchuang Tan, Renshuai Tao, Huan Liu et al.
From Representation Space to Prognostic Insights: Whole Slide Image Generation with Hierarchical Diffusion Model for Survival Prediction
Zhihao Tang, Xi Zhang, Chaozhuo Li
Unsupervised Self-Prior Embedding Neural Representation for Iterative Sparse-View CT Reconstruction
Xuanyu Tian, Lixuan Chen, Qing Wu et al.
AI-generated Image Quality Assessment in Visual Communication
Yu Tian, Yixuan Li, Baoliang Chen et al.
G-VEval: A Versatile Metric for Evaluating Image and Video Captions Using GPT-4o
Tony Cheng Tong, Sirui He, Zhiwen Shao et al.
Towards Efficient Object Re-Identification with a Novel Cloud-Edge Collaborative Framework
Chuanming Wang, Yuxin Yang, Mengshi Qi et al.
EMControl: Adding Conditional Control to Text-to-Image Diffusion Models via Expectation-Maximization
He Wang, Longquan Dai, Jinhui Tang
MIMTrack: In-Context Tracking via Masked Image Modeling
Xingmei Wang, Guohao Nie, Jiaxiang Meng et al.
Hierarchical Alignment-enhanced Adaptive Grounding Network for Generalized Referring Expression Comprehension
Yaxian Wang, Henghui Ding, Shuting He et al.
Capturing the Unseen: Vision-Free Facial Motion Capture Using Inertial Measurement Units
Youjia Wang, Yiwen Wu, Hengan Zhou et al.
MambaPro: Multi-Modal Object Re-identification with Mamba Aggregation and Synergistic Prompt
Yuhao Wang, Xuehu Liu, Tianyu Yan et al.
IteRPrimE: Zero-shot Referring Image Segmentation with Iterative Grad-CAM Refinement and Primary Word Emphasis
Yuji Wang, Jingchen Ni, Yong Liu et al.
Thermal-Aware Low-Light Image Enhancement: A Real-World Benchmark and a New Light-Weight Model
Zhen Wang, Yaozu Wu, Dongyuan Li et al.
Realistic Noise Synthesis with Diffusion Models
Qi Wu, Mingyan Han, Ting Jiang et al.
Deconfound Semantic Shift and Incompleteness in Incremental Few-shot Semantic Segmentation
Yirui Wu, Yuhang Xia, Hao Li et al.
Boosting Vision State Space Model with Fractal Scanning
Haoke Xiao, Lv Tang, Peng-tao Jiang et al.
Cross-modulated Attention Transformer for RGBT Tracking
Yun Xiao, Jiacong Zhao, Andong Lu et al.
PointTalk: Audio-Driven Dynamic Lip Point Cloud for 3D Gaussian-based Talking Head Synthesis
Yifan Xie, Tao Feng, Xin Zhang et al.
HieraFashDiff: Hierarchical Fashion Design with Multi-stage Diffusion Models
Zhifeng Xie, Hao Li, Huiming Ding et al.
FLAME: Learning to Navigate with Multimodal LLM in Urban Environments
Yunzhe Xu, Yiyuan Pan, Zhe Liu et al.
Diffusion Prior Interpolation for Flexibility Real-World Face Super-Resolution
Jiarui Yang, Tao Dai, Yufei Zhu et al.
Dual Information Purification for Lightweight SAR Object Detection
Xi Yang, Jiachen Sun, Songsong Duan et al.
MMGDreamer: Mixed-Modality Graph for Geometry-Controllable 3D Indoor Scene Generation
Zhifei Yang, Keyang Lu, Chao Zhang et al.
MM-Tracker: Motion Mamba for UAV-platform Multiple Object Tracking
Mufeng Yao, Jinlong Peng, Qingdong He et al.
FlexDataset: Crafting Annotated Dataset Generation for Diverse Applications
Ellen Yi-Ge, Leo Shawn
ImagePiece: Content-aware Re-tokenization for Efficient Image Recognition
Seungdong Yoa, Seungjun Lee, Hye-Seung Cho et al.
FOCUS: Towards Universal Foreground Segmentation
Zuyao You, Lingyu Kong, Lingchen Meng et al.
Fine-grained Adaptive Visual Prompt for Generative Medical Visual Question Answering
Ting Yu, Zixuan Tong, Jun Yu et al.
OTPNet: ODE-inspired Tuning-free Proximal Network for Remote Sensing Image Fusion
Wei Yu, Zonglin Li, Qinglin Liu et al.
Building a Multi-modal Spatiotemporal Expert for Zero-shot Action Recognition with CLIP
Yating Yu, Congqi Cao, Yueran Zhang et al.
OLMD: Orientation-aware Long-term Motion Decoupling for Continuous Sign Language Recognition
Yiheng Yu, Sheng Liu, Yuan Feng et al.
Gaze Label Alignment: Alleviating Domain Shift for Gaze Estimation
Guanzhong Zeng, Jingjing Wang, Zefu Xu et al.
TGFormer: Transformer with Track Query Group for Multi-Object Tracking
Rui Zeng, Yuanzhou Huang, Songwei Pei
Training-Free and Hardware-Friendly Acceleration for Diffusion Models via Similarity-based Token Pruning
Evelyn Zhang, Jiayi Tang, Xuefei Ning et al.
Decoupling Scattering: Pseudo-Label Guided NeRF for Scenes with Scattering Media
Mingyang Zhang, Junkang Zhang, Faming Fang et al.
Visual Perturbation for Text-Based Person Search
Pengcheng Zhang, Xiaohan Yu, Xiao Bai et al.
CAMSIC: Content-aware Masked Image Modeling Transformer for Stereo Image Compression
Xinjie Zhang, Shenyuan Gao, Zhening Liu et al.
Track the Answer: Extending TextVQA from Image to Video with Spatio-Temporal Clues
Yan Zhang, Gangyan Zeng, Huawen Shen et al.
InstantSticker: Realistic Decal Blending via Disentangled Object Reconstruction
Yi Zhang, Xiaoyang Huang, Yishun Dou et al.
Leveraging Consistent Spatio-Temporal Correspondence for Robust Visual Odometry
Zhaoxing Zhang, Junda Cheng, Gangwei Xu et al.
Adaptive Wavelet-Positional Encoding for High-Frequency Information Learning in Implicit Neural Representation
Hongxu Zhao, Zelin Gao, Yue Wang et al.
NightReID: A Large-Scale Nighttime Person Re-Identification Benchmark
Yuxuan Zhao, Weijian Ruan, He Li et al.
Universal Domain Adaptive Object Detection via Dual Probabilistic Alignment
Yuanfan Zheng, Jinlin Wu, Wuyang Li et al.
MMPF: Multi-Modal Perception Framework for Abnormal Medical Condition Detection
Chuyi Zhong, Dingkang Yang, Peng Zhai et al.
Core-to-Global Reasoning for Compositional Visual Question Answering
Hao Zhou, Tingjin Luo, Zhangqi Jiang
Mitigating Feature Gap for Adversarial Robustness by Feature Disentanglement
Nuoyan Zhou, Dawei Zhou, Decheng Liu et al.
GoHD: Gaze-oriented and Highly Disentangled Portrait Animation with Rhythmic Poses and Realistic Expressions
Ziqi Zhou, Weize Quan, Hailin Shi et al.
A Lottery Ticket Hypothesis Approach with Sparse Fine-tuning and MAE for Image Forgery Detection and Localization
Jiaying Zhu, Dong Li, Xueyang Fu et al.
Less Is More: Adaptive Program Repair with Bug Localization and Preference Learning
Zhenlong Dai, Bingrui Chen, Zhuoluo Zhao et al.
Optimal Classification Trees for Continuous Feature Data Using Dynamic Programming with Branch-and-Bound
Cătălin E. Brița, Jacobus G. M. van der Linden, Emir Demirović
Decentralized Projected Riemannian Stochastic Recursive Momentum Method for Nonconvex Optimization
Kangkang Deng, Jiang Hu
Parameterized Complexity of Caching in Networks
Robert Ganian, Fionn Mc Inerney, Dimitra Tsigkari
DCC: Differentiable Cardinality Constraints for Partial Index Tracking
Wooyeon Jo, Hyunsouk Cho
Designing Specialized Two-Dimensional Graph Spectral Filters for Spatial-Temporal Graph Modeling
Yuxin Chen, Fangru Lin, Jingyi Huo et al.
POI-Enhancer: An LLM-based Semantic Enhancement Framework for POI Representation Learning
Jiawei Cheng, Jingyuan Wang, Yichuan Zhang et al.
Descriptive and Discriminative Document Identifiers for Generative Retrieval
Jiehan Cheng, Zhicheng Dou, Yutao Zhu et al.
Entire-Space Variational Information Exploitation for Post-Click Conversion Rate Prediction
Ke Fei, Xinyue Zhang, Jingjing Li
Mixed-Curvature Multi-Modal Knowledge Graph Completion
Yuxiao Gao, Fuwei Zhang, Zhao Zhang et al.
Multiple Purchase Chains with Negative Transfer Elimination for Multi-Behavior Recommendation
Shuwei Gong, Yuting Liu, Yizhou Dang et al.
K-ON: Stacking Knowledge on the Head Layer of Large Language Model
Lingbing Guo, Yichi Zhang, Zhongpu Bo et al.
Decomposed Spatio-Temporal Mamba for Long-Term Traffic Prediction
Sicheng He, Junzhong Ji, Minglong Lei
ST-FiT: Inductive Spatial-Temporal Forecasting with Limited Training Data
Zhenyu Lei, Yushun Dong, Jundong Li et al.
Public Opinion Field Effect and Hawkes Process Join Hands for Information Popularity Prediction
Junliang Li, Yajun Yang, Yujia Zhang et al.
Self-Explainable Graph Transformer for Link Sign Prediction
Lu Li, Jiale Liu, Xingyu Ji et al.
Context-aware Inductive Knowledge Graph Completion with Latent Type Constraints and Subgraph Reasoning
Muzhi Li, Cehao Yang, Chengjin Xu et al.
Structure Balance and Gradient Matching-Based Signed Graph Condensation
Rong Li, Long Xu, Songbai Liu et al.
LLMEmb: Large Language Model Can Be a Good Embedding Generator for Sequential Recommendation
Qidong Liu, Xian Wu, Wanyu Wang et al.
EPERM: An Evidence Path Enhanced Reasoning Model for Knowledge Graph Question and Answering
Xiao Long, Liansheng Zhuang, Aodi Li et al.
FairGP: A Scalable and Fair Graph Transformer Using Graph Partitioning
Renqiang Luo, Huafei Huang, Ivan Lee et al.
Sub-Interest-Aware Representation Uniformity for Recommender System
Ruijia Ma, Yahong Lian, Chunyao Song
GenAuction: A Generative Auction for Online Advertising
Yuchao Ma, Ruohan Qian, Bingzhe Wang et al.
Seeing Beyond Noise: Joint Graph Structure Evaluation and Denoising for Multimodal Recommendation
Yuxin Qi, Quan Zhang, Xi Lin et al.
Domain-Level Disentanglement Framework Based on Information Enhancement for Cross-Domain Cold-Start Recommendation
Nian Rong, Fei Xiong, Shirui Pan et al.
Language Pre-training Guided Masking Representation Learning for Time Series Classification
Liaoyuan Tang, Zheng Wang, Jie Wang et al.