Most Cited 2025 "long context" Papers
22,274 papers found • Page 87 of 112
Conference
GoLF-NRT: Integrating Global Context and Local Geometry for Few-Shot View Synthesis
You Wang, Li Fang, Hao Zhu et al.
SCFlow2: Plug-and-Play Object Pose Refiner with Shape-Constraint Scene Flow
Qingyuan Wang, Rui Song, Jiaojiao Li et al.
Domain Adaptive Diabetic Retinopathy Grading with Model Absence and Flowing Data
Wenxin Su, Song Tang, Xiaofeng Liu et al.
Explanations of GNN on Evolving Graphs via Axiomatic Layer edges
Yazheng Liu, Sihong Xie
Bridging Local Inductive Bias and Long-Range Dependencies with Pixel-Mamba for End-to-end Whole Slide Image Analysis
Zhongwei Qiu, Hanqing Chao, Tiancheng Lin et al.
Dynamic Modeling of Patients, Modalities and Tasks via Multi-modal Multi-task Mixture of Experts
Chenwei Wu, Zitao Shuai, Zhengxu Tang et al.
Rethinking Personalized Aesthetics Assessment: Employing Physique Aesthetics Assessment as An Exemplification
Haobin Zhong, Shuai He, Anlong Ming et al.
Frequency-Biased Synergistic Design for Image Compression and Compensation
Jiaming Liu, Qi Zheng, Zihao Liu et al.
ASTrA: Adversarial Self-supervised Training with Adaptive-Attacks
Prakash Chandra Chhipa, Gautam Vashishtha, Jithamanyu Settur et al.
MambaQuant: Quantizing the Mamba Family with Variance Aligned Rotation Methods
Dawei Yang, Yuxuan Yue, Xing Hu et al.
GOttack: Universal Adversarial Attacks on Graph Neural Networks via Graph Orbits Learning
Zulfikar Alom, Tran Gia Bao Ngo, Murat Kantarcioglu et al.
Misspecified $Q$-Learning with Sparse Linear Function Approximation: Tight Bounds on Approximation Error
Ally Du, Lin Yang, Ruosong Wang
Cycle-Consistent Learning for Joint Layout-to-Image Generation and Object Detection
Xinhao Cai, Qiuxia Lai, Gensheng Pei et al.
Regret-Optimal List Replicable Bandit Learning: Matching Upper and Lower Bounds
Michael Chen, A. Pavan, N. V. Vinodchandran et al.
Hierarchical 3D Scene Graphs Construction Outdoors
Jon Nyffeler, Federico Tombari, Daniel Barath
Neural Architecture Search Driven by Locally Guided Diffusion for Personalized Federated Learning
PENG LIAO, Xilu Wang, Yaochu Jin et al.
WISH: Weakly Supervised Instance Segmentation using Heterogeneous Labels
Hyeokjun Kweon, Kuk-Jin Yoon
Convex Combination Star Shape Prior for Data-driven Image Semantic Segmentation
Xinyu Zhao, Jun Xie, Shengzhe Chen et al.
FreeCG: Free the Design Space of Clebsch-Gordan Transform for Machine Learning Force Fields
Shihao Shao, Haoran Geng, Zun Wang et al.
Learning Conditional Space-Time Prompt Distributions for Video Class-Incremental Learning
Xiaohan Zou, Wenchao Ma, Shu Zhao
Gain-MLP: Improving HDR Gain Map Encoding via a Lightweight MLP
Trevor Canham, SaiKiran Tedla, Michael Murdoch et al.
Dirichlet-Constrained Variational Codebook Learning for Temporally Coherent Video Face Restoration
Baoyou Chen, Ce Liu, Weihao Yuan et al.
χ: Symmetry Understanding of 3D Shapes via Chirality Disentanglement
Weikang Wang, Tobias Weißberg, Nafie El Amrani et al.
Trajectory-LLM: A Language-based Data Generator for Trajectory Prediction in Autonomous Driving
Kairui Yang, Zihao Guo, Gengjie Lin et al.
GeoMan: Temporally Consistent Human Geometry Estimation using Image-to-Video Diffusion
Gwanghyun Kim, Xueting Li, Ye Yuan et al.
LightCity: An Urban Dataset for Outdoor Inverse Rendering and Reconstruction under Multi-illumination Conditions
Jingjing Wang, Qirui Hu, Chong Bao et al.
Disentangled Pose and Appearance Guidance for Multi-Pose Generation
Tengfei Xiao, Yue Wu, Yuelong Li et al.
Overcoming Dual Drift for Continual Long-Tailed Visual Question Answering
Feifei Zhang, Zhihao Wang, Xi Zhang et al.
Humans as Checkerboards: Calibrating Camera Motion Scale for World-Coordinate Human Mesh Recovery
Fengyuan Yang, Kerui Gu, Ha Linh Nguyen et al.
VI^3NR: Variance Informed Initialization for Implicit Neural Representations
Chamin Hewa Koneputugodage, Yizhak Ben-Shabat, Sameera Ramasinghe et al.
Unsupervised Identification of Protein Compositions and Conformations via Implicit Content-Transformation Disentanglement
Mostofa Rafid Uddin, Jana Armouti, Min Xu
GliaNet: Adaptive Neural Network Structure Learning with Glia-Driven
Mengqiao Han, Liyuan Pan, Xiabi Liu
Weakly Supervised Semantic Segmentation via Progressive Confidence Region Expansion
Xiangfeng Xu, Pinyi Zhang, Wenxuan Huang et al.
VidSeg: Training-free Video Semantic Segmentation based on Diffusion Models
Qian Wang, Abdelrahman Eldesokey, Mohit Mendiratta et al.
MOS-Attack: A Scalable Multi-objective Adversarial Attack Framework
Ping Guo, Cheng Gong, Fei Liu et al.
SuperLightNet: Lightweight Parameter Aggregation Network for Multimodal Brain Tumor Segmentation
Feng Yu, Jiacheng Cao, Li Liu et al.
Robust-PIFu: Robust Pixel-aligned Implicit Function for 3D Human Digitalization from a Single Image
Kennard Chan, Fayao Liu, Guosheng Lin et al.
Learning Hierarchical Line Buffer for Image Processing
Jiacheng Li, Feiran Li, Daisuke Iso
Feature Extraction and Representation of Pre-training Point Cloud Based on Diffusion Models
Chang Qiu, Feipeng Da, Zilei Zhang
FE-CLIP: Frequency Enhanced CLIP Model for Zero-Shot Anomaly Detection and Segmentation
Tao Gong, Qi Chu, Bin Liu et al.
Backdoor Defense via Enhanced Splitting and Trap Isolation
Hongrui Yu, Lu Qi, Wanyu Lin et al.
MANTA: Diffusion Mamba for Efficient and Effective Stochastic Long-Term Dense Action Anticipation
Olga Zatsarynna, Emad Bahrami, Yazan Abu Farha et al.
ConceptSplit: Decoupled Multi-Concept Personalization of Diffusion Models via Token-wise Adaptation and Attention Disentanglement
Habin Lim, Youngseob Won, Juwon Seo et al.
ShapeWords: Guiding Text-to-Image Synthesis with 3D Shape-Aware Prompts
Dmitrii M Petrov, Pradyumn Goyal, Divyansh Shivashok et al.
SC-Lane: Slope-aware and Consistent Road Height Estimation Framework for 3D Lane Detection
Chaesong Park, Eunbin Seo, JihyeonHwang JihyeonHwang et al.
FullDiT: Video Generative Foundation Models with Multimodal Control via Full Attention
Xuan Ju, Weicai Ye, Quande Liu et al.
GaussianIP: Identity-Preserving Realistic 3D Human Generation via Human-Centric Diffusion Prior
Zichen Tang, Yuan Yao, Miaomiao Cui et al.
Re-Evaluating the Impact of Unseen-Class Unlabeled Data on Semi-Supervised Learning Model
Rundong He, Yicong Dong, Lan-Zhe Guo et al.
Bridging Gait Recognition and Large Language Models Sequence Modeling
Shaopeng Yang, Jilong Wang, Saihui Hou et al.
Cross-Rejective Open-Set SAR Image Registration
Shasha Mao, Shiming Lu, Zhaolong Du et al.
Controllable Latent Space Augmentation for Digital Pathology
Sofiène Boutaj, Marin Scalbert, Pierre Marza et al.
Multi-scenario Overlapping Text Segmentation with Depth Awareness
Yang Liu, Xudong Xie, Yuliang Liu et al.
SOAP: Vision-Centric 3D Semantic Scene Completion with Scene-Adaptive Decoder and Occluded Region-Aware View Projection
Hyo-Jun Lee, Yeong Jun Koh, Hanul Kim et al.
MagicCity: Geometry-Aware 3D City Generation from Satellite Imagery with Multi-View Consistency
Xingbo YAO, xuanmin Wang, Hao WU et al.
VehicleMAE: View-asymmetry Mutual Learning for Vehicle Re-identification Pre-training via Masked AutoEncoders
Qi Wang, Zeyu Zhang, Dong Wang et al.
Temporal Overlapping Prediction: A Self-supervised Pre-training Method for LiDAR Moving Object Segmentation
Ziliang Miao, Runjian Chen, Yixi Cai et al.
FedCS: Coreset Selection for Federated Learning
Chenhe Hao, Weiying Xie, Daixun Li et al.
GraphI2P: Image-to-Point Cloud Registration with Exploring Pattern of Correspondence via Graph Learning
Lin Bie, Shouan Pan, Siqi Li et al.
ArchiSet: Benchmarking Editable and Consistent Single-View 3D Reconstruction of Buildings with Specific Window-to-Wall Ratios
Jun Yin, Pengyu Zeng, Licheng Shen et al.
S²M²: Scalable Stereo Matching Model for Reliable Depth Estimation
JUNHONG MIN, YOUNGPIL JEON, Jimin Kim et al.
What's Making That Sound Right Now? Video-centric Audio-Visual Localization
hahyeon choi, Junhoo Lee, Nojun Kwak
Photolithography Overlay Map Generation with Implicit Knowledge Distillation Diffusion Transformer
YuanFu Yang, Hsiu-Hui Hsiao
FlexUOD: The Answer to Real-world Unsupervised Image Outlier Detection
Zhonghang Liu, Kun Zhou, Changshuo Wang et al.
Federated Granger Causality Learning For Interdependent Clients With State Space Representation
Ayush Mohanty, Nazal Mohamed, Paritosh Ramanan et al.
Samba: A Unified Mamba-based Framework for General Salient Object Detection
Jiahao He, Keren Fu, Xiaohong Liu et al.
TAD-E2E: A Large-scale End-to-end Autonomous Driving Dataset
Chang Liu, mingxuzhu mingxuzhu, Zheyuan Zhang et al.
Collaborative Tree Search for Enhancing Embodied Multi-Agent Collaboration
Lizheng Zu, Lin Lin, Song Fu et al.
Dual Exposure Stereo for Extended Dynamic Range 3D Imaging
Juhyung Choi, Jinneyong Kim, Seokjun Choi et al.
DeFSS: Image-to-Mask Denoising Learning for Few-shot Segmentation
Zishu Qin, Junhao Xu, Weifeng Ge
Towards Robustness of Person Search against Corruptions
Woojung Son, Yoonki Cho, Guoyuan An et al.
Improved Monocular Depth Prediction Using Distance Transform Over Pre-semantic Contours with Self-supervised Neural Networks
Marwane Hariat, Antoine Manzanera, David Filliat
Teleportraits: Training-Free People Insertion into Any Scene
Jialu Gao, Joseph K J, Fernando De la Torre
Towards Visual Localization Interoperability: Cross-Feature for Collaborative Visual Localization and Mapping
Alberto Jaenal, Paula Carbó Cubero, Jose Araujo et al.
GenVP: Generating Visual Puzzles with Contrastive Hierarchical VAEs
Kalliopi Basioti, Pritish Sahu, Qingze Liu et al.
ERUPT: Efficient Rendering with Unposed Patch Transformer
Maxim Shugaev, Vincent Chen, Maxim Karrenbach et al.
MiDSummer: Multi-Guidance Diffusion for Controllable Zero-Shot Immersive Gaussian Splatting Scene Generation
Anjun Hu, Richard Tomsett, Valentin Gourmet et al.
Spatio-Spectral Pattern Illumination for Direct and Indirect Separation from a Single Hyperspectral Image
Shin Ishihara, Imari Sato
TopoTTA: Topology-Enhanced Test-Time Adaptation for Tubular Structure Segmentation
Jiale Zhou, Wenhan Wang, Shikun Li et al.
Redefining <Creative> in Dictionary: Towards an Enhanced Semantic Understanding of Creative Generation
Fu Feng, Yucheng Xie, Xu Yang et al.
Structuring Benchmark into Knowledge Graphs to Assist Large Language Models in Retrieving and Designing Models
Hanmo Liu, Shimin Di, Jialiang Wang et al.
Variance-Based Membership Inference Attacks Against Large-Scale Image Captioning Models
Daniel Samira, Edan Habler, Yuval Elovici et al.
Generalized Zero-Shot Classification via Semantics-Free Inter-Class Feature Generation
Libiao Chen, Dong Nie, Junjun Pan et al.
SDFormer: Vision-based 3D Semantic Scene Completion via SAM-assisted Dual-channel Voxel Transformer
Yujie Xue, Huilong Pi, Jiapeng Zhang et al.
Camera Resection from Known Line Pencils and a Radially Distorted Scanline
Juan Carlos Dibene Simental, Enrique Dunn
SKDream: Controllable Multi-view and 3D Generation with Arbitrary Skeletons
Yuanyou Xu, Zongxin Yang, Yi Yang
Closest Neighbors are Harmful for Lightweight Masked Auto-encoders
Jian Meng, Ahmed Hasssan, Li Yang et al.
GeoFormer: Geometry Point Encoder for 3D Object Detection with Graph-based Transformer
Xin Jin, Haisheng Su, Cong Ma et al.
Consistent Normal Orientation for 3D Point Clouds via Least Squares on Delaunay Graph
Rao Fu, Jianmin Zheng, Liang Yu
BioX-CPath: Biologically-driven Explainable Diagnostics for Multistain IHC Computational Pathology
Amaya Gallagher-Syed, Henry Senior, Omnia Alwazzan et al.
Allowing Oscillation Quantization: Overcoming Solution Space Limitation in Low Bit-Width Quantization
Weiying Xie, Zihan Meng, Jitao Ma et al.
Tile-wise vs. Image-wise: Random-Tile Loss and Training Paradigm for Gaussian Splatting
Xiaoyu Zhang, Weihong Pan, Xiaojun Xiang et al.
VIPerson: Flexibly Generating Virtual Identity for Person Re-Identification
Xiao-Wen Zhang, Delong Zhang, Yi-Xing Peng et al.
Hybrid Reciprocal Transformer with Triplet Feature Alignment for Scene Graph Generation
Jiawei Fu, ZHANG Tiantian, Kai Chen et al.
Learning Person-Specific Animatable Face Models from In-the-Wild Images via a Shared Base Model
Yuxiang Mao, Zhenfeng Fan, Zhijie Zhang et al.
EditCLIP: Representation Learning for Image Editing
Qian Wang, Aleksandar Cvejic, Abdelrahman Eldesokey et al.
QK-Edit: Revisiting Attention-based Injection in MM-DiT for Image and Video Editing
Tiancheng SHEN, Jun Hao Liew, Zilong Huang et al.
High-dimension Prototype is a Better Incremental Object Detection Learner
Yanjie Wang, Liqun Chen, Tianming Zhao et al.
Let's Chorus: Partner-aware Hybrid Song-Driven 3D Head Animation
Xiumei Xie, Zikai Huang, Wenhao Xu et al.
Think Twice: Test-Time Reasoning for Robust CLIP Zero-Shot Classification
Shenyu Lu, Zhaoying Pan, Xiaoqian Wang
A Simple yet Effective $\Delta\Delta G$ Predictor is An Unsupervised Antibody Optimizer and Explainer
Lirong Wu, Yunfan Liu, Haitao Lin et al.
Adapt Foundational Segmentation Models with Heterogeneous Searching Space
Li Yi, Jie Hu, Songan Zhang et al.
CMB-ML: A Cosmic Microwave Background Dataset for the Oldest Possible Computer Vision Task
James Amato, Yunan Xie, Leonel Medina-Varela et al.
Can Machines Understand Composition? Dataset and Benchmark for Photographic Image Composition Embedding and Understanding
Zhaoran Zhao, Peng Lu, Anran Zhang et al.
Explaining Human Preferences via Metrics for Structured 3D Reconstruction
Jack Langerman, Denis Rozumny, Yuzhong Huang et al.
Optimizing Neural Network Representations of Boolean Networks
Joshua Russell, Ignacio Gavier, Devdhar Patel et al.
Two Losses, One Goal: Balancing Conflict Gradients for Semi-supervised Semantic Segmentation
Rui Sun, Huayu Mai, Wangkai Li et al.
SMSTracker: Tri-path Score Mask Sigma Fusion for Multi-Modal Tracking
Sixian Chan, Zedong Li, Xiaoqin Zhang et al.
SDBF: Steep-Decision-Boundary Fingerprinting for Hard-Label Tampering Detection of DNN Models
Xiaofan Bai, Shixin Li, Xiaojing Ma et al.
Gromov–Wasserstein Problem with Cyclic Symmetry
Shoichiro Takeda, Yasunori Akagi
UINavBench: A Framework for Comprehensive Evaluation of Interactive Digital Agents
Harsh Agrawal, Eldon Schoop, Xinlei Pan et al.
On the Convergence of Adaptive Gradient Methods for Nonconvex Optimization
Quanquan Gu, Jinghui Chen, Yuan Cao et al.
Time After Time: Deep-Q Effect Estimation for Interventions on When and What to do
Yoav Wald, Mark Goldstein, Yonathan Efroni et al.
RoCo-Sim: Enhancing Roadside Collaborative Perception through Foreground Simulation
Yuwen Du, Anning Hu, Zichen Chao et al.
Height-Fidelity Dense Global Fusion for Multi-modal 3D Object Detection
Hanshi Wang, Jin Gao, Weiming Hu et al.
Diversity-Enhanced Distribution Alignment for Dataset Distillation
Hongcheng Li, Yucan Zhou, Xiaoyan Gu et al.
Separation for Better Integration: Disentangling Edge and Motion in Event-based Deblurring
Yufei Zhu, Hao Chen, Yongjian Deng et al.
CASP: Consistency-aware Audio-induced Saliency Prediction Model for Omnidirectional Video
Zhaolin Wan, Han Qin, Zhiyang Li et al.
A Universal Scale-Adaptive Deformable Transformer for Image Restoration across Diverse Artifacts
Xuyi He, Yuhui Quan, Ruotao Xu et al.
HFD-Teacher: High-Frequency Depth Distillation from Depth Foundation Models for Enhanced Depth Completion
Zhiyuan Yang, Anqi Cheng, Haiyue Zhu et al.
A4A: Adapter for Adapter Transfer via All-for-All Mapping for Cross-Architecture Models
Keyu Tu, Mengqi Huang, Zhuowei Chen et al.
Towards Precise Embodied Dialogue Localization via Causality Guided Diffusion
Haoyu Wang, Le Wang, Sanping Zhou et al.
LR0.FM: LOW-RESOLUTION ZERO-SHOT CLASSIFICATION BENCHMARK FOR FOUNDATION MODELS
Priyank Pathak, Shyam Marjit, Shruti Vyas et al.
Democratizing High-Fidelity Co-Speech Gesture Video Generation
Xu Yang, Shaoli Huang, Shenbo Xie et al.
Backdoor Mitigation by Distance-Driven Detoxification
Shaokui Wei, Jiayin Liu, Hongyuan Zha
Hierarchy UGP: Hierarchy Unified Gaussian Primitive for Large-Scale Dynamic Scene Reconstruction
Hongyang Sun, Qinglin Yang, Jiawei Wang et al.
Temporal Difference Learning: Why It Can Be Fast and How It Will Be Faster
Patrick Schnell, Luca Guastoni, Nils Thuerey
High-quality Text-to-3D Character Generation with SparseCubes and Sparse Transformers.
Jiachen Qian, Hongye Yang, Shuang Wu et al.
Disentangling Safe and Unsafe Image Corruptions via Anisotropy and Locality
Ramchandran Muthukumar, Ambar Pal, Jeremias Sulam et al.
Temporal Rate Reduction Clustering for Human Motion Segmentation
Xianghan Meng, Zhengyu Tong, Zhiyuan Huang et al.
RareCLIP: Rarity-aware Online Zero-shot Industrial Anomaly Detection
Jianfang He, Min Cao, Silong Peng et al.
GMMamba: Group Masking Mamba for Whole Slide Image Classification
Tingting Zheng, Hongxun Yao, Kui Jiang et al.
ReMP-AD: Retrieval-enhanced Multi-modal Prompt Fusion for Few-Shot Industrial Visual Anomaly Detection
Hongchi Ma, Guanglei Yang, Debin Zhao et al.
EVA: Geometric Inverse Design for Fast Protein Motif-Scaffolding with Coupled Flow
Yufei Huang, Yunshu Liu, Lirong Wu et al.
Scene Graph Guided Generation: Enable Accurate Relations Generation in Text-to-Image Models via Textural Rectification
Guibao SHEN, Luozhou Wang, Jiantao Lin et al.
Tracing Copied Pixels and Regularizing Patch Affinity in Copy Detection
Yichen Lu, Siwei Nie, Minlong Lu et al.
Omegance: A Single Parameter for Various Granularities in Diffusion-Based Synthesis
Xinyu Hou, Zongsheng Yue, Xiaoming Li et al.
Balancing Bias in Two-sided Markets for Fair Stable Matchings
Siyuan Wu, Leong Hou U, Panagiotis Karras
SD2Actor: Continuous State Decomposition via Diffusion Embeddings for Robotic Manipulation
lijiayi jiayi
Descriptor-In-Pixel : Point-Feature Tracking For Pixel Processor Arrays
Laurie Bose, Piotr Dudek, Jianing Chen
Doppelgängers and Adversarial Vulnerability
George Kamberov
CoSMIC: Continual Self-supervised Learning for Multi-Domain Medical Imaging via Conditional Mutual Information Maximization
Yihang Liu, Ying Wen, Longzhen Yang et al.
UniVerse: Unleashing the Scene Prior of Video Diffusion Models for Robust Radiance Field Reconstruction
Jin Cao, Hongrui Wu, Ziyong Feng et al.
ExploreGS: Explorable 3D Scene Reconstruction with Virtual Camera Samplings and Diffusion Priors
Minsu Kim, Subin Jeon, In Cho et al.
Beyond Brain Decoding: Visual-Semantic Reconstructions to Mental Creation Extension Based on fMRI
Haodong Jing, Dongyao Jiang, Yongqiang Ma et al.
CodePlan: Unlocking Reasoning Potential in Large Language Models by Scaling Code-form Planning
Jiaxin Wen, Jian Guan, Hongning Wang et al.
LaneDiffusion: Improving Centerline Graph Learning via Prior Injected BEV Feature Generation
Zijie Wang, Weiming Zhang, Wei Zhang et al.
Matrix-Free Shared Intrinsics Bundle Adjustment
Daniel Safari
Seeing More with Less: Human-like Representations in Vision Models
Andrey Gizdov, Shimon Ullman, Daniel Harari
Chain of Semantics Programming in 3D Gaussian Splatting Representation for 3D Vision Grounding
Jiaxin Shi, Mingyue Xiang, Hao Sun et al.
Transformer-based Tooth Alignment Prediction with Occlusion and Collision Constraints
DongZhenXing DongZhenXing, Jiazhou Chen
Fuzzy Multimodal Learning for Trusted Cross-modal Retrieval
Siyuan Duan, Yuan Sun, Dezhong Peng et al.
PixTalk: Controlling Photorealistic Image Processing and Editing with Language
Marcos Conde, Zihao Lu, Radu Timofte
ADCD-Net: Robust Document Image Forgery Localization via Adaptive DCT Feature and Hierarchical Content Disentanglement
KA WONG, Jicheng Zhou, Haiwei Wu et al.
Unsupervised Histopathological Image Semantic Segmentation with Overlapping Patches Consistency Constraint
Wentian Cai, Weizhao Weng, Zihao Huang et al.
Planar Affine Rectification from Local Change of Scale and Orientation
Yuval Nissan, Marc Pollefeys, Daniel Barath
LLaVA-SP: Enhancing Visual Representation with Visual Spatial Tokens for MLLMs
Haoran Lou, Chunxiao Fan, Ziyan Liu et al.
Towards Robust Defense against Customization via Protective Perturbation Resistant to Diffusion-based Purification
Wenkui Yang, Jie Cao, Junxian Duan et al.
Visual Textualization for Image Prompted Object Detection
Yongjian Wu, Yang Zhou, Jiya Saiyin et al.
ERNet: Efficient Non-Rigid Registration Network for Point Sequences
Guangzhao He, Yuxi Xiao, Zhen Xu et al.
FedXDS: Leveraging Model Attribution Methods to counteract Data Heterogeneity in Federated Learning
Maximilian Hoefler, Karsten Mueller, Wojciech Samek
VISO: Accelerating In-orbit Object Detection with Language-Guided Mask Learning and Sparse Inference
Meiqi Wang, Han Qiu
Doppler-Aware LiDAR-RADAR Fusion for Weather-Robust 3D Detection
Yujeong Chae, Heejun Park, Hyeonseong Kim et al.
Learning Partonomic 3D Reconstruction from Image Collections
Xiaoqian Ruan, Pei Yu, Dian Jia et al.
LOGICZSL: Exploring Logic-induced Representation for Compositional Zero-shot Learning
Peng Wu, Xiankai Lu, Hao Hu et al.
A Unified Framework for Industrial Cel-Animation Colorization with Temporal-Structural Awareness
Xiaoyi Feng, Tao Huang, Peng Wang et al.
Leveraging Spatial Invariance to Boost Adversarial Transferability
Zihan Zhou, LI LI, Yanli Ren et al.
FIND: Few-Shot Anomaly Inspection with Normal-Only Multi-Modal Data
YITING LI, Fayao Liu, Jingyi Liao et al.
PLAN: Proactive Low-Rank Allocation for Continual Learning
XIEQUN WANG, Zhan Zhuang, Yu Zhang
DC-TTA: Divide-and-Conquer Framework for Test-Time Adaptation of Interactive Segmentation
Jihun Kim, Hoyong Kwon, Hyeokjun Kweon et al.
Hybrid Layout Control for Diffusion Transformer: Fewer Annotations, Superior Aesthetics
Keming Wu, Junwen Chen, Zhanhao Liang et al.
SeriesBench: A Benchmark for Narrative-Driven Drama Series Understanding
chenkai zhang, Yiming Lei, Zeming Liu et al.
UniDxMD: Towards Unified Representation for Cross-Modal Unsupervised Domain Adaptation in 3D Semantic Segmentation
Zhengyin Liang, Hui Yin, Min Liang et al.
MS3D: High-Quality 3D Generation via Multi-Scale Representation Modeling
Guan Luo, Jianfeng Zhang
NATRA: Noise-Agnostic Framework for Trajectory Prediction with Noisy Observations
Rongqing Li, Changsheng Li, Ruilin Lv et al.
Uncertainty Meets Diversity: A Comprehensive Active Learning Framework for Indoor 3D Object Detection
Jiangyi Wang, Na Zhao
Customizing Domain Adapters for Domain Generalization
Yuyang Ji, Zeyi Huang, Haohan Wang et al.
Classic but Everlasting: Traditional Gradient-Based Algorithms Converge Fast Even in Time-Varying Multi-Player Games
Yanzheng Chen, Jun Yu
Text-Driven Fashion Image Editing with Compositional Concept Learning and Counterfactual Abduction
Shanshan Huang, Haoxuan Li, Chunyuan Zheng et al.
Autoregressive Sequential Pretraining for Visual Tracking
Shiyi Liang, Yifan Bai, Yihong Gong et al.
COME: Dual Structure-Semantic Learning with Collaborative MoE for Universal Lesion Detection Across Heterogeneous Ultrasound Datasets
Lingyu Chen, Yawen Zeng, Yue Wang et al.
A Selective Re-learning Mechanism for Hyperspectral Fusion Imaging
Yuanye Liu, jinyang liu, Renwei Dian et al.
Efficient Visual Place Recognition Through Multimodal Semantic Knowledge Integration
Sitao Zhang, Hongda Mao, Qingshuang Chen et al.
Generative Video Bi-flow
Chen Liu, Tobias Ritschel
TPG-INR: Target Prior-Guided Implicit 3D CT Reconstruction for Enhanced Sparse-view Imaging
QingleiCao QingleiCao, Ziyao Tang, Xiaoqin Tang
Context-Aware Academic Emotion Dataset and Benchmark
Luming Zhao, Jingwen Xuan, Jiamin Lou et al.
Event-aided Dense and Continuous Point Tracking: Everywhere and Anytime
Zhexiong Wan, Jianqin Luo, Yuchao Dai et al.
$\text{I}^2\text{AM}$: Interpreting Image-to-Image Latent Diffusion Models via Bi-Attribution Maps
Junseo Park, Hyeryung Jang
ArgoTweak: Towards Self-Updating HD Maps through Structured Priors
Lena Wild, Rafael Valencia, Patric Jensfelt
TAET: Two-Stage Adversarial Equalization Training on Long-Tailed Distributions
Wang Yu-Hang, Junkang Guo, Aolei Liu et al.
Soft Separation and Distillation: Toward Global Uniformity in Federated Unsupervised Learning
Hung-Chieh Fang, Hsuan-Tien Lin, Irwin King et al.
Mamba-Reg: Vision Mamba Also Needs Registers
Feng Wang, Jiahao Wang, Sucheng Ren et al.
ArgMatch: Adaptive Refinement Gathering for Efficient Dense Matching
Yuxin Deng, Kaining Zhang, Linfeng Tang et al.
GaussianReg: Rapid 2D/3D Registration for Emergency Surgery via Explicit 3D Modeling with Gaussian Primitives
Weihao Yu, Xiaoqing Guo, Xinyu Liu et al.
MambaML: Exploring State Space Models for Multi-Label Image Classification
Xuelin Zhu, Jian liu, Jiuxin Cao et al.
Thermal Polarimetric Multi-view Stereo
Takahiro Kushida, Kenichiro Tanaka
Auto-Controlled Image Perception in MLLMs via Visual Perception Tokens
Runpeng Yu, Xinyin Ma, Xinchao Wang
Domain-aware Category-level Geometry Learning Segmentation for 3D Point Clouds
Pei He, Lingling Li, Licheng Jiao et al.
RAEncoder: A Label-Free Reversible Adversarial Examples Encoder for Dataset Intellectual Property Protection
Fan Xing, Zhuo Tian, Xuefeng Fan et al.
Inner Speech as Behavior Guides: Steerable Imitation of Diverse Behaviors for Human-AI coordination
Rakshit Trivedi, Kartik Sharma, David Parkes
Information-Theoretic Discrete Diffusion
Moongyu Jeon, Sangwoo Shin, Dongjae Jeon et al.