Most Cited 2024 "linear fidelity differential equations" Papers
12,324 papers found • Page 6 of 62
Conference
6385 Efficient Spiking Neural Networks with Sparse Selective Activation for Continual Learning
Jiangrong Shen, Wenyao Ni, Qi Xu et al.
Exploiting Label Skews in Federated Learning with Model Concatenation
Yiqun Diao, Qinbin Li, Bingsheng He
Interactive Continual Learning: Fast and Slow Thinking
Biqing Qi, Xinquan Chen, Junqi Gao et al.
Alchemist: Parametric Control of Material Properties with Diffusion Models
Prafull Sharma, Varun Jampani, Yuanzhen Li et al.
V-IRL: Grounding Virtual Intelligence in Real Life
Jihan YANG, Runyu Ding, Ellis L Brown et al.
UGG: Unified Generative Grasping
Jiaxin Lu, Hao Kang, Haoxiang Li et al.
Translate Meanings, Not Just Words: IdiomKB’s Role in Optimizing Idiomatic Translation with Language Models
Shuang Li, Jiangjie Chen, Siyu Yuan et al.
NeuSurf: On-Surface Priors for Neural Surface Reconstruction from Sparse Input Views
Han Huang, Yulun Wu, Junsheng Zhou et al.
RobustSAM: Segment Anything Robustly on Degraded Images
Wei-Ting Chen, Yu Jiet Vong, Sy-Yen Kuo et al.
Towards Multimodal Sentiment Analysis Debiasing via Bias Purification
Dingkang Yang, Mingcheng Li, Dongling Xiao et al.
Frequency Decoupling for Motion Magnification via Multi-Level Isomorphic Architecture
Fei Wang, Dan Guo, Kun Li et al.
MobileDiffusion: Instant Text-to-Image Generation on Mobile Devices
Yang Zhao, Zhisheng Xiao, Yanwu Xu et al.
ICP-Flow: LiDAR Scene Flow Estimation with ICP
Yancong Lin, Holger Caesar
Point-PEFT: Parameter-Efficient Fine-Tuning for 3D Pre-trained Models
Yiwen Tang, Ray Zhang, Zoey Guo et al.
SemGrasp: Semantic Grasp Generation via Language Aligned Discretization
Kailin Li, Jingbo Wang, Lixin Yang et al.
Fake It till You Make It: Curricular Dynamic Forgery Augmentations towards General Deepfake Detection
Yuzhen Lin, Wentang Song, Bin Li et al.
MonoNPHM: Dynamic Head Reconstruction from Monocular Videos
Simon Giebenhain, Tobias Kirschstein, Markos Georgopoulos et al.
Watermark-embedded Adversarial Examples for Copyright Protection against Diffusion Models
Peifei Zhu, Tsubasa Takahashi, Hirokatsu Kataoka
Disentangled Clothed Avatar Generation from Text Descriptions
Jionghao Wang, Yuan Liu, Zhiyang Dou et al.
Prompting Language-Informed Distribution for Compositional Zero-Shot Learning
Wentao Bao, Lichang Chen, Heng Huang et al.
SafeDreamer: Safe Reinforcement Learning with World Models
Weidong Huang, Jiaming Ji, Chunhe Xia et al.
Generalizable Human Gaussians for Sparse View Synthesis
Youngjoong Kwon, Baole Fang, Yixing Lu et al.
GenNBV: Generalizable Next-Best-View Policy for Active 3D Reconstruction
Xiao Chen, Quanyi Li, Tai Wang et al.
Adversarial Prompt Tuning for Vision-Language Models
Jiaming Zhang, Xingjun Ma, Xin Wang et al.
Adv-Diffusion: Imperceptible Adversarial Face Identity Attack via Latent Diffusion Model
Decheng Liu, Xijun Wang, Chunlei Peng et al.
LingoQA: Video Question Answering for Autonomous Driving
Ana-Maria Marcu, Long Chen, Jan Hünermann et al.
Active Generalized Category Discovery
Shijie Ma, Fei Zhu, Zhun Zhong et al.
DGCLUSTER: A Neural Framework for Attributed Graph Clustering via Modularity Maximization
Aritra Bhowmick, Mert Kosan, Zexi Huang et al.
HiFi-123: Towards High-fidelity One Image to 3D Content Generation
Wangbo Yu, Li Yuan, Yanpei Cao et al.
Video Super-Resolution Transformer with Masked Inter&Intra-Frame Attention
Xingyu Zhou, Leheng Zhang, Xiaorui Zhao et al.
The Consensus Game: Language Model Generation via Equilibrium Search
Athul Jacob, Yikang Shen, Gabriele Farina et al.
Neural Redshift: Random Networks are not Random Functions
Damien Teney, Armand Nicolicioiu, Valentin Hartmann et al.
Infinite-ID: Identity-preserved Personalization via ID-semantics Decoupling Paradigm
Yi Wu, Ziqiang Li, Heliang Zheng et al.
SeFlow: A Self-Supervised Scene Flow Method in Autonomous Driving
Qingwen Zhang, Yi Yang, Peizheng Li et al.
ADA-Track: End-to-End Multi-Camera 3D Multi-Object Tracking with Alternating Detection and Association
Shuxiao Ding, Lukas Schneider, Marius Cordts et al.
High-fidelity Person-centric Subject-to-Image Synthesis
Yibin Wang, Weizhong Zhang, Jianwei Zheng et al.
AGILE3D: Attention Guided Interactive Multi-object 3D Segmentation
Yuanwen Yue, Sabarinath Mahadevan, Jonas Schult et al.
Detecting As Labeling: Rethinking LiDAR-camera Fusion in 3D Object Detection
Junjie Huang, Yun Ye, Zhujin Liang et al.
Progressive Semantic-Guided Vision Transformer for Zero-Shot Learning
Shiming Chen, Wenjin Hou, Salman Khan et al.
ExtDM: Distribution Extrapolation Diffusion Model for Video Prediction
Zhicheng Zhang, Junyao Hu, Wentao Cheng et al.
Transformer-Based No-Reference Image Quality Assessment via Supervised Contrastive Learning
Jinsong Shi, Pan Gao, Jie Qin
Ray Denoising: Depth-aware Hard Negative Sampling for Multi-view 3D Object Detection
Feng Liu, Tengteng Huang, Qianjing Zhang et al.
Unraveling Instance Associations: A Closer Look for Audio-Visual Segmentation
Yuanhong Chen, Yuyuan Liu, Hu Wang et al.
NeuroPictor: Refining fMRI-to-Image Reconstruction via Multi-individual Pretraining and Multi-level Modulation
Jingyang Huo, Yikai Wang, Yanwei Fu et al.
Concept-Guided Prompt Learning for Generalization in Vision-Language Models
Yi Zhang, Ce Zhang, Ke Yu et al.
Revisiting Link Prediction: a data perspective
Haitao Mao, Juanhui Li, Harry Shomer et al.
Learning 3D Geometry and Feature Consistent Gaussian Splatting for Object Removal
YUXIN WANG, Qianyi Wu, Guofeng Zhang et al.
AutoAD III: The Prequel – Back to the Pixels
Tengda Han, Max Bain, Arsha Nagrani et al.
Explaining Generalization Power of a DNN Using Interactive Concepts
Huilin Zhou, Hao Zhang, Huiqi Deng et al.
AUFormer: Vision Transformers are Parameter-Efficient Facial Action Unit Detectors
Kaishen Yuan, Zitong Yu, Xin Liu et al.
SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its Teacher
Trung Dao, Thuan Nguyen, Thanh Van Le et al.
CoGS: Controllable Gaussian Splatting
Heng Yu, Joel Julin, Zoltán Á. Milacski et al.
MoAI: Mixture of All Intelligence for Large Language and Vision Models
Byung-Kwan Lee, Beomchan Park, Chae Won Kim et al.
AnimatableDreamer: Text-Guided Non-rigid 3D Model Generation and Reconstruction with Canonical Score Distillation
Xinzhou Wang, Yikai Wang, junliang ye et al.
The Hidden Language of Diffusion Models
Hila Chefer, Oran Lang, Mor Geva et al.
Multi-Prompts Learning with Cross-Modal Alignment for Attribute-Based Person Re-identification
Yajing Zhai, Yawen Zeng, Zhiyong Huang et al.
Provably Powerful Graph Neural Networks for Directed Multigraphs
Beni Egressy, Luc von Niederhäusern, Jovan Blanuša et al.
Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation
Yunhao Ge, Xiaohui Zeng, Jacob Huffman et al.
Low Rank Matrix Completion via Robust Alternating Minimization in Nearly Linear Time
Yuzhou Gu, Zhao Song, Junze Yin et al.
LiDAR4D: Dynamic Neural Fields for Novel Space-time View LiDAR Synthesis
Zehan Zheng, Fan Lu, Weiyi Xue et al.
Token-Level Contrastive Learning with Modality-Aware Prompting for Multimodal Intent Recognition
Qianrui Zhou, Hua Xu, Hao Li et al.
Audio-Synchronized Visual Animation
Lin Zhang, Shentong Mo, Yijing Zhang et al.
XPSR: Cross-modal Priors for Diffusion-based Image Super-Resolution
Yunpeng Qu, Kun Yuan, Kai Zhao et al.
Time Weaver: A Conditional Time Series Generation Model
Sai Shankar Narasimhan, Shubhankar Agarwal, Oguzhan Akcin et al.
Synergistic Multiscale Detail Refinement via Intrinsic Supervision for Underwater Image Enhancement
Dehuan Zhang, Jingchun Zhou, Chunle Guo et al.
FairSIN: Achieving Fairness in Graph Neural Networks through Sensitive Information Neutralization
Cheng Yang, Jixi Liu, Yunhe Yan et al.
Simple Semantic-Aided Few-Shot Learning
Hai Zhang, Junzhe Xu, Shanlin Jiang et al.
Learning Object State Changes in Videos: An Open-World Perspective
Zihui Xue, Kumar Ashutosh, Kristen Grauman
Asynchronous Large Language Model Enhanced Planner for Autonomous Driving
Yuan Chen, Zi-han Ding, Ziqin Wang et al.
Spurious Feature Diversification Improves Out-of-distribution Generalization
LIN Yong, Lu Tan, Yifan HAO et al.
Don't Play Favorites: Minority Guidance for Diffusion Models
Soobin Um, Suhyeon Lee, Jong Chul YE
GTP-4o: Modality-prompted Heterogeneous Graph Learning for Omni-modal Biomedical Representation
Chenxin Li, Xinyu Liu, Cheng Wang et al.
Exploring Sparse Visual Prompt for Domain Adaptive Dense Prediction
Senqiao Yang, Jiarui Wu, Jiaming Liu et al.
Rethinking Generalizable Face Anti-spoofing via Hierarchical Prototype-guided Distribution Refinement in Hyperbolic Space
Chengyang Hu, Ke-Yue Zhang, Taiping Yao et al.
Cross-view image geo-localization with Panorama-BEV Co-Retrieval Network
ye junyan, Zhutao Lv, Li Weijia et al.
Spherical Linear Interpolation and Text-Anchoring for Zero-shot Composed Image Retrieval
Young Kyun Jang, Dat B Huynh, Ashish Shah et al.
Three Pillars Improving Vision Foundation Model Distillation for Lidar
Gilles Puy, Spyros Gidaris, Alexandre Boulch et al.
Inversion-Free Image Editing with Language-Guided Diffusion Models
Sihan Xu, Yidong Huang, Jiayi Pan et al.
Jointly Training Large Autoregressive Multimodal Models
Emanuele Aiello, Lili Yu, Yixin Nie et al.
MAS: Multi-view Ancestral Sampling for 3D Motion Generation Using 2D Diffusion
Roy Kapon, Guy Tevet, Daniel Cohen-Or et al.
Generalizable Whole Slide Image Classification with Fine-Grained Visual-Semantic Interaction
Hao Li, Ying Chen, Yifei Chen et al.
Test-Time Domain Adaptation by Learning Domain-Aware Batch Normalization
Yanan Wu, Zhixiang Chi, Yang Wang et al.
LiDAR-PTQ: Post-Training Quantization for Point Cloud 3D Object Detection
Sifan Zhou, Liang Li, Xinyu Zhang et al.
SAM-guided Graph Cut for 3D Instance Segmentation
Haoyu Guo, He Zhu, Sida Peng et al.
Graph Invariant Learning with Subgraph Co-mixup for Out-of-Distribution Generalization
Tianrui Jia, Haoyang Li, Cheng Yang et al.
Collaborating Foundation Models for Domain Generalized Semantic Segmentation
Yasser Benigmim, Subhankar Roy, Slim Essid et al.
Rethinking Graph Masked Autoencoders through Alignment and Uniformity
Liang Wang, Xiang Tao, Qiang Liu et al.
It's Never Too Late: Fusing Acoustic Information into Large Language Models for Automatic Speech Recognition
CHEN CHEN, Ruizhe Li, Yuchen Hu et al.
OpenStreetView-5M: The Many Roads to Global Visual Geolocation
Guillaume Astruc, Nicolas Dufour, Ioannis Siglidis et al.
Relevant Intrinsic Feature Enhancement Network for Few-Shot Semantic Segmentation
Xiaoyi Bao, Jie Qin, Siyang Sun et al.
Fantastic Animals and Where to Find Them: Segment Any Marine Animal with Dual SAM
Pingping Zhang, Tianyu Yan, Yang Liu et al.
Towards Generalizable Multi-Object Tracking
Zheng Qin, Le Wang, Sanping Zhou et al.
A Dynamic Kernel Prior Model for Unsupervised Blind Image Super-Resolution
Zhixiong Yang, Jingyuan Xia, Shengxi Li et al.
Frequency-Adaptive Pan-Sharpening with Mixture of Experts
Xuanhua He, Keyu Yan, Rui Li et al.
REACTO: Reconstructing Articulated Objects from a Single Video
Chaoyue Song, Jiacheng Wei, Chuan-Sheng Foo et al.
How Far Can We Compress Instant-NGP-Based NeRF?
Yihang Chen, Qianyi Wu, Mehrtash Harandi et al.
CFR-ICL: Cascade-Forward Refinement with Iterative Click Loss for Interactive Image Segmentation
Shoukun Sun, Min Xian, Fei Xu et al.
Transductive Zero-Shot and Few-Shot CLIP
Ségolène Martin, Yunshi HUANG, Fereshteh Shakeri et al.
HarmonyView: Harmonizing Consistency and Diversity in One-Image-to-3D
Sangmin Woo, byeongjun park, Hyojun Go et al.
Audio Generation with Multiple Conditional Diffusion Model
Zhifang Guo, Jianguo Mao, Tao Rui et al.
Training Unbiased Diffusion Models From Biased Dataset
Yeongmin Kim, Byeonghu Na, Minsang Park et al.
G2P-DDM: Generating Sign Pose Sequence from Gloss Sequence with Discrete Diffusion Model
Pan Xie, Qipeng Zhang, Peng Taiying et al.
Random Feature Amplification: Feature Learning and Generalization in Neural Networks
Spencer Frei, Niladri Chatterji, Peter L. Bartlett
CPPO: Continual Learning for Reinforcement Learning with Human Feedback
Han Zhang, Yu Lei, Lin Gui et al.
View-decoupled Transformer for Person Re-identification under Aerial-ground Camera Network
Quan Zhang, Lei Wang, Vishal M. Patel et al.
Towards a Simultaneous and Granular Identity-Expression Control in Personalized Face Generation
Renshuai Liu, Bowen Ma, Wei Zhang et al.
Graph-Aware Contrasting for Multivariate Time-Series Classification
Yucheng Wang, Yuecong Xu, Jianfei Yang et al.
Exact Diffusion Inversion via Bidirectional Integration Approximation
Guoqiang Zhang, j.p. lewis, W. Bastiaan Kleijn
Lossy Image Compression with Foundation Diffusion Models
Lucas Relic, Roberto Azevedo, Markus Gross et al.
SpikePoint: An Efficient Point-based Spiking Neural Network for Event Cameras Action Recognition
Hongwei Ren, Yue ZHOU, Xiaopeng LIN et al.
HowToCaption: Prompting LLMs to Transform Video Annotations at Scale
Nina Shvetsova, Anna Kukleva, Xudong Hong et al.
Hierarchical Temporal Context Learning for Camera-based Semantic Scene Completion
Bohan Li, Jiajun Deng, Wenyao Zhang et al.
Atlantis: Enabling Underwater Depth Estimation with Stable Diffusion
Fan Zhang, Shaodi You, Yu Li et al.
PREGO: Online Mistake Detection in PRocedural EGOcentric Videos
Alessandro Flaborea, Guido M. D&, #x27 et al.
GTA: A Geometry-Aware Attention Mechanism for Multi-View Transformers
Takeru Miyato, Bernhard Jaeger, Max Welling et al.
Beyond Prompt Learning: Continual Adapter for Efficient Rehearsal-Free Continual Learning
XINYUAN GAO, Songlin Dong, Yuhang He et al.
SpecNeRF: Gaussian Directional Encoding for Specular Reflections
Li Ma, Vasu Agrawal, Haithem Turki et al.
Attention Guided CAM: Visual Explanations of Vision Transformer Guided by Self-Attention
Saebom Leem, Hyunseok Seo
G-Adapter: Towards Structure-Aware Parameter-Efficient Transfer Learning for Graph Transformer Networks
Anchun Gui, Jinqiang Ye, Han Xiao
Unveiling the Power of Audio-Visual Early Fusion Transformers with Dense Interactions through Masked Modeling
Shentong Mo, Pedro Morgado
Contextrast: Contextual Contrastive Learning for Semantic Segmentation
Changki Sung, Wanhee Kim, Jungho An et al.
FlowIE: Efficient Image Enhancement via Rectified Flow
Yixuan Zhu, Wenliang Zhao, Ao Li et al.
ExACT: Language-guided Conceptual Reasoning and Uncertainty Estimation for Event-based Action Recognition and More
Jiazhou Zhou, Xu Zheng, Yuanhuiyi Lyu et al.
Skeleton Recall Loss for Connectivity Conserving and Resource Efficient Segmentation of Thin Tubular Structures
Yannick Kirchhoff, Maximilian Rokuss, Saikat Roy et al.
Fair and Efficient Contribution Valuation for Vertical Federated Learning
Zhenan Fan, Huang Fang, Xinglu Wang et al.
Open-World Human-Object Interaction Detection via Multi-modal Prompts
Jie Yang, Bingliang Li, Ailing Zeng et al.
LaWa: Using Latent Space for In-Generation Image Watermarking
Ahmad Rezaei, Mohammad Akbari, Saeed Ranjbar Alvar et al.
R^2-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding
Ye Liu, Jixuan He, Wanhua Li et al.
TCI-Former: Thermal Conduction-Inspired Transformer for Infrared Small Target Detection
Tianxiang Chen, Zhentao Tan, Qi Chu et al.
Learning Generalized Medical Image Segmentation from Decoupled Feature Queries
1207 Qi Bi, Jingjun Yi, Hao Zheng et al.
Physical Property Understanding from Language-Embedded Feature Fields
Albert J. Zhai, Yuan Shen, Emily Y. Chen et al.
RegionDrag: Fast Region-Based Image Editing with Diffusion Models
Jingyi Lu, Xinghui Li, Kai Han
Masked Generative Video-to-Audio Transformers with Enhanced Synchronicity
Santiago Pascual, Chunghsin YEH, Ioannis Tsiamas et al.
Distilling Autoregressive Models to Obtain High-Performance Non-autoregressive Solvers for Vehicle Routing Problems with Faster Inference Speed
Yubin Xiao, Di Wang, Boyang Li et al.
Root Cause Analysis in Microservice Using Neural Granger Causal Discovery
Cheng-Ming Lin, Ching Chang, Wei-Yao Wang et al.
Contrasting Deepfakes Diffusion via Contrastive Learning and Global-Local Similarities
Lorenzo Baraldi, Federico Cocchi, Marcella Cornia et al.
CAT: Exploiting Inter-Class Dynamics for Domain Adaptive Object Detection
Mikhail Kennerley, Jian-Gang Wang, Bharadwaj Veeravalli et al.
Hierarchical Gaussian Mixture Normalizing Flow Modeling for Unified Anomaly Detection
Xincheng Yao, Ruoqi Li, Zefeng Qian et al.
Boosting Transferability in Vision-Language Attacks via Diversification along the Intersection Region of Adversarial Trajectory
Sensen Gao, Xiaojun Jia, Xuhong Ren et al.
Spatio-Temporal Few-Shot Learning via Diffusive Neural Network Generation
Yuan Yuan, Chenyang Shao, Jingtao Ding et al.
RadOcc: Learning Cross-Modality Occupancy Knowledge through Rendering Assisted Distillation
Haiming Zhang, Xu Yan, Dongfeng Bai et al.
Finite-Time Analysis of On-Policy Heterogeneous Federated Reinforcement Learning
Chenyu Zhang, Han Wang, Aritra Mitra et al.
Domain-Controlled Prompt Learning
Qinglong Cao, Zhengqin Xu, Yuntian Chen et al.
Material Palette: Extraction of Materials from a Single Image
Ivan Lopes, Fabio Pizzati, Raoul de Charette
Localization Is All You Evaluate: Data Leakage in Online Mapping Datasets and How to Fix It
Adam Lilja, Junsheng Fu, Erik Stenborg et al.
Towards the Uncharted: Density-Descending Feature Perturbation for Semi-supervised Semantic Segmentation
Xiaoyang Wang, Huihui Bai, Limin Yu et al.
Denoising Vision Transformers
Jiawei Yang, Katie Luo, Jiefeng Li et al.
Beyond TreeSHAP: Efficient Computation of Any-Order Shapley Interactions for Tree Ensembles
Maximilian Muschalik, Fabian Fumagalli, Barbara Hammer et al.
InterHandGen: Two-Hand Interaction Generation via Cascaded Reverse Diffusion
Jihyun Lee, Shunsuke Saito, Giljoo Nam et al.
LEOD: Label-Efficient Object Detection for Event Cameras
Ziyi Wu, Mathias Gehrig, Qing Lyu et al.
Gaussian Splatting on the Move: Blur and Rolling Shutter Compensation for Natural Camera Motion
Otto Seiskari, Jerry Ylilammi, Valtteri Kaatrasalo et al.
VP3D: Unleashing 2D Visual Prompt for Text-to-3D Generation
Yang Chen, Yingwei Pan, haibo yang et al.
Modular Blind Video Quality Assessment
Wen Wen, Mu Li, Yabin ZHANG et al.
NECO: NEural Collapse Based Out-of-distribution detection
Mouïn Ben Ammar, Nacim Belkhir, Sebastian Popescu et al.
Spanning Training Progress: Temporal Dual-Depth Scoring (TDDS) for Enhanced Dataset Pruning
xin zhang, Jiawei Du, Weiying Xie et al.
PosterLlama: Bridging Design Ability of Langauge Model to Content-Aware Layout Generation
Jaejung Seol, Seojun Kim, Jaejun Yoo
WeConvene: Learned Image Compression with Wavelet-Domain Convolution and Entropy Model
Haisheng Fu, Jie Liang, Zhenman Fang et al.
Multi-Space Alignments Towards Universal LiDAR Segmentation
Youquan Liu, Lingdong Kong, Xiaoyang Wu et al.
Deep Contrastive Graph Learning with Clustering-Oriented Guidance
Mulin Chen, Bocheng Wang, Xuelong Li
MVPGS: Excavating Multi-view Priors for Gaussian Splatting from Sparse Input Views
Wangze Xu, Huachen Gao, Shihe Shen et al.
Soft Prompt Generation for Domain Generalization
Shuanghao Bai, Yuedi Zhang, Wanqi Zhou et al.
TopoGCL: Topological Graph Contrastive Learning
Yuzhou Chen, Jose Frias, Yulia Gel
Universal Segmentation at Arbitrary Granularity with Language Instruction
Yong Liu, Cairong Zhang, Yitong Wang et al.
EAGLE: Eigen Aggregation Learning for Object-Centric Unsupervised Semantic Segmentation
Chanyoung Kim, Woojung Han, Dayun Ju et al.
Meta-Prompting for Automating Zero-shot Visual Recognition with LLMs
Muhammad Jehanzeb Mirza, Leonid Karlinsky, Wei Lin et al.
Griffon: Spelling out All Object Locations at Any Granularity with Large Language Models
Yufei Zhan, Yousong Zhu, Zhiyang Chen et al.
WiMANS: A Benchmark Dataset for WiFi-based Multi-user Activity Sensing
Shuokang Huang, Kaihan Li, Di You et al.
Image Inpainting via Tractable Steering of Diffusion Models
Anji Liu, Mathias Niepert, Guy Van den Broeck
Style Blind Domain Generalized Semantic Segmentation via Covariance Alignment and Semantic Consistence Contrastive Learning
Woo-Jin Ahn, Geun-Yeong Yang, Hyunduck Choi et al.
Unifying Multi-Modal Uncertainty Modeling and Semantic Alignment for Text-to-Image Person Re-identification
Zhiwei Zhao, Bin Liu, Yan Lu et al.
Towards Language-Driven Video Inpainting via Multimodal Large Language Models
Jianzong Wu, Xiangtai Li, Chenyang Si et al.
Seeing Motion at Nighttime with an Event Camera
Haoyue Liu, Shihan Peng, Lin Zhu et al.
Region-Disentangled Diffusion Model for High-Fidelity PPG-to-ECG Translation
Debaditya Shome, Pritam Sarkar, Ali Etemad
SHAP-EDITOR: Instruction-Guided Latent 3D Editing in Seconds
Minghao Chen, Junyu Xie, Iro Laina et al.
Training Like a Medical Resident: Context-Prior Learning Toward Universal Medical Image Segmentation
Yunhe Gao
N2F2: Hierarchical Scene Understanding with Nested Neural Feature Fields
Yash Bhalgat, Iro Laina, Joao F Henriques et al.
Prompt-Enhanced Multiple Instance Learning for Weakly Supervised Video Anomaly Detection
Junxi Chen, Liang Li, Li Su et al.
Insect-Foundation: A Foundation Model and Large-scale 1M Dataset for Visual Insect Understanding
Hoang-Quan Nguyen, Thanh-Dat Truong, Xuan-Bac Nguyen et al.
Revisiting the Domain Shift and Sample Uncertainty in Multi-source Active Domain Transfer
Wenqiao Zhang, Zheqi Lv
UniGarmentManip: A Unified Framework for Category-Level Garment Manipulation via Dense Visual Correspondence
Ruihai Wu, Haoran Lu, Yiyan Wang et al.
Vision Transformer Off-the-Shelf: A Surprising Baseline for Few-Shot Class-Agnostic Counting
Zhicheng Wang, Liwen Xiao, Zhiguo Cao et al.
Frozen CLIP: A Strong Backbone for Weakly Supervised Semantic Segmentation
Bingfeng Zhang, Siyue Yu, Yunchao Wei et al.
Make-Your-Anchor: A Diffusion-based 2D Avatar Generation Framework
Ziyao Huang, Fan Tang, Yong Zhang et al.
DocRes: A Generalist Model Toward Unifying Document Image Restoration Tasks
Jiaxin Zhang, Dezhi Peng, Chongyu Liu et al.
Boosting the Power of Small Multimodal Reasoning Models to Match Larger Models with Self-Consistency Training
Cheng Tan, Jingxuan Wei, Zhangyang Gao et al.
Zero-1-to-3: Domain-Level Zero-Shot Cognitive Diagnosis via One Batch of Early-Bird Students towards Three Diagnostic Objectives
Weibo Gao, Qi Liu, Hao Wang et al.
OmniViD: A Generative Framework for Universal Video Understanding
Junke Wang, Dongdong Chen, Chong Luo et al.
Personalized Federated Domain-Incremental Learning based on Adaptive Knowledge Matching
Yichen Li, Wenchao Xu, Haozhao Wang et al.
Diffusion Models for Monocular Depth Estimation: Overcoming Challenging Conditions
Fabio Tosi, Pierluigi Zama Ramirez, Matteo Poggi
Four Ways to Improve Verbo-visual Fusion for Dense 3D Visual Grounding
Ozan Unal, Christos Sakaridis, Suman Saha et al.
Single Domain Generalization for Crowd Counting
Zhuoxuan Peng, S.-H. Gary Chan
Entropic Open-Set Active Learning
Bardia Safaei, Vibashan VS, Celso de Melo et al.
Hunting Attributes: Context Prototype-Aware Learning for Weakly Supervised Semantic Segmentation
feilong tang, Zhongxing Xu, Zhaojun QU et al.
Nuvo: Neural UV Mapping for Unruly 3D Representations
Pratul Srinivasan, Stephan J Garbin, Dor Verbin et al.
UMBRAE: Unified Multimodal Brain Decoding
Weihao Xia, Raoul de Charette, Cengiz Oztireli et al.
VOODOO 3D: Volumetric Portrait Disentanglement For One-Shot 3D Head Reenactment
Phong Tran, Egor Zakharov, Long Nhat Ho et al.
Dataset Distillation by Automatic Training Trajectories
Dai Liu, Jindong Gu, Hu Cao et al.
MTMamba: Enhancing Multi-Task Dense Scene Understanding by Mamba-Based Decoders
Baijiong Lin, Weisen Jiang, Pengguang Chen et al.
Biased Temporal Convolution Graph Network for Time Series Forecasting with Missing Values
Xiaodan Chen, Xiucheng Li, Bo Liu et al.
Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling
Rui Liu, Yifan Hu, Yi Ren et al.