Most Cited 2024 "image stylization" Papers
12,324 papers found • Page 12 of 62
Conference
Multi-Objective Bayesian Optimization with Active Preference Learning
Ryota Ozaki, Kazuki Ishikawa, Youhei Kanzaki et al.
3426 Regulating Intermediate 3D Features for Vision-Centric Autonomous Driving
Junkai Xu, Liang Peng, Haoran Cheng et al.
Hyperbolic Learning with Synthetic Captions for Open-World Detection
Fanjie Kong, Yanbei Chen, Jiarui Cai et al.
Cycle-Consistency Learning for Captioning and Grounding
Ning Wang, Jiajun Deng, Mingbo Jia
ProCC: Progressive Cross-Primitive Compatibility for Open-World Compositional Zero-Shot Learning
Fushuo Huo, Wenchao Xu, Song Guo et al.
Online GNN Evaluation Under Test-time Graph Distribution Shifts
Xin Zheng, Dongjin Song, Qingsong Wen et al.
DailyDVS-200: A Comprehensive Benchmark Dataset for Event-Based Action Recognition
Qi Wang, Zhou Xu, Yuming Lin et al.
Can We Evaluate Domain Adaptation Models Without Target-Domain Labels?
JIANFEI YANG, Hanjie Qian, Yuecong Xu et al.
What You See is What You GAN: Rendering Every Pixel for High-Fidelity Geometry in 3D GANs
Alex Trevithick, Matthew Chan, Towaki Takikawa et al.
Sequential Fusion Based Multi-Granularity Consistency for Space-Time Transformer Tracking
Kun Hu, Wenjing Yang, Wanrong Huang et al.
CAMIL: Context-Aware Multiple Instance Learning for Cancer Detection and Subtyping in Whole Slide Images
olga fourkioti, Matt De Vries, Chris Bakal
Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation
Shihao Zhao, Shaozhe Hao, Bojia Zi et al.
X-Pose: Detecting Any Keypoints
Jie Yang, AILING ZENG, Ruimao Zhang et al.
UniHuman: A Unified Model For Editing Human Images in the Wild
Nannan Li, Qing Liu, Krishna Kumar Singh et al.
Free-Editor: Zero-shot Text-driven 3D Scene Editing
Md Nazmul Karim, Hasan Iqbal, Umar Khalid et al.
AttnZero: Efficient Attention Discovery for Vision Transformers
Lujun Li, Zimian Wei, Peijie Dong et al.
Referring Atomic Video Action Recognition
Kunyu Peng, Jia Fu, Kailun Yang et al.
MoVideo: Motion-Aware Video Generation with Diffusion Models
Jingyun Liang, Yuchen Fan, Kai Zhang et al.
Towards Fair Graph Federated Learning via Incentive Mechanisms
12794 Chenglu Pan, Jiarong Xu, Yue Yu et al.
Event Camera Data Dense Pre-training
Yan Yang, Liyuan Pan, Liu liu
Finding Visual Task Vectors
Alberto Hojel, Yutong Bai, Trevor Darrell et al.
M2Doc: A Multi-Modal Fusion Approach for Document Layout Analysis
Ning Zhang, Hiuyi Cheng, Jiayu Chen et al.
Vision-Language Action Knowledge Learning for Semantic-Aware Action Quality Assessment
Huangbiao Xu, Xiao Ke, Yuezhou Li et al.
RoDUS: Robust Decomposition of Static and Dynamic Elements in Urban Scenes
Thang-Anh-Quan Nguyen, Luis G Roldao Jimenez, Nathan Piasco et al.
AGS: Affordable and Generalizable Substitute Training for Transferable Adversarial Attack
Ruikui Wang, Yuanfang Guo, Yunhong Wang
Retrieval-Guided Reinforcement Learning for Boolean Circuit Minimization
Animesh Basak Chowdhury, Marco Romanelli, Benjamin Tan et al.
TASeg: Temporal Aggregation Network for LiDAR Semantic Segmentation
Xiaopei Wu, Yuenan Hou, Xiaoshui Huang et al.
PIN: Positional Insert Unlocks Object Localisation Abilities in VLMs
Michael Dorkenwald, Nimrod Barazani, Cees G. M. Snoek et al.
Generative 3D Part Assembly via Part-Whole-Hierarchy Message Passing
Bi'an Du, Xiang Gao, Wei Hu et al.
A Restoration Network as an Implicit Prior
Yuyang Hu, Mauricio Delbracio, Peyman Milanfar et al.
SyFormer: Structure-Guided Synergism Transformer for Large-Portion Image Inpainting
Jie Wu, Yuchao Feng, Honghui Xu et al.
Pre-training with Random Orthogonal Projection Image Modeling
Maryam Haghighat, Peyman Moghadam, Shaheer Mohamed et al.
Find n' Propagate: Open-Vocabulary 3D Object Detection in Urban Environments
Djamahl Etchegaray, Zi Helen Huang, Tatsuya Harada et al.
CNN Kernels Can Be the Best Shapelets
Eric Qu, Yansen Wang, Xufang Luo et al.
EntAugment: Entropy-Driven Adaptive Data Augmentation Framework for Image Classification
Suorong Yang, Furao Shen, Jian Zhao
Dynamic Feature Pruning and Consolidation for Occluded Person Re-identification
YuTeng Ye, Hang Zhou, Jiale Cai et al.
AMD: Automatic Multi-step Distillation of Large-scale Vision Models
Cheng Han, Qifan Wang, Sohail A Dianat et al.
CMG-Net: Robust Normal Estimation for Point Clouds via Chamfer Normal Distance and Multi-Scale Geometry
Yingrui Wu, Mingyang Zhao, Keqiang Li et al.
DiffSal: Joint Audio and Video Learning for Diffusion Saliency Prediction
Junwen Xiong, Peng Zhang, Tao You et al.
LEIA: Latent View-invariant Embeddings for Implicit 3D Articulation
Archana Swaminathan, Anubhav Anubhav, Kamal Gupta et al.
Neural-Symbolic Recursive Machine for Systematic Generalization
Qing Li, Yixin Zhu, Yitao Liang et al.
DexFuncGrasp: A Robotic Dexterous Functional Grasp Dataset Constructed from a Cost-Effective Real-Simulation Annotation System
Jinglue Hang, Xiangbo Lin, Tianqiang Zhu et al.
LayoutDETR: Detection Transformer Is a Good Multimodal Layout Designer
Ning Yu, Chia-Chih Chen, Zeyuan Chen et al.
Exploiting Auxiliary Caption for Video Grounding
Hongxiang Li, Meng Cao, Xuxin Cheng et al.
HONGAT: Graph Attention Networks in the Presence of High-Order Neighbors
Heng-Kai Zhang, Yi-Ge Zhang, Zhi Zhou et al.
UnionFormer: Unified-Learning Transformer with Multi-View Representation for Image Manipulation Detection and Localization
Shuaibo Li, Wei Ma, Jianwei Guo et al.
Generalizable Task Representation Learning for Offline Meta-Reinforcement Learning with Data Limitations
Renzhe Zhou, Chen-Xiao Gao, Zongzhang Zhang et al.
HUMOS: Human Motion Model Conditioned on Body Shape
Shashank Tripathi, Omid Taheri, Christoph Lassner et al.
Adversarial Backdoor Attack by Naturalistic Data Poisoning on Trajectory Prediction in Autonomous Driving
Mozhgan Pourkeshavarz, Mohammad Sabokrou, Amir Rasouli
Learning Intra-view and Cross-view Geometric Knowledge for Stereo Matching
Rui Gong, Weide Liu, ZAIWANG GU et al.
Estimating Noisy Class Posterior with Part-level Labels for Noisy Label Learning
Rui Zhao, Bin Shi, Jianfei Ruan et al.
Learning to Learn Better Visual Prompts
Fengxiang Wang, Wanrong Huang, Shaowu Yang et al.
UniCode : Learning a Unified Codebook for Multimodal Large Language Models
Sipeng Zheng, Bohan Zhou, Yicheng Feng et al.
Temporal Event Stereo via Joint Learning with Stereoscopic Flow
Hoonhee Cho, Jae-young Kang, Kuk-Jin Yoon
HandBooster: Boosting 3D Hand-Mesh Reconstruction by Conditional Synthesis and Sampling of Hand-Object Interactions
Hao Xu, Li Haipeng, Yinqiao Wang et al.
Accelerating Data Generation for Neural Operators via Krylov Subspace Recycling
Hong Wang, Zhongkai Hao, Jie Wang et al.
Free-VSC: Free Semantics from Visual Foundation Models for Unsupervised Video Semantic Compression
Yuan Tian, Guo Lu, Guangtao Zhai
Learning Instance-Aware Correspondences for Robust Multi-Instance Point Cloud Registration in Cluttered Scenes
Zhiyuan Yu, Zheng Qin, lintao zheng et al.
DiffSCI: Zero-Shot Snapshot Compressive Imaging via Iterative Spectral Diffusion Model
Zhenghao Pan, Haijin Zeng, Jiezhang Cao et al.
Spiking NeRF: Representing the Real-World Geometry by a Discontinuous Representation
Zhanfeng Liao, Yan Liu, Qian Zheng et al.
Reinforcement Learning Meets Visual Odometry
Nico Messikommer, Giovanni Cioffi, Mathias Gehrig et al.
PointNeRF++: A multi-scale, point-based Neural Radiance Field
Weiwei Sun, Eduard Trulls, Yang-Che Tseng et al.
ProTeCt: Prompt Tuning for Taxonomic Open Set Classification
Tz-Ying Wu, Chih-Hui Ho, Nuno Vasconcelos
Iterative Ensemble Training with Anti-Gradient Control for Mitigating Memorization in Diffusion Models
Xiao Liu, Xiaoliu Guan, Yu Wu et al.
Deep Structural Knowledge Exploitation and Synergy for Estimating Node Importance Value on Heterogeneous Information Networks
Yankai Chen, Yixiang Fang, Qiongyan Wang et al.
Inspecting Prediction Confidence for Detecting Black-Box Backdoor Attacks
Tong Wang, Yuan Yao, Feng Xu et al.
Norface: Improving Facial Expression Analysis by Identity Normalization
Hanwei Liu, Rudong An, Zhimeng Zhang et al.
NeRF Analogies: Example-Based Visual Attribute Transfer for NeRFs
Michael Fischer, Zhengqin Li, Thu Nguyen-Phuoc et al.
Slice3D: Multi-Slice Occlusion-Revealing Single View 3D Reconstruction
Yizhi Wang, Wallace Lira, Wenqi Wang et al.
Long-term Temporal Context Gathering for Neural Video Compression
Linfeng Qi, Zhaoyang Jia, Jiahao Li et al.
Robust Distillation via Untargeted and Targeted Intermediate Adversarial Samples
Junhao Dong, Piotr Koniusz, Junxi Chen et al.
Rating-Based Reinforcement Learning
Devin White, Mingkang Wu, Ellen Novoseller et al.
Dual-Window Multiscale Transformer for Hyperspectral Snapshot Compressive Imaging
Fulin Luo, Xi Chen, Xiuwen Gong et al.
CoBIT: A Contrastive Bi-directional Image-Text Generation Model
Haoxuan You, Xiaoyue Guo, Zhecan Wang et al.
Neural Volumetric World Models for Autonomous Driving
Zanming Huang, Jimuyang Zhang, Eshed Ohn-Bar
MoEAD: A Parameter-efficient Model for Multi-class Anomaly Detection
Shiyuan Meng, Wenchao Meng, Qihang Zhou et al.
FoSp: Focus and Separation Network for Early Smoke Segmentation
Lujian Yao, Haitao Zhao, Jingchao Peng et al.
Foster Adaptivity and Balance in Learning with Noisy Labels
Mengmeng Sheng, Zeren Sun, Tao Chen et al.
Region-Based Representations Revisited
Michal Shlapentokh-Rothman, Ansel Blume, Yao Xiao et al.
In2SET: Intra-Inter Similarity Exploiting Transformer for Dual-Camera Compressive Hyperspectral Imaging
Xin Wang, Lizhi Wang, Xiangtian Ma et al.
ReCoRe: Regularized Contrastive Representation Learning of World Model
Rudra P, K. Poudel, Harit Pandya et al.
Towards Robust Image Stitching: An Adaptive Resistance Learning against Compatible Attacks
Zhiying Jiang, Xingyuan Li, Jinyuan Liu et al.
Regroup Median Loss for Combating Label Noise
Authors: Fengpeng Li, Kemou Li, Jinyu Tian et al.
Bridge Past and Future: Overcoming Information Asymmetry in Incremental Object Detection
QIJIE MO, Yipeng Gao, Shenghao Fu et al.
FineMatch: Aspect-based Fine-grained Image and Text Mismatch Detection and Correction
Hang Hua, Jing Shi, Kushal Kafle et al.
CDPNet: Cross-Modal Dual Phases Network for Point Cloud Completion
Zhenjiang Du, Jiale Dou, Zhitao Liu et al.
EmoTalk3D: High-Fidelity Free-View Synthesis of Emotional 3D Talking Head
Qianyun He, Xinya Ji, Yicheng Gong et al.
SURE: SUrvey REcipes for building reliable and robust deep networks
Yuting Li, Yingyi Chen, Xuanlong Yu et al.
Grounding Language Models for Visual Entity Recognition
Zilin Xiao, Ming Gong, Paola Cascante-Bonilla et al.
Full Bayesian Significance Testing via Neural Networks
Zehua Liu, Zimeng Li, Jingyuan Wang et al.
PartGLEE: A Foundation Model for Recognizing and Parsing Any Objects
Junyi Li, Junfeng Wu, Weizhi Zhao et al.
DreamMover: Leveraging the Prior of Diffusion Models for Image Interpolation with Large Motion
Liao Shen, Tianqi Liu, Huiqiang Sun et al.
InstaStyle: Inversion Noise of a Stylized Image is Secretly a Style Adviser
Xing Cui, Zekun Li, Peipei Li et al.
ReSyncer: Rewiring Style-based Generator for Unified Audio-Visually Synced Facial Performer
Jiazhi Guan, Zhiliang Xu, Hang Zhou et al.
Federated Causality Learning with Explainable Adaptive Optimization
Dezhi Yang, Xintong He, Jun Wang et al.
ScanTalk: 3D Talking Heads from Unregistered Scans
Federico Nocentini, Thomas Besnier, Claudio Ferrari et al.
Learning Representations on the Unit Sphere: Investigating Angular Gaussian and Von Mises-Fisher Distributions for Online Continual Learning
Nicolas Michel, Giovanni Chierchia, Romain Negrel et al.
Learning Representations of Satellite Images From Metadata Supervision
Jules Bourcier, Gohar Dashyan, Karteek Alahari et al.
Scribble Hides Class: Promoting Scribble-Based Weakly-Supervised Semantic Segmentation
Xinliang Zhang, Lei Zhu, Hangzhou He et al.
Rethinking the Paradigm of Content Constraints in Unpaired Image-to-Image Translation
Xiuding Cai, Yaoyao Zhu, Dong Miao et al.
FashionERN: Enhance-and-Refine Network for Composed Fashion Image Retrieval
Yanzhe Chen, Huasong Zhong, Xiangteng He et al.
MagicEraser: Erasing Any Objects via Semantics-Aware Control
FAN LI, Zixiao Zhang, Yi Huang et al.
Boosting Adversarial Training via Fisher-Rao Norm-based Regularization
Xiangyu Yin, Wenjie Ruan
MCPNet: An Interpretable Classifier via Multi-Level Concept Prototypes
Bor Shiun Wang, Chien-Yi Wang, Wei-Chen Chiu
On the Utility of 3D Hand Poses for Action Recognition
Md Salman Shamil, Dibyadip Chatterjee, Fadime Sener et al.
Differentiable Euler Characteristic Transforms for Shape Classification
Ernst Roell, Bastian Rieck
Brain-ID: Learning Contrast-agnostic Anatomical Representations for Brain Imaging
Peirong Liu, Oula Puonti, Xiaoling Hu et al.
Multi-Scene Generalized Trajectory Global Graph Solver with Composite Nodes for Multiple Object Tracking
Yan Gao, Haojun Xu, Jie Li et al.
SC-Tune: Unleashing Self-Consistent Referential Comprehension in Large Vision Language Models
Tongtian Yue, Jie Cheng, Longteng Guo et al.
Beyond Prototypes: Semantic Anchor Regularization for Better Representation Learning
Yanqi Ge, Qiang Nie, Ye Huang et al.
Parameterized Approximation Algorithms for Sum of Radii Clustering and Variants
Xianrun Chen, Dachuan Xu, Yicheng Xu et al.
Learning Temporal Resolution in Spectrogram for Audio Classification
Haohe Liu, Xubo Liu, Qiuqiang Kong et al.
UMG-CLIP: A Unified Multi-Granularity Vision Generalist for Open-World Understanding
Bowen Shi, Peisen Zhao, Zichen Wang et al.
3iGS: Factorised Tensorial Illumination for 3D Gaussian Splatting
Zhe Jun Tang, Tat-Jen Cham
DV-3DLane: End-to-end Multi-modal 3D Lane Detection with Dual-view Representation
Yueru Luo, Shuguang Cui, Zhen Li
ScatterFormer: Efficient Voxel Transformer with Scattered Linear Attention
Chenhang He, Ruihuang Li, Guowen Zhang et al.
LQMFormer: Language-aware Query Mask Transformer for Referring Image Segmentation
Nisarg Shah, Vibashan VS, Vishal M. Patel
Chronic Poisoning: Backdoor Attack against Split Learning
Fangchao Yu, Bo Zeng, Kai Zhao et al.
Kalman-Inspired Feature Propagation for Video Face Super-Resolution
Ruicheng Feng, Chongyi Li, Chen Change Loy
Boosting 3D Single Object Tracking with 2D Matching Distillation and 3D Pre-training
qiangqiang wu, Yan Xia, Jia Wan et al.
BKDSNN: Enhancing the Performance of Learning-based Spiking Neural Networks Training with Blurred Knowledge Distillation
Zekai Xu, Kang You, Qinghai Guo et al.
InstructGIE: Towards Generalizable Image Editing
Zichong Meng, Changdi Yang, Jun Liu et al.
Single-View Scene Point Cloud Human Grasp Generation
Yan-Kang Wang, Chengyi Xing, Yi-Lin Wei et al.
Local Action-Guided Motion Diffusion Model for Text-to-Motion Generation
Peng Jin, Hao Li, Zesen Cheng et al.
Effective Video Mirror Detection with Inconsistent Motion Cues
Alex Warren, Ke Xu, Jiaying Lin et al.
Task-Driven Causal Feature Distillation: Towards Trustworthy Risk Prediction
Zhixuan Chu, Mengxuan Hu, Qing Cui et al.
Unlocking Textual and Visual Wisdom: Open-Vocabulary 3D Object Detection Enhanced by Comprehensive Guidance from Text and Image
Pengkun Jiao, Na Zhao, Jingjing Chen et al.
PH-Net: Semi-Supervised Breast Lesion Segmentation via Patch-wise Hardness
Siyao Jiang, Huisi Wu, Junyang Chen et al.
Light Schrödinger Bridge
Alexander Korotin, Nikita Gushchin, Evgeny Burnaev
WildRefer: 3D Object Localization in Large-scale Dynamic Scenes with Multi-modal Visual Data and Natural Language
Zhenxiang Lin, Xidong Peng, peishan cong et al.
Move Anything with Layered Scene Diffusion
Jiawei Ren, Mengmeng Xu, Jui-Chieh Wu et al.
STAMP: Outlier-Aware Test-Time Adaptation with Stable Memory Replay
Yu Yongcan, Lijun Sheng, Ran He et al.
Harnessing Text-to-Image Diffusion Models for Category-Agnostic Pose Estimation
Duo Peng, Zhengbo Zhang, Ping Hu et al.
Trust Region Methods for Nonconvex Stochastic Optimization beyond Lipschitz Smoothness
Chenghan Xie, Chenxi Li, Chuwen Zhang et al.
SportsSloMo: A New Benchmark and Baselines for Human-centric Video Frame Interpolation
Jiaben Chen, Huaizu Jiang
Compound Text-Guided Prompt Tuning via Image-Adaptive Cues
Hao Tan, Jun Li, Yizhuang Zhou et al.
Hebbian Learning based Orthogonal Projection for Continual Learning of Spiking Neural Networks
Mingqing Xiao, Qingyan Meng, Zongpeng Zhang et al.
Adaptive Compressed Sensing with Diffusion-Based Posterior Sampling
Noam Elata, Tomer Michaeli, Michael Elad
Semi-supervised Segmentation of Histopathology Images with Noise-Aware Topological Consistency
Meilong Xu, Xiaoling Hu, Saumya Gupta et al.
GDA: Generalized Diffusion for Robust Test-time Adaptation
Yun-Yun Tsai, Fu-Chen Chen, Albert Chen et al.
OTSeg: Multi-prompt Sinkhorn Attention for Zero-Shot Semantic Segmentation
Kwanyoung Kim, Yujin Oh, Jong Chul Ye
F3Loc: Fusion and Filtering for Floorplan Localization
Changan Chen, Rui Wang, Christoph Vogel et al.
Real-World Mobile Image Denoising Dataset with Efficient Baselines
Roman Flepp, Andrey Ignatov, Radu Timofte et al.
Where am I? Scene Retrieval with Language
Jiaqi Chen, Daniel Barath, Iro Armeni et al.
Empowering Embodied Visual Tracking with Visual Foundation Models and Offline RL
Fangwei Zhong, Kui Wu, Hai Ci et al.
USE: Universal Segment Embeddings for Open-Vocabulary Image Segmentation
Xiaoqi Wang, Wenbin He, Xiwei Xuan et al.
3DGazeNet: Generalizing Gaze Estimation with Weak Supervision from Synthetic Views
Evangelos Ververas, Polydefkis Gkagkos, Jiankang Deng et al.
Robust Multimodal Learning via Representation Decoupling
Shicai Wei, Yang Luo, Yuji Wang et al.
Rethinking Data Augmentation for Robust LiDAR Semantic Segmentation in Adverse Weather
Junsung Park, Kyungmin Kim, Hyunjung Shim
SpectralNeRF: Physically Based Spectral Rendering with Neural Radiance Field
Ru Li, Jia Liu, Guanghui Liu et al.
Partial-to-Partial Shape Matching with Geometric Consistency
Viktoria Ehm, Maolin Gao, Paul Roetzer et al.
MultiDelete for Multimodal Machine Unlearning
Jiali Cheng, Hadi Amiri
Self-Guided Generation of Minority Samples Using Diffusion Models
Soobin Um, Jong Chul Ye
DiffuMatting: Synthesizing Arbitrary Objects with Matting-level Annotation
Xiaobin Hu, Xu Peng, Donghao Luo et al.
OVD-Explorer: Optimism Should Not Be the Sole Pursuit of Exploration in Noisy Environments
Jinyi Liu, Zhi Wang, Yan Zheng et al.
3D Open-Vocabulary Panoptic Segmentation with 2D-3D Vision-Language Distillation
Zihao Xiao, Longlong Jing, Shangxuan Wu et al.
ReALFRED: An Embodied Instruction Following Benchmark in Photo-Realistic Environments
Taewoong Kim, Cheolhong Min, Byeonghwi Kim et al.
IMMA: Immunizing text-to-image Models against Malicious Adaptation
Amber Yijia Zheng, Raymond Yeh
Improving Geo-diversity of Generated Images with Contextualized Vendi Score Guidance
Reyhane Askari Hemmat, Melissa Hall, Alicia Yi Sun et al.
OpenPSG: Open-set Panoptic Scene Graph Generation via Large Multimodal Models
Zijian Zhou, Zheng Zhu, Holger Caesar et al.
MaxFusion: Plug&Play Multi-Modal Generation in Text-to-Image Diffusion Models
Nithin Gopalakrishnan Nair, Jeya Maria Jose Valanarasu, Vishal Patel
Fed-QSSL: A Framework for Personalized Federated Learning under Bitwidth and Data Heterogeneity
Yiyue Chen, Haris Vikalo, Chianing Wang
AdapEdit: Spatio-Temporal Guided Adaptive Editing Algorithm for Text-Based Continuity-Sensitive Image Editing
Zhiyuan Ma, Guoli Jia, Bowen Zhou
Lift3D: Zero-Shot Lifting of Any 2D Vision Model to 3D
Mukund Varma T, Peihao Wang, Zhiwen Fan et al.
Label-Efficient Few-Shot Semantic Segmentation with Unsupervised Meta-Training
Jianwu Li, Kaiyue Shi, Guo-Sen Xie et al.
Turbo: Informativity-Driven Acceleration Plug-In for Vision-Language Large Models
Chen Ju, Haicheng Wang, Haozhe Cheng et al.
3D Neural Edge Reconstruction
Lei Li, Songyou Peng, Zehao Yu et al.
FedLoGe: Joint Local and Generic Federated Learning under Long-tailed Data
Zikai Xiao, Zihan Chen, Liyinglan Liu et al.
CALICO: Self-Supervised Camera-LiDAR Contrastive Pre-training for BEV Perception
Jiachen Sun, Haizhong Zheng, Qingzhao Zhang et al.
Distilling Reliable Knowledge for Instance-Dependent Partial Label Learning
Dong-Dong Wu, Deng-Bao Wang, Min-Ling Zhang
Enhancing RAW-to-sRGB with Decoupled Style Structure in Fourier Domain
Xuanhua He, Tao Hu, Guoli Wang et al.
Editable Image Elements for Controllable Synthesis
Jiteng Mu, Michael Gharbi, Richard Zhang et al.
ISP-Teacher: Image Signal Process with Disentanglement Regularization for Unsupervised Domain Adaptive Dark Object Detection
Yin Zhang, Yongqiang Zhang, Zian Zhang et al.
CVT-xRF: Contrastive In-Voxel Transformer for 3D Consistent Radiance Fields from Sparse Inputs
Yingji Zhong, Lanqing Hong, Zhenguo Li et al.
3D Multi-frame Fusion for Video Stabilization
Zhan Peng, Xinyi Ye, Weiyue Zhao et al.
TriSampler: A Better Negative Sampling Principle for Dense Retrieval
Zhen Yang, Zhou Shao, Yuxiao Dong et al.
Representing Part-Whole Hierarchies in Foundation Models by Learning Localizability Composability and Decomposability from Anatomy via Self Supervision
Mohammad Reza Hosseinzadeh Taher, Michael Gotway, Jianming Liang
Symmetric Mean-field Langevin Dynamics for Distributional Minimax Problems
Juno Kim, Kakei Yamamoto, Kazusato Oko et al.
An Empirical Study of the Generalization Ability of Lidar 3D Object Detectors to Unseen Domains
George Eskandar
Every Pixel Has its Moments: Ultra-High-Resolution Unpaired Image-to-Image Translation via Dense Normalization
Ming-Yang Ho, Che-Ming Wu, Min-Sheng Wu et al.
Cell Graph Transformer for Nuclei Classification
Wei Lou, Guanbin Li, Xiang Wan et al.
Teddy: Efficient Large-Scale Dataset Distillation via Taylor-Approximated Matching
Ruonan Yu, Songhua Liu, Jingwen Ye et al.
Representing Topological Self-Similarity Using Fractal Feature Maps for Accurate Segmentation of Tubular Structures
Jiaxing Huang, Yanfeng Zhou, Yaoru Luo et al.
BENO: Boundary-embedded Neural Operators for Elliptic PDEs
Haixin Wang, Jiaxin Li, Anubhav Dwivedi et al.
Novel Class Discovery for Ultra-Fine-Grained Visual Categorization
Qi Jia, Yaqi Cai, Qi Jia et al.
Fine-Grained Knowledge Selection and Restoration for Non-exemplar Class Incremental Learning
Authors: Jiang-Tian Zhai, Xialei Liu, Lu Yu et al.
How to Handle Sketch-Abstraction in Sketch-Based Image Retrieval?
Subhadeep Koley, Ayan Kumar Bhunia, Aneeshan Sain et al.
SRFormer: Text Detection Transformer with Incorporated Segmentation and Regression
Qingwen Bu, Sungrae Park, Minsoo Khang et al.
PAC Prediction Sets Under Label Shift
Wenwen Si, Sangdon Park, Insup Lee et al.
Just a Hint: Point-Supervised Camouflaged Object Detection
Huafeng Chen, Dian SHAO, Guangqian Guo et al.
SPD-DDPM: Denoising Diffusion Probabilistic Models in the Symmetric Positive Definite Space
Yunchen Li, Zhou Yu, Gaoqi He et al.
SketchINR: A First Look into Sketches as Implicit Neural Representations
Hmrishav Bandyopadhyay, Ayan Kumar Bhunia, Pinaki Nath Chowdhury et al.
RoDLA: Benchmarking the Robustness of Document Layout Analysis Models
Yufan Chen, Jiaming Zhang, Kunyu Peng et al.
Identifiability of Direct Effects from Summary Causal Graphs
Simon Ferreira, Charles Assaad
Rephrase, Augment, Reason: Visual Grounding of Questions for Vision-Language Models
Archiki Prasad, Elias Stengel-Eskin, Mohit Bansal
Total-Decom: Decomposed 3D Scene Reconstruction with Minimal Interaction
Xiaoyang Lyu, Chirui Chang, Peng Dai et al.
Multi-Sentence Grounding for Long-term Instructional Video
Zeqian Li, QIRUI CHEN, Tengda Han et al.
CF-NeRF: Camera Parameter Free Neural Radiance Fields with Incremental Learning
Qingsong Yan, Qiang Wang, Kaiyong Zhao et al.
Adapting Fine-Grained Cross-View Localization to Areas without Fine Ground Truth
Zimin Xia, Yujiao Shi, HONGDONG LI et al.
Generalized Planning for the Abstraction and Reasoning Corpus
Chao Lei, Nir Lipovetzky, Krista A. Ehinger