Most Cited 2024 "image stylization" Papers
12,324 papers found • Page 56 of 62
Conference
SNIDA: Unlocking Few-Shot Object Detection with Non-linear Semantic Decoupling Augmentation
Yanjie Wang, Xu Zou, Luxin Yan et al.
Adaptive Multi-Modal Cross-Entropy Loss for Stereo Matching
Peng Xu, Zhiyu Xiang, Chengyu Qiao et al.
Driving-Video Dehazing with Non-Aligned Regularization for Safety Assistance
Junkai Fan, Jiangwei Weng, Kun Wang et al.
Exploring Region-Word Alignment in Built-in Detector for Open-Vocabulary Object Detection
Heng Zhang, Qiuyu Zhao, Linyu Zheng et al.
L0-Sampler: An L0 Model Guided Volume Sampling for NeRF
Liangchen Li, Juyong Zhang
Diffusion 3D Features (Diff3F): Decorating Untextured Shapes with Distilled Semantic Features
Niladri Shekhar Dutt, Sanjeev Muralikrishnan, Niloy J. Mitra
Unsupervised Occupancy Learning from Sparse Point Cloud
Amine Ouasfi, Adnane Boukhayma
GLOW: Global Layout Aware Attacks on Object Detection
Jun Bao, Buyu Liu, Kui Ren et al.
Context-based and Diversity-driven Specificity in Compositional Zero-Shot Learning
Yun Li, Zhe Liu, Hang Chen et al.
Neural Underwater Scene Representation
Yunkai Tang, Chengxuan Zhu, Renjie Wan et al.
Scaled Decoupled Distillation
Shicai Wei, Chunbo Luo, Yang Luo
VISTA-LLAMA: Reducing Hallucination in Video Language Models via Equal Distance to Visual Tokens
Fan Ma, Xiaojie Jin, Heng Wang et al.
Hierarchical Intra-modal Correlation Learning for Label-free 3D Semantic Segmentation
Xin Kang, Lei Chu, Jiahao Li et al.
PARA-Drive: Parallelized Architecture for Real-time Autonomous Driving
Xinshuo Weng, Boris Ivanovic, Yan Wang et al.
Towards Generalizable Tumor Synthesis
Qi Chen, Xiaoxi Chen, Haorui Song et al.
Adaptive Hyper-graph Aggregation for Modality-Agnostic Federated Learning
Fan Qi, Shuai Li
Bi-SSC: Geometric-Semantic Bidirectional Fusion for Camera-based 3D Semantic Scene Completion
Yujie Xue, Ruihui Li, F anWu et al.
Efficient and Effective Weakly-Supervised Action Segmentation via Action-Transition-Aware Boundary Alignment
Angchi Xu, Wei-Shi Zheng
Depth-Aware Concealed Crop Detection in Dense Agricultural Scenes
Liqiong Wang, Jinyu Yang, Yanfu Zhang et al.
FC-GNN: Recovering Reliable and Accurate Correspondences from Interferences
Haobo Xu, Jun Zhou, Hua Yang et al.
MoMask: Generative Masked Modeling of 3D Human Motions
chuan guo, Yuxuan Mu, Muhammad Gohar Javed et al.
CapsFusion: Rethinking Image-Text Data at Scale
Qiying Yu, Quan Sun, Xiaosong Zhang et al.
A General and Efficient Training for Transformer via Token Expansion
Wenxuan Huang, Yunhang Shen, Jiao Xie et al.
BigGait: Learning Gait Representation You Want by Large Vision Models
Dingqiang Ye, Chao Fan, Jingzhe Ma et al.
Event-based Visible and Infrared Fusion via Multi-task Collaboration
Mengyue Geng, Lin Zhu, Lizhi Wang et al.
Breathing Life Into Sketches Using Text-to-Video Priors
Rinon Gal, Yael Vinker, Yuval Alaluf et al.
Gaussian Shell Maps for Efficient 3D Human Generation
Rameen Abdal, Wang Yifan, Zifan Shi et al.
Byzantine-robust Decentralized Federated Learning via Dual-domain Clustering and Trust Bootstrapping
Peng Sun, Xinyang Liu, Zhibo Wang et al.
MotionEditor: Editing Video Motion via Content-Aware Diffusion
Shuyuan Tu, Qi Dai, Zhi-Qi Cheng et al.
State Space Models for Event Cameras
Nikola Zubic, Mathias Gehrig, Davide Scaramuzza
DiffInDScene: Diffusion-based High-Quality 3D Indoor Scene Generation
Xiaoliang Ju, Zhaoyang Huang, Yijin Li et al.
Towards Calibrated Multi-label Deep Neural Networks
Jiacheng Cheng, Nuno Vasconcelos
TIM: A Time Interval Machine for Audio-Visual Action Recognition
Jacob Chalk, Jaesung Huh, Evangelos Kazakos et al.
Test-Time Linear Out-of-Distribution Detection
Ke Fan, Tong Liu, Xingyu Qiu et al.
Exploiting Style Latent Flows for Generalizing Deepfake Video Detection
Jongwook Choi, Taehoon Kim, Yonghyun Jeong et al.
LeGO: Leveraging a Surface Deformation Network for Animatable Stylized Face Generation with One Example
Soyeon Yoon, Kwan Yun, Kwanggyoon Seo et al.
Leveraging Predicate and Triplet Learning for Scene Graph Generation
Jiankai Li, Yunhong Wang, Xiefan Guo et al.
Unsupervised Semantic Segmentation Through Depth-Guided Feature Correlation and Sampling
Leon Sick, Dominik Engel, Pedro Hermosilla et al.
HHMR: Holistic Hand Mesh Recovery by Enhancing the Multimodal Controllability of Graph Diffusion Models
Mengcheng Li, Hongwen Zhang, Yuxiang Zhang et al.
Enhancing Visual Continual Learning with Language-Guided Supervision
Bolin Ni, Hongbo Zhao, Chenghao Zhang et al.
PhyScene: Physically Interactable 3D Scene Synthesis for Embodied AI
Yandan Yang, Baoxiong Jia, Peiyuan Zhi et al.
Uncertainty-Aware Source-Free Adaptive Image Super-Resolution with Wavelet Augmentation Transformer
Yuang Ai, Xiaoqiang Zhou, Huaibo Huang et al.
Generalizing 6-DoF Grasp Detection via Domain Prior Knowledge
Haoxiang Ma, Modi Shi, Boyang GAO et al.
Making Vision Transformers Truly Shift-Equivariant
Renan A. Rojas-Gomez, Teck-Yian Lim, Minh Do et al.
DreamVideo: Composing Your Dream Videos with Customized Subject and Motion
Yujie Wei, Shiwei Zhang, Zhiwu Qing et al.
RankED: Addressing Imbalance and Uncertainty in Edge Detection Using Ranking-based Losses
bedrettin cetinkaya, Sinan Kalkan, Emre Akbas
Fine-Grained Bipartite Concept Factorization for Clustering
Chong Peng, Pengfei Zhang, Yongyong Chen et al.
Generalized Event Cameras
Varun Sundar, Matthew Dutson, Andrei Ardelean et al.
Multimodal Prompt Perceiver: Empower Adaptiveness Generalizability and Fidelity for All-in-One Image Restoration
Yuang Ai, Huaibo Huang, Xiaoqiang Zhou et al.
BEVSpread: Spread Voxel Pooling for Bird’s-Eye-View Representation in Vision-based Roadside 3D Object Detection
Wenjie Wang, Yehao Lu, Guangcong Zheng et al.
Dual Pose-invariant Embeddings: Learning Category and Object-specific Discriminative Representations for Recognition and Retrieval
Rohan Sarkar, Avinash Kak
Multi-modal In-Context Learning Makes an Ego-evolving Scene Text Recognizer
Zhen Zhao, Jingqun Tang, Chunhui Lin et al.
NIVeL: Neural Implicit Vector Layers for Text-to-Vector Generation
Vikas Thamizharasan, Difan Liu, Matthew Fisher et al.
Hyperbolic Anomaly Detection
Huimin Li, Zhentao Chen, Yunhao Xu et al.
Selective Nonlinearities Removal from Digital Signals
Krzysztof Maliszewski, Magdalena Urbanska, Varvara Vetrova et al.
Backdoor Defense via Test-Time Detecting and Repairing
Jiyang Guan, Jian Liang, Ran He
Towards a Perceptual Evaluation Framework for Lighting Estimation
Justine Giroux, Mohammad Reza Karimi Dastjerdi, Yannick Hold-Geoffroy et al.
DanceCamera3D: 3D Camera Movement Synthesis with Music and Dance
Zixuan Wang, Jia Jia, Shikun Sun et al.
HoloVIC: Large-scale Dataset and Benchmark for Multi-Sensor Holographic Intersection and Vehicle-Infrastructure Cooperative
CONG MA, Qiao Lei, Chengkai Zhu et al.
What Sketch Explainability Really Means for Downstream Tasks?
Hmrishav Bandyopadhyay, Pinaki Nath Chowdhury, Ayan Kumar Bhunia et al.
Leveraging Frame Affinity for sRGB-to-RAW Video De-rendering
Chen Zhang, Wencheng Han, Yang Zhou et al.
Learning CNN on ViT: A Hybrid Model to Explicitly Class-specific Boundaries for Domain Adaptation
Ba Hung Ngo, Nhat-Tuong Do-Tran, Tuan-Ngoc Nguyen et al.
GoMVS: Geometrically Consistent Cost Aggregation for Multi-View Stereo
Jiang Wu, Rui Li, Haofei Xu et al.
From Correspondences to Pose: Non-minimal Certifiably Optimal Relative Pose without Disambiguation
Javier Tirado-Garín, Javier Civera
CommonCanvas: Open Diffusion Models Trained on Creative-Commons Images
Aaron Gokaslan, A. Feder Cooper, Jasmine Collins et al.
Choose What You Need: Disentangled Representation Learning for Scene Text Recognition Removal and Editing
Boqiang Zhang, Hongtao Xie, Zuan Gao et al.
Memory-based Adapters for Online 3D Scene Perception
Xiuwei Xu, Chong Xia, Ziwei Wang et al.
Cross-spectral Gated-RGB Stereo Depth Estimation
Samuel Brucker, Stefanie Walz, Mario Bijelic et al.
EASE-DETR: Easing the Competition among Object Queries
Yulu Gao, Yifan Sun, Xudong Ding et al.
GSNeRF: Generalizable Semantic Neural Radiance Fields with Enhanced 3D Scene Understanding
Zi-Ting Chou, Sheng-Yu Huang, I-Jieh Liu et al.
CAT-DM: Controllable Accelerated Virtual Try-on with Diffusion Model
Jianhao Zeng, Dan Song, Weizhi Nie et al.
Readout Guidance: Learning Control from Diffusion Features
Grace Luo, Trevor Darrell, Oliver Wang et al.
Action Detection via an Image Diffusion Process
Lin Geng Foo, Tianjiao Li, Hossein Rahmani et al.
Transcriptomics-guided Slide Representation Learning in Computational Pathology
Guillaume Jaume, Lukas Oldenburg, Anurag Vaidya et al.
SurroundSDF: Implicit 3D Scene Understanding Based on Signed Distance Field
Lizhe Liu, Bohua Wang, Hongwei Xie et al.
MLIP: Enhancing Medical Visual Representation with Divergence Encoder and Knowledge-guided Contrastive Learning
Zhe Li, Laurence Yang, Bocheng Ren et al.
Active Open-Vocabulary Recognition: Let Intelligent Moving Mitigate CLIP Limitations
Lei Fan, Jianxiong Zhou, Xiaoying Xing et al.
DyBluRF: Dynamic Neural Radiance Fields from Blurry Monocular Video
Huiqiang Sun, Xingyi Li, Liao Shen et al.
SAOR: Single-View Articulated Object Reconstruction
Mehmet Aygun, Oisin Mac Aodha
GenHowTo: Learning to Generate Actions and State Transformations from Instructional Videos
Tomas Soucek, Dima Damen, Michael Wray et al.
Density-Adaptive Model Based on Motif Matrix for Multi-Agent Trajectory Prediction
Di Wen, Haoran Xu, Zhaocheng He et al.
Towards Accurate Post-training Quantization for Diffusion Models
Changyuan Wang, Ziwei Wang, Xiuwei Xu et al.
MoST: Multi-Modality Scene Tokenization for Motion Prediction
Norman Mu, Jingwei Ji, Zhenpei Yang et al.
Gear-NeRF: Free-Viewpoint Rendering and Tracking with Motion-aware Spatio-Temporal Sampling
Xinhang Liu, Yu-Wing Tai, Chi-Keung Tang et al.
MultiDiff: Consistent Novel View Synthesis from a Single Image
Norman Müller, Katja Schwarz, Barbara Roessle et al.
Multi-Scale Video Anomaly Detection by Multi-Grained Spatio-Temporal Representation Learning
Menghao Zhang, Jingyu Wang, Qi Qi et al.
Uncertainty-aware Action Decoupling Transformer for Action Anticipation
Hongji Guo, Nakul Agarwal, Shao-Yuan Lo et al.
PTT: Point-Trajectory Transformer for Efficient Temporal 3D Object Detection
Kuan-Chih Huang, Weijie Lyu, Ming-Hsuan Yang et al.
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
Haoxin Chen, Yong Zhang, Xiaodong Cun et al.
TextNeRF: A Novel Scene-Text Image Synthesis Method based on Neural Radiance Fields
Jialei Cui, Jianwei Du, Wenzhuo Liu et al.
An Asymmetric Augmented Self-Supervised Learning Method for Unsupervised Fine-Grained Image Hashing
Feiran Hu, Chenlin Zhang, Jiangliang GUO et al.
MimicDiffusion: Purifying Adversarial Perturbation via Mimicking Clean Diffusion Model
Kaiyu Song, Hanjiang Lai, Yan Pan et al.
Action Scene Graphs for Long-Form Understanding of Egocentric Videos
Ivan Rodin, Antonino Furnari, Kyle Min et al.
DiffusionTrack: Point Set Diffusion Model for Visual Object Tracking
Fei Xie, Zhongdao Wang, Chao Ma
EmoGen: Emotional Image Content Generation with Text-to-Image Diffusion Models
Jingyuan Yang, Jiawei Feng, Hui Huang
SpiderMatch: 3D Shape Matching with Global Optimality and Geometric Consistency
Paul Roetzer, Florian Bernard
Realigning Confidence with Temporal Saliency Information for Point-Level Weakly-Supervised Temporal Action Localization
Ziying Xia, Jian Cheng, Siyu Liu et al.
3D Facial Expressions through Analysis-by-Neural-Synthesis
George Retsinas, Panagiotis Filntisis, Radek Danecek et al.
Segment and Caption Anything
Xiaoke Huang, Jianfeng Wang, Yansong Tang et al.
Brush2Prompt: Contextual Prompt Generator for Object Inpainting
Mang Tik Chiu, Yuqian Zhou, Lingzhi Zhang et al.
G^3-LQ: Marrying Hyperbolic Alignment with Explicit Semantic-Geometric Modeling for 3D Visual Grounding
Yuan Wang, Yali Li, Shengjin Wang
Adapting Visual-Language Models for Generalizable Anomaly Detection in Medical Images
Chaoqin Huang, Aofan Jiang, Jinghao Feng et al.
NightCC: Nighttime Color Constancy via Adaptive Channel Masking
Shuwei Li, Robby T. Tan
Sparse Views Near Light: A Practical Paradigm for Uncalibrated Point-light Photometric Stereo
Mohammed Brahimi, Bjoern Haefner, Zhenzhang Ye et al.
Total Selfie: Generating Full-Body Selfies
Bowei Chen, Brian Curless, Ira Kemelmacher-Shlizerman et al.
LayoutFormer: Hierarchical Text Detection Towards Scene Text Understanding
Min Liang, Jia-Wei Ma, Xiaobin Zhu et al.
On the Diversity and Realism of Distilled Dataset: An Efficient Dataset Distillation Paradigm
Peng Sun, Bei Shi, Daiwei Yu et al.
HumanGaussian: Text-Driven 3D Human Generation with Gaussian Splatting
Xian Liu, Xiaohang Zhan, Jiaxiang Tang et al.
Depth Prompting for Sensor-Agnostic Depth Estimation
Jin-Hwi Park, Chanhwi Jeong, Junoh Lee et al.
Modality-Collaborative Test-Time Adaptation for Action Recognition
Baochen Xiong, Xiaoshan Yang, Yaguang Song et al.
DRESS: Instructing Large Vision-Language Models to Align and Interact with Humans via Natural Language Feedback
Yangyi Chen, Karan Sikka, Michael Cogswell et al.
Rethinking Inductive Biases for Surface Normal Estimation
Gwangbin Bae, Andrew J. Davison
Visual Layout Composer: Image-Vector Dual Diffusion Model for Design Layout Generation
Mohammad Amin Shabani, Zhaowen Wang, Difan Liu et al.
Pose-Guided Self-Training with Two-Stage Clustering for Unsupervised Landmark Discovery
Siddharth Tourani, Ahmed Alwheibi, Arif Mahmood et al.
OVMR: Open-Vocabulary Recognition with Multi-Modal References
Zehong Ma, Shiliang Zhang, Longhui Wei et al.
AETTA: Label-Free Accuracy Estimation for Test-Time Adaptation
Taeckyung Lee, Sorn Chottananurak, Taesik Gong et al.
A Simple Recipe for Language-guided Domain Generalized Segmentation
Mohammad Fahes, TUAN-HUNG VU, Andrei Bursuc et al.
An Edit Friendly DDPM Noise Space: Inversion and Manipulations
Inbar Huberman-Spiegelglas, Vladimir Kulikov, Tomer Michaeli
AdaShift: Learning Discriminative Self-Gated Neural Feature Activation With an Adaptive Shift Factor
Sudong Cai
PredToken: Predicting Unknown Tokens and Beyond with Coarse-to-Fine Iterative Decoding
Xuesong Nie, Haoyuan Jin, Yunfeng Yan et al.
Holistic Features are almost Sufficient for Text-to-Video Retrieval
Kaibin Tian, Ruixiang Zhao, Zijie Xin et al.
Causal Mode Multiplexer: A Novel Framework for Unbiased Multispectral Pedestrian Detection
Taeheon Kim, Sebin Shin, Youngjoon Yu et al.
Seeing the Unseen: Visual Common Sense for Semantic Placement
Ram Ramrakhya, Aniruddha Kembhavi, Dhruv Batra et al.
Diffuse Attend and Segment: Unsupervised Zero-Shot Segmentation using Stable Diffusion
Junjiao Tian, Lavisha Aggarwal, Andrea Colaco et al.
GOAT-Bench: A Benchmark for Multi-Modal Lifelong Navigation
Mukul Khanna, Ram Ramrakhya, Gunjan Chhablani et al.
WonderJourney: Going from Anywhere to Everywhere
Hong-Xing Yu, Haoyi Duan, Junhwa Hur et al.
CLIP-Driven Open-Vocabulary 3D Scene Graph Generation via Cross-Modality Contrastive Learning
Lianggangxu Chen, Xuejiao Wang, Jiale Lu et al.
Training-Free Open-Vocabulary Segmentation with Offline Diffusion-Augmented Prototype Generation
Luca Barsellotti, Roberto Amoroso, Marcella Cornia et al.
HRVDA: High-Resolution Visual Document Assistant
Chaohu Liu, Kun Yin, Haoyu Cao et al.
A Dual-Augmentor Framework for Domain Generalization in 3D Human Pose Estimation
Qucheng Peng, Ce Zheng, Chen Chen
Building Bridges across Spatial and Temporal Resolutions: Reference-Based Super-Resolution via Change Priors and Conditional Diffusion Model
Runmin Dong, Shuai Yuan, Bin Luo et al.
Gaussian Shading: Provable Performance-Lossless Image Watermarking for Diffusion Models
Zijin Yang, Kai Zeng, Kejiang Chen et al.
Multimodal Sense-Informed Forecasting of 3D Human Motions
Zhenyu Lou, Qiongjie Cui, Haofan Wang et al.
Resolution Limit of Single-Photon LiDAR
Stanley H. Chan, Hashan K Weerasooriya, Weijian Zhang et al.
Correlation-aware Coarse-to-fine MLPs for Deformable Medical Image Registration
Mingyuan Meng, Dagan Feng, Lei Bi et al.
Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Kristen Grauman, Andrew Westbury, Lorenzo Torresani et al.
CAD-SIGNet: CAD Language Inference from Point Clouds using Layer-wise Sketch Instance Guided Attention
Mohammad Sadil Khan, Elona Dupont, Sk Aziz Ali et al.
LAA-Net: Localized Artifact Attention Network for Quality-Agnostic and Generalizable Deepfake Detection
Dat NGUYEN, Nesryne Mejri, Inder Pal Singh et al.
The Devil is in the Details: StyleFeatureEditor for Detail-Rich StyleGAN Inversion and High Quality Image Editing
Denis Bobkov, Vadim Titov, Aibek Alanov et al.
Adaptive Random Feature Regularization on Fine-tuning Deep Neural Networks
Shin', ya Yamaguchi, Sekitoshi Kanai et al.
Self-Training Large Language Models for Improved Visual Program Synthesis With Visual Reinforcement
Zaid Khan, Vijay Kumar BG, Samuel Schulter et al.
Generating Enhanced Negatives for Training Language-Based Object Detectors
Shiyu Zhao, Long Zhao, Vijay Kumar BG et al.
Joint-Task Regularization for Partially Labeled Multi-Task Learning
Kento Nishi, Junsik Kim, Wanhua Li et al.
MRFP: Learning Generalizable Semantic Segmentation from Sim-2-Real with Multi-Resolution Feature Perturbation
Sumanth Udupa, Prajwal Gurunath, Aniruddh Sikdar et al.
Object Recognition as Next Token Prediction
Kaiyu Yue, Bor-Chun Chen, Jonas Geiping et al.
MuGE: Multiple Granularity Edge Detection
Caixia Zhou, Yaping Huang, Mengyang Pu et al.
Local-consistent Transformation Learning for Rotation-invariant Point Cloud Analysis
Yiyang Chen, Lunhao Duan, Shanshan Zhao et al.
Spacetime Gaussian Feature Splatting for Real-Time Dynamic View Synthesis
Zhan Li, Zhang Chen, Zhong Li et al.
LOTUS: Evasive and Resilient Backdoor Attacks through Sub-Partitioning
Siyuan Cheng, Guanhong Tao, Yingqi Liu et al.
The More You See in 2D the More You Perceive in 3D
Xinyang Han, Zelin Gao, Angjoo Kanazawa et al.
What When and Where? Self-Supervised Spatio-Temporal Grounding in Untrimmed Multi-Action Videos from Narrated Instructions
Brian Chen, Nina Shvetsova, Andrew Rouditchenko et al.
A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions
Jack Urbanek, Florian Bordes, Pietro Astolfi et al.
Correlation-Decoupled Knowledge Distillation for Multimodal Sentiment Analysis with Incomplete Modalities
Mingcheng Li, Dingkang Yang, Xiao Zhao et al.
ES³: Evolving Self-Supervised Learning of Robust Audio-Visual Speech Representations
Yuanhang Zhang, Shuang Yang, Shiguang Shan et al.
Depth-aware Test-Time Training for Zero-shot Video Object Segmentation
Weihuang Liu, Xi Shen, Haolun Li et al.
MSU-4S - The Michigan State University Four Seasons Dataset
Daniel Kent, Mohammed Alyaqoub, Xiaohu Lu et al.
An Interactive Navigation Method with Effect-oriented Affordance
Xiaohan Wang, Yuehu LIU, Xinhang Song et al.
Rapid 3D Model Generation with Intuitive 3D Input
Tianrun Chen, Chaotao Ding, Shangzhan Zhang et al.
Unsupervised Salient Instance Detection
Xin Tian, Ke Xu, Rynson W.H. Lau
Multi-modal Instruction Tuned LLMs with Fine-grained Visual Perception
Junwen He, Yifan Wang, Lijun Wang et al.
CoDi-2: In-Context Interleaved and Interactive Any-to-Any Generation
Zineng Tang, Ziyi Yang, MAHMOUD KHADEMI et al.
PanoOcc: Unified Occupancy Representation for Camera-based 3D Panoptic Segmentation
Yuqi Wang, Yuntao Chen, Xingyu Liao et al.
AdaBM: On-the-Fly Adaptive Bit Mapping for Image Super-Resolution
Cheeun Hong, Kyoung Mu Lee
MULDE: Multiscale Log-Density Estimation via Denoising Score Matching for Video Anomaly Detection
Jakub Micorek, Horst Possegger, Dominik Narnhofer et al.
Instance-level Expert Knowledge and Aggregate Discriminative Attention for Radiology Report Generation
Shenshen Bu, Taiji Li, Zhiming Dai et al.
HyperSDFusion: Bridging Hierarchical Structures in Language and Geometry for Enhanced 3D Text2Shape Generation
Zhiying Leng, Tolga Birdal, Xiaohui Liang et al.
Just Add ?! Pose Induced Video Transformers for Understanding Activities of Daily Living
Dominick Reilly, Srijan Das
PhysGaussian: Physics-Integrated 3D Gaussians for Generative Dynamics
Tianyi Xie, Zeshun Zong, Yuxing Qiu et al.
Viewpoint-Aware Visual Grounding in 3D Scenes
Xiangxi Shi, Zhonghua Wu, Stefan Lee
Long-Tail Class Incremental Learning via Independent Sub-prototype Construction
Xi Wang, Xu Yang, Jie Yin et al.
An Aggregation-Free Federated Learning for Tackling Data Heterogeneity
Yuan Wang, Huazhu Fu, Renuga Kanagavelu et al.
Infrared Adversarial Car Stickers
Xiaopei Zhu, Yuqiu Liu, Zhanhao Hu et al.
XFibrosis: Explicit Vessel-Fiber Modeling for Fibrosis Staging from Liver Pathology Images
CHONG YIN, Siqi Liu, Fei Lyu et al.
Advancing Saliency Ranking with Human Fixations: Dataset Models and Benchmarks
Bowen Deng, Siyang Song, Andrew French et al.
Implicit Event-RGBD Neural SLAM
Delin Qu, Chi Yan, Dong Wang et al.
Retraining-Free Model Quantization via One-Shot Weight-Coupling Learning
Chen Tang, Yuan Meng, Jiacheng Jiang et al.
DreamControl: Control-Based Text-to-3D Generation with 3D Self-Prior
Tianyu Huang, Yihan Zeng, Zhilu Zhang et al.
From Coarse to Fine-Grained Open-Set Recognition
Nico Lang, Vésteinn Snæbjarnarson, Elijah Cole et al.
Exploiting Inter-sample and Inter-feature Relations in Dataset Distillation
Wenxiao Deng, Wenbin Li, Tianyu Ding et al.
Discriminative Pattern Calibration Mechanism for Source-Free Domain Adaptation
Haifeng Xia, Siyu Xia, Zhengming Ding
RAM-Avatar: Real-time Photo-Realistic Avatar from Monocular Videos with Full-body Control
xiang deng, Zerong Zheng, Yuxiang Zhang et al.
PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding
Zhen Li, Mingdeng Cao, Xintao Wang et al.
Privacy-Preserving Face Recognition Using Trainable Feature Subtraction
Yuxi Mi, Zhizhou Zhong, Yuge Huang et al.
Arbitrary Motion Style Transfer with Multi-condition Motion Latent Diffusion Model
Wenfeng Song, Xingliang Jin, Shuai Li et al.
3DSFLabelling: Boosting 3D Scene Flow Estimation by Pseudo Auto-labelling
Chaokang Jiang, Guangming Wang, Jiuming Liu et al.
CPR-Coach: Recognizing Composite Error Actions based on Single-class Training
Shunli Wang, Shuaibing Wang, Dingkang Yang et al.
Restoration by Generation with Constrained Priors
Zheng Ding, Xuaner Zhang, Zhuowen Tu et al.
Unified Entropy Optimization for Open-Set Test-Time Adaptation
Zhengqing Gao, Xu-Yao Zhang, Cheng-Lin Liu
Poly Kernel Inception Network for Remote Sensing Detection
Xinhao Cai, Qiuxia Lai, Yuwei Wang et al.
Distraction is All You Need: Memory-Efficient Image Immunization against Diffusion-Based Image Editing
Ling Lo, Cheng Yeo, Hong-Han Shuai et al.
Parameter Efficient Fine-tuning via Cross Block Orchestration for Segment Anything Model
Zelin Peng, Zhengqin Xu, Zhilin Zeng et al.
MGMap: Mask-Guided Learning for Online Vectorized HD Map Construction
Xiaolu Liu, Song Wang, Wentong Li et al.
ViT-Lens: Towards Omni-modal Representations
Stan Weixian Lei, Yixiao Ge, Kun Yi et al.
Prompt-Driven Referring Image Segmentation with Instance Contrasting
Chao Shang, Zichen Song, Heqian Qiu et al.
CosmicMan: A Text-to-Image Foundation Model for Humans
Shikai Li, Jianglin Fu, Kaiyuan Liu et al.
MTMMC: A Large-Scale Real-World Multi-Modal Camera Tracking Benchmark
Sanghyun Woo, Kwanyong Park, Inkyu Shin et al.
Align and Aggregate: Compositional Reasoning with Video Alignment and Answer Aggregation for Video Question-Answering
Zhaohe Liao, Jiangtong Li, Li Niu et al.
Overload: Latency Attacks on Object Detection for Edge Devices
Erh-Chung Chen, Pin-Yu Chen, I-Hsin Chung et al.
Neural Exposure Fusion for High-Dynamic Range Object Detection
Emmanuel Onzon, Maximilian Bömer, Fahim Mannan et al.
Semantics Distortion and Style Matter: Towards Source-free UDA for Panoramic Segmentation
Xu Zheng, Pengyuan Zhou, ATHANASIOS et al.