Most Cited 2024 "ventral stream selectivity" Papers
12,324 papers found • Page 41 of 62
Conference
Restoring Images in Adverse Weather Conditions via Histogram Transformer
Shangquan Sun, Wenqi Ren, Xinwei Gao et al.
G2fR: Frequency Regularization in Grid-based Feature Encoding Neural Radiance Fields
Shuxiang Xie, Shuyi Zhou, Ken Sakurada et al.
The Fabrication of Reality and Fantasy: Scene Generation with LLM-Assisted Prompt Interpretation
Yi Yao, Chan-Feng Hsu, Jhe-Hao Lin et al.
Eliminating Feature Ambiguity for Few-Shot Segmentation
Qianxiong Xu, Guosheng Lin, Chen Change Loy et al.
GENIXER: Empowering Multimodal Large Language Models as a Powerful Data Generator
Hengyuan Zhao, Pan Zhou, Mike Zheng Shou
PreLAR: World Model Pre-training with Learnable Action Representation
Lixuan Zhang, Meina Kan, Shiguang Shan et al.
FreestyleRet: Retrieving Images from Style-Diversified Queries
Hao Li, Yanhao Jia, Peng Jin et al.
Elevating All Zero-Shot Sketch-Based Image Retrieval Through Multimodal Prompt Learning
Mainak Singha, Ankit Jha, Divyam Gupta et al.
3DFG-PIFu: 3D Feature Grids for Human Digitization from Sparse Views
Kennard Yanting Chan, Fayao Liu, Guosheng Lin et al.
SIGMA: Sinkhorn-Guided Masked Video Modeling
Mohammadreza Salehi, Michael Dorkenwald, Fida Mohammad Thoker et al.
Generative Camera Dolly: Extreme Monocular Dynamic Novel View Synthesis
Basile Van Hoorick, Rundi Wu, Ege Ozguroglu et al.
Distribution Alignment for Fully Test-Time Adaptation with Dynamic Online Data Streams
Ziqiang Wang, Zhixiang Chi, Yanan Wu et al.
SemTrack: A Large-scale Dataset for Semantic Tracking in the Wild
Pengfei Wang, Xiaofei Hui, Jing Wu et al.
Text to Layer-wise 3D Clothed Human Generation
Junting Dong, Qi Fang, Zehuan Huang et al.
Fully Sparse 3D Occupancy Prediction
Haisong Liu, Yang Chen, Haiguang Wang et al.
High-Fidelity 3D Textured Shapes Generation by Sparse Encoding and Adversarial Decoding
Qi Zuo, Xiaodong Gu, Yuan Dong et al.
Omni6D: Large-Vocabulary 3D Object Dataset for Category-Level 6D Object Pose Estimation
Mengchen Zhang, Tong Wu, Tai Wang et al.
Forest2Seq: Revitalizing Order Prior for Sequential Indoor Scene Synthesis
Qi Sun, Hang Zhou, Wengang Zhou et al.
Learning Exhaustive Correlation for Spectral Super-Resolution: Where Spatial-Spectral Attention Meets Linear Dependence
Hongyuan Wang, Lizhi Wang, Jiang Xu et al.
SUMix: Mixup with Semantic and Uncertain Information
Huafeng Qin, Xin Jin, Hongyu Zhu et al.
EAFormer: Scene Text Segmentation with Edge-Aware Transformers
Haiyang Yu, Teng Fu, Bin Li et al.
Zero-Shot Detection of AI-Generated Images
Davide Cozzolino, GIovanni Poggi, Matthias Niessner et al.
TCC-Det: Temporarily consistent cues for weakly-supervised 3D detection
Jan Skvrna, Lukas Neumann
Radiative Gaussian Splatting for Efficient X-ray Novel View Synthesis
Yuanhao Cai, Yixun Liang, Jiahao Wang et al.
Improving Intervention Efficacy via Concept Realignment in Concept Bottleneck Models
Nishad Singhi, Jae Myung Kim, Karsten Roth et al.
Gaze Target Detection Based on Head-Local-Global Coordination
Yaokun Yang, Feng Lu
3DSA:Multi-View 3D Human Pose Estimation With 3D Space Attention Mechanisms
Po Han Chen, Chia-Chi Tsai
An Economic Framework for 6-DoF Grasp Detection
Xiao-Ming Wu, Jia-Feng Cai, Jian-Jian Jiang et al.
CLIP-Guided Generative Networks for Transferable Targeted Adversarial Attacks
Hao Fang, Jiawei Kong, Bin Chen et al.
Progressive Classifier and Feature Extractor Adaptation for Unsupervised Domain Adaptation on Point Clouds
Zicheng Wang, Zhen Zhao, Yiming Wu et al.
RISurConv: Rotation Invariant Surface Attention-Augmented Convolutions for 3D Point Cloud Classification and Segmentation
Zhiyuan Zhang, Licheng Yang, Zhiyu Xiang
Projecting Points to Axes: Oriented Object Detection via Point-Axis Representation
Zeyang Zhao, Qilong Xue, Yifan Bai et al.
SparseSSP: 3D Subcellular Structure Prediction from Sparse-View Transmitted Light Images
Jintu Zheng, Yi Ding, Qizhe Liu et al.
MemBN: Robust Test-Time Adaptation via Batch Norm with Statistics Memory
Juwon Kang, Nayeong Kim, Jungseul Ok et al.
Synchronous Diffusion for Unsupervised Smooth Non-Rigid 3D Shape Matching
Dongliang Cao, Zorah Laehner, Florian Bernard
EgoPoser: Robust Real-Time Egocentric Pose Estimation from Sparse and Intermittent Observations Everywhere
Jiaxi Jiang, Paul Streli, Manuel Meier et al.
Decoupling Common and Unique Representations for Multimodal Self-supervised Learning
Yi Wang, Conrad M Albrecht, Nassim Ait Ali Braham et al.
D-SCo: Dual-Stream Conditional Diffusion for Monocular Hand-Held Object Reconstruction
Bowen Fu, Gu Wang, Chenyangguang Zhang et al.
Linearly Controllable GAN: Unsupervised Feature Categorization and Decomposition for Image Generation and Manipulation
Sehyung Lee, Mijung Kim, Yeongnam Chae et al.
CoherentGS: Sparse Novel View Synthesis with Coherent 3D Gaussians
Avinash Paliwal, Wei Ye, Jinhui Xiong et al.
Unleashing the Power of Prompt-driven Nucleus Instance Segmentation
Zhongyi Shui, Yunlong Zhang, Kai Yao et al.
3DEgo: 3D Editing on the Go!
Umar Khalid, Hasan Iqbal, Azib Farooq et al.
Domain-adaptive Video Deblurring via Test-time Blurring
Jin-Ting He, Fu-Jen Tsai, Jia-Hao Wu et al.
DreamMotion: Space-Time Self-Similar Score Distillation for Zero-Shot Video Editing
Hyeonho Jeong, Jinho Chang, GEON YEONG PARK et al.
Cross-Domain Learning for Video Anomaly Detection with Limited Supervision
Yashika Jain, Ali Dabouei, Min Xu
OMG: Occlusion-friendly Personalized Multi-concept Generation in Diffusion Models
Kong Zhe, Yong Zhang, Tianyu Yang et al.
Enhancing Optimization Robustness in 1-bit Neural Networks through Stochastic Sign Descent
NianHui Guo, Hong Guo, Christoph Meinel et al.
BeNeRF:Neural Radiance Fields from a Single Blurry Image and Event Stream
Wenpu Li, Pian Wan, Peng Wang et al.
SCP-Diff: Spatial-Categorical Joint Prior for Diffusion Based Semantic Image Synthesis
Huan-ang Gao, Mingju Gao, Jiaju Li et al.
PoseAugment: Generative Human Pose Data Augmentation with Physical Plausibility for IMU-based Motion Capture
Zhuojun Li, Chun Yu, Chen Liang et al.
GiT: Towards Generalist Vision Transformer through Universal Language Interface
Haiyang Wang, Hao Tang, Li Jiang et al.
Improving Unsupervised Domain Adaptation: A Pseudo-Candidate Set Approach
Aveen Dayal, Rishabh Lalla, Linga Reddy Cenkeramaddi et al.
Integer-Valued Training and Spike-driven Inference Spiking Neural Network for High-performance and Energy-efficient Object Detection
Xinhao Luo, Man Yao, Yuhong Chou et al.
S-JEPA: A Joint Embedding Predictive Architecture for Skeletal Action Recognition
Mohamed Abdelfattah, Alexandre ALahi
ProTIP: Probabilistic Robustness Verification on Text-to-Image Diffusion Models against Stochastic Perturbation
Yi Zhang, Yun Tang, Wenjie Ruan et al.
Leveraging Near-Field Lighting for Monocular Depth Estimation from Endoscopy Videos
Akshay Paruchuri, Samuel Ehrenstein, Shuxian Wang et al.
6DoF Head Pose Estimation through Explicit Bidirectional Interaction with Face Geometry
Sungho Chun, Ju Yong Chang
Masked Angle-Aware Autoencoder for Remote Sensing Images
Zhihao Li, Biao Hou, Siteng Ma et al.
Multi-modal Relation Distillation for Unified 3D Representation Learning
Huiqun Wang, Yiping Bao, Panwang Pan et al.
Diff3DETR: Agent-based Diffusion Model for Semi-supervised 3D Object Detection
Jiacheng Deng, Jiahao Lu, Tianzhu Zhang
Segment3D: Learning Fine-Grained Class-Agnostic 3D Segmentation without Manual Labels
Rui Huang, Songyou Peng, Ayca Takmaz et al.
Efficient Training of Spiking Neural Networks with Multi-Parallel Implicit Stream Architecture
Zhigao Cao, Meng Li, Xiashuang Wang et al.
Deep Patch Visual SLAM
Lahav Lipson, Zachary Teed, Jia Deng
LiteSAM is Actually what you Need for segment Everything
Jianhai Fu, Yuanjie Yu, Ningchuan Li et al.
Visual Prompting via Partial Optimal Transport
MENGYU ZHENG, Zhiwei Hao, Yehui Tang et al.
AdaCLIP: Adapting CLIP with Hybrid Learnable Prompts for Zero-Shot Anomaly Detection
Yunkang Cao, Jiangning Zhang, Luca Frittoli et al.
Pathformer3D: A 3D Scanpath Transformer for 360° Images
Rong Quan, yantao Lai, Mengyu Qiu et al.
Asymmetric Mask Scheme for Self-Supervised Real Image Denoising
Xiangyu Liao, Tianheng Zheng, Jiayu Zhong et al.
FlexAttention for Efficient High-Resolution Vision-Language Models
Junyan Li, Delin Chen, Tianle Cai et al.
EGIC: Enhanced Low-Bit-Rate Generative Image Compression Guided by Semantic Segmentation
Nikolai Körber, Eduard Kromer, Andreas Siebert et al.
Global-Local Collaborative Inference with LLM for Lidar-Based Open-Vocabulary Detection
Xingyu Peng, Yan Bai, Chen Gao et al.
E3V-K5: An Authentic Benchmark for Redefining Video-Based Energy Expenditure Estimation
Shengxuming Zhang, Lei Jin, Yifan Wang et al.
Robust Incremental Structure-from-Motion with Hybrid Features
Shaohui Liu, Yidan Gao, Tianyi Zhang et al.
Mind the Interference: Retaining Pre-trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models
Longxiang Tang, Zhuotao Tian, Kai Li et al.
Trajectory-aligned Space-time Tokens for Few-shot Action Recognition
Pulkit Kumar, Namitha Padmanabhan, Luke Luo et al.
U-COPE: Taking a Further Step to Universal 9D Category-level Object Pose Estimation
li zhang, Weiqing Meng, Yan Zhong et al.
Neural graphics texture compression supporting random access
Farzad Farhadzadeh, Qiqi Hou, Hoang Le et al.
ControlCap: Controllable Region-level Captioning
Yuzhong Zhao, Liu Yue, Zonghao Guo et al.
Watch Your Steps: Local Image and Scene Editing by Text Instructions
Ashkan Mirzaei, Tristan T Aumentado-Armstrong, Marcus A Brubaker et al.
A Semantic Space is Worth 256 Language Descriptions: Make Stronger Segmentation Models with Descriptive Properties
Junfei Xiao, Ziqi Zhou, Wenxuan Li et al.
Fast View Synthesis of Casual Videos with Soup-of-Planes
Yao-Chih Lee, Zhoutong Zhang, Kevin Blackburn-Matzen et al.
Confidence Self-Calibration for Multi-Label Class-Incremental Learning
Kaile Du, Yifan Zhou, Fan Lyu et al.
SlotLifter: Slot-guided Feature Lifting for Learning Object-Centric Radiance Fields
Yu Liu, Baoxiong Jia, Yixin Chen et al.
Towards High-Quality 3D Motion Transfer with Realistic Apparel Animation
Rong Wang, Wei Mao, Changsheng Lu et al.
AnyHome: Open-Vocabulary Large-Scale Indoor Scene Generation with First-Person View Exploration
Rao Fu, Zehao Wen, Zichen Liu et al.
iMatching: Imperative Correspondence Learning
Chen Wang, Dasong Gao, Yun-Jou Lin et al.
Appearance-based Refinement for Object-Centric Motion Segmentation
Junyu Xie, Weidi Xie, Andrew ZISSERMAN
MTKD: Multi-Teacher Knowledge Distillation for Image Super-Resolution
Yuxuan Jiang, Chen Feng, Fan Zhang et al.
DEAL: Disentangle and Localize Concept-level Explanations for VLMs
Tang Li, Mengmeng Ma, Xi Peng
ReMatching: Low-Resolution Representations for Scalable Shape Correspondence
Filippo Maggioli, Daniele Baieri, Emanuele Rodola et al.
Global Structure-from-Motion Revisited
Linfei Pan, Daniel Barath, Marc Pollefeys et al.
MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation
KUNPENG SONG, Yizhe Zhu, Bingchen Liu et al.
HYPE: Hyperbolic Entailment Filtering for Underspecified Images and Texts
Wonjae Kim, Sanghyuk Chun, Taekyung Kim et al.
Receler: Reliable Concept Erasing of Text-to-Image Diffusion Models via Lightweight Erasers
Chi-Pin Huang, Kai-Po Chang, Chung-Ting Tsai et al.
Bridging the Gap: Studio-like Avatar Creation from a Monocular Phone Capture
ShahRukh Athar, Shunsuke Saito, Stanislav Pidhorskyi et al.
Expressive Whole-Body 3D Gaussian Avatar
Gyeongsik Moon, Takaaki Shiratori, Shunsuke Saito
Strike a Balance in Continual Panoptic Segmentation
Jinpeng Chen, Runmin Cong, Yuxuan Luo et al.
TrajPrompt: Aligning Color Trajectory with Vision-Language Representations
Li-Wu Tsao, Hao-Tang Tsui, Yu-Rou Tuan et al.
Benchmarks and Challenges in Pose Estimation for Egocentric Hand Interactions with Objects
Zicong Fan, Takehiko Ohkawa, Linlin Yang et al.
Tree-D Fusion: Simulation-Ready Tree Dataset from Single Images with Diffusion Priors
Jae Joong Lee, Bosheng Li, Sara Beery et al.
DomainFusion: Generalizing To Unseen Domains with Latent Diffusion Models
Yuyang Huang, Yabo Chen, Yuchen Liu et al.
Unsqueeze [CLS] Bottleneck to Learn Rich Representations
Qing Su, Shihao Ji
Object-Conditioned Energy-Based Attention Map Alignment in Text-to-Image Diffusion Models
Yasi Zhang, Peiyu Yu, Ying Nian Wu
Three Things We Need to Know About Transferring Stable Diffusion to Visual Dense Prediciton Tasks
Manyuan Zhang, Guanglu Song, Xiaoyu Shi et al.
A Direct Approach to Viewing Graph Solvability
Federica Arrigoni, Andrea Fusiello, Tomas Pajdla
Parrot Captions Teach CLIP to Spot Text
Yiqi Lin, Conghui He, Alex Jinpeng Wang et al.
Versatile Incremental Learning: Towards Class and Domain-Agnostic Incremental Learning
Minyeong Park, Jae-Ho Lee, Gyeong-Moon Park
Solving Motion Planning Tasks with a Scalable Generative Model
Yihan Hu, Siqi Chai, Zhening Yang et al.
PLOT: Text-based Person Search with Part Slot Attention for Corresponding Part Discovery
Jicheol Park, Dongwon Kim, Boseung Jeong et al.
Prompt-Driven Contrastive Learning for Transferable Adversarial Attacks
Hunmin Yang, Jongoh Jeong, Kuk-Jin Yoon
Distractors-Immune Representation Learning with Cross-modal Contrastive Regularization for Change Captioning
Yunbin Tu, Liang Li, Li Su et al.
Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation
Fu-Yun Wang, Xiaoshi Wu, Zhaoyang Huang et al.
Physically Plausible Color Correction for Neural Radiance Fields
Qi Zhang, Ying Feng, HONGDONG LI
LLM as Copilot for Coarse-grained Vision-and-Language Navigation
Yanyuan Qiao, Qianyi Liu, Jiajun Liu et al.
Motion-Guided Latent Diffusion for Temporally Consistent Real-world Video Super-resolution
Xi Yang, Chenhang He, Jianqi Ma et al.
MAD-DR: Map Compression for Visual Localization with Matchness Aware Descriptor Dimension Reduction
Qiang Wang
SAFARI: Adaptive Sequence Transformer for Weakly Supervised Referring Expression Segmentation
Sayan Nag, Koustava Goswami, Srikrishna Karanam
BlazeBVD: Make Scale-Time Equalization Great Again for Blind Video Deflickering
Xinmin Qiu, Congying Han, Zicheng Zhang et al.
A Closer Look at GAN Priors: Exploiting Intermediate Features for Enhanced Model Inversion Attacks
Yixiang Qiu, Hao Fang, Hongyao Yu et al.
Relightable 3D Gaussians: Realistic Point Cloud Relighting with BRDF Decomposition and Ray Tracing
Jian Gao, chun gu, Youtian Lin et al.
Text2Place: Affordance-aware Text Guided Human Placement
Rishubh Parihar, Harsh Gupta, Sachidanand VS et al.
TAPTR: Tracking Any Point with Transformers as Detection
Hongyang Li, Hao Zhang, Shilong Liu et al.
Textual Grounding for Open-vocabulary Visual Information Extraction in Layout-diversified Documents
MENGJUN CHENG, Chengquan Zhang, Chang Liu et al.
Textual Knowledge Matters: Cross-Modality Co-Teaching for Generalized Visual Class Discovery
Haiyang Zheng, Pu Nan, Wenjing Li et al.
Multimodal Cross-Domain Few-Shot Learning for Egocentric Action Recognition
Masashi Hatano, Ryo Hachiuma, Ryo Fujii et al.
AdvDiff: Generating Unrestricted Adversarial Examples using Diffusion Models
Xuelong Dai, Kaisheng Liang, Bin Xiao
ST-LDM: A Universal Framework for Text-Grounded Object Generation in Real Images
Xiangtian Xue, Jiasong Wu, Youyong Kong et al.
Category Adaptation Meets Projected Distillation in Generalized Continual Category Discovery
Grzegorz Rypesc, Daniel Marczak, Sebastian Cygert et al.
Uncertainty Calibration with Energy Based Instance-wise Scaling in the Wild Dataset
Mijoo Kim, Junseok Kwon
OLAF: A Plug-and-Play Framework for Enhanced Multi-object Multi-part Scene Parsing
Pranav Gupta, Rishubh Singh, Pradeep Shenoy et al.
DiffPMAE: Diffusion Masked Autoencoders for Point Cloud Reconstruction
YANLONG LI, Chamara Madarasingha, Kanchana Thilakarathna
Motion Aware Event Representation-driven Image Deblurring
Zhijing Sun, Xueyang Fu, Longzhuo Huang et al.
GroupDiff: Diffusion-based Group Portrait Editing
Yuming Jiang, Nanxuan Zhao, Qing Liu et al.
Privacy-Preserving Adaptive Re-Identification without Image Transfer
Hamza Rami, Jhony H. Giraldo, Nicolas Winckler et al.
TexDreamer: Towards Zero-Shot High-Fidelity 3D Human Texture Generation
Yufei Liu, Junwei Zhu, Junshu Tang et al.
Geospecific View Generation - Geometry-Context Aware High-resolution Ground View Inference from Satellite Views
Ningli Xu, Rongjun Qin
Co-Student: Collaborating Strong and Weak Students for Sparsely Annotated Object Detection
Lianjun Wu, Jiangxiao Han, Zengqiang Zheng et al.
Revisiting Feature Disentanglement Strategy in Diffusion Training and Breaking Conditional Independence Assumption in Sampling
Wonwoong Cho, Hareesh Ravi, Midhun Harikumar et al.
ProMerge: Prompt and Merge for Unsupervised Instance Segmentation
Dylan Li, Gyungin Shin
Open-Vocabulary Camouflaged Object Segmentation
Youwei Pang, Xiaoqi Zhao, JiaMing Zuo et al.
GLARE: Low Light Image Enhancement via Generative Latent Feature based Codebook Retrieval
Han Zhou, Wei Dong, Xiaohong Liu et al.
Fully Authentic Visual Question Answering Dataset from Online Communities
Chongyan Chen, Mengchen Liu, Noel C Codella et al.
Panel-Specific Degradation Representation for Raw Under-Display Camera Image Restoration
Youngjin Oh, Keuntek Lee, Jooyoung Lee et al.
Diffusion-Guided Weakly Supervised Semantic Segmentation
Sung-Hoon Yoon, Hoyong Kwon, Jaeseok Jeong et al.
Online Vectorized HD Map Construction using Geometry
Zhixin Zhang, Yiyuan Zhang, Xiaohan Ding et al.
Click-Gaussian: Interactive Segmentation to Any 3D Gaussians
Seokhun Choi, Hyeonseop Song, Jaechul Kim et al.
MANIKIN: Biomechanically Accurate Neural Inverse Kinematics for Human Motion Estimation
Jiaxi Jiang, Paul Streli, Xuejing Luo et al.
Disentangled Generation and Aggregation for Robust Radiance Fields
Shihe Shen, Huachen Gao, Wangze Xu et al.
Online Temporal Action Localization with Memory-Augmented Transformer
Youngkil Song, Dongkeun Kim, Minsu Cho et al.
JointDreamer: Ensuring Geometry Consistency and Text Congruence in Text-to-3D Generation via Joint Score Distillation
ChenHan Jiang, Yihan Zeng, Tianyang Hu et al.
FLAT: Flux-aware Imperceptible Adversarial Attacks on 3D Point Clouds
Keke Tang, Lujie Huang, Weilong Peng et al.
Multi-branch Collaborative Learning Network for 3D Visual Grounding
Zhipeng Qian, Yiwei Ma, Zhekai Lin et al.
Progressive Proxy Anchor Propagation for Unsupervised Semantic Segmentation
Hyun Seok Seong, WonJun Moon, SuBeen Lee et al.
Revisit Human-Scene Interaction via Space Occupancy
Xinpeng Liu, Haowen Hou, Yanchao Yang et al.
Towards Stable 3D Object Detection
Jiabao Wang, Qiang Meng, Guochao Liu et al.
FYI: Flip Your Images for Dataset Distillation
Byunggwan Son, Youngmin Oh, Donghyeon Baek et al.
Attention Decomposition for Cross-Domain Semantic Segmentation
Liqiang He, Sinisa Todorovic
Controllable Contextualized Image Captioning: Directing the Visual Narrative through User-Defined Highlights
Shunqi Mao, Chaoyi Zhang, Hang Su et al.
Diffusion Models as Optimizers for Efficient Planning in Offline RL
Renming Huang, Yunqiang Pei, Guoqing Wang et al.
HiDiffusion: Unlocking Higher-Resolution Creativity and Efficiency in Pretrained Diffusion Models
Shen Zhang, Zhaowei CHEN, Zhenyu Zhao et al.
Region-aware Distribution Contrast: A Novel Approach to Multi-Task Partially Supervised Learning
Meixuan Li, Tianyu Li, Guoqing Wang et al.
MasterWeaver: Taming Editability and Face Identity for Personalized Text-to-Image Generation
Yuxiang WEI, Zhilong Ji, Jinfeng Bai et al.
PointRegGPT: Boosting 3D Point Cloud Registration using Generative Point-Cloud Pairs for Training
SUYI CHEN, Hao Xu, Haipeng Li et al.
DIFFender: Diffusion-Based Adversarial Defense against Patch Attacks
Caixin Kang, Yinpeng Dong, Zhengyi Wang et al.
DriveLM: Driving with Graph Visual Question Answering
Chonghao Sima, Katrin Renz, Kashyap Chitta et al.
Q&A Prompts: Discovering Rich Visual Clues through Mining Question-Answer Prompts for VQA requiring Diverse World Knowledge
Haibo Wang, Weifeng Ge
CatchBackdoor: Backdoor Detection via Critical Trojan Neural Path Fuzzing
Haibo Jin, Ruoxi Chen, Jinyin Chen et al.
HARIVO: Harnessing Text-to-Image Models for Video Generation
Mingi Kwon, Seoung Wug Oh, Yang Zhou et al.
WRIM-Net: Wide-Ranging Information Mining Network for Visible-Infrared Person Re-Identification
Yonggan Wu, Ling-Chao Meng, Yuan Zichao et al.
Reliable and Efficient Concept Erasure of Text-to-Image Diffusion Models
Chao Gong, Kai Chen, Zhipeng Wei et al.
Length-Aware Motion Synthesis via Latent Diffusion
Alessio Sampieri, Alessio Palma, Indro Spinelli et al.
Improving image synthesis with diffusion-negative sampling
Alakh Desai, Nuno Vasconcelos
SignGen: End-to-End Sign Language Video Generation with Latent Diffusion
Fan Qi, Yu Duan, Changsheng Xu et al.
Idling Neurons, Appropriately Lenient Workload During Fine-tuning Leads to Better Generalization
Hongjing Niu, Hanting Li, Bin Li et al.
GRA: Detecting Oriented Objects through Group-wise Rotating and Attention
Jiangshan Wang, Yifan Pu, Yizeng Han et al.
Track Everything Everywhere Fast and Robustly
Yunzhou Song, Jiahui Lei, Ziyun Wang et al.
Label-free Neural Semantic Image Synthesis
Jiayi Wang, Kevin Alexander Laube, Yumeng Li et al.
Powerful and Flexible: Personalized Text-to-Image Generation via Reinforcement Learning
Fanyue Wei, Wei Zeng, Zhenyang Li et al.
HiEI: A Universal Framework for Generating High-quality Emerging Images from Natural Images
Jingmeng Li, Lukang Fu, Surun Yang et al.
Nonverbal Interaction Detection
Jianan Wei, Tianfei Zhou, Yi Yang et al.
The Sky's the Limit: Relightable Outdoor Scenes via a Sky-pixel Constrained Illumination Prior and Outside-In Visibility
James Gardner, Evgenii Kashin, Bernhard Egger et al.
Neural Spectral Decomposition for Dataset Distillation
Yang Shaolei, Shen Cheng, Mingbo Hong et al.
Causality-inspired Discriminative Feature Learning in Triple Domains for Gait Recognition
Haijun Xiong, Bin Feng, Xinggang Wang et al.
Understanding and Mitigating Human-Labelling Errors in Supervised Contrastive Learning
Zijun Long, Lipeng Zhuang, George W Killick et al.
Mew: Multiplexed Immunofluorescence Image Analysis through an Efficient Multiplex Network
Sukwon Yun, Jie Peng, Alexandro E Trevino et al.
HERGen: Elevating Radiology Report Generation with Longitudinal Data
Fuying Wang, Shenghui Du, Lequan Yu
Labeled Data Selection for Category Discovery
Bingchen Zhao, Nico Lang, Serge Belongie et al.
Hierarchical Unsupervised Relation Distillation for Source Free Domain Adaptation
Bowei Xing, Xianghua Ying, Ruibin Wang et al.
GMT: Enhancing Generalizable Neural Rendering via Geometry-Driven Multi-Reference Texture Transfer
Youngho Yoon, Hyun-Kurl Jang, Kuk-Jin Yoon
SNeRV: Spectra-preserving Neural Representation for Video
Jina Kim, Jihoo Lee, Jewon Kang
WPS-SAM: Towards Weakly-Supervised Part Segmentation with Foundation Models
xinjian wu, Ruisong Zhang, Jie Qin et al.
Analysis-by-Synthesis Transformer for Single-View 3D Reconstruction
Dian Jia, Xiaoqian Ruan, Kun Xia et al.
DMiT: Deformable Mipmapped Tri-Plane Representation for Dynamic Scenes
Jing-Wen Yang, Jia-Mu Sun, Yong-Liang Yang et al.
Bayesian Self-Training for Semi-Supervised 3D Segmentation
Ozan Unal, Christos Sakaridis, Luc Van Gool
Motion-Oriented Compositional Neural Radiance Fields for Monocular Dynamic Human Modeling
Jaehyeok Kim, Dongyoon Wee, Dan Xu
Tiny Models are the Computational Saver for Large Models
Qingyuan Wang, Barry Cardiff, Antoine Frappé et al.
AddressCLIP: Empowering Vision-Language Models for City-wide Image Address Localization
Shixiong Xu, Chenghao Zhang, Lubin Fan et al.
Unveiling Privacy Risks in Stochastic Neural Networks Training: Effective Image Reconstruction from Gradients
Yiming Chen, Xiangyu Yang, Nikos Deligiannis
Head360: Learning a Parametric 3D Full-Head for Free-View Synthesis in 360°
Yuxiao He, Yiyu Zhuang, Yanwen Wang et al.
KMTalk: Speech-Driven 3D Facial Animation with Key Motion Embedding
Zhihao Xu, Shengjie Gong, Jiapeng Tang et al.