Most Cited 2024 "oversensitivity benchmark" Papers
12,324 papers found • Page 54 of 62
Conference
APISR: Anime Production Inspired Real-World Anime Super-Resolution
Boyang Wang, Fengyu Yang, Xihang Yu et al.
Instruct-ReID: A Multi-purpose Person Re-identification Task with Instructions
Weizhen He, Yiheng Deng, SHIXIANG TANG et al.
Device-Wise Federated Network Pruning
Shangqian Gao, Junyi Li, Zeyu Zhang et al.
SPOT: Self-Training with Patch-Order Permutation for Object-Centric Learning with Autoregressive Transformers
Ioannis Kakogeorgiou, Spyros Gidaris, Konstantinos Karantzalos et al.
MRC-Net: 6-DoF Pose Estimation with MultiScale Residual Correlation
Yuelong Li, Yafei Mao, Raja Bala et al.
Progress-Aware Online Action Segmentation for Egocentric Procedural Task Videos
Yuhan Shen, Ehsan Elhamifar
MOHO: Learning Single-view Hand-held Object Reconstruction with Multi-view Occlusion-Aware Supervision
Chenyangguang Zhang, Guanlong Jiao, Yan Di et al.
Faces that Speak: Jointly Synthesising Talking Face and Speech from Text
Youngjoon Jang, Jihoon Kim, Junseok Ahn et al.
Learning to Segment Referred Objects from Narrated Egocentric Videos
Yuhan Shen, Huiyu Wang, Xitong Yang et al.
EGTR: Extracting Graph from Transformer for Scene Graph Generation
Jinbae Im, JeongYeon Nam, Nokyung Park et al.
Distributionally Generative Augmentation for Fair Facial Attribute Classification
Fengda Zhang, Qianpei He, Kun Kuang et al.
PikeLPN: Mitigating Overlooked Inefficiencies of Low-Precision Neural Networks
Marina Neseem, Conor McCullough, Randy Hsin et al.
THRONE: An Object-based Hallucination Benchmark for the Free-form Generations of Large Vision-Language Models
Prannay Kaul, Zhizhong Li, Hao Yang et al.
UniVS: Unified and Universal Video Segmentation with Prompts as Queries
Minghan LI, Shuai Li, Xindong Zhang et al.
Inlier Confidence Calibration for Point Cloud Registration
Yongzhe Yuan, Yue Wu, Xiaolong Fan et al.
CLIP-BEVFormer: Enhancing Multi-View Image-Based BEV Detector with Ground Truth Flow
Chenbin Pan, Burhan Yaman, Senem Velipasalar et al.
ADFactory: An Effective Framework for Generalizing Optical Flow with NeRF
Han Ling, Quansen Sun, Yinghui Sun et al.
3D Building Reconstruction from Monocular Remote Sensing Images with Multi-level Supervisions
Weijia Li, Haote Yang, Zhenghao Hu et al.
In Search of a Data Transformation That Accelerates Neural Field Training
Junwon Seo, Sangyoon Lee, Kwang In Kim et al.
FastMAC: Stochastic Spectral Sampling of Correspondence Graph
Yifei Zhang, Hao Zhao, Hongyang Li et al.
PeLK: Parameter-efficient Large Kernel ConvNets with Peripheral Convolution
Honghao Chen, Xiangxiang Chu, Renyongjian et al.
Towards Generalizing to Unseen Domains with Few Labels
Chamuditha Jayanga Galappaththige, Sanoojan Baliah, Malitha Gunawardhana et al.
Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts
Jialin Wu, Xia Hu, Yaqing Wang et al.
Improving Semantic Correspondence with Viewpoint-Guided Spherical Maps
Octave Mariotti, Oisin Mac Aodha, Hakan Bilen
Learning Degradation-Independent Representations for Camera ISP Pipelines
Yanhui Guo, Fangzhou Luo, Xiaolin Wu
A Subspace-Constrained Tyler's Estimator and its Applications to Structure from Motion
Feng Yu, Teng Zhang, Gilad Lerman
Low-Resource Vision Challenges for Foundation Models
Yunhua Zhang, Hazel Doughty, Cees G. M. Snoek
Low-Latency Neural Stereo Streaming
Qiqi Hou, Farzad Farhadzadeh, Amir Said et al.
Your Transferability Barrier is Fragile: Free-Lunch for Transferring the Non-Transferable Learning
Ziming Hong, Li Shen, Tongliang Liu
ARTrackV2: Prompting Autoregressive Tracker Where to Look and How to Describe
Yifan Bai, Zeyang Zhao, Yihong Gong et al.
DPHMs: Diffusion Parametric Head Models for Depth-based Tracking
Jiapeng Tang, Angela Dai, Yinyu Nie et al.
MaxQ: Multi-Axis Query for N:M Sparsity Network
Jingyang Xiang, Siqi Li, Junhao Chen et al.
The Mirrored Influence Hypothesis: Efficient Data Influence Estimation by Harnessing Forward Passes
Myeongseob Ko, Feiyang Kang, Weiyan Shi et al.
Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models
Haoning Wu, Zicheng Zhang, Erli Zhang et al.
Efficient Scene Recovery Using Luminous Flux Prior
ZhongYu Li, Lei Zhang
Structured Gradient-based Interpretations via Norm-Regularized Adversarial Training
Shizhan Gong, Qi Dou, Farzan Farnia
Revisiting Global Translation Estimation with Feature Tracks
Peilin Tao, Hainan Cui, Mengqi Rong et al.
Forgery-aware Adaptive Transformer for Generalizable Synthetic Image Detection
Huan Liu, Zichang Tan, Chuangchuang Tan et al.
MeaCap: Memory-Augmented Zero-shot Image Captioning
Zequn Zeng, Yan Xie, Hao Zhang et al.
MuseChat: A Conversational Music Recommendation System for Videos
Zhikang Dong, Bin Chen, Xiulong Liu et al.
Novel View Synthesis with View-Dependent Effects from a Single Image
Juan Luis Gonzalez Bello, Munchurl Kim
Orchestrate Latent Expertise: Advancing Online Continual Learning with Multi-Level Supervision and Reverse Self-Distillation
Hongwei Yan, Liyuan Wang, Kaisheng Ma et al.
DisCo: Disentangled Control for Realistic Human Dance Generation
Tan Wang, Linjie Li, Kevin Lin et al.
Suppress and Rebalance: Towards Generalized Multi-Modal Face Anti-Spoofing
Xun Lin, Shuai Wang, RIZHAO CAI et al.
Constructing and Exploring Intermediate Domains in Mixed Domain Semi-supervised Medical Image Segmentation
Qinghe Ma, Jian Zhang, Lei Qi et al.
LAMP: Learn A Motion Pattern for Few-Shot Video Generation
Rui-Qi Wu, Liangyu Chen, Tong Yang et al.
PixelLM: Pixel Reasoning with Large Multimodal Model
Zhongwei Ren, Zhicheng Huang, Yunchao Wei et al.
Towards CLIP-driven Language-free 3D Visual Grounding via 2D-3D Relational Enhancement and Consistency
Yuqi Zhang, Han Luo, Yinjie Lei
iKUN: Speak to Trackers without Retraining
Yunhao Du, Cheng Lei, Zhicheng Zhao et al.
Neural Fields as Distributions: Signal Processing Beyond Euclidean Space
Daniel Rebain, Soroosh Yazdani, Kwang Moo Yi et al.
Weakly-Supervised Emotion Transition Learning for Diverse 3D Co-speech Gesture Generation
Xingqun Qi, Jiahao Pan, Peng Li et al.
LaRE^2: Latent Reconstruction Error Based Method for Diffusion-Generated Image Detection
Yunpeng Luo, Junlong Du, Ke Yan et al.
Stratified Avatar Generation from Sparse Observations
Han Feng, Wenchao Ma, Quankai Gao et al.
Few-shot Learner Parameterization by Diffusion Time-steps
Zhongqi Yue, Pan Zhou, Richang Hong et al.
Global and Hierarchical Geometry Consistency Priors for Few-shot NeRFs in Indoor Scenes
Xiaotian Sun, Qingshan Xu, Xinjie Yang et al.
Lookahead Exploration with Neural Radiance Representation for Continuous Vision-Language Navigation
Zihan Wang, Xiangyang Li, Jiahao Yang et al.
Compressed 3D Gaussian Splatting for Accelerated Novel View Synthesis
Simon Niedermayr, Josef Stumpfegger, rüdiger westermann
The STVchrono Dataset: Towards Continuous Change Recognition in Time
Yanjun Sun, Yue Qiu, Mariia Khan et al.
Unleashing Channel Potential: Space-Frequency Selection Convolution for SAR Object Detection
Ke Li, Di Wang, Zhangyuan Hu et al.
Motion Blur Decomposition with Cross-shutter Guidance
Xiang Ji, Haiyang Jiang, Yinqiang Zheng
LION: Empowering Multimodal Large Language Model with Dual-Level Visual Knowledge
Gongwei Chen, Leyang Shen, Rui Shao et al.
Pixel-Aligned Language Model
Jiarui Xu, Xingyi Zhou, Shen Yan et al.
Eclipse: Disambiguating Illumination and Materials using Unintended Shadows
Dor Verbin, Ben Mildenhall, Peter Hedman et al.
ConvoFusion: Multi-Modal Conversational Diffusion for Co-Speech Gesture Synthesis
Muhammad Hamza Mughal, Rishabh Dabral, Ikhsanul Habibie et al.
2S-UDF: A Novel Two-stage UDF Learning Method for Robust Non-watertight Model Reconstruction from Multi-view Images
Junkai Deng, Fei Hou, Xuhui Chen et al.
ViewDiff: 3D-Consistent Image Generation with Text-to-Image Models
Lukas Höllein, Aljaž Božič, Norman Müller et al.
Taming Stable Diffusion for Text to 360 Panorama Image Generation
Cheng Zhang, Qianyi Wu, Camilo Cruz Gambardella et al.
CAMEL: CAusal Motion Enhancement Tailored for Lifting Text-driven Video Editing
Guiwei Zhang, Tianyu Zhang, Guanglin Niu et al.
DuPL: Dual Student with Trustworthy Progressive Learning for Robust Weakly Supervised Semantic Segmentation
Yuanchen Wu, Xichen Ye, KequanYang et al.
A Physics-informed Low-rank Deep Neural Network for Blind and Universal Lens Aberration Correction
Jin Gong, Runzhao Yang, Weihang Zhang et al.
NAPGuard: Towards Detecting Naturalistic Adversarial Patches
Siyang Wu, Jiakai Wang, Jiejie Zhao et al.
Descriptor and Word Soups: Overcoming the Parameter Efficiency Accuracy Tradeoff for Out-of-Distribution Few-shot Learning
Christopher Liao, Theodoros Tsiligkaridis, Brian Kulis
A Stealthy Wrongdoer: Feature-Oriented Reconstruction Attack against Split Learning
Xiaoyang Xu, Mengda Yang, Wenzhe Yi et al.
Bootstrapping SparseFormers from Vision Foundation Models
Ziteng Gao, Zhan Tong, Kevin Qinghong Lin et al.
Psychometry: An Omnifit Model for Image Reconstruction from Human Brain Activity
Ruijie Quan, Wenguan Wang, Zhibo Tian et al.
G-NeRF: Geometry-enhanced Novel View Synthesis from Single-View Images
Zixiong Huang, Qi Chen, Libo Sun et al.
Active Prompt Learning in Vision Language Models
Jihwan Bang, Sumyeong Ahn, Jae-Gil Lee
Generating Handwritten Mathematical Expressions From Symbol Graphs: An End-to-End Pipeline
Yu chen, Fei Gao, YanguangZhang et al.
On the Robustness of Language Guidance for Low-Level Vision Tasks: Findings from Depth Estimation
Agneet Chatterjee, Tejas Gokhale, Chitta Baral et al.
SingularTrajectory: Universal Trajectory Predictor Using Diffusion Model
Inhwan Bae, Young-Jae Park, Hae-Gon Jeon
Domain Separation Graph Neural Networks for Saliency Object Ranking
Zijian Wu, Jun Lu, Jing Han et al.
Solving the Catastrophic Forgetting Problem in Generalized Category Discovery
Xinzi Cao, Xiawu Zheng, Guanhong Wang et al.
Driving into the Future: Multiview Visual Forecasting and Planning with World Model for Autonomous Driving
Yuqi Wang, Jiawei He, Lue Fan et al.
WOUAF: Weight Modulation for User Attribution and Fingerprinting in Text-to-Image Diffusion Models
Changhoon Kim, Kyle Min, Maitreya Patel et al.
MatchU: Matching Unseen Objects for 6D Pose Estimation from RGB-D Images
Junwen Huang, Hao Yu, Kuan-Ting Yu et al.
Resource-Efficient Transformer Pruning for Finetuning of Large Models
Fatih Ilhan, Gong Su, Selim Tekin et al.
Link-Context Learning for Multimodal LLMs
Yan Tai, Weichen Fan, Zhao Zhang et al.
The Manga Whisperer: Automatically Generating Transcriptions for Comics
Ragav Sachdeva, Andrew Zisserman
Deep-TROJ: An Inference Stage Trojan Insertion Algorithm through Efficient Weight Replacement Attack
Sabbir Ahmed, RANYANG ZHOU, Shaahin Angizi et al.
Dynamic LiDAR Re-simulation using Compositional Neural Fields
Hanfeng Wu, Xingxing Zuo, Stefan Leutenegger et al.
Language-aware Visual Semantic Distillation for Video Question Answering
Bo Zou, Chao Yang, Yu Qiao et al.
3DInAction: Understanding Human Actions in 3D Point Clouds
Yizhak Ben-Shabat, Oren Shrout, Stephen Gould
DiLiGenRT: A Photometric Stereo Dataset with Quantified Roughness and Translucency
Heng Guo, Jieji Ren, Feishi Wang et al.
StyLitGAN: Image-Based Relighting via Latent Control
Anand Bhattad, James Soole, David Forsyth
Label-Efficient Group Robustness via Out-of-Distribution Concept Curation
Yiwei Yang, Anthony Liu, Robert Wolfe et al.
Unsupervised Universal Image Segmentation
XuDong Wang, Dantong Niu, Xinyang Han et al.
Batch Normalization Alleviates the Spectral Bias in Coordinate Networks
Zhicheng Cai, Hao Zhu, Qiu Shen et al.
Not All Classes Stand on Same Embeddings: Calibrating a Semantic Distance with Metric Tensor
Jae Hyeon Park, Gyoomin Lee, Seunggi Park et al.
CodedEvents: Optimal Point-Spread-Function Engineering for 3D-Tracking with Event Cameras
Sachin Shah, Matthew Chan, Haoming Cai et al.
Animatable Gaussians: Learning Pose-dependent Gaussian Maps for High-fidelity Human Avatar Modeling
Zhe Li, Zerong Zheng, Lizhen Wang et al.
Retrieval-Augmented Open-Vocabulary Object Detection
Jooyeon Kim, Eulrang Cho, Sehyung Kim et al.
NB-GTR: Narrow-Band Guided Turbulence Removal
Yifei Xia, Chu Zhou, Chengxuan Zhu et al.
LangSplat: 3D Language Gaussian Splatting
Minghan Qin, Wanhua Li, Jiawei ZHOU et al.
Positive-Unlabeled Learning by Latent Group-Aware Meta Disambiguation
Lin Long, Haobo Wang, Zhijie Jiang et al.
Text-conditional Attribute Alignment across Latent Spaces for 3D Controllable Face Image Synthesis
FeiFan Xu, Rui Li, Si Wu et al.
SuGaR: Surface-Aligned Gaussian Splatting for Efficient 3D Mesh Reconstruction and High-Quality Mesh Rendering
Antoine Guédon, Vincent Lepetit
DiffusionPoser: Real-time Human Motion Reconstruction From Arbitrary Sparse Sensors Using Autoregressive Diffusion
Tom Van Wouwe, Seunghwan Lee, Antoine Falisse et al.
HumanRef: Single Image to 3D Human Generation via Reference-Guided Diffusion
Jingbo Zhang, Xiaoyu Li, Qi Zhang et al.
CurveCloudNet: Processing Point Clouds with 1D Structure
Colton Stearns, Alex Fu, Jiateng Liu et al.
Harnessing Meta-Learning for Improving Full-Frame Video Stabilization
Muhammad Kashif Ali, Eun Woo Im, Dongjin Kim et al.
Physical 3D Adversarial Attacks against Monocular Depth Estimation in Autonomous Driving
Junhao Zheng, Chenhao Lin, Jiahao Sun et al.
SeaBird: Segmentation in Bird’s View with Dice Loss Improves Monocular 3D Detection of Large Objects
Abhinav Kumar, Yuliang Guo, Xinyu Huang et al.
MoML: Online Meta Adaptation for 3D Human Motion Prediction
Xiaoning Sun, Huaijiang Sun, Bin Li et al.
Learning with Structural Labels for Learning with Noisy Labels
Noo-ri Kim, Jin-Seop Lee, Jee-Hyong Lee
What If the TV Was Off? Examining Counterfactual Reasoning Abilities of Multi-modal Language Models
Letian Zhang, Xiaotong Zhai, Zhongkai Zhao et al.
Incremental Nuclei Segmentation from Histopathological Images via Future-class Awareness and Compatibility-inspired Distillation
Huyong Wang, Huisi Wu, Jing Qin
Model Inversion Robustness: Can Transfer Learning Help?
Sy-Tuyen Ho, Koh Jun Hao, Keshigeyan Chandrasegaran et al.
Scene-adaptive and Region-aware Multi-modal Prompt for Open Vocabulary Object Detection
Xiaowei Zhao, Xianglong Liu, Duorui Wang et al.
InteractDiffusion: Interaction Control in Text-to-Image Diffusion Models
Jiun Tian Hoe, Xudong Jiang, Chee Seng Chan et al.
MAP: MAsk-Pruning for Source-Free Model Intellectual Property Protection
Boyang Peng, Sanqing Qu, Yong Wu et al.
Dynamic Adapter Meets Prompt Tuning: Parameter-Efficient Transfer Learning for Point Cloud Analysis
Xin Zhou, Dingkang Liang, Wei Xu et al.
EMCAD: Efficient Multi-scale Convolutional Attention Decoding for Medical Image Segmentation
Md Mostafijur Rahman, Mustafa Munir, Radu Marculescu
On Exact Inversion of DPM-Solvers
Seongmin Hong, Kyeonghyun Lee, Suh Yoon Jeon et al.
Generate Like Experts: Multi-Stage Font Generation by Incorporating Font Transfer Process into Diffusion Models
Bin Fu, Fanghua Yu, Anran Liu et al.
A Unified Diffusion Framework for Scene-aware Human Motion Estimation from Sparse Signals
Jiangnan Tang, Jingya Wang, Kaiyang Ji et al.
MaskCLR: Attention-Guided Contrastive Learning for Robust Action Representation Learning
Mohamed Abdelfattah, Mariam Hassan, Alex Alahi
D3T: Distinctive Dual-Domain Teacher Zigzagging Across RGB-Thermal Gap for Domain-Adaptive Object Detection
Dinh Phat Do, Taehoon Kim, JAEMIN NA et al.
MAGICK: A Large-scale Captioned Dataset from Matting Generated Images using Chroma Keying
Ryan Burgert, Brian Price, Jason Kuen et al.
Intrinsic Image Diffusion for Indoor Single-view Material Estimation
Peter Kocsis, Vincent Sitzmann, Matthias Nießner
Prompt Highlighter: Interactive Control for Multi-Modal LLMs
Yuechen Zhang, Shengju Qian, Bohao Peng et al.
Can Protective Perturbation Safeguard Personal Data from Being Exploited by Stable Diffusion?
Zhengyue Zhao, Jinhao Duan, Kaidi Xu et al.
NetTrack: Tracking Highly Dynamic Objects with a Net
Guangze Zheng, Shijie Lin, Haobo Zuo et al.
Scaling Up Video Summarization Pretraining with Large Language Models
Dawit Argaw Argaw, Seunghyun Yoon, Fabian Caba Heilbron et al.
Online Task-Free Continual Generative and Discriminative Learning via Dynamic Cluster Memory
飞 叶, Adrian Bors
FADES: Fair Disentanglement with Sensitive Relevance
Taeuk Jang, Xiaoqian Wang
Versatile Navigation Under Partial Observability via Value-guided Diffusion Policy
Gengyu Zhang, Hao Tang, Yan Yan
Improving Depth Completion via Depth Feature Upsampling
Yufei Wang, Ge Zhang, Shaoqian Wang et al.
Deep Single Image Camera Calibration by Heatmap Regression to Recover Fisheye Images Under Manhattan World Assumption
Nobuhiko Wakai, Satoshi Sato, Yasunori Ishii et al.
MRFS: Mutually Reinforcing Image Fusion and Segmentation
HAO ZHANG, Xuhui Zuo, Jie Jiang et al.
Multi-agent Long-term 3D Human Pose Forecasting via Interaction-aware Trajectory Conditioning
Jaewoo Jeong, Daehee Park, Kuk-Jin Yoon
OrCo: Towards Better Generalization via Orthogonality and Contrast for Few-Shot Class-Incremental Learning
Noor Ahmed, Anna Kukleva, Bernt Schiele
3D-LFM: Lifting Foundation Model
Mosam Dabhi, László A. Jeni, Simon Lucey
LASIL: Learner-Aware Supervised Imitation Learning For Long-term Microscopic Traffic Simulation
Ke Guo, Zhenwei Miao, Wei Jing et al.
HybridNeRF: Efficient Neural Rendering via Adaptive Volumetric Surfaces
Haithem Turki, Vasu Agrawal, Samuel Rota Bulò et al.
IIRP-Net: Iterative Inference Residual Pyramid Network for Enhanced Image Registration
Tai Ma, zhangsuwei, Jiafeng Li et al.
SEED-Bench: Benchmarking Multimodal Large Language Models
Bohao Li, Yuying Ge, Yixiao Ge et al.
Style Aligned Image Generation via Shared Attention
Amir Hertz, Andrey Voynov, Shlomi Fruchter et al.
NeRFDeformer: NeRF Transformation from a Single View via 3D Scene Flows
Zhenggang Tang, Jason Ren, Xiaoming Zhao et al.
BIVDiff: A Training-Free Framework for General-Purpose Video Synthesis via Bridging Image and Video Diffusion Models
Fengyuan Shi, Jiaxi Gu, Hang Xu et al.
Active Domain Adaptation with False Negative Prediction for Object Detection
Yuzuru Nakamura, Yasunori Ishii, Takayoshi Yamashita
LTGC: Long-tail Recognition via Leveraging LLMs-driven Generated Content
Qihao Zhao, Yalun Dai, Hao Li et al.
How to Train Neural Field Representations: A Comprehensive Study and Benchmark
Samuele Papa, Riccardo Valperga, David Knigge et al.
Motion-adaptive Separable Collaborative Filters for Blind Motion Deblurring
Chengxu Liu, Xuan Wang, Xiangyu Xu et al.
SCoFT: Self-Contrastive Fine-Tuning for Equitable Image Generation
Zhixuan Liu, Peter Schaldenbrand, Beverley-Claire Okogwu et al.
Reg-PTQ: Regression-specialized Post-training Quantization for Fully Quantized Object Detector
Yifu Ding, Weilun Feng, Chuyan Chen et al.
FREE: Faster and Better Data-Free Meta-Learning
Yongxian Wei, Zixuan Hu, Zhenyi Wang et al.
Open Vocabulary Semantic Scene Sketch Understanding
Ahmed Bourouis, Judith Fan, Yulia Gryaditskaya
You Only Need Less Attention at Each Stage in Vision Transformers
Shuoxi Zhang, Hanpeng Liu, Stephen Lin et al.
Hierarchical Patch Diffusion Models for High-Resolution Video Generation
Ivan Skorokhodov, Willi Menapace, Aliaksandr Siarohin et al.
Rethinking the Up-Sampling Operations in CNN-based Generative Network for Generalizable Deepfake Detection
Chuangchuang Tan, Huan Liu, Yao Zhao et al.
Building a Strong Pre-Training Baseline for Universal 3D Large-Scale Perception
Haoming Chen, Zhizhong Zhang, Yanyun Qu et al.
BoQ: A Place is Worth a Bag of Learnable Queries
Amar Ali-bey, Brahim Chaib-draa, Philippe Giguère
UFC-Net: Unrolling Fixed-point Continuous Network for Deep Compressive Sensing
Xiaoyang Wang, Hongping Gan
Symphonize 3D Semantic Scene Completion with Contextual Instance Queries
Haoyi Jiang, Tianheng Cheng, Naiyu Gao et al.
CPLIP: Zero-Shot Learning for Histopathology with Comprehensive Vision-Language Alignment
Sajid Javed, Arif Mahmood, IYYAKUTTI IYAPPAN GANAPATHI et al.
MaskPLAN: Masked Generative Layout Planning from Partial Input
Hang Zhang, Anton Savov, Benjamin Dillenburger
Solving Masked Jigsaw Puzzles with Diffusion Vision Transformers
Jinyang Liu, Wondmgezahu Teshome, Sandesh Ghimire et al.
Towards Memorization-Free Diffusion Models
Chen Chen, Daochang Liu, Chang Xu
AV-RIR: Audio-Visual Room Impulse Response Estimation
Anton Ratnarajah, Sreyan Ghosh, Sonal Kumar et al.
Entangled View-Epipolar Information Aggregation for Generalizable Neural Radiance Fields
Zhiyuan Min, Yawei Luo, Wei Yang et al.
A-Teacher: Asymmetric Network for 3D Semi-Supervised Object Detection
Hanshi Wang, Zhipeng Zhang, Jin Gao et al.
HanDiffuser: Text-to-Image Generation With Realistic Hand Appearances
Supreeth Narasimhaswamy, Uttaran Bhattacharya, Xiang Chen et al.
Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks
Bin Xiao, Haiping Wu, Weijian Xu et al.
DMR: Decomposed Multi-Modality Representations for Frames and Events Fusion in Visual Reinforcement Learning
Haoran Xu, Peixi Peng, Guang Tan et al.
3D Feature Tracking via Event Camera
Siqi Li, Zhou Zhikuan, Zhou Xue et al.
Frequency-aware Event-based Video Deblurring for Real-World Motion Blur
Taewoo Kim, Hoonhee Cho, Kuk-Jin Yoon
FedHCA2: Towards Hetero-Client Federated Multi-Task Learning
Yuxiang Lu, Suizhi Huang, Yuwen Yang et al.
Improving Unsupervised Hierarchical Representation with Reinforcement Learning
Ruyi An, Yewen Li, Xu He et al.
Global Latent Neural Rendering
Thomas Tanay, Matteo Maggioni
Data Poisoning based Backdoor Attacks to Contrastive Learning
Jinghuai Zhang, Hongbin Liu, Jinyuan Jia et al.
RoHM: Robust Human Motion Reconstruction via Diffusion
Siwei Zhang, Bharat Lal Bhatnagar, Yuanlu Xu et al.
SportsHHI: A Dataset for Human-Human Interaction Detection in Sports Videos
Tao Wu, Runyu He, Gangshan Wu et al.
Classes Are Not Equal: An Empirical Study on Image Recognition Fairness
Jiequan Cui, Beier Zhu, Xin Wen et al.
ImageNet-D: Benchmarking Neural Network Robustness on Diffusion Synthetic Object
Chenshuang Zhang, Fei Pan, Junmo Kim et al.
BlockGCN: Redefine Topology Awareness for Skeleton-Based Action Recognition
Yuxuan Zhou, Xudong Yan, Zhi-Qi Cheng et al.
Dynamic Inertial Poser (DynaIP): Part-Based Motion Dynamics Learning for Enhanced Human Pose Estimation with Sparse Inertial Sensors
Yu Zhang, Songpengcheng Xia, Lei Chu et al.
Person-in-WiFi 3D: End-to-End Multi-Person 3D Pose Estimation with Wi-Fi
Kangwei Yan, Fei Wang, Bo Qian et al.
ERMVP: Communication-Efficient and Collaboration-Robust Multi-Vehicle Perception in Challenging Environments
Jingyu Zhang, Kun Yang, Yilei Wang et al.
GRAM: Global Reasoning for Multi-Page VQA
Itshak Blau, Sharon Fogel, Roi Ronen et al.
HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data
Qifan Yu, Juncheng Li, Longhui Wei et al.
Tri-Perspective View Decomposition for Geometry-Aware Depth Completion
Zhiqiang Yan, Yuankai Lin, Kun Wang et al.
MoPE-CLIP: Structured Pruning for Efficient Vision-Language Models with Module-wise Pruning Error Metric
Haokun Lin, Haoli Bai, Zhili Liu et al.
DiffusionRegPose: Enhancing Multi-Person Pose Estimation using a Diffusion-Based End-to-End Regression Approach
Dayi Tan, Hansheng Chen, Wei Tian et al.
Tumor Micro-environment Interactions Guided Graph Learning for Survival Analysis of Human Cancers from Whole-slide Pathological Images
WEI SHAO, YangYang Shi, Daoqiang Zhang et al.
Perception-Oriented Video Frame Interpolation via Asymmetric Blending
Guangyang Wu, Xin Tao, Changlin Li et al.
Cooperation Does Matter: Exploring Multi-Order Bilateral Relations for Audio-Visual Segmentation
Qi Yang, Xing Nie, Tong Li et al.
Exact Fusion via Feature Distribution Matching for Few-shot Image Generation
Yingbo Zhou, Yutong Ye, Pengyu Zhang et al.
Fooling Polarization-Based Vision using Locally Controllable Polarizing Projection
Zhuoxiao Li, Zhihang Zhong, Shohei Nobuhara et al.
Affine Equivariant Networks Based on Differential Invariants
Yikang Li, Yeqing Qiu, Yuxuan Chen et al.
Diffusion-based Blind Text Image Super-Resolution
Yuzhe Zhang, jiawei zhang, Hao Li et al.