Most Cited 2024 "llm adaptation" Papers
12,324 papers found • Page 10 of 62
Conference
Adapting Short-Term Transformers for Action Detection in Untrimmed Videos
Min Yang, gaohuan, Ping Guo et al.
Exploring the Transferability of Visual Prompting for Multimodal Large Language Models
Yichi Zhang, Yinpeng Dong, Siyuan Zhang et al.
Towards Green AI in Fine-tuning Large Language Models via Adaptive Backpropagation
Kai Huang, Hanyun Yin, Heng Huang et al.
Crystalformer: Infinitely Connected Attention for Periodic Structure Encoding
Tatsunori Taniai, Ryo Igarashi, Yuta Suzuki et al.
DeiT-LT: Distillation Strikes Back for Vision Transformer Training on Long-Tailed Datasets
Harsh Rangwani, Pradipto Mondal, Mayank Mishra et al.
Understanding Video Transformers via Universal Concept Discovery
Matthew Kowal, Achal Dave, Rares Andrei Ambrus et al.
Decompose-and-Compose: A Compositional Approach to Mitigating Spurious Correlation
Fahimeh Hosseini Noohdani, Parsa Hosseini, Aryan Yazdan Parast et al.
Keypoint Promptable Re-Identification
Vladimir Somers, Alexandre ALahi, Christophe De Vleeschouwer
Decomposing Semantic Shifts for Composed Image Retrieval
Xingyu Yang, Daqing Liu, Heng Zhang et al.
HandDiff: 3D Hand Pose Estimation with Diffusion on Image-Point Cloud
WENCAN CHENG, Hao Tang, Luc Van Gool et al.
UniM2AE: Multi-modal Masked Autoencoders with Unified 3D Representation for 3D Perception in Autonomous Driving
Jian Zou, Tianyu Huang, Guanglei Yang et al.
DIRECT-3D: Learning Direct Text-to-3D Generation on Massive Noisy 3D Data
Qihao Liu, Yi Zhang, Song Bai et al.
CycleINR: Cycle Implicit Neural Representation for Arbitrary-Scale Volumetric Super-Resolution of Medical Data
Wei Fang, Yuxing Tang, Heng Guo et al.
InfMAE: A Foundation Model in The Infrared Modality
Fangcen liu, Chenqiang Gao, Yaming Zhang et al.
Raindrop Clarity: A Dual-Focused Dataset for Day and Night Raindrop Removal
Yeying Jin, Xin Li, Jiadong Wang et al.
Listen to Look into the Future: Audio-Visual Egocentric Gaze Anticipation
Bolin Lai, Fiona Ryan, Wenqi Jia et al.
HAVE-FUN: Human Avatar Reconstruction from Few-Shot Unconstrained Images
Xihe Yang, Xingyu Chen, Daiheng Gao et al.
Towards Multimodal Open-Set Domain Generalization and Adaptation through Self-supervision
Hao Dong, Eleni Chatzi, Olga Fink
Knowledge Transfer with Simulated Inter-Image Erasing for Weakly Supervised Semantic Segmentation
Tao Chen, Xiruo Jiang, Gensheng Pei et al.
LMUFormer: Low Complexity Yet Powerful Spiking Model With Legendre Memory Units
Zeyu Liu, Gourav Datta, Anni Li et al.
One-stage Prompt-based Continual Learning
Youngeun Kim, YUHANG LI, Priyadarshini Panda
Towards Effective Usage of Human-Centric Priors in Diffusion Models for Text-based Human Image Generation
Junyan Wang, Zhenhong Sun, Stewart Tan et al.
M2D2M: Multi-Motion Generation from Text with Discrete Diffusion Models
Seunggeun Chi, Hyung-gun Chi, Hengbo Ma et al.
Unsupervised Layer-Wise Score Aggregation for Textual OOD Detection
Maxime Darrin, Guillaume Staerman, Eduardo Dadalto Camara Gomes et al.
Language-Driven 6-DoF Grasp Detection Using Negative Prompt Guidance
Tien Toan Nguyen, Minh Nhat Nhat Vu, Baoru Huang et al.
MESA: Matching Everything by Segmenting Anything
Yesheng Zhang, Xu Zhao
EgoCVR: An Egocentric Benchmark for Fine-Grained Composed Video Retrieval
Thomas Hummel, Shyamgopal Karthik, Mariana-Iuliana Georgescu et al.
Latent Diffusion Prior Enhanced Deep Unfolding for Snapshot Spectral Compressive Imaging
Zongliang Wu, Ruiying Lu, Ying Fu et al.
Multiagent Multitraversal Multimodal Self-Driving: Open MARS Dataset
Yiming Li, Zhiheng Li, Nuo Chen et al.
Revisiting Adversarial Training Under Long-Tailed Distributions
Xinli Yue, Ningping Mou, Qian Wang et al.
Label-Agnostic Forgetting: A Supervision-Free Unlearning in Deep Models
Shaofei Shen, Chenhao Zhang, Yawen Zhao et al.
Motion-prior Contrast Maximization for Dense Continuous-Time Motion Estimation
Friedhelm Hamann, Ziyun Wang, Ioannis Asmanis et al.
Weakly Supervised Semantic Segmentation for Driving Scenes
Dongseob Kim, Seungho Lee, Junsuk Choe et al.
CLOSER: Towards Better Representation Learning for Few-Shot Class-Incremental Learning
Junghun Oh, Sungyong Baik, Kyoung Mu Lee
Diffusion Model is a Good Pose Estimator from 3D RF-Vision
Junqiao Fan, Jianfei Yang, Yuecong Xu et al.
Exploring Pre-trained Text-to-Video Diffusion Models for Referring Video Object Segmentation
Xuelu Feng, Dongdong Chen, Junsong Yuan et al.
SPEAL: Skeletal Prior Embedded Attention Learning for Cross-Source Point Cloud Registration
Kezheng Xiong, Maoji Zheng, Qingshan Xu et al.
LRANet: Towards Accurate and Efficient Scene Text Detection with Low-Rank Approximation
Yuchen Su, Zhineng Chen, Zhiwen Shao et al.
Masked Video and Body-worn IMU Autoencoder for Egocentric Action Recognition
Mingfang Zhang, Yifei Huang, Ruicong Liu et al.
Mismatch Quest: Visual and Textual Feedback for Image-Text Misalignment
Brian Gordon, Yonatan Bitton, Yonatan Shafir et al.
PreRoutGNN for Timing Prediction with Order Preserving Partition: Global Circuit Pre-training, Local Delay Learning and Attentional Cell Modeling
Ruizhe Zhong, Junjie Ye, Zhentao Tang et al.
Adaptive VIO: Deep Visual-Inertial Odometry with Online Continual Learning
Youqi Pan, Wugen Zhou, Yingdian Cao et al.
Towards Understanding Factual Knowledge of Large Language Models
Xuming Hu, Junzhe Chen, Xiaochuan Li et al.
What Makes a Good Prune? Maximal Unstructured Pruning for Maximal Cosine Similarity
Gabryel Mason-Williams, Fredrik Dahlqvist
Condition-Aware Neural Network for Controlled Image Generation
Han Cai, Muyang Li, Qinsheng Zhang et al.
A Comprehensive Augmentation Framework for Anomaly Detection
Lin Jiang, Yaping Yan
Programmable Motion Generation for Open-Set Motion Control Tasks
Hanchao Liu, Xiaohang Zhan, Shaoli Huang et al.
Visual Concept Connectome (VCC): Open World Concept Discovery and their Interlayer Connections in Deep Models
Matthew Kowal, Richard P. Wildes, Kosta Derpanis
Object Pose Estimation via the Aggregation of Diffusion Features
Tianfu Wang, Guosheng Hu, Hongguang Wang
IS-DARTS: Stabilizing DARTS through Precise Measurement on Candidate Importance
Hongyi He, Longjun Liu, Haonan Zhang et al.
Lazy Diffusion Transformer for Interactive Image Editing
Yotam Nitzan, Zongze Wu, Richard Zhang et al.
Select and Distill: Selective Dual-Teacher Knowledge Transfer for Continual Learning on Vision-Language Models
Yu-Chu Yu, Chi-Pin Huang, Jr-Jen Chen et al.
R-MAE: Regions Meet Masked Autoencoders
Duy-Kien Nguyen, Yanghao Li, Vaibhav Aggarwal et al.
Video Prediction by Modeling Videos as Continuous Multi-Dimensional Processes
Gaurav Shrivastava, Abhinav Shrivastava
PosFormer: Recognizing Complex Handwritten Mathematical Expression with Position Forest Transformer
Tongkun Guan, Chengyu Lin, Wei Shen et al.
Taming Latent Diffusion Model for Neural Radiance Field Inpainting
Chieh Lin, Changil Kim, Jia-Bin Huang et al.
ODM: A Text-Image Further Alignment Pre-training Approach for Scene Text Detection and Spotting
Chen Duan, Pei Fu, Shan Guo et al.
Diversified and Personalized Multi-rater Medical Image Segmentation
Yicheng Wu, Xiangde Luo, Zhe Xu et al.
LiFT: A Surprisingly Simple Lightweight Feature Transform for Dense ViT Descriptors
Saksham Suri, Matthew Walmer, Kamal Gupta et al.
City-on-Web: Real-time Neural Rendering of Large-scale Scenes on the Web
Kaiwen Song, Xiaoyi Zeng, Chenqu Ren et al.
Day-Night Cross-domain Vehicle Re-identification
Hongchao Li, Jingong Chen, AIHUA ZHENG et al.
Stable Unlearnable Example: Enhancing the Robustness of Unlearnable Examples via Stable Error-Minimizing Noise
Yixin Liu, Kaidi Xu, Xun Chen et al.
C^2RV: Cross-Regional and Cross-View Learning for Sparse-View CBCT Reconstruction
Yiqun Lin, Jiewen Yang, hualiang wang et al.
Efficient Image Pre-Training with Siamese Cropped Masked Autoencoders
Alexandre Eymaël, Renaud Vandeghen, Anthony Cioppa et al.
An Efficient and Effective Transformer Decoder-Based Framework for Multi-Task Visual Grounding
Wei Chen, Long Chen, Yu Wu
Learning to Optimize Permutation Flow Shop Scheduling via Graph-Based Imitation Learning
Longkang Li, Siyuan Liang, Zihao Zhu et al.
TP2O: Creative Text Pair-to-Object Generation using Balance Swap-Sampling
Jun Li, Zedong Zhang, Jian Yang
Three Heads Are Better than One: Complementary Experts for Long-Tailed Semi-supervised Learning
Chengcheng Ma, Ismail Elezi, Jiankang Deng et al.
Auto-GAS: Automated Proxy Discovery for Training-free Generative Architecture Search
Lujun Li, Haosen SUN, Shiwen Li et al.
Towards Latent Masked Image Modeling for Self-Supervised Visual Representation Learning
Yibing Wei, Abhinav Gupta, Pedro Morgado
Osmosis: RGBD Diffusion Prior for Underwater Image Restoration
Opher Bar Nathan, Deborah Steinberger-Levy, Tali Treibitz et al.
Efficiently Assemble Normalization Layers and Regularization for Federated Domain Generalization
Khiem Le, Tuan Long Ho, Cuong Do et al.
TransCAD: A Hierarchical Transformer for CAD Sequence Inference from Point Clouds
Dupont Elona, Kseniya Cherenkova, Dimitrios Mallis et al.
CADTalk: An Algorithm and Benchmark for Semantic Commenting of CAD Programs
Haocheng Yuan, Jing Xu, Hao Pan et al.
Joint Demosaicing and Denoising for Spike Camera
Yanchen Dong, Ruiqin Xiong, Jing Zhao et al.
AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation with Unified Audio-Visual Speech Representation
Jeongsoo Choi, Se Jin Park, Minsu Kim et al.
Review-Enhanced Hierarchical Contrastive Learning for Recommendation
Ke Wang, Yanmin Zhu, Tianzi Zang et al.
Progressive Poisoned Data Isolation for Training-Time Backdoor Defense
Yiming Chen, Haiwei Wu, Jiantao Zhou
Context Diffusion: In-Context Aware Image Generation
Ivona Najdenkoska, Animesh Sinha, Abhimanyu Dubey et al.
Transformer-Based Selective Super-resolution for Efficient Image Refinement
Tianyi Zhang, Kishore Kasichainula, Yaoxin Zhuo et al.
AnatoMask: Enhancing Medical Image Segmentation with Reconstruction-guided Self-masking
Yuheng Li, Tianyu Luan, Yizhou Wu et al.
Revealing the Proximate Long-Tail Distribution in Compositional Zero-Shot Learning
Chenyi Jiang, Haofeng Zhang
Semi-supervised Active Learning for Video Action Detection
Ayush Singh, Aayush J Rana, Akash Kumar et al.
Learning Hierarchical Image Segmentation For Recognition and By Recognition
Tsung-Wei Ke, Sangwoo Mo, Stella Yu
Interactive3D: Create What You Want by Interactive 3D Generation
Shaocong Dong, Lihe Ding, Zhanpeng Huang et al.
Commonsense Prototype for Outdoor Unsupervised 3D Object Detection
Hai Wu, Shijia Zhao, Xun Huang et al.
Breaking Physical and Linguistic Borders: Multilingual Federated Prompt Tuning for Low-Resource Languages
Wanru Zhao, Yihong Chen, Royson Lee et al.
Token Transformation Matters: Towards Faithful Post-hoc Explanation for Vision Transformer
Junyi Wu, Bin Duan, Weitai Kang et al.
CIFAR-10-Warehouse: Broad and More Realistic Testbeds in Model Generalization Analysis
Xiaoxiao Sun, Xingjian Leng, Zijian Wang et al.
Inverse Rendering of Glossy Objects via the Neural Plenoptic Function and Radiance Fields
Haoyuan Wang, Wenbo Hu, Lei Zhu et al.
Align Before Adapt: Leveraging Entity-to-Region Alignments for Generalizable Video Action Recognition
Yifei Chen, Dapeng Chen, Ruijin Liu et al.
DiffusionNAG: Predictor-guided Neural Architecture Generation with Diffusion Models
Sohyun An, Hayeon Lee, Jaehyeong Jo et al.
PatchRefiner: Leveraging Synthetic Data for Real-Domain High-Resolution Monocular Metric Depth Estimation
Zhenyu Li, Shariq Farooq Bhat, Peter Wonka
Iterated Learning Improves Compositionality in Large Vision-Language Models
Chenhao Zheng, Jieyu Zhang, Aniruddha Kembhavi et al.
Music Style Transfer with Time-Varying Inversion of Diffusion Models
Sifei Li, Yuxin Zhang, Fan Tang et al.
CoT3DRef: Chain-of-Thoughts Data-Efficient 3D Visual Grounding
eslam Abdelrahman, Mohamed Ayman Mohamed, Mahmoud Ahmed et al.
Controllable Navigation Instruction Generation with Chain of Thought Prompting
Xianghao Kong, Jinyu Chen, Wenguan Wang et al.
Frozen Feature Augmentation for Few-Shot Image Classification
Andreas Bär, Neil Houlsby, Mostafa Dehghani et al.
Thermal3D-GS: Physics-induced 3D Gaussians for Thermal Infrared Novel-view Synthesis
Qian Chen, Shihao Shu, Xiangzhi Bai
TCLC-GS: Tightly Coupled LiDAR-Camera Gaussian Splatting for Autonomous Driving
Cheng Zhao, su sun, Ruoyu Wang et al.
Optimizing Diffusion Models for Joint Trajectory Prediction and Controllable Generation
Yixiao Wang, Chen Tang, Lingfeng Sun et al.
Weakly Supervised Open-Vocabulary Object Detection
Jianghang Lin, Yunhang Shen, Bingquan Wang et al.
Evidential Active Recognition: Intelligent and Prudent Open-World Embodied Perception
Lei Fan, Mingfu Liang, Yunxuan Li et al.
Decentralized Riemannian Conjugate Gradient Method on the Stiefel Manifold
Jun Chen, Haishan Ye, Mengmeng Wang et al.
Quadratic models for understanding catapult dynamics of neural networks
Libin Zhu, Chaoyue Liu, Adityanarayanan Radhakrishnan et al.
Comparing the Robustness of Modern No-Reference Image- and Video-Quality Metrics to Adversarial Attacks
Anastasia Antsiferova, Khaled Abud, Aleksandr Gushchin et al.
FRIH: Fine-Grained Region-Aware Image Harmonization
Jinlong Peng, Zekun Luo, Liang Liu et al.
PromptIQA: Boosting the Performance and Generalization for No-Reference Image Quality Assessment via Prompts
Zewen Chen, Haina Qin, Juan Wang et al.
Guess The Unseen: Dynamic 3D Scene Reconstruction from Partial 2D Glimpses
Inhee Lee, Byungjun Kim, Hanbyul Joo
DART: Implicit Doppler Tomography for Radar Novel View Synthesis
Tianshu Huang, John Miller, Akarsh Prabhakara et al.
ProxyDet: Synthesizing Proxy Novel Classes via Classwise Mixup for Open Vocabulary Object Detection
Joonhyun Jeong, Geondo Park, Jayeon Yoo et al.
SURER: Structure-Adaptive Unified Graph Neural Network for Multi-View Clustering
Jing Wang, Songhe Feng, Gengyu Lyu et al.
LocLLM: Exploiting Generalizable Human Keypoint Localization via Large Language Model
Dongkai Wang, shiyu xuan, Shiliang Zhang
Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective
Ming Zhong, Chenxin An, Weizhu Chen et al.
MSD: A Benchmark Dataset for Floor Plan Generation of Building Complexes
Casper van Engelenburg, Fatemeh Mostafavi, Emanuel Kuhn et al.
CORN: Contact-based Object Representation for Nonprehensile Manipulation of General Unseen Objects
Yoonyoung Cho, Junhyek Han, Yoontae Cho et al.
Mirage: Model-agnostic Graph Distillation for Graph Classification
Mridul Gupta, Sahil Manchanda, HARIPRASAD KODAMANA et al.
MaGGIe: Masked Guided Gradual Human Instance Matting
Chuong Huynh, Seoung Wug Oh, Abhinav Shrivastava et al.
SHINOBI: Shape and Illumination using Neural Object Decomposition via BRDF Optimization In-the-wild
Andreas Engelhardt, Amit Raj, Mark Boss et al.
SuperGaussian: Repurposing Video Models for 3D Super Resolution
Yuan Shen, Duygu Ceylan, Paul Guerrero et al.
GaussReg: Fast 3D Registration with Gaussian Splatting
Jiahao Chang, Yinglin Xu, Yihao Li et al.
DPMesh: Exploiting Diffusion Prior for Occluded Human Mesh Recovery
Yixuan Zhu, Ao Li, Yansong Tang et al.
KD-DETR: Knowledge Distillation for Detection Transformer with Consistent Distillation Points Sampling
Yu Wang, Xin Li, Shengzhao Wen et al.
SWAP-NAS: Sample-Wise Activation Patterns for Ultra-fast NAS
Yameng Peng, Andy Song, Haytham Fayek et al.
SegGen: Supercharging Segmentation Models with Text2Mask and Mask2Img Synthesis
Hanrong Ye, Jason Wen Yong Kuen, Qing Liu et al.
Every Node Is Different: Dynamically Fusing Self-Supervised Tasks for Attributed Graph Clustering
Pengfei Zhu, Qian Wang, Yu Wang et al.
VividDreamer: Invariant Score Distillation for Hyper-Realistic Text-to-3D Generation
Wenjie Zhuo, Fan Ma, Hehe Fan et al.
Grounded Object-Centric Learning
Avinash Kori, Francesco Locatello, Fabio De Sousa Ribeiro et al.
Self-Supervised Multi-Modal Knowledge Graph Contrastive Hashing for Cross-Modal Search
Meiyu Liang, Junping Du, Zhengyang Liang et al.
Heterogeneous Graph Reasoning for Fact Checking over Texts and Tables
Haisong Gong, Weizhi Xu, Shu Wu et al.
PARE-Net: Position-Aware Rotation-Equivariant Networks for Robust Point Cloud Registration
Runzhao Yao, Shaoyi Du, Wenting Cui et al.
Unleashing the Potential of Fractional Calculus in Graph Neural Networks with FROND
Qiyu Kang, Kai Zhao, Qinxu Ding et al.
Deep Diffusion Image Prior for Efficient OOD Adaptation in 3D Inverse Problems
Hyungjin Chung, Jong Chul Ye
Get an A in Math: Progressive Rectification Prompting
Zhenyu Wu, Meng Jiang, Chao Shen
Versatile Medical Image Segmentation Learned from Multi-Source Datasets via Model Self-Disambiguation
Xiaoyang Chen, Hao Zheng, Yuemeng LI et al.
Multichannel AV-wav2vec2: A Framework for Learning Multichannel Multi-Modal Speech Representation
Qiushi Zhu, Jie Zhang, Yu Gu et al.
MeshSegmenter: Zero-Shot Mesh Segmentation via Texture Synthesis
ziming zhong, Yanyu Xu, Jing Li et al.
What How and When Should Object Detectors Update in Continually Changing Test Domains?
Jayeon Yoo, Dongkwan Lee, Inseop Chung et al.
One-Shot Structure-Aware Stylized Image Synthesis
Hansam Cho, Jonghyun Lee, Seunggyu Chang et al.
DG-PIC: Domain Generalized Point-In-Context Learning for Point Cloud Understanding
Jincen Jiang, Qianyu Zhou, Yuhang Li et al.
Compositional Generative Inverse Design
Tailin Wu, Takashi Maruyama, Long Wei et al.
LaPose: Laplacian Mixture Shape Modeling for RGB-Based Category-Level Object Pose Estimation
Ruida Zhang, Ziqin Huang, Gu Wang et al.
Codebook Transfer with Part-of-Speech for Vector-Quantized Image Modeling
Baoquan Zhang, Huaibin Wang, Luo Chuyao et al.
Gaussian Shadow Casting for Neural Characters
Luis Bolanos, Shih-Yang Su, Helge Rhodin
Signed Graph Neural Ordinary Differential Equation for Modeling Continuous-Time Dynamics
Lanlan Chen, Kai Wu, Jian Lou et al.
Real-Time Simulated Avatar from Head-Mounted Sensors
Zhengyi Luo, Jinkun Cao, Rawal Khirodkar et al.
Enhancing Source-Free Domain Adaptive Object Detection with Low-confidence Pseudo Label Distillation
Ilhoon Yoon, Hyeongjun Kwon, Jin Kim et al.
Enhancing Vision-Language Pre-training with Rich Supervisions
Yuan Gao, Kunyu Shi, Pengkai Zhu et al.
The First to Know: How Token Distributions Reveal Hidden Knowledge in Large Vision-Language Models?
Qinyu Zhao, Ming Xu, Kartik Gupta et al.
Diffusion Bridges for 3D Point Cloud Denoising
Mathias Vogel, Keisuke Tateno, Marc Pollefeys et al.
Morphable Diffusion: 3D-Consistent Diffusion for Single-image Avatar Creation
Xiyi Chen, Marko Mihajlovic, Shaofei Wang et al.
Adapters Strike Back
Jan-Martin Steitz, Stefan Roth
LookupViT: Compressing visual information to a limited number of tokens
Rajat Koner, Gagan Jain, Sujoy Paul et al.
Jointly Training and Pruning CNNs via Learnable Agent Guidance and Alignment
Alireza Ganjdanesh, Shangqian Gao, Heng Huang
Tensorized Label Learning on Anchor Graph
Jing Li, Quanxue Gao, Qianqian Wang et al.
Learning MDL Logic Programs from Noisy Data
Céline Hocquette, Andreas Niskanen, Matti Järvisalo et al.
Learning Optimal Advantage from Preferences and Mistaking It for Reward
W Bradley Knox, Stephane Hatgis-Kessell, Sigurdur Orn Adalgeirsson et al.
The Hard Positive Truth about Vision-Language Compositionality
Amita Kamath, Cheng-Yu Hsieh, Kai-Wei Chang et al.
HiLo: Detailed and Robust 3D Clothed Human Reconstruction with High-and Low-Frequency Information of Parametric Models
Yifan Yang, Dong Liu, Shuhai Zhang et al.
ConsistDreamer: 3D-Consistent 2D Diffusion for High-Fidelity Scene Editing
Jun-Kun Chen, Samuel Rota Bulò, Norman Müller et al.
Instance-Aware Group Quantization for Vision Transformers
Jaehyeon Moon, Dohyung Kim, Jun Yong Cheon et al.
OVER-NAV: Elevating Iterative Vision-and-Language Navigation with Open-Vocabulary Detection and StructurEd Representation
Ganlong Zhao, Guanbin Li, Weikai Chen et al.
Adversarial Score Distillation: When score distillation meets GAN
Min Wei, Jingkai Zhou, Junyao Sun et al.
Cyclic Learning for Binaural Audio Generation and Localization
Zhaojian Li, Bin Zhao, Yuan Yuan
AugUndo: Scaling Up Augmentations for Monocular Depth Completion and Estimation
Yangchao Wu, Tian Yu Liu, Hyoungseob Park et al.
Improving Spectral Snapshot Reconstruction with Spectral-Spatial Rectification
Jiancheng Zhang, Haijin Zeng, Yongyong Chen et al.
Progressive Divide-and-Conquer via Subsampling Decomposition for Accelerated MRI
Chong Wang, Lanqing Guo, Yufei Wang et al.
OmniMotionGPT: Animal Motion Generation with Limited Data
Zhangsihao Yang, Mingyuan Zhou, Mengyi Shan et al.
BaCon: Boosting Imbalanced Semi-supervised Learning via Balanced Feature-Level Contrastive Learning
Qianhan Feng, Lujing Xie, Shijie Fang et al.
Bridging the Synthetic-to-Authentic Gap: Distortion-Guided Unsupervised Domain Adaptation for Blind Image Quality Assessment
Aobo Li, Jinjian Wu, Yongxu Liu et al.
Tackling Structural Hallucination in Image Translation with Local Diffusion
Seunghoi Kim, Chen Jin, Tom Diethe et al.
Make Me a BNN: A Simple Strategy for Estimating Bayesian Uncertainty from Pre-trained Models
Gianni Franchi, Olivier Laurent, Maxence Leguéry et al.
Animal Avatars: Reconstructing Animatable 3D Animals from Casual Videos
Remy Sabathier, David Novotny, Niloy Mitra
TeMO: Towards Text-Driven 3D Stylization for Multi-Object Meshes
Xuying Zhang, Bo-Wen Yin, yuming chen et al.
Semi-supervised Open-World Object Detection
Sahal Shaji Mullappilly, Abhishek Singh Gehlot, Rao Muhammad Anwer et al.
ColNeRF: Collaboration for Generalizable Sparse Input Neural Radiance Field
Zhangkai Ni, Peiqi Yang, Wenhan Yang et al.
Living Scenes: Multi-object Relocalization and Reconstruction in Changing 3D Environments
Liyuan Zhu, Shengyu Huang, Konrad Schindler et al.
A Noisy Elephant in the Room: Is Your Out-of-Distribution Detector Robust to Label Noise?
Galadrielle Humblot-Renaux, Sergio Escalera, Thomas B. Moeslund
Exploring Vision Transformers for 3D Human Motion-Language Models with Motion Patches
Qing Yu, Mikihiro Tanaka, Kent Fujiwara
Instant 3D Human Avatar Generation using Image Diffusion Models
Nikos Kolotouros, Thiemo Alldieck, Enric Corona et al.
Open Panoramic Segmentation
Junwei Zheng, Ruiping Liu, Yufan Chen et al.
Unveiling Typographic Deceptions: Insights of the Typographic Vulnerability in Large Vision-Language Models
Hao Cheng, Erjia Xiao, Jindong Gu et al.
CREAD: A Classification-Restoration Framework with Error Adaptive Discretization for Watch Time Prediction in Video Recommender Systems
Jie Sun, Zhao Ying Ding, Xiaoshuang Chen et al.
Kill Two Birds with One Stone: Rethinking Data Augmentation for Deep Long-tailed Learning
Binwu Wang, Pengkun Wang, Wei Xu et al.
FRDiff : Feature Reuse for Universal Training-free Acceleration of Diffusion Models
Junhyuk So, Jungwon Lee, Eunhyeok Park
Self-Supervised Video Desmoking for Laparoscopic Surgery
Renlong Wu, Zhilu Zhang, Shuohao Zhang et al.
Learning to Visually Localize Sound Sources from Mixtures without Prior Source Knowledge
Dongjin Kim, Sung Jin Um, Sangmin Lee et al.
Adversarial Training Should Be Cast as a Non-Zero-Sum Game
Alex Robey, Fabian Latorre, George Pappas et al.
ScanFormer: Referring Expression Comprehension by Iteratively Scanning
Wei Su, Peihan Miao, Huanzhang Dou et al.
Quad Bayer Joint Demosaicing and Denoising Based on Dual Encoder Network with Joint Residual Learning
Bolun Zheng, Li Haoran, Quan Chen et al.
History Matters: Temporal Knowledge Editing in Large Language Model
Xunjian Yin, Jin Jiang, Liming Yang et al.
Kandinsky Conformal Prediction: Efficient Calibration of Image Segmentation Algorithms
Joren Brunekreef, Eric Marcus, Ray Sheombarsing et al.
Accelerating Image Generation with Sub-path Linear Approximation Model
Chen Xu, Tianhui Song, Weixin Feng et al.
VEON: Vocabulary-Enhanced Occupancy Prediction
Jilai Zheng, Pin Tang, Zhongdao Wang et al.
Multimarginal Generative Modeling with Stochastic Interpolants
Michael Albergo, Nicholas Boffi, Michael Lindsey et al.
Dissecting Sample Hardness: A Fine-Grained Analysis of Hardness Characterization Methods for Data-Centric AI
Nabeel Seedat, Fergus Imrie, Mihaela van der Schaar
Bidirectional Autoregessive Diffusion Model for Dance Generation
Canyu Zhang, Youbao Tang, NING Zhang et al.
Predicated Diffusion: Predicate Logic-Based Attention Guidance for Text-to-Image Diffusion Models
Kota Sueyoshi, Takashi Matsubara
GenesisTex: Adapting Image Denoising Diffusion to Texture Space
Chenjian Gao, Boyan Jiang, Xinghui Li et al.
Dynamic Retraining-Updating Mean Teacher for Source-Free Object Detection
BA KHANH TRINH LE, Huy-Hung Nguyen, Long Hoang Pham et al.