Most Cited 2024 "gpu kernel optimization" Papers
12,324 papers found • Page 9 of 62
Conference
Neural Spline Fields for Burst Image Fusion and Layer Separation
Ilya Chugunov, David Shustin, Ruyu Yan et al.
AE-NeRF: Audio Enhanced Neural Radiance Field for Few Shot Talking Head Synthesis
Dongze Li, Kang Zhao, Wei Wang et al.
ZO-AdaMU Optimizer: Adapting Perturbation by the Momentum and Uncertainty in Zeroth-Order Optimization
Shuoran Jiang, Qingcai Chen, Yang Xiang et al.
IMPUS: Image Morphing with Perceptually-Uniform Sampling Using Diffusion Models
Zhaoyuan Yang, Zhengyang Yu, Zhiwei Xu et al.
Few-Shot Anomaly-Driven Generation for Anomaly Classification and Segmentation
Guan Gui, Bin-Bin Gao, Jun Liu et al.
Zero-shot Referring Expression Comprehension via Structural Similarity Between Images and Captions
Zeyu Han, Fangrui Zhu, Qianru Lao et al.
GCNext: Towards the Unity of Graph Convolutions for Human Motion Prediction
Xinshun Wang, Qiongjie Cui, Chen Chen et al.
UVEB: A Large-scale Benchmark and Baseline Towards Real-World Underwater Video Enhancement
yaofeng xie, Lingwei Kong, Kai Chen et al.
MOFDiff: Coarse-grained Diffusion for Metal-Organic Framework Design
Xiang Fu, Tian Xie, Andrew Rosen et al.
Online Zero-Shot Classification with CLIP
Qi Qian, JUHUA HU
PreSight: Enhancing Autonomous Vehicle Perception with City-Scale NeRF Priors
Tianyuan Yuan, Mao Yucheng, Jiawei Yang et al.
Unmixing Diffusion for Self-Supervised Hyperspectral Image Denoising
Haijin Zeng, Jiezhang Cao, Yongyong Chen et al.
WTS: A Pedestrian-Centric Traffic Video Dataset for Fine-grained Spatial-Temporal Understanding
Quan Kong, Yuki Kawana, Rajat Saini et al.
Visible and Clear: Finding Tiny Objects in Difference Map
Bing Cao, Haiyu Yao, Pengfei Zhu et al.
Learning to Adapt SAM for Segmenting Cross-domain Point Clouds
Xidong Peng, Runnan Chen, Feng Qiao et al.
Real-time 3D-aware Portrait Video Relighting
Ziqi Cai, Kaiwen Jiang, Shu-Yu Chen et al.
VisionTrap: Vision-Augmented Trajectory Prediction Guided by Textual Descriptions
Seokha Moon, Hyun Woo, Hongbeen Park et al.
Pathologies of Predictive Diversity in Deep Ensembles
Geoff Pleiss, Taiga Abe, E. Kelly Buchanan et al.
Boosting Neural Cognitive Diagnosis with Student’s Affective State Modeling
Shanshan Wang, Zhen Zeng, Xun Yang et al.
DreamPropeller: Supercharge Text-to-3D Generation with Parallel Sampling
Linqi Zhou, Andy Shih, Chenlin Meng et al.
Rethinking Few-shot 3D Point Cloud Semantic Segmentation
Zhaochong An, Guolei Sun, Yun Liu et al.
Unlocking the Potential of Prompt-Tuning in Bridging Generalized and Personalized Federated Learning
wenlong deng, Christos Thrampoulidis, Xiaoxiao Li
Lipschitz Singularities in Diffusion Models
Zhantao Yang, Ruili Feng, Han Zhang et al.
SEED: A Simple and Effective 3D DETR in Point Clouds
Zhe Liu, Jinghua Hou, Xiaoqing Ye et al.
Surface Reconstruction for 3D Gaussian Splatting via Local Structural Hints
Qianyi Wu, Jianmin Zheng, Jianfei Cai
Question Calibration and Multi-Hop Modeling for Temporal Question Answering
Chao Xue, Di Liang, Pengfei Wang et al.
Conditional Information Bottleneck Approach for Time Series Imputation
MinGyu Choi, Changhee Lee
Clustering Propagation for Universal Medical Image Segmentation
Yuhang Ding, Liulei Li, Wenguan Wang et al.
NRDF: Neural Riemannian Distance Fields for Learning Articulated Pose Priors
Yannan He, Garvita Tiwari, Tolga Birdal et al.
Language-Driven Anchors for Zero-Shot Adversarial Robustness
Xiao Li, Wei Zhang, Yining Liu et al.
PracticalDG: Perturbation Distillation on Vision-Language Models for Hybrid Domain Generalization
Zining Chen, Weiqiu Wang, Zhicheng Zhao et al.
3D-SceneDreamer: Text-Driven 3D-Consistent Scene Generation
Songchun Zhang, Yibo Zhang, Quan Zheng et al.
SAGS: Structure-Aware 3D Gaussian Splatting
Evangelos Ververas, Rolandos Alexandros Potamias, Song Jifei et al.
RealViformer: Investigating Attention for Real-World Video Super-Resolution
Yuehan Zhang, Angela Yao
Visual Prompting for Generalized Few-shot Segmentation: A Multi-scale Approach
Mir Rayat Imtiaz Hossain, Mennatullah Siam, Leonid Sigal et al.
TI2V-Zero: Zero-Shot Image Conditioning for Text-to-Video Diffusion Models
Haomiao Ni, Bernhard Egger, Suhas Lohit et al.
One-Shot Diffusion Mimicker for Handwritten Text Generation
Gang Dai, Yifan Zhang, Quhui Ke et al.
Factorized Diffusion: Perceptual Illusions by Noise Decomposition
Daniel Geng, Inbum Park, Andrew Owens
Carve3D: Improving Multi-view Reconstruction Consistency for Diffusion Models with RL Finetuning
Desai Xie, Jiahao Li, Hao Tan et al.
VideoRF: Rendering Dynamic Radiance Fields as 2D Feature Video Streams
Liao Wang, Kaixin Yao, Chengcheng Guo et al.
A Simple Baseline for Spoken Language to Sign Language Translation with 3D Avatars
Ronglai Zuo, Fangyun Wei, Zenggui Chen et al.
HMD-Poser: On-Device Real-time Human Motion Tracking from Scalable Sparse Observations
Peng Dai, Yang Zhang, Tao Liu et al.
Isomorphic Pruning for Vision Models
Gongfan Fang, Xinyin Ma, Michael Bi Mi et al.
PORF: POSE RESIDUAL FIELD FOR ACCURATE NEURAL SURFACE RECONSTRUCTION
Jia-Wang Bian, Wenjing Bian, Victor Prisacariu et al.
Learning to Predict Activity Progress by Self-Supervised Video Alignment
Gerard Donahue, Ehsan Elhamifar
Exploring Diverse Representations for Open Set Recognition
Yu Wang, Junxian Mu, Pengfei Zhu et al.
AlignMiF: Geometry-Aligned Multimodal Implicit Field for LiDAR-Camera Joint Synthesis
Tao Tang, Guangrun Wang, Yixing Lao et al.
Conformal Autoregressive Generation: Beam Search with Coverage Guarantees
Nicolas Deutschmann, Marvin Alberts, María Rodríguez Martínez
TE-TAD: Towards Full End-to-End Temporal Action Detection via Time-Aligned Coordinate Expression
Ho-Joong Kim, Jung-Ho Hong, Heejo Kong et al.
Exploring the Promise and Limits of Real-Time Recurrent Learning
Kazuki Irie, Anand Gopalakrishnan, Jürgen Schmidhuber
Simplifying Source-Free Domain Adaptation for Object Detection: Effective Self-Training Strategies and Performance Insights
Yan Hao, Florent Forest, Olga Fink
Steerers: A Framework for Rotation Equivariant Keypoint Descriptors
Georg Bökman, Johan Edstedt, Michael Felsberg et al.
Multi-Level Neural Scene Graphs for Dynamic Urban Environments
Tobias Fischer, Lorenzo Porzi, Samuel Rota Bulò et al.
ColorPeel: Color Prompt Learning with Diffusion Models via Color and Shape Disentanglement
Muhammad Atif Butt, Kai Wang, Javier Vazquez-Corral et al.
STAR: Boosting Low-Resource Information Extraction by Structure-to-Text Data Generation with Large Language Models
Mingyu Derek Ma, Xiaoxuan Wang, Po-Nien Kung et al.
Customizing Language Model Responses with Contrastive In-Context Learning
Xiang Gao, Kamalika Das
360Loc: A Dataset and Benchmark for Omnidirectional Visual Localization with Cross-device Queries
Huajian Huang, Changkun Liu, Yipeng Zhu et al.
NeRF-HuGS: Improved Neural Radiance Fields in Non-static Scenes Using Heuristics-Guided Segmentation
Jiahao Chen, Yipeng Qin, Lingjie Liu et al.
A Call to Reflect on Evaluation Practices for Age Estimation: Comparative Analysis of the State-of-the-Art and a Unified Benchmark
Jakub Paplham, Vojtech Franc
On the Variance of Neural Network Training with respect to Test Sets and Distributions
Keller Jordan
WordRobe: Text-Guided Generation of Textured 3D Garments
Astitva Srivastava, Pranav Manu, Amit Raj et al.
Modeling Collaborator: Enabling Subjective Vision Classification With Minimal Human Effort via LLM Tool-Use
Imad Eddine Toubal, Aditya Avinash, Neil Alldrin et al.
Underwater Organism Color Fine-Tuning via Decomposition and Guidance
Xiaofeng Cong, Jie Gui, Junming Hou
Instrumental Variable Estimation for Causal Inference in Longitudinal Data with Time-Dependent Latent Confounders
Debo Cheng, Ziqi Xu, Jiuyong Li et al.
Improving Cross-Modal Alignment with Synthetic Pairs for Text-Only Image Captioning
Zhiyue Liu, Jinyuan Liu, Fanrong Ma
Improving Video Segmentation via Dynamic Anchor Queries
Yikang Zhou, Tao Zhang, Xiangtai Li et al.
Distilling Vision-Language Models on Millions of Videos
Yue Zhao, Long Zhao, Xingyi Zhou et al.
Improving Zero-shot Generalization of Learned Prompts via Unsupervised Knowledge Distillation
Marco Mistretta, Alberto Baldrati, Marco Bertini et al.
An Upload-Efficient Scheme for Transferring Knowledge From a Server-Side Pre-trained Generator to Clients in Heterogeneous Federated Learning
Jianqing Zhang, Yang Liu, Yang Hua et al.
Training-free Video Temporal Grounding using Large-scale Pre-trained Models
Minghang Zheng, Xinhao Cai, Qingchao Chen et al.
Improved Graph Contrastive Learning for Short Text Classification
Yonghao Liu, Lan Huang, Fausto Giunchiglia et al.
Leaving the Nest: Going beyond Local Loss Functions for Predict-Then-Optimize
Sanket Shah, Bryan Wilder, Andrew Perrault et al.
Structure-Guided Adversarial Training of Diffusion Models
Ling Yang, Haotian Qian, Zhilong Zhang et al.
MMVR: Millimeter-wave Multi-View Radar Dataset and Benchmark for Indoor Perception
Mohammad Mahbubur Rahman, Ryoma Yataka, Sorachi Kato et al.
ASAM: Boosting Segment Anything Model with Adversarial Tuning
Bo Li, Haoke Xiao, Lv Tang
AMEGO: Active Memory from long EGOcentric videos
Gabriele Goletto, Tushar Nagarajan, Giuseppe Averta et al.
Restore Anything with Masks: Leveraging Mask Image Modeling for Blind All-in-One Image Restoration
Chu Jie Qin, Ruiqi Wu, Zikun Liu et al.
TeTriRF: Temporal Tri-Plane Radiance Fields for Efficient Free-Viewpoint Video
Minye Wu, Zehao Wang, Georgios Kouros et al.
Embarrassingly Simple Dataset Distillation
Yunzhen Feng, Shanmukha Ramakrishna Vedantam, Julia Kempe
Navigation Instruction Generation with BEV Perception and Large Language Models
Sheng Fan, Rui Liu, Wenguan Wang et al.
Long-Tailed Anomaly Detection with Learnable Class Names
Chih-Hui Ho, Kuan-Chuan Peng, Nuno Vasconcelos
Improving Plasticity in Online Continual Learning via Collaborative Learning
Maorong Wang, Nicolas Michel, Ling Xiao et al.
FedRA: A Random Allocation Strategy for Federated Tuning to Unleash the Power of Heterogeneous Clients
Shangchao Su, Bin Li, Xiangyang Xue
Neuro-Inspired Information-Theoretic Hierarchical Perception for Multimodal Learning
Xiongye Xiao, Gengshuo Liu, Gaurav Gupta et al.
MirrorGaussian: Reflecting 3D Gaussians for Reconstructing Mirror Reflections
Jiayue Liu, Tang Xiao, Freeman Cheng et al.
Pre-training Sequence, Structure, and Surface Features for Comprehensive Protein Representation Learning
Youhan Lee, Hasun Yu, Jaemyung Lee et al.
ECoFLaP: Efficient Coarse-to-Fine Layer-Wise Pruning for Vision-Language Models
Yi-Lin Sung, Jaehong Yoon, Mohit Bansal
GOODAT: Towards Test-Time Graph Out-of-Distribution Detection
Luzhi Wang, Di Jin, He Zhang et al.
SelEx: Self-Expertise in Fine-Grained Generalized Category Discovery
Sarah Rastegar, Mohammadreza Salehi, Yuki M Asano et al.
Self-Distillation Regularized Connectionist Temporal Classification Loss for Text Recognition: A Simple Yet Effective Approach
Ziyin Zhang, Ning Lu, Minghui Liao et al.
VideoMAC: Video Masked Autoencoders Meet ConvNets
Gensheng Pei, Tao Chen, Xiruo Jiang et al.
PerAda: Parameter-Efficient Federated Learning Personalization with Generalization Guarantees
Chulin Xie, De-An Huang, Wenda Chu et al.
Sketch and Refine: Towards Fast and Accurate Lane Detection
Chao Chen, Jie Liu, Chang Zhou et al.
Gaussian Frosting: Editable Complex Radiance Fields with Real-Time Rendering
Antoine Guedon, Vincent Lepetit
Diffusion for Natural Image Matting
Yihan Hu, Yiheng Lin, Wei Wang et al.
One-Class Face Anti-spoofing via Spoof Cue Map-Guided Feature Learning
Pei-Kai Huang, Cheng-Hsuan Chiang, Tzu-Hsien Chen et al.
Grid Diffusion Models for Text-to-Video Generation
Taegyeong Lee, Soyeong Kwon, Taehwan Kim
Towards Open Domain Text-Driven Synthesis of Multi-Person Motions
Shan Mengyi, Lu Dong, Yutao Han et al.
MLP Can Be A Good Transformer Learner
Sihao Lin, Pumeng Lyu, Dongrui Liu et al.
Generating and Reweighting Dense Contrastive Patterns for Unsupervised Anomaly Detection
Songmin Dai, Yifan Wu, Xiaoqiang Li et al.
A Graph-Based Approach for Category-Agnostic Pose Estimation
Or Hirschorn, Shai Avidan
EditShield: Protecting Unauthorized Image Editing by Instruction-guided Diffusion Models
Ruoxi Chen, Haibo Jin, Yixin Liu et al.
FlowTrack: Revisiting Optical Flow for Long-Range Dense Tracking
Seokju Cho, Gabriel Huang, Seungryong Kim et al.
Domain Randomization via Entropy Maximization
Gabriele Tiboni, Pascal Klink, Jan Peters et al.
Image-Text Co-Decomposition for Text-Supervised Semantic Segmentation
Ji-Jia Wu, Andy Chia-Hao Chang, Chieh-Yu Chuang et al.
ExeDec: Execution Decomposition for Compositional Generalization in Neural Program Synthesis
Kensen Shi, Joey Hong, Yinlin Deng et al.
ConR: Contrastive Regularizer for Deep Imbalanced Regression
Mahsa Keramati, Lili Meng, R. Evans
Upper Bounding Barlow Twins: A Novel Filter for Multi-Relational Clustering
Xiaowei Qian, Bingheng Li, Zhao Kang
Zero-Shot Aerial Object Detection with Visual Description Regularization
Chenyu Lin, Zhengqing Zang, Chenwei Tang et al.
Federated Learning with Extremely Noisy Clients via Negative Distillation
Yang Lu, Lin Chen, Yonggang Zhang et al.
EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE
Junyi Chen, Longteng Guo, Jia Sun et al.
Towards Efficient Diffusion-Based Image Editing with Instant Attention Masks
Siyu Zou, Jiji Tang, Yiyi Zhou et al.
You'll Never Walk Alone: A Sketch and Text Duet for Fine-Grained Image Retrieval
Subhadeep Koley, Ayan Kumar Bhunia, Aneeshan Sain et al.
DIBS: Enhancing Dense Video Captioning with Unlabeled Videos via Pseudo Boundary Enrichment and Online Refinement
Hao Wu, Huabin Liu, Yu Qiao et al.
Improved Self-Training for Test-Time Adaptation
Jing Ma
Dual Prior Unfolding for Snapshot Compressive Imaging
Jiancheng Zhang, Haijin Zeng, Jiezhang Cao et al.
WildVidFit: Video Virtual Try-On in the Wild via Image-Based Controlled Diffusion Models
Zijian He, Peixin Chen, Guangrun Wang et al.
Loose Inertial Poser: Motion Capture with IMU-attached Loose-Wear Jacket
Chengxu Zuo, Yiming Wang, Lishuang Zhan et al.
Cascade-CLIP: Cascaded Vision-Language Embeddings Alignment for Zero-Shot Semantic Segmentation
Yunheng Li, Zhong-Yu Li, Quan-Sheng Zeng et al.
Joint Reconstruction of 3D Human and Object via Contact-Based Refinement Transformer
Hyeongjin Nam, Daniel Jung, Gyeongsik Moon et al.
Minimum Coverage Sets for Training Robust Ad Hoc Teamwork Agents
Arrasy Rahman, Jiaxun Cui, Peter Stone
HENet: Hybrid Encoding for End-to-end Multi-task 3D Perception from Multi-view Cameras
Zhongyu Xia, ZhiWei Lin, Xinhao Wang et al.
Image Clustering via the Principle of Rate Reduction in the Age of Pretrained Models
Tianzhe Chu, Shengbang Tong, Tianjiao Ding et al.
HR-Pro: Point-Supervised Temporal Action Localization via Hierarchical Reliability Propagation
Huaxin Zhang, Xiang Wang, Xiaohao Xu et al.
RPSC: Robust Pseudo-Labeling for Semantic Clustering
Sihang Liu, Wenming Cao, Ruigang Fu et al.
SmartControl: Enhancing ControlNet for Handling Rough Visual Conditions
XIAOYU LIU, Yuxiang WEI, Ming LIU et al.
Diffusion Language-Shapelets for Semi-supervised Time-Series Classification
Zhen Liu, Wenbin Pei, Disen Lan et al.
Decoupled Pseudo-labeling for Semi-Supervised Monocular 3D Object Detection
Jiacheng Zhang, Jiaming Li, Xiangru Lin et al.
Exploring Efficient Asymmetric Blind-Spots for Self-Supervised Denoising in Real-World Scenarios
Shiyan Chen, Jiyuan Zhang, Zhaofei Yu et al.
DriveTrack: A Benchmark for Long-Range Point Tracking in Real-World Videos
Arjun Balasingam, Joseph Chandler, Chenning Li et al.
Repeated Fair Allocation of Indivisible Items
Ayumi Igarashi, Martin Lackner, Oliviero Nardi et al.
Leveraging Cross-Modal Neighbor Representation for Improved CLIP Classification
Chao Yi, Lu Ren, De-Chuan Zhan et al.
Exemplar-free Continual Representation Learning via Learnable Drift Compensation
Alex Gomez-Villa, Dipam Goswami, Kai Wang et al.
LAN: Learning to Adapt Noise for Image Denoising
Changjin Kim, Tae Hyun Kim, Sungyong Baik
FedFixer: Mitigating Heterogeneous Label Noise in Federated Learning
Xinyuan Ji, Zhaowei Zhu, Wei Xi et al.
Any2Point: Empowering Any-modality Transformers for Efficient 3D Understanding
YIWEN TANG, Renrui Zhang, Jiaming Liu et al.
Split to Merge: Unifying Separated Modalities for Unsupervised Domain Adaptation
Xinyao Li, Yuke Li, Zhekai Du et al.
iHuman: Instant Animatable Digital Humans From Monocular Videos
Pramish Paudel, Anubhav Khanal, Danda Pani Paudel et al.
Continuous Memory Representation for Anomaly Detection
Joo Chan Lee, Taejune Kim, Eunbyung Park et al.
RecDiffusion: Rectangling for Image Stitching with Diffusion Models
Tianhao Zhou, Li Haipeng, Ziyi Wang et al.
LAKE-RED: Camouflaged Images Generation by Latent Background Knowledge Retrieval-Augmented Diffusion
Pancheng Zhao, Peng Xu, Pengda Qin et al.
FusionFormer: A Concise Unified Feature Fusion Transformer for 3D Pose Estimation
Yanlu Cai, Weizhong Zhang, Yuan Wu et al.
ParamISP: Learned Forward and Inverse ISPs using Camera Parameters
Woohyeok Kim, Geonu Kim, Junyong Lee et al.
Efficient Subgraph GNNs by Learning Effective Selection Policies
Beatrice Bevilacqua, Moshe Eliasof, Eli Meirom et al.
SPIRE: Semantic Prompt-Driven Image Restoration
Chenyang Qi, Zhengzhong Tu, Keren Ye et al.
Federated Q-Learning: Linear Regret Speedup with Low Communication Cost
Zhong Zheng, Fengyu Gao, Lingzhou Xue et al.
LiDAR-based Person Re-identification
Wenxuan Guo, Zhiyu Pan, Yingping Liang et al.
TexGen: Text-Guided 3D Texture Generation with Multi-view Sampling and Resampling
Dong Huo, Zixin Guo, Xinxin Zuo et al.
Enhancing Semantic Fidelity in Text-to-Image Synthesis: Attention Regulation in Diffusion Models
Yang Zhang, Tze Tzun Teoh, Wei Hern Lim et al.
InterFusion: Text-Driven Generation of 3D Human-Object Interaction
Sisi Dai, Wenhao Li, Haowen Sun et al.
EAT: Towards Long-Tailed Out-of-Distribution Detection
Tong Wei, Bo-Lin Wang, Min-Ling Zhang
ZeroI2V: Zero-Cost Adaptation of Pre-Trained Transformers from Image to Video
Xinhao Li, Yuhan Zhu, Limin Wang
VTQA: Visual Text Question Answering via Entity Alignment and Cross-Media Reasoning
Kang Chen, Xiangqian Wu
To Grok or not to Grok: Disentangling Generalization and Memorization on Corrupted Algorithmic Datasets
Darshil Doshi, Aritra Das, Tianyu He et al.
Deep Incomplete Multi-View Learning Network with Insufficient Label Information
Zhangqi Jiang, Tingjin Luo, Xinyan Liang
Deep Orthogonal Hypersphere Compression for Anomaly Detection
Yunhe Zhang, Yan Sun, Jinyu Cai et al.
eTraM: Event-based Traffic Monitoring Dataset
Aayush Atul Verma, Bharatesh Chakravarthi, Arpitsinh Vaghela et al.
WebVLN: Vision-and-Language Navigation on Websites
Qi Chen, Dileepa Pitawela, Chongyang Zhao et al.
CA-Jaccard: Camera-aware Jaccard Distance for Person Re-identification
Yiyu Chen, Zheyi Fan, Zhaoru Chen et al.
DreamComposer: Controllable 3D Object Generation via Multi-View Conditions
Yunhan Yang, Yukun Huang, Xiaoyang Wu et al.
Is Retain Set All You Need in Machine Unlearning? Restoring Performance of Unlearned Models with Out-Of-Distribution Images
Jacopo Bonato, Marco Cotogni, Luigi Sabetta
Improving Transferable Targeted Adversarial Attacks with Model Self-Enhancement
Han Wu, Guanyan Ou, Weibin Wu et al.
NViST: In the Wild New View Synthesis from a Single Image with Transformers
Wonbong Jang, Lourdes Agapito
Constrained Bayesian Optimization under Partial Observations: Balanced Improvements and Provable Convergence
Shengbo Wang, Ke Li
Neuroformer: Multimodal and Multitask Generative Pretraining for Brain Data
Antonis Antoniades, Yiyi Yu, Joe Canzano et al.
OmniNOCS: A unified NOCS dataset and model for 3D lifting of 2D objects
Akshay Krishnan, Abhijit Kundu, Kevis Maninis et al.
Towards Real-world Event-guided Low-light Video Enhancement and Deblurring
Taewoo Kim, Jaeseok Jeong, Hoonhee Cho et al.
Theoretically Achieving Continuous Representation of Oriented Bounding Boxes
Zikai Xiao, Guo-Ye Yang, Xue Yang et al.
CVT-Occ: Cost Volume Temporal Fusion for 3D Occupancy Prediction
Zhangchen Ye, Tao Jiang, Chenfeng Xu et al.
SparseCraft: Few-Shot Neural Reconstruction through Stereopsis Guided Geometric Linearization
Mae Younes, Amine Ouasfi, Adnane Boukhayma
As-Plausible-As-Possible: Plausibility-Aware Mesh Deformation Using 2D Diffusion Priors
Seungwoo Yoo, Kunho Kim, Vladimir G. Kim et al.
Distinguished In Uniform: Self-Attention Vs. Virtual Nodes
Eran Rosenbluth, Jan Tönshoff, Martin Ritzert et al.
Discriminability-Driven Channel Selection for Out-of-Distribution Detection
Yue Yuan, Rundong He, Yicong Dong et al.
ProS: Prompting-to-simulate Generalized knowledge for Universal Cross-Domain Retrieval
Fang Kaipeng, Jingkuan Song, Lianli Gao et al.
Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos
Changan Chen, Puyuan Peng, Ami Baid et al.
Unified Embedding Alignment for Open-Vocabulary Video Instance Segmentation
Hao Fang, Peng Wu, Yawei Li et al.
You Only Need One Step: Fast Super-Resolution with Stable Diffusion via Scale Distillation
Mehdi Noroozi, Isma Hadji, Brais Martinez et al.
Mixed-Precision Quantization for Federated Learning on Resource-Constrained Heterogeneous Devices
Huancheng Chen, Haris Vikalo
Low-Rank Rescaled Vision Transformer Fine-Tuning: A Residual Design Approach
Wei Dong, Xing Zhang, Bihui Chen et al.
Centering the Value of Every Modality: Towards Efficient and Resilient Modality-agnostic Semantic Segmentation
Xu Zheng, Yuanhuiyi Lyu, jiazhou zhou et al.
Scalable 3D Registration via Truncated Entry-wise Absolute Residuals
Tianyu Huang, Liangzu Peng, Rene Vidal et al.
Towards Neuro-Symbolic Video Understanding
Minkyu Choi, Harsh Goel, Mohammad Omama et al.
Contourlet Residual for Prompt Learning Enhanced Infrared Image Super-Resolution
Xingyuan Li, Jinyuan Liu, ZHIXIN CHEN et al.
Robust-Wide: Robust Watermarking against Instruction-driven Image Editing
Runyi Hu, Jie Zhang, Ting Xu et al.
Dexterous Grasp Transformer
Guo-Hao Xu, Yi-Lin Wei, Dian Zheng et al.
ConGeo: Robust Cross-view Geo-localization across Ground View Variations
Li Mi, Chang Xu, Javiera Castillo Navarro et al.
Robust Image Denoising through Adversarial Frequency Mixup
Donghun Ryou, Inju Ha, Hyewon Yoo et al.
Adversarial Adaptive Sampling: Unify PINN and Optimal Transport for the Approximation of PDEs
Kejun Tang, Jiayu Zhai, Xiaoliang Wan et al.
Unsupervised Keypoints from Pretrained Diffusion Models
Eric Hedlin, Gopal Sharma, Shweta Mahajan et al.
Boosting of Thoughts: Trial-and-Error Problem Solving with Large Language Models
Sijia Chen, Baochun Li, Di Niu
LatentEditor: Text Driven Local Editing of 3D Scenes
Umar Khalid, Hasan Iqbal, Muhammad Tayyab et al.
ReGCL: Rethinking Message Passing in Graph Contrastive Learning
Cheng Ji, Zixuan Huang, Qingyun Sun et al.
ODIN: A Single Model for 2D and 3D Segmentation
Ayush Jain, Pushkal Katara, Nikolaos Gkanatsios et al.
Bounds on Representation-Induced Confounding Bias for Treatment Effect Estimation
Valentyn Melnychuk, Dennis Frauen, Stefan Feuerriegel
LayoutFlow: Flow Matching for Layout Generation
Julian Jorge Andrade Guerreiro, Naoto Inoue, Kento Masui et al.
SegVG: Transferring Object Bounding Box to Segmentation for Visual Grounding
Weitai Kang, Gaowen Liu, Shah Mubarak et al.
HOISDF: Constraining 3D Hand-Object Pose Estimation with Global Signed Distance Fields
Haozhe Qi, Chen Zhao, Mathieu Salzmann et al.
OV-Uni3DETR: Towards Unified Open-Vocabulary 3D Object Detection via Cycle-Modality Propagation
Zhenyu Wang, Ya-Li Li, TAICHI LIU et al.
GarmentCodeData: A Dataset of 3D Made-to-Measure Garments With Sewing Patterns
Maria Korosteleva, Timur Levent Kesdogan, Fabian Kemper et al.
The Unreasonable Effectiveness of Pre-Trained Features for Camera Pose Refinement
Gabriele Trivigno, Carlo Masone, Barbara Caputo et al.