Most Cited 2024 "oblivious soft decision tree" Papers
12,324 papers found • Page 9 of 62
Conference
Rethinking Few-shot 3D Point Cloud Semantic Segmentation
Zhaochong An, Guolei Sun, Yun Liu et al.
SAGS: Structure-Aware 3D Gaussian Splatting
Evangelos Ververas, Rolandos Alexandros Potamias, Song Jifei et al.
RealViformer: Investigating Attention for Real-World Video Super-Resolution
Yuehan Zhang, Angela Yao
Real-time 3D-aware Portrait Video Relighting
Ziqi Cai, Kaiwen Jiang, Shu-Yu Chen et al.
Question Calibration and Multi-Hop Modeling for Temporal Question Answering
Chao Xue, Di Liang, Pengfei Wang et al.
MOFDiff: Coarse-grained Diffusion for Metal-Organic Framework Design
Xiang Fu, Tian Xie, Andrew Rosen et al.
One-Shot Diffusion Mimicker for Handwritten Text Generation
Gang Dai, Yifan Zhang, Quhui Ke et al.
Factorized Diffusion: Perceptual Illusions by Noise Decomposition
Daniel Geng, Inbum Park, Andrew Owens
PrPSeg: Universal Proposition Learning for Panoramic Renal Pathology Segmentation
Ruining Deng, Quan Liu, Can Cui et al.
A Simple Baseline for Spoken Language to Sign Language Translation with 3D Avatars
Ronglai Zuo, Fangyun Wei, Zenggui Chen et al.
HMD-Poser: On-Device Real-time Human Motion Tracking from Scalable Sparse Observations
Peng Dai, Yang Zhang, Tao Liu et al.
Isomorphic Pruning for Vision Models
Gongfan Fang, Xinyin Ma, Michael Bi Mi et al.
One-Class Face Anti-spoofing via Spoof Cue Map-Guided Feature Learning
Pei-Kai Huang, Cheng-Hsuan Chiang, Tzu-Hsien Chen et al.
Federated Learning with Extremely Noisy Clients via Negative Distillation
Yang Lu, Lin Chen, Yonggang Zhang et al.
Simplifying Source-Free Domain Adaptation for Object Detection: Effective Self-Training Strategies and Performance Insights
Yan Hao, Florent Forest, Olga Fink
Upper Bounding Barlow Twins: A Novel Filter for Multi-Relational Clustering
Xiaowei Qian, Bingheng Li, Zhao Kang
ColorPeel: Color Prompt Learning with Diffusion Models via Color and Shape Disentanglement
Muhammad Atif Butt, Kai Wang, Javier Vazquez-Corral et al.
FlowTrack: Revisiting Optical Flow for Long-Range Dense Tracking
Seokju Cho, Gabriel Huang, Seungryong Kim et al.
DIBS: Enhancing Dense Video Captioning with Unlabeled Videos via Pseudo Boundary Enrichment and Online Refinement
Hao Wu, Huabin Liu, Yu Qiao et al.
WordRobe: Text-Guided Generation of Textured 3D Garments
Astitva Srivastava, Pranav Manu, Amit Raj et al.
Learning to Predict Activity Progress by Self-Supervised Video Alignment
Gerard Donahue, Ehsan Elhamifar
Improving Video Segmentation via Dynamic Anchor Queries
Yikang Zhou, Tao Zhang, Xiangtai Li et al.
Instrumental Variable Estimation for Causal Inference in Longitudinal Data with Time-Dependent Latent Confounders
Debo Cheng, Ziqi Xu, Jiuyong Li et al.
ConR: Contrastive Regularizer for Deep Imbalanced Regression
Mahsa Keramati, Lili Meng, R. Evans
Cascade-CLIP: Cascaded Vision-Language Embeddings Alignment for Zero-Shot Semantic Segmentation
Yunheng Li, Zhong-Yu Li, Quan-Sheng Zeng et al.
ExeDec: Execution Decomposition for Compositional Generalization in Neural Program Synthesis
Kensen Shi, Joey Hong, Yinlin Deng et al.
Exploring Diverse Representations for Open Set Recognition
Yu Wang, Junxian Mu, Pengfei Zhu et al.
Domain Randomization via Entropy Maximization
Gabriele Tiboni, Pascal Klink, Jan Peters et al.
AlignMiF: Geometry-Aligned Multimodal Implicit Field for LiDAR-Camera Joint Synthesis
Tao Tang, Guangrun Wang, Yixing Lao et al.
Improving Zero-shot Generalization of Learned Prompts via Unsupervised Knowledge Distillation
Marco Mistretta, Alberto Baldrati, Marco Bertini et al.
Underwater Organism Color Fine-Tuning via Decomposition and Guidance
Xiaofeng Cong, Jie Gui, Junming Hou
Multi-Level Neural Scene Graphs for Dynamic Urban Environments
Tobias Fischer, Lorenzo Porzi, Samuel Rota Bulò et al.
Conformal Autoregressive Generation: Beam Search with Coverage Guarantees
Nicolas Deutschmann, Marvin Alberts, María Rodríguez Martínez
NeRF-HuGS: Improved Neural Radiance Fields in Non-static Scenes Using Heuristics-Guided Segmentation
Jiahao Chen, Yipeng Qin, Lingjie Liu et al.
Improving Cross-Modal Alignment with Synthetic Pairs for Text-Only Image Captioning
Zhiyue Liu, Jinyuan Liu, Fanrong Ma
Aligning Geometric Spatial Layout in Cross-View Geo-Localization via Feature Recombination
Qingwang Zhang, Yingying Zhu
STAR: Boosting Low-Resource Information Extraction by Structure-to-Text Data Generation with Large Language Models
Mingyu Derek Ma, Xiaoxuan Wang, Po-Nien Kung et al.
Loose Inertial Poser: Motion Capture with IMU-attached Loose-Wear Jacket
Chengxu Zuo, Yiming Wang, Lishuang Zhan et al.
Training-free Video Temporal Grounding using Large-scale Pre-trained Models
Minghang Zheng, Xinhao Cai, Qingchao Chen et al.
360Loc: A Dataset and Benchmark for Omnidirectional Visual Localization with Cross-device Queries
Huajian Huang, Changkun Liu, Yipeng Zhu et al.
On the Variance of Neural Network Training with respect to Test Sets and Distributions
Keller Jordan
ECoFLaP: Efficient Coarse-to-Fine Layer-Wise Pruning for Vision-Language Models
Yi-Lin Sung, Jaehong Yoon, Mohit Bansal
MMVR: Millimeter-wave Multi-View Radar Dataset and Benchmark for Indoor Perception
Mohammad Mahbubur Rahman, Ryoma Yataka, Sorachi Kato et al.
AMEGO: Active Memory from long EGOcentric videos
Gabriele Goletto, Tushar Nagarajan, Giuseppe Averta et al.
Restore Anything with Masks: Leveraging Mask Image Modeling for Blind All-in-One Image Restoration
Chu Jie Qin, Ruiqi Wu, Zikun Liu et al.
You'll Never Walk Alone: A Sketch and Text Duet for Fine-Grained Image Retrieval
Subhadeep Koley, Ayan Kumar Bhunia, Aneeshan Sain et al.
Navigation Instruction Generation with BEV Perception and Large Language Models
Sheng Fan, Rui Liu, Wenguan Wang et al.
ZePT: Zero-Shot Pan-Tumor Segmentation via Query-Disentangling and Self-Prompting
Yankai Jiang, Zhongzhen Huang, Rongzhao Zhang et al.
Leaving the Nest: Going beyond Local Loss Functions for Predict-Then-Optimize
Sanket Shah, Bryan Wilder, Andrew Perrault et al.
Structure-Guided Adversarial Training of Diffusion Models
Ling Yang, Haotian Qian, Zhilong Zhang et al.
PORF: POSE RESIDUAL FIELD FOR ACCURATE NEURAL SURFACE RECONSTRUCTION
Jia-Wang Bian, Wenjing Bian, Victor Prisacariu et al.
FedRA: A Random Allocation Strategy for Federated Tuning to Unleash the Power of Heterogeneous Clients
Shangchao Su, Bin Li, Xiangyang Xue
Improved Graph Contrastive Learning for Short Text Classification
Yonghao Liu, Lan Huang, Fausto Giunchiglia et al.
MirrorGaussian: Reflecting 3D Gaussians for Reconstructing Mirror Reflections
Jiayue Liu, Tang Xiao, Freeman Cheng et al.
Steerers: A Framework for Rotation Equivariant Keypoint Descriptors
Georg Bökman, Johan Edstedt, Michael Felsberg et al.
Exploring the Promise and Limits of Real-Time Recurrent Learning
Kazuki Irie, Anand Gopalakrishnan, Jürgen Schmidhuber
SelEx: Self-Expertise in Fine-Grained Generalized Category Discovery
Sarah Rastegar, Mohammadreza Salehi, Yuki M Asano et al.
A Call to Reflect on Evaluation Practices for Age Estimation: Comparative Analysis of the State-of-the-Art and a Unified Benchmark
Jakub Paplham, Vojtech Franc
An Upload-Efficient Scheme for Transferring Knowledge From a Server-Side Pre-trained Generator to Clients in Heterogeneous Federated Learning
Jianqing Zhang, Yang Liu, Yang Hua et al.
Distilling Vision-Language Models on Millions of Videos
Yue Zhao, Long Zhao, Xingyi Zhou et al.
ASAM: Boosting Segment Anything Model with Adversarial Tuning
Bo Li, Haoke Xiao, Lv Tang
Sketch and Refine: Towards Fast and Accurate Lane Detection
Chao Chen, Jie Liu, Chang Zhou et al.
Gaussian Frosting: Editable Complex Radiance Fields with Real-Time Rendering
Antoine Guedon, Vincent Lepetit
Diffusion for Natural Image Matting
Yihan Hu, Yiheng Lin, Wei Wang et al.
GOODAT: Towards Test-Time Graph Out-of-Distribution Detection
Luzhi Wang, Di Jin, He Zhang et al.
Towards Open Domain Text-Driven Synthesis of Multi-Person Motions
Shan Mengyi, Lu Dong, Yutao Han et al.
A Graph-Based Approach for Category-Agnostic Pose Estimation
Or Hirschorn, Shai Avidan
EditShield: Protecting Unauthorized Image Editing by Instruction-guided Diffusion Models
Ruoxi Chen, Haibo Jin, Yixin Liu et al.
Generating and Reweighting Dense Contrastive Patterns for Unsupervised Anomaly Detection
Songmin Dai, Yifan Wu, Xiaoqiang Li et al.
TeTriRF: Temporal Tri-Plane Radiance Fields for Efficient Free-Viewpoint Video
Minye Wu, Zehao Wang, Georgios Kouros et al.
Zero-shot Referring Expression Comprehension via Structural Similarity Between Images and Captions
Zeyu Han, Fangrui Zhu, Qianru Lao et al.
Long-Tailed Anomaly Detection with Learnable Class Names
Chih-Hui Ho, Kuan-Chuan Peng, Nuno Vasconcelos
Self-Distillation Regularized Connectionist Temporal Classification Loss for Text Recognition: A Simple Yet Effective Approach
Ziyin Zhang, Ning Lu, Minghui Liao et al.
MLP Can Be A Good Transformer Learner
Sihao Lin, Pumeng Lyu, Dongrui Liu et al.
Neuro-Inspired Information-Theoretic Hierarchical Perception for Multimodal Learning
Xiongye Xiao, Gengshuo Liu, Gaurav Gupta et al.
Improving Plasticity in Online Continual Learning via Collaborative Learning
Maorong Wang, Nicolas Michel, Ling Xiao et al.
Grid Diffusion Models for Text-to-Video Generation
Taegyeong Lee, Soyeong Kwon, Taehwan Kim
VideoMAC: Video Masked Autoencoders Meet ConvNets
Gensheng Pei, Tao Chen, Xiruo Jiang et al.
Modeling Collaborator: Enabling Subjective Vision Classification With Minimal Human Effort via LLM Tool-Use
Imad Eddine Toubal, Aditya Avinash, Neil Alldrin et al.
EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE
Junyi Chen, Longteng Guo, Jia Sun et al.
PerAda: Parameter-Efficient Federated Learning Personalization with Generalization Guarantees
Chulin Xie, De-An Huang, Wenda Chu et al.
WildVidFit: Video Virtual Try-On in the Wild via Image-Based Controlled Diffusion Models
Zijian He, Peixin Chen, Guangrun Wang et al.
Embarrassingly Simple Dataset Distillation
Yunzhen Feng, Shanmukha Ramakrishna Vedantam, Julia Kempe
Pre-training Sequence, Structure, and Surface Features for Comprehensive Protein Representation Learning
Youhan Lee, Hasun Yu, Jaemyung Lee et al.
Dual Prior Unfolding for Snapshot Compressive Imaging
Jiancheng Zhang, Haijin Zeng, Jiezhang Cao et al.
HENet: Hybrid Encoding for End-to-end Multi-task 3D Perception from Multi-view Cameras
Zhongyu Xia, ZhiWei Lin, Xinhao Wang et al.
Adversarial Supervision Makes Layout-to-Image Diffusion Models Thrive
Yumeng Li, Margret Keuper, Dan Zhang et al.
Minimum Coverage Sets for Training Robust Ad Hoc Teamwork Agents
Arrasy Rahman, Jiaxun Cui, Peter Stone
FedFixer: Mitigating Heterogeneous Label Noise in Federated Learning
Xinyuan Ji, Zhaowei Zhu, Wei Xi et al.
Weighted Envy-Freeness for Submodular Valuations
Luisa Montanari, Ulrike Schmidt-Kraepelin, Warut Suksompong et al.
HR-Pro: Point-Supervised Temporal Action Localization via Hierarchical Reliability Propagation
Huaxin Zhang, Xiang Wang, Xiaohao Xu et al.
SmartControl: Enhancing ControlNet for Handling Rough Visual Conditions
XIAOYU LIU, Yuxiang WEI, Ming LIU et al.
Mixed-Precision Quantization for Federated Learning on Resource-Constrained Heterogeneous Devices
Huancheng Chen, Haris Vikalo
Diffusion Language-Shapelets for Semi-supervised Time-Series Classification
Zhen Liu, Wenbin Pei, Disen Lan et al.
Robust Image Denoising through Adversarial Frequency Mixup
Donghun Ryou, Inju Ha, Hyewon Yoo et al.
Low-Rank Rescaled Vision Transformer Fine-Tuning: A Residual Design Approach
Wei Dong, Xing Zhang, Bihui Chen et al.
Exemplar-free Continual Representation Learning via Learnable Drift Compensation
Alex Gomez-Villa, Dipam Goswami, Kai Wang et al.
Any2Point: Empowering Any-modality Transformers for Efficient 3D Understanding
YIWEN TANG, Renrui Zhang, Jiaming Liu et al.
iHuman: Instant Animatable Digital Humans From Monocular Videos
Pramish Paudel, Anubhav Khanal, Danda Pani Paudel et al.
ReGCL: Rethinking Message Passing in Graph Contrastive Learning
Cheng Ji, Zixuan Huang, Qingyun Sun et al.
Continuous Memory Representation for Anomaly Detection
Joo Chan Lee, Taejune Kim, Eunbyung Park et al.
The Unreasonable Effectiveness of Pre-Trained Features for Camera Pose Refinement
Gabriele Trivigno, Carlo Masone, Barbara Caputo et al.
Adversarial Adaptive Sampling: Unify PINN and Optimal Transport for the Approximation of PDEs
Kejun Tang, Jiayu Zhai, Xiaoliang Wan et al.
HOISDF: Constraining 3D Hand-Object Pose Estimation with Global Signed Distance Fields
Haozhe Qi, Chen Zhao, Mathieu Salzmann et al.
FusionFormer: A Concise Unified Feature Fusion Transformer for 3D Pose Estimation
Yanlu Cai, Weizhong Zhang, Yuan Wu et al.
RPSC: Robust Pseudo-Labeling for Semantic Clustering
Sihang Liu, Wenming Cao, Ruigang Fu et al.
ODIN: A Single Model for 2D and 3D Segmentation
Ayush Jain, Pushkal Katara, Nikolaos Gkanatsios et al.
SPIRE: Semantic Prompt-Driven Image Restoration
Chenyang Qi, Zhengzhong Tu, Keren Ye et al.
TexGen: Text-Guided 3D Texture Generation with Multi-view Sampling and Resampling
Dong Huo, Zixin Guo, Xinxin Zuo et al.
Enhancing Semantic Fidelity in Text-to-Image Synthesis: Attention Regulation in Diffusion Models
Yang Zhang, Tze Tzun Teoh, Wei Hern Lim et al.
DSL-FIQA: Assessing Facial Image Quality via Dual-Set Degradation Learning and Landmark-Guided Transformer
Wei-Ting Chen, Gurunandan Krishnan, Qiang Gao et al.
LSK3DNet: Towards Effective and Efficient 3D Perception with Large Sparse Kernels
Tuo Feng, Wenguan Wang, Fan Ma et al.
Image-Text Co-Decomposition for Text-Supervised Semantic Segmentation
Ji-Jia Wu, Andy Chia-Hao Chang, Chieh-Yu Chuang et al.
Repeated Fair Allocation of Indivisible Items
Ayumi Igarashi, Martin Lackner, Oliviero Nardi et al.
Image Clustering via the Principle of Rate Reduction in the Age of Pretrained Models
Tianzhe Chu, Shengbang Tong, Tianjiao Ding et al.
Exploring Efficient Asymmetric Blind-Spots for Self-Supervised Denoising in Real-World Scenarios
Shiyan Chen, Jiyuan Zhang, Zhaofei Yu et al.
InterFusion: Text-Driven Generation of 3D Human-Object Interaction
Sisi Dai, Wenhao Li, Haowen Sun et al.
LAN: Learning to Adapt Noise for Image Denoising
Changjin Kim, Tae Hyun Kim, Sungyong Baik
ZeroI2V: Zero-Cost Adaptation of Pre-Trained Transformers from Image to Video
Xinhao Li, Yuhan Zhu, Limin Wang
Bounds on Representation-Induced Confounding Bias for Treatment Effect Estimation
Valentyn Melnychuk, Dennis Frauen, Stefan Feuerriegel
GlitchBench: Can Large Multimodal Models Detect Video Game Glitches?
Mohammad Reza Taesiri, Tianjun Feng, Cor-Paul Bezemer et al.
Constrained Bayesian Optimization under Partial Observations: Balanced Improvements and Provable Convergence
Shengbo Wang, Ke Li
Is Retain Set All You Need in Machine Unlearning? Restoring Performance of Unlearned Models with Out-Of-Distribution Images
Jacopo Bonato, Marco Cotogni, Luigi Sabetta
Customizing Language Model Responses with Contrastive In-Context Learning
Xiang Gao, Kamalika Das
Joint Reconstruction of 3D Human and Object via Contact-Based Refinement Transformer
Hyeongjin Nam, Daniel Jung, Gyeongsik Moon et al.
OmniNOCS: A unified NOCS dataset and model for 3D lifting of 2D objects
Akshay Krishnan, Abhijit Kundu, Kevis Maninis et al.
Towards Real-world Event-guided Low-light Video Enhancement and Deblurring
Taewoo Kim, Jaeseok Jeong, Hoonhee Cho et al.
Deep Incomplete Multi-View Learning Network with Insufficient Label Information
Zhangqi Jiang, Tingjin Luo, Xinyan Liang
CVT-Occ: Cost Volume Temporal Fusion for 3D Occupancy Prediction
Zhangchen Ye, Tao Jiang, Chenfeng Xu et al.
SparseCraft: Few-Shot Neural Reconstruction through Stereopsis Guided Geometric Linearization
Mae Younes, Amine Ouasfi, Adnane Boukhayma
Leveraging Cross-Modal Neighbor Representation for Improved CLIP Classification
Chao Yi, Lu Ren, De-Chuan Zhan et al.
Decoupled Pseudo-labeling for Semi-Supervised Monocular 3D Object Detection
Jiacheng Zhang, Jiaming Li, Xiangru Lin et al.
LiDAR-based Person Re-identification
Wenxuan Guo, Zhiyu Pan, Yingping Liang et al.
RecDiffusion: Rectangling for Image Stitching with Diffusion Models
Tianhao Zhou, Li Haipeng, Ziyi Wang et al.
Efficient Subgraph GNNs by Learning Effective Selection Policies
Beatrice Bevilacqua, Moshe Eliasof, Eli Meirom et al.
Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos
Changan Chen, Puyuan Peng, Ami Baid et al.
Unified Embedding Alignment for Open-Vocabulary Video Instance Segmentation
Hao Fang, Peng Wu, Yawei Li et al.
LAKE-RED: Camouflaged Images Generation by Latent Background Knowledge Retrieval-Augmented Diffusion
Pancheng Zhao, Peng Xu, Pengda Qin et al.
You Only Need One Step: Fast Super-Resolution with Stable Diffusion via Scale Distillation
Mehdi Noroozi, Isma Hadji, Brais Martinez et al.
Centering the Value of Every Modality: Towards Efficient and Resilient Modality-agnostic Semantic Segmentation
Xu Zheng, Yuanhuiyi Lyu, jiazhou zhou et al.
Towards Neuro-Symbolic Video Understanding
Minkyu Choi, Harsh Goel, Mohammad Omama et al.
ParamISP: Learned Forward and Inverse ISPs using Camera Parameters
Woohyeok Kim, Geonu Kim, Junyong Lee et al.
Contourlet Residual for Prompt Learning Enhanced Infrared Image Super-Resolution
Xingyuan Li, Jinyuan Liu, ZHIXIN CHEN et al.
Robust-Wide: Robust Watermarking against Instruction-driven Image Editing
Runyi Hu, Jie Zhang, Ting Xu et al.
Federated Q-Learning: Linear Regret Speedup with Low Communication Cost
Zhong Zheng, Fengyu Gao, Lingzhou Xue et al.
ConGeo: Robust Cross-view Geo-localization across Ground View Variations
Li Mi, Chang Xu, Javiera Castillo Navarro et al.
eTraM: Event-based Traffic Monitoring Dataset
Aayush Atul Verma, Bharatesh Chakravarthi, Arpitsinh Vaghela et al.
Neuroformer: Multimodal and Multitask Generative Pretraining for Brain Data
Antonis Antoniades, Yiyi Yu, Joe Canzano et al.
CA-Jaccard: Camera-aware Jaccard Distance for Person Re-identification
Yiyu Chen, Zheyi Fan, Zhaoru Chen et al.
Improving Transferable Targeted Adversarial Attacks with Model Self-Enhancement
Han Wu, Guanyan Ou, Weibin Wu et al.
To Grok or not to Grok: Disentangling Generalization and Memorization on Corrupted Algorithmic Datasets
Darshil Doshi, Aritra Das, Tianyu He et al.
LatentEditor: Text Driven Local Editing of 3D Scenes
Umar Khalid, Hasan Iqbal, Muhammad Tayyab et al.
Deep Orthogonal Hypersphere Compression for Anomaly Detection
Yunhe Zhang, Yan Sun, Jinyu Cai et al.
VTQA: Visual Text Question Answering via Entity Alignment and Cross-Media Reasoning
Kang Chen, Xiangqian Wu
Unsupervised Keypoints from Pretrained Diffusion Models
Eric Hedlin, Gopal Sharma, Shweta Mahajan et al.
Theoretically Achieving Continuous Representation of Oriented Bounding Boxes
Zikai Xiao, Guo-Ye Yang, Xue Yang et al.
LayoutFlow: Flow Matching for Layout Generation
Julian Jorge Andrade Guerreiro, Naoto Inoue, Kento Masui et al.
SegVG: Transferring Object Bounding Box to Segmentation for Visual Grounding
Weitai Kang, Gaowen Liu, Shah Mubarak et al.
OV-Uni3DETR: Towards Unified Open-Vocabulary 3D Object Detection via Cycle-Modality Propagation
Zhenyu Wang, Ya-Li Li, TAICHI LIU et al.
GarmentCodeData: A Dataset of 3D Made-to-Measure Garments With Sewing Patterns
Maria Korosteleva, Timur Levent Kesdogan, Fabian Kemper et al.
NViST: In the Wild New View Synthesis from a Single Image with Transformers
Wonbong Jang, Lourdes Agapito
ProS: Prompting-to-simulate Generalized knowledge for Universal Cross-Domain Retrieval
Fang Kaipeng, Jingkuan Song, Lianli Gao et al.
DreamComposer: Controllable 3D Object Generation via Multi-View Conditions
Yunhan Yang, Yukun Huang, Xiaoyang Wu et al.
Distinguished In Uniform: Self-Attention Vs. Virtual Nodes
Eran Rosenbluth, Jan Tönshoff, Martin Ritzert et al.
NICP: Neural ICP for 3D Human Registration at Scale
Riccardo Marin, Enric Corona, Gerard Pons-Moll
Towards Efficient Diffusion-Based Image Editing with Instant Attention Masks
Siyu Zou, Jiji Tang, Yiyi Zhou et al.
As-Plausible-As-Possible: Plausibility-Aware Mesh Deformation Using 2D Diffusion Priors
Seungwoo Yoo, Kunho Kim, Vladimir G. Kim et al.
Scalable 3D Registration via Truncated Entry-wise Absolute Residuals
Tianyu Huang, Liangzu Peng, Rene Vidal et al.
Split to Merge: Unifying Separated Modalities for Unsupervised Domain Adaptation
Xinyao Li, Yuke Li, Zhekai Du et al.
When Pedestrian Detection Meets Multi-Modal Learning: Generalist Model and Benchmark Dataset
Yi Zhang, Wang Zeng, Sheng Jin et al.
Boosting of Thoughts: Trial-and-Error Problem Solving with Large Language Models
Sijia Chen, Baochun Li, Di Niu
Diffusion-Driven Data Replay: A Novel Approach to Combat Forgetting in Federated Class Continual Learning
Jinglin Liang, Jin Zhong, Hanlin Gu et al.
Dexterous Grasp Transformer
Guo-Hao Xu, Yi-Lin Wei, Dian Zheng et al.
WebVLN: Vision-and-Language Navigation on Websites
Qi Chen, Dileepa Pitawela, Chongyang Zhao et al.
StoryImager: A Unified and Efficient Framework for Coherent Story Visualization and Completion
Ming Tao, BINGKUN BAO, Hao Tang et al.
Discriminability-Driven Channel Selection for Out-of-Distribution Detection
Yue Yuan, Rundong He, Yicong Dong et al.
Fair-VPT: Fair Visual Prompt Tuning for Image Classification
Sungho Park, Hyeran Byun
A Simple and Effective Point-based Network for Event Camera 6-DOFs Pose Relocalization
Hongwei Ren, Jiadong Zhu, Yue Zhou et al.
FedMef: Towards Memory-efficient Federated Dynamic Pruning
Hong Huang, Weiming Zhuang, Chen Chen et al.
OVOR: OnePrompt with Virtual Outlier Regularization for Rehearsal-Free Class-Incremental Learning
Wei-Cheng Huang, Chun-Fu Chen, Hsiang Hsu
MoST: Motion Style Transformer Between Diverse Action Contents
Boeun Kim, Jungho Kim, Hyung Jin Chang et al.
Geometric-Facilitated Denoising Diffusion Model for 3D Molecule Generation
6428 Can Xu, Haosen Wang, Weigang Wang et al.
Enhancing High-Resolution 3D Generation through Pixel-wise Gradient Clipping
Zijie Pan, Jiachen Lu, Xiatian Zhu et al.
T4P: Test-Time Training of Trajectory Prediction via Masked Autoencoder and Actor-specific Token Memory
Daehee Park, Jaeseok Jeong, Sung-Hoon Yoon et al.
PromptCCD: Learning Gaussian Mixture Prompt Pool for Continual Category Discovery
Fernando Julio Cendra, Bingchen Zhao, Kai Han
Benchmarking Algorithms for Federated Domain Generalization
Ruqi Bai, Saurabh Bagchi, David Inouye
DVSAI: Diverse View-Shared Anchors Based Incomplete Multi-View Clustering
Shengju Yu, Siwei Wang, Pei Zhang et al.
Multiagent Multitraversal Multimodal Self-Driving: Open MARS Dataset
Yiming Li, Zhiheng Li, Nuo Chen et al.
PILoRA: Prototype Guided Incremental LoRA for Federated Class-Incremental Learning
Haiyang Guo, Fei Zhu, Wenzhuo Liu et al.
Source Prompt Disentangled Inversion for Boosting Image Editability with Diffusion Models
Ruibin Li, Ruihuang Li, Song Guo et al.
Text Image Inpainting via Global Structure-Guided Diffusion Models
Shipeng Zhu, Pengfei Fang, Chenjie Zhu et al.
Open-Set Domain Adaptation for Semantic Segmentation
Seun-An Choe, Ah-Hyung Shin, Keon Hee Park et al.
Blur-aware Spatio-temporal Sparse Transformer for Video Deblurring
Huicong Zhang, Haozhe Xie, Hongxun Yao
InsMapper: Exploring Inner-instance Information for Vectorized HD Mapping
Zhenhua Xu, Kwan-Yee K. Wong, Hengshuang ZHAO
FAR: Flexible Accurate and Robust 6DoF Relative Camera Pose Estimation
Chris Rockwell, Nilesh Kulkarni, Linyi Jin et al.
Adaptive Bounding Box Uncertainties via Two-Step Conformal Prediction
Alexander Timans, Christoph-Nikolas Straehle, Kaspar Sakmann et al.
HiPose: Hierarchical Binary Surface Encoding and Correspondence Pruning for RGB-D 6DoF Object Pose Estimation
Yongliang Lin, Yongzhi Su, Praveen Nathan et al.
STDiff: Spatio-Temporal Diffusion for Continuous Stochastic Video Prediction
Xi Ye, Guillaume-Alexandre Bilodeau
Improving Virtual Try-On with Garment-focused Diffusion Models
Siqi Wan, Yehao Li, Jingwen Chen et al.
Traffic Flow Optimisation for Lifelong Multi-Agent Path Finding
Zhe Chen, Daniel Harabor, Jiaoyang Li et al.