Most Cited 2024 "time series augmentation" Papers
12,324 papers found • Page 11 of 62
Conference
CIFAR-10-Warehouse: Broad and More Realistic Testbeds in Model Generalization Analysis
Xiaoxiao Sun, Xingjian Leng, Zijian Wang et al.
Decentralized Riemannian Conjugate Gradient Method on the Stiefel Manifold
Jun Chen, Haishan Ye, Mengmeng Wang et al.
MSD: A Benchmark Dataset for Floor Plan Generation of Building Complexes
Casper van Engelenburg, Fatemeh Mostafavi, Emanuel Kuhn et al.
LocLLM: Exploiting Generalizable Human Keypoint Localization via Large Language Model
Dongkai Wang, shiyu xuan, Shiliang Zhang
Evidential Active Recognition: Intelligent and Prudent Open-World Embodied Perception
Lei Fan, Mingfu Liang, Yunxuan Li et al.
Unleashing the Potential of Fractional Calculus in Graph Neural Networks with FROND
Qiyu Kang, Kai Zhao, Qinxu Ding et al.
Transformer-Based Selective Super-resolution for Efficient Image Refinement
Tianyi Zhang, Kishore Kasichainula, Yaoxin Zhuo et al.
Visual Concept Connectome (VCC): Open World Concept Discovery and their Interlayer Connections in Deep Models
Matthew Kowal, Richard P. Wildes, Kosta Derpanis
Mirage: Model-agnostic Graph Distillation for Graph Classification
Mridul Gupta, Sahil Manchanda, HARIPRASAD KODAMANA et al.
Semi-supervised Active Learning for Video Action Detection
Ayush Singh, Aayush J Rana, Akash Kumar et al.
CORN: Contact-based Object Representation for Nonprehensile Manipulation of General Unseen Objects
Yoonyoung Cho, Junhyek Han, Yoontae Cho et al.
SegGen: Supercharging Segmentation Models with Text2Mask and Mask2Img Synthesis
Hanrong Ye, Jason Wen Yong Kuen, Qing Liu et al.
Reinforcement Learning Friendly Vision-Language Model for Minecraft
Haobin Jiang, Junpeng Yue, Hao Luo et al.
Tri^{2}-plane: Thinking Head Avatar via Feature Pyramid
Luchuan Song, Pinxin Liu, Lele Chen et al.
ColorMNet: A Memory-based Deep Spatial-Temporal Feature Propagation Network for Video Colorization
Yixin Yang, Jiangxin Dong, Jinhui Tang et al.
ConsistDreamer: 3D-Consistent 2D Diffusion for High-Fidelity Scene Editing
Jun-Kun Chen, Samuel Rota Bulò, Norman Müller et al.
SAFNet: Selective Alignment Fusion Network for Efficient HDR Imaging
Lingtong Kong, Bo Li, Yike Xiong et al.
Multiple View Geometry Transformers for 3D Human Pose Estimation
Ziwei Liao, jialiang zhu, Chunyu Wang et al.
Adapters Strike Back
Jan-Martin Steitz, Stefan Roth
A Video is Worth 256 Bases: Spatial-Temporal Expectation-Maximization Inversion for Zero-Shot Video Editing
Li Maomao, Yu Li, Tianyu Yang et al.
Learning Camouflaged Object Detection from Noisy Pseudo Label
Jin Zhang, Ruiheng Zhang, Yanjiao Shi et al.
Dissecting Sample Hardness: A Fine-Grained Analysis of Hardness Characterization Methods for Data-Centric AI
Nabeel Seedat, Fergus Imrie, Mihaela van der Schaar
Codebook Transfer with Part-of-Speech for Vector-Quantized Image Modeling
Baoquan Zhang, Huaibin Wang, Luo Chuyao et al.
History Matters: Temporal Knowledge Editing in Large Language Model
Xunjian Yin, Jin Jiang, Liming Yang et al.
VEON: Vocabulary-Enhanced Occupancy Prediction
Jilai Zheng, Pin Tang, Zhongdao Wang et al.
Adversarial Score Distillation: When score distillation meets GAN
Min Wei, Jingkai Zhou, Junyao Sun et al.
Signed Graph Neural Ordinary Differential Equation for Modeling Continuous-Time Dynamics
Lanlan Chen, Kai Wu, Jian Lou et al.
A Simple Latent Diffusion Approach for Panoptic Segmentation and Mask Inpainting
Wouter Van Gansbeke, Bert De Brabandere
Cyclic Learning for Binaural Audio Generation and Localization
Zhaojian Li, Bin Zhao, Yuan Yuan
Accelerating Image Generation with Sub-path Linear Approximation Model
Chen Xu, Tianhui Song, Weixin Feng et al.
CO2: Efficient Distributed Training with Full Communication-Computation Overlap
Weigao Sun, Qin Zhen, Weixuan Sun et al.
MeshSegmenter: Zero-Shot Mesh Segmentation via Texture Synthesis
ziming zhong, Yanyu Xu, Jing Li et al.
What How and When Should Object Detectors Update in Continually Changing Test Domains?
Jayeon Yoo, Dongkwan Lee, Inseop Chung et al.
CSL: Class-Agnostic Structure-Constrained Learning for Segmentation including the Unseen
Hao Zhang, Fang Li, Lu Qi et al.
Gaussian Shadow Casting for Neural Characters
Luis Bolanos, Shih-Yang Su, Helge Rhodin
PCE-Palm: Palm Crease Energy Based Two-Stage Realistic Pseudo-Palmprint Generation
Lei Shen, Jianlong Jin, Ruixin Zhang et al.
Binarized Low-light Raw Video Enhancement
Gengchen Zhang, Yulun Zhang, Xin Yuan et al.
VSFormer: Visual-Spatial Fusion Transformer for Correspondence Pruning
Tangfei Liao, Xiaoqin Zhang, Li Zhao et al.
The First to Know: How Token Distributions Reveal Hidden Knowledge in Large Vision-Language Models?
Qinyu Zhao, Ming Xu, Kartik Gupta et al.
Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns
Brian DuSell, David Chiang
LookupViT: Compressing visual information to a limited number of tokens
Rajat Koner, Gagan Jain, Sujoy Paul et al.
Enhancing Vision-Language Pre-training with Rich Supervisions
Yuan Gao, Kunyu Shi, Pengkai Zhu et al.
LaPose: Laplacian Mixture Shape Modeling for RGB-Based Category-Level Object Pose Estimation
Ruida Zhang, Ziqin Huang, Gu Wang et al.
BaCon: Boosting Imbalanced Semi-supervised Learning via Balanced Feature-Level Contrastive Learning
Qianhan Feng, Lujing Xie, Shijie Fang et al.
OmniMotionGPT: Animal Motion Generation with Limited Data
Zhangsihao Yang, Mingyuan Zhou, Mengyi Shan et al.
Compositional Generative Inverse Design
Tailin Wu, Takashi Maruyama, Long Wei et al.
ConTex-Human: Free-View Rendering of Human from a Single Image with Texture-Consistent Synthesis
Xiangjun Gao, Xiaoyu Li, Chaopeng Zhang et al.
Towards Robust Fidelity for Evaluating Explainability of Graph Neural Networks
Xu Zheng, Farhad Shirani, Tianchun Wang et al.
Progressive Divide-and-Conquer via Subsampling Decomposition for Accelerated MRI
Chong Wang, Lanqing Guo, Yufei Wang et al.
Enhancing Source-Free Domain Adaptive Object Detection with Low-confidence Pseudo Label Distillation
Ilhoon Yoon, Hyeongjun Kwon, Jin Kim et al.
Bridging the Synthetic-to-Authentic Gap: Distortion-Guided Unsupervised Domain Adaptation for Blind Image Quality Assessment
Aobo Li, Jinjian Wu, Yongxu Liu et al.
CREAD: A Classification-Restoration Framework with Error Adaptive Discretization for Watch Time Prediction in Video Recommender Systems
Jie Sun, Zhao Ying Ding, Xiaoshuang Chen et al.
ScanFormer: Referring Expression Comprehension by Iteratively Scanning
Wei Su, Peihan Miao, Huanzhang Dou et al.
DG-PIC: Domain Generalized Point-In-Context Learning for Point Cloud Understanding
Jincen Jiang, Qianyu Zhou, Yuhang Li et al.
Adversarial Training Should Be Cast as a Non-Zero-Sum Game
Alex Robey, Fabian Latorre, George Pappas et al.
UniMD: Towards Unifying Moment Retrieval and Temporal Action Detection
Yingsen Zeng, Yujie Zhong, Chengjian Feng et al.
DiffHuman: Probabilistic Photorealistic 3D Reconstruction of Humans
Akash Sengupta, Thiemo Alldieck, NIKOS KOLOTOUROS et al.
Tensorized Label Learning on Anchor Graph
Jing Li, Quanxue Gao, Qianqian Wang et al.
CardiacNet: Learning to Reconstruct Abnormalities for Cardiac Disease Assessment from Echocardiogram Videos
JIEWEN YANG, Yiqun Lin, Bin Pu et al.
Animal Avatars: Reconstructing Animatable 3D Animals from Casual Videos
Remy Sabathier, David Novotny, Niloy Mitra
GenesisTex: Adapting Image Denoising Diffusion to Texture Space
Chenjian Gao, Boyan Jiang, Xinghui Li et al.
Aligner$^2$: Enhancing Joint Multiple Intent Detection and Slot Filling via Adjustive and Forced Cross-Task Alignment
Zhihong Zhu, Xuxin Cheng, Yaowei Li et al.
Get an A in Math: Progressive Rectification Prompting
Zhenyu Wu, Meng Jiang, Chao Shen
Learning Optimal Advantage from Preferences and Mistaking It for Reward
W Bradley Knox, Stephane Hatgis-Kessell, Sigurdur Orn Adalgeirsson et al.
Living Scenes: Multi-object Relocalization and Reconstruction in Changing 3D Environments
Liyuan Zhu, Shengyu Huang, Konrad Schindler et al.
Multichannel AV-wav2vec2: A Framework for Learning Multichannel Multi-Modal Speech Representation
Qiushi Zhu, Jie Zhang, Yu Gu et al.
Make Me a BNN: A Simple Strategy for Estimating Bayesian Uncertainty from Pre-trained Models
Gianni Franchi, Olivier Laurent, Maxence Leguéry et al.
Diffusion Bridges for 3D Point Cloud Denoising
Mathias Vogel, Keisuke Tateno, Marc Pollefeys et al.
One-Shot Structure-Aware Stylized Image Synthesis
Hansam Cho, Jonghyun Lee, Seunggyu Chang et al.
FRDiff : Feature Reuse for Universal Training-free Acceleration of Diffusion Models
Junhyuk So, Jungwon Lee, Eunhyeok Park
Table of Contents
Pengfei Hu, Zhenrong Zhang, Jianshu Zhang et al.
NeRF Director: Revisiting View Selection in Neural Volume Rendering
Wenhui Xiao, Rodrigo Santa Cruz, David Ahmedt-Aristizabal et al.
Negative Pre-aware for Noisy Cross-Modal Matching
Xu Zhang, Hao Li, Mang Ye
TeMO: Towards Text-Driven 3D Stylization for Multi-Object Meshes
Xuying Zhang, Bo-Wen Yin, yuming chen et al.
Learning MDL Logic Programs from Noisy Data
Céline Hocquette, Andreas Niskanen, Matti Järvisalo et al.
Kandinsky Conformal Prediction: Efficient Calibration of Image Segmentation Algorithms
Joren Brunekreef, Eric Marcus, Ray Sheombarsing et al.
Exploring Vision Transformers for 3D Human Motion-Language Models with Motion Patches
Qing Yu, Mikihiro Tanaka, Kent Fujiwara
Single-to-Dual-View Adaptation for Egocentric 3D Hand Pose Estimation
Ruicong Liu, Takehiko Ohkawa, Mingfang Zhang et al.
Hybrid Proposal Refiner: Revisiting DETR Series from the Faster R-CNN Perspective
Jinjing Zhao, Fangyun Wei, Chang Xu
Morphable Diffusion: 3D-Consistent Diffusion for Single-image Avatar Creation
Xiyi Chen, Marko Mihajlovic, Shaofei Wang et al.
A Simple Background Augmentation Method for Object Detection with Diffusion Model
YUHANG LI, Xin Dong, Chen Chen et al.
Quad Bayer Joint Demosaicing and Denoising Based on Dual Encoder Network with Joint Residual Learning
Bolun Zheng, Li Haoran, Quan Chen et al.
OVER-NAV: Elevating Iterative Vision-and-Language Navigation with Open-Vocabulary Detection and StructurEd Representation
Ganlong Zhao, Guanbin Li, Weikai Chen et al.
Kill Two Birds with One Stone: Rethinking Data Augmentation for Deep Long-tailed Learning
Binwu Wang, Pengkun Wang, Wei Xu et al.
Open Panoramic Segmentation
Junwei Zheng, Ruiping Liu, Yufan Chen et al.
Semi-supervised Open-World Object Detection
Sahal Shaji Mullappilly, Abhishek Singh Gehlot, Rao Muhammad Anwer et al.
AugUndo: Scaling Up Augmentations for Monocular Depth Completion and Estimation
Yangchao Wu, Tian Yu Liu, Hyoungseob Park et al.
Improving Spectral Snapshot Reconstruction with Spectral-Spatial Rectification
Jiancheng Zhang, Haijin Zeng, Yongyong Chen et al.
Tri-Modal Motion Retrieval by Learning a Joint Embedding Space
Kangning Yin, Shihao Zou, Yuxuan Ge et al.
Learning to Visually Localize Sound Sources from Mixtures without Prior Source Knowledge
Dongjin Kim, Sung Jin Um, Sangmin Lee et al.
A Noisy Elephant in the Room: Is Your Out-of-Distribution Detector Robust to Label Noise?
Galadrielle Humblot-Renaux, Sergio Escalera, Thomas B. Moeslund
ColNeRF: Collaboration for Generalizable Sparse Input Neural Radiance Field
Zhangkai Ni, Peiqi Yang, Wenhan Yang et al.
HiLo: Detailed and Robust 3D Clothed Human Reconstruction with High-and Low-Frequency Information of Parametric Models
Yifan Yang, Dong Liu, Shuhai Zhang et al.
Multimarginal Generative Modeling with Stochastic Interpolants
Michael Albergo, Nicholas Boffi, Michael Lindsey et al.
Instant 3D Human Avatar Generation using Image Diffusion Models
Nikos Kolotouros, Thiemo Alldieck, Enric Corona et al.
Efficient Vision-Language Pre-training by Cluster Masking
Zihao Wei, Zixuan Pan, Andrew Owens
Diffusion in Diffusion: Cyclic One-Way Diffusion for Text-Vision-Conditioned Generation
Ruoyu Wang, Yongqi Yang, Zhihao Qian et al.
Hypergraph Joint Representation Learning for Hypervertices and Hyperedges via Cross Expansion
Yuguang Yan, Yuanlin Chen, Shibo Wang et al.
Predicated Diffusion: Predicate Logic-Based Attention Guidance for Text-to-Image Diffusion Models
Kota Sueyoshi, Takashi Matsubara
PixOOD: Pixel-Level Out-of-Distribution Detection
Tomas Vojir, Jan Sochman, Jiri Matas
Unveiling Typographic Deceptions: Insights of the Typographic Vulnerability in Large Vision-Language Models
Hao Cheng, Erjia Xiao, Jindong Gu et al.
Bidirectional Autoregessive Diffusion Model for Dance Generation
Canyu Zhang, Youbao Tang, NING Zhang et al.
Is attention required for ICL? Exploring the Relationship Between Model Architecture and In-Context Learning Ability
Ivan Lee, Nan Jiang, Taylor Berg-Kirkpatrick
Instance-Aware Group Quantization for Vision Transformers
Jaehyeon Moon, Dohyung Kim, Jun Yong Cheon et al.
AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation
Zanlin Ni, Yulin Wang, Renping Zhou et al.
Dynamic Retraining-Updating Mean Teacher for Source-Free Object Detection
BA KHANH TRINH LE, Huy-Hung Nguyen, Long Hoang Pham et al.
The Audio-Visual Conversational Graph: From an Egocentric-Exocentric Perspective
Wenqi Jia, Miao Liu, Hao Jiang et al.
Large-Scale Multi-Hypotheses Cell Tracking Using Ultrametric Contours Maps
Jordao Bragantini, Merlin Lange, Loïc A Royer
Faceptor: A Generalist Model for Face Perception
Lixiong Qin, Mei Wang, Xuannan Liu et al.
SOAC: Spatio-Temporal Overlap-Aware Multi-Sensor Calibration using Neural Radiance Fields
Quentin HERAU, Nathan Piasco, Moussab Bennehar et al.
Complementing Event Streams and RGB Frames for Hand Mesh Reconstruction
Jianping Jiang, xinyu zhou, Bingxuan Wang et al.
Are Human-generated Demonstrations Necessary for In-context Learning?
Rui Li, Guoyin Wang, Jiwei Li
Versatile Medical Image Segmentation Learned from Multi-Source Datasets via Model Self-Disambiguation
Xiaoyang Chen, Hao Zheng, Yuemeng LI et al.
FoX: Formation-Aware Exploration in Multi-Agent Reinforcement Learning
Yonghyeon Jo, Sunwoo Lee, Junghyuk Yum et al.
Rate-Distortion-Cognition Controllable Versatile Neural Image Compression
Jinming Liu, Ruoyu Feng, Yunpeng Qi et al.
NeRF Analogies: Example-Based Visual Attribute Transfer for NeRFs
Michael Fischer, Zhengqin Li, Thu Nguyen-Phuoc et al.
Region-Based Representations Revisited
Michal Shlapentokh-Rothman, Ansel Blume, Yao Xiao et al.
Unsupervised Cross-Domain Image Retrieval via Prototypical Optimal Transport
Bin Li, Ye Shi, Qian Yu et al.
AutoCast++: Enhancing World Event Prediction with Zero-shot Ranking-based Context Retrieval
Qi Yan, Raihan Seraj, Jiawei He et al.
Protein Multimer Structure Prediction via Prompt Learning
Ziqi Gao, Xiangguo SUN, Zijing Liu et al.
A Good Learner can Teach Better: Teacher-Student Collaborative Knowledge Distillation
Ayan Sengupta, Shantanu Dixit, Md Shad Akhtar et al.
MERGE: Fast Private Text Generation
Zi Liang, Pinghui Wang, Ruofei Zhang et al.
Event-Adapted Video Super-Resolution
Zeyu Xiao, Dachun Kai, Yueyi Zhang et al.
ReCoRe: Regularized Contrastive Representation Learning of World Model
Rudra P, K. Poudel, Harit Pandya et al.
SURE: SUrvey REcipes for building reliable and robust deep networks
Yuting Li, Yingyi Chen, Xuanlong Yu et al.
Online GNN Evaluation Under Test-time Graph Distribution Shifts
Xin Zheng, Dongjin Song, Qingsong Wen et al.
CLIFF: Continual Latent Diffusion for Open-Vocabulary Object Detection
Wuyang Li, Xinyu Liu, Jiayi Ma et al.
CNN Kernels Can Be the Best Shapelets
Eric Qu, Yansen Wang, Xufang Luo et al.
EvSign: Sign Language Recognition and Translation with Streaming Events
Pengyu Zhang, Hao Yin, Zeren Wang et al.
Multi-Objective Bayesian Optimization with Active Preference Learning
Ryota Ozaki, Kazuki Ishikawa, Youhei Kanzaki et al.
One Forward is Enough for Neural Network Training via Likelihood Ratio Method
Jinyang Jiang, Zeliang Zhang, Chenliang Xu et al.
Accelerating Data Generation for Neural Operators via Krylov Subspace Recycling
Hong Wang, Zhongkai Hao, Jie Wang et al.
ProCC: Progressive Cross-Primitive Compatibility for Open-World Compositional Zero-Shot Learning
Fushuo Huo, Wenchao Xu, Song Guo et al.
LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction
Penghui Du, Yu Wang, Yifan Sun et al.
In2SET: Intra-Inter Similarity Exploiting Transformer for Dual-Camera Compressive Hyperspectral Imaging
Xin Wang, Lizhi Wang, Xiangtian Ma et al.
SPADE: Semi-supervised Anomaly Detection under Distribution Mismatch
Chun-Liang Li, Tomas Pfister, Kihyuk Sohn et al.
NeuSDFusion: A Spatial-Aware Generative Model for 3D Shape Completion, Reconstruction, and Generation
Ruikai Cui, Weizhe Liu, Weixuan Sun et al.
TetraDiffusion: Tetrahedral Diffusion Models for 3D Shape Generation
Nikolai Kalischek, Torben Peters, Jan Dirk Wegner et al.
Sequential Fusion Based Multi-Granularity Consistency for Space-Time Transformer Tracking
Kun Hu, Wenjing Yang, Wanrong Huang et al.
Non-Exemplar Domain Incremental Learning via Cross-Domain Concept Integration
Qiang Wang, Yuhang He, Songlin Dong et al.
The Effect of Intrinsic Dataset Properties on Generalization: Unraveling Learning Differences Between Natural and Medical Images
Nicholas Konz, Maciej Mazurowski
MFABA: A More Faithful and Accelerated Boundary-Based Attribution Method for Deep Neural Networks
Zhiyu Zhu, Huaming Chen, Jiayu Zhang et al.
BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation
Yunhao Ge, Yihe Tang, Jiashu Xu et al.
Bootstrapping Chest CT Image Understanding by Distilling Knowledge from X-ray Expert Models
Weiwei Cao, Jianpeng Zhang, Yingda Xia et al.
En3D: An Enhanced Generative Model for Sculpting 3D Humans from 2D Synthetic Data
Yifang Men, Biwen Lei, Yuan Yao et al.
Towards Fair Graph Federated Learning via Incentive Mechanisms
12794 Chenglu Pan, Jiarong Xu, Yue Yu et al.
Learning Explicit Contact for Implicit Reconstruction of Hand-Held Objects from Monocular Images
Junxing Hu, Hongwen Zhang, Zerui Chen et al.
GRIDS: Grouped Multiple-Degradation Restoration with Image Degradation Similarity
Shuo Cao, Yihao Liu, Wenlong Zhang et al.
Improving Text-guided Object Inpainting with Semantic Pre-inpainting
Yifu Chen, Jingwen Chen, Yingwei Pan et al.
GenN2N: Generative NeRF2NeRF Translation
Xiangyue Liu, Han Xue, Kunming Luo et al.
HumanRefiner: Benchmarking Abnormal Human Generation and Refining with Coarse-to-fine Pose-Reversible Guidance
Guian Fang, Wenbiao Yan, Yuanfan Guo et al.
HEAL-SWIN: A Vision Transformer On The Sphere
Oscar Carlsson, Jan E. Gerken, Hampus Linander et al.
Unifying Automatic and Interactive Matting with Pretrained ViTs
Zixuan Ye, Wenze Liu, He Guo et al.
Hyperbolic Learning with Synthetic Captions for Open-World Detection
Fanjie Kong, Yanbei Chen, Jiarui Cai et al.
DailyDVS-200: A Comprehensive Benchmark Dataset for Event-Based Action Recognition
Qi Wang, Zhou Xu, Yuming Lin et al.
CAMIL: Context-Aware Multiple Instance Learning for Cancer Detection and Subtyping in Whole Slide Images
olga fourkioti, Matt De Vries, Chris Bakal
On the Road to Portability: Compressing End-to-End Motion Planner for Autonomous Driving
Kaituo Feng, Changsheng Li, Dongchun Ren et al.
Dynamic Feature Pruning and Consolidation for Occluded Person Re-identification
YuTeng Ye, Hang Zhou, Jiale Cai et al.
Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation
Shihao Zhao, Shaozhe Hao, Bojia Zi et al.
X-Pose: Detecting Any Keypoints
Jie Yang, AILING ZENG, Ruimao Zhang et al.
Free-Editor: Zero-shot Text-driven 3D Scene Editing
Md Nazmul Karim, Hasan Iqbal, Umar Khalid et al.
Pre-training with Random Orthogonal Projection Image Modeling
Maryam Haghighat, Peyman Moghadam, Shaheer Mohamed et al.
Reward-Consistent Dynamics Models are Strongly Generalizable for Offline Reinforcement Learning
Fan-Ming Luo, Tian Xu, Xingchen Cao et al.
Exploiting Auxiliary Caption for Video Grounding
Hongxiang Li, Meng Cao, Xuxin Cheng et al.
HONGAT: Graph Attention Networks in the Presence of High-Order Neighbors
Heng-Kai Zhang, Yi-Ge Zhang, Zhi Zhou et al.
AttnZero: Efficient Attention Discovery for Vision Transformers
Lujun Li, Zimian Wei, Peijie Dong et al.
Customization Assistant for Text-to-Image Generation
Yufan Zhou, Ruiyi Zhang, Jiuxiang Gu et al.
Referring Atomic Video Action Recognition
Kunyu Peng, Jia Fu, Kailun Yang et al.
MoVideo: Motion-Aware Video Generation with Diffusion Models
Jingyun Liang, Yuchen Fan, Kai Zhang et al.
Event Camera Data Dense Pre-training
Yan Yang, Liyuan Pan, Liu liu
What You See is What You GAN: Rendering Every Pixel for High-Fidelity Geometry in 3D GANs
Alex Trevithick, Matthew Chan, Towaki Takikawa et al.
Finding Visual Task Vectors
Alberto Hojel, Yutong Bai, Trevor Darrell et al.
M2Doc: A Multi-Modal Fusion Approach for Document Layout Analysis
Ning Zhang, Hiuyi Cheng, Jiayu Chen et al.
Vision-Language Action Knowledge Learning for Semantic-Aware Action Quality Assessment
Huangbiao Xu, Xiao Ke, Yuezhou Li et al.
RoDUS: Robust Decomposition of Static and Dynamic Elements in Urban Scenes
Thang-Anh-Quan Nguyen, Luis G Roldao Jimenez, Nathan Piasco et al.
Generalizable Task Representation Learning for Offline Meta-Reinforcement Learning with Data Limitations
Renzhe Zhou, Chen-Xiao Gao, Zongzhang Zhang et al.
Learning to Learn Better Visual Prompts
Fengxiang Wang, Wanrong Huang, Shaowu Yang et al.
TASeg: Temporal Aggregation Network for LiDAR Semantic Segmentation
Xiaopei Wu, Yuenan Hou, Xiaoshui Huang et al.
Generative 3D Part Assembly via Part-Whole-Hierarchy Message Passing
Bi'an Du, Xiang Gao, Wei Hu et al.
CoBIT: A Contrastive Bi-directional Image-Text Generation Model
Haoxuan You, Xiaoyue Guo, Zhecan Wang et al.
Spiking NeRF: Representing the Real-World Geometry by a Discontinuous Representation
Zhanfeng Liao, Yan Liu, Qian Zheng et al.
Find n' Propagate: Open-Vocabulary 3D Object Detection in Urban Environments
Djamahl Etchegaray, Zi Helen Huang, Tatsuya Harada et al.
DiffSal: Joint Audio and Video Learning for Diffusion Saliency Prediction
Junwen Xiong, Peng Zhang, Tao You et al.
PIN: Positional Insert Unlocks Object Localisation Abilities in VLMs
Michael Dorkenwald, Nimrod Barazani, Cees G. M. Snoek et al.
EntAugment: Entropy-Driven Adaptive Data Augmentation Framework for Image Classification
Suorong Yang, Furao Shen, Jian Zhao
AMD: Automatic Multi-step Distillation of Large-scale Vision Models
Cheng Han, Qifan Wang, Sohail A Dianat et al.
Deep Structural Knowledge Exploitation and Synergy for Estimating Node Importance Value on Heterogeneous Information Networks
Yankai Chen, Yixiang Fang, Qiongyan Wang et al.
LEIA: Latent View-invariant Embeddings for Implicit 3D Articulation
Archana Swaminathan, Anubhav Anubhav, Kamal Gupta et al.
Retrieval-Guided Reinforcement Learning for Boolean Circuit Minimization
Animesh Basak Chowdhury, Marco Romanelli, Benjamin Tan et al.
FoSp: Focus and Separation Network for Early Smoke Segmentation
Lujian Yao, Haitao Zhao, Jingchao Peng et al.
LayoutDETR: Detection Transformer Is a Good Multimodal Layout Designer
Ning Yu, Chia-Chih Chen, Zeyuan Chen et al.
A Restoration Network as an Implicit Prior
Yuyang Hu, Mauricio Delbracio, Peyman Milanfar et al.
Towards Robust Image Stitching: An Adaptive Resistance Learning against Compatible Attacks
Zhiying Jiang, Xingyuan Li, Jinyuan Liu et al.
HUMOS: Human Motion Model Conditioned on Body Shape
Shashank Tripathi, Omid Taheri, Christoph Lassner et al.
Regroup Median Loss for Combating Label Noise
Authors: Fengpeng Li, Kemou Li, Jinyu Tian et al.
CDPNet: Cross-Modal Dual Phases Network for Point Cloud Completion
Zhenjiang Du, Jiale Dou, Zhitao Liu et al.
Temporal Event Stereo via Joint Learning with Stereoscopic Flow
Hoonhee Cho, Jae-young Kang, Kuk-Jin Yoon
Towards Efficient and Effective Text-to-Video Retrieval with Coarse-to-Fine Visual Representation Learning
Kaibin Tian, Yanhua Cheng, Yi Liu et al.
Adversarial Backdoor Attack by Naturalistic Data Poisoning on Trajectory Prediction in Autonomous Driving
Mozhgan Pourkeshavarz, Mohammad Sabokrou, Amir Rasouli
Rating-Based Reinforcement Learning
Devin White, Mingkang Wu, Ellen Novoseller et al.