Most Cited 2024 "memory tokens" Papers

12,324 papers found • Page 6 of 62

#1001

Generalizable Human Gaussians for Sparse View Synthesis

Youngjoong Kwon, Baole Fang, Yixing Lu et al.

ECCV 2024posterarXiv:2407.12777
34
citations
#1002

Detecting As Labeling: Rethinking LiDAR-camera Fusion in 3D Object Detection

Junjie Huang, Yun Ye, Zhujin Liang et al.

ECCV 2024posterarXiv:2311.07152
34
citations
#1003

Ray Denoising: Depth-aware Hard Negative Sampling for Multi-view 3D Object Detection

Feng Liu, Tengteng Huang, Qianjing Zhang et al.

ECCV 2024posterarXiv:2402.03634
34
citations
#1004

Multi-Modal Latent Space Learning for Chain-of-Thought Reasoning in Language Models

Liqi He, Zuchao Li, Xiantao Cai et al.

AAAI 2024paperarXiv:2312.08762
34
citations
#1005

Don't Play Favorites: Minority Guidance for Diffusion Models

Soobin Um, Suhyeon Lee, Jong Chul YE

ICLR 2024posterarXiv:2301.12334
33
citations
#1006

AnimatableDreamer: Text-Guided Non-rigid 3D Model Generation and Reconstruction with Canonical Score Distillation

Xinzhou Wang, Yikai Wang, junliang ye et al.

ECCV 2024posterarXiv:2312.03795
33
citations
#1007

Learning 3D Geometry and Feature Consistent Gaussian Splatting for Object Removal

YUXIN WANG, Qianyi Wu, Guofeng Zhang et al.

ECCV 2024posterarXiv:2404.13679
33
citations
#1008

Revisiting Link Prediction: a data perspective

Haitao Mao, Juanhui Li, Harry Shomer et al.

ICLR 2024posterarXiv:2310.00793
33
citations
#1009

Explaining Generalization Power of a DNN Using Interactive Concepts

Huilin Zhou, Hao Zhang, Huiqi Deng et al.

AAAI 2024paperarXiv:2302.13091
33
citations
#1010

Learning Object State Changes in Videos: An Open-World Perspective

Zihui Xue, Kumar Ashutosh, Kristen Grauman

CVPR 2024posterarXiv:2312.11782
33
citations
#1011

Low Rank Matrix Completion via Robust Alternating Minimization in Nearly Linear Time

Yuzhou Gu, Zhao Song, Junze Yin et al.

ICLR 2024posterarXiv:2302.11068
33
citations
#1012

Simple Semantic-Aided Few-Shot Learning

Hai Zhang, Junzhe Xu, Shanlin Jiang et al.

CVPR 2024posterarXiv:2311.18649
33
citations
#1013

SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its Teacher

Trung Dao, Thuan Nguyen, Thanh Van Le et al.

ECCV 2024posterarXiv:2408.14176
33
citations
#1014

AUFormer: Vision Transformers are Parameter-Efficient Facial Action Unit Detectors

Kaishen Yuan, Zitong Yu, Xin Liu et al.

ECCV 2024posterarXiv:2403.04697
33
citations
#1015

Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation

Yunhao Ge, Xiaohui Zeng, Jacob Huffman et al.

CVPR 2024posterarXiv:2404.19752
33
citations
#1016

MoAI: Mixture of All Intelligence for Large Language and Vision Models

Byung-Kwan Lee, Beomchan Park, Chae Won Kim et al.

ECCV 2024posterarXiv:2403.07508
33
citations
#1017

LiDAR4D: Dynamic Neural Fields for Novel Space-time View LiDAR Synthesis

Zehan Zheng, Fan Lu, Weiyi Xue et al.

CVPR 2024posterarXiv:2404.02742
33
citations
#1018

Token-Level Contrastive Learning with Modality-Aware Prompting for Multimodal Intent Recognition

Qianrui Zhou, Hua Xu, Hao Li et al.

AAAI 2024paperarXiv:2312.14667
33
citations
#1019

Asynchronous Large Language Model Enhanced Planner for Autonomous Driving

Yuan Chen, Zi-han Ding, Ziqin Wang et al.

ECCV 2024posterarXiv:2406.14556
33
citations
#1020

XPSR: Cross-modal Priors for Diffusion-based Image Super-Resolution

Yunpeng Qu, Kun Yuan, Kai Zhao et al.

ECCV 2024posterarXiv:2403.05049
33
citations
#1021

Provably Powerful Graph Neural Networks for Directed Multigraphs

Beni Egressy, Luc von Niederhäusern, Jovan Blanuša et al.

AAAI 2024paperarXiv:2306.11586
33
citations
#1022

Audio-Synchronized Visual Animation

Lin Zhang, Shentong Mo, Yijing Zhang et al.

ECCV 2024posterarXiv:2403.05659
33
citations
#1023

AutoAD III: The Prequel – Back to the Pixels

Tengda Han, Max Bain, Arsha Nagrani et al.

CVPR 2024posterarXiv:2404.14412
33
citations
#1024

Time Weaver: A Conditional Time Series Generation Model

Sai Shankar Narasimhan, Shubhankar Agarwal, Oguzhan Akcin et al.

ICML 2024spotlightarXiv:2403.02682
33
citations
#1025

Multi-Prompts Learning with Cross-Modal Alignment for Attribute-Based Person Re-identification

Yajing Zhai, Yawen Zeng, Zhiyong Huang et al.

AAAI 2024paperarXiv:2312.16797
33
citations
#1026

CoGS: Controllable Gaussian Splatting

Heng Yu, Joel Julin, Zoltán Á. Milacski et al.

CVPR 2024posterarXiv:2312.05664
33
citations
#1027

GTP-4o: Modality-prompted Heterogeneous Graph Learning for Omni-modal Biomedical Representation

Chenxin Li, Xinyu Liu, Cheng Wang et al.

ECCV 2024posterarXiv:2407.05540
33
citations
#1028

Concept-Guided Prompt Learning for Generalization in Vision-Language Models

Yi Zhang, Ce Zhang, Ke Yu et al.

AAAI 2024paperarXiv:2401.07457
33
citations
#1029

Point-PEFT: Parameter-Efficient Fine-Tuning for 3D Pre-trained Models

Yiwen Tang, Ray Zhang, Zoey Guo et al.

AAAI 2024paperarXiv:2310.03059
33
citations
#1030

The Hidden Language of Diffusion Models

Hila Chefer, Oran Lang, Mor Geva et al.

ICLR 2024posterarXiv:2306.00966
33
citations
#1031

NeuroPictor: Refining fMRI-to-Image Reconstruction via Multi-individual Pretraining and Multi-level Modulation

Jingyang Huo, Yikai Wang, Yanwei Fu et al.

ECCV 2024posterarXiv:2403.18211
33
citations
#1032

FairSIN: Achieving Fairness in Graph Neural Networks through Sensitive Information Neutralization

Cheng Yang, Jixi Liu, Yunhe Yan et al.

AAAI 2024paperarXiv:2403.12474
33
citations
#1033

Synergistic Multiscale Detail Refinement via Intrinsic Supervision for Underwater Image Enhancement

Dehuan Zhang, Jingchun Zhou, Chunle Guo et al.

AAAI 2024paperarXiv:2308.11932
33
citations
#1034

Spurious Feature Diversification Improves Out-of-distribution Generalization

LIN Yong, Lu Tan, Yifan HAO et al.

ICLR 2024posterarXiv:2309.17230
33
citations
#1035

Training Unbiased Diffusion Models From Biased Dataset

Yeongmin Kim, Byeonghu Na, Minsang Park et al.

ICLR 2024posterarXiv:2403.01189
32
citations
#1036

CPPO: Continual Learning for Reinforcement Learning with Human Feedback

Han Zhang, Yu Lei, Lin Gui et al.

ICLR 2024poster
32
citations
#1037

Graph-Aware Contrasting for Multivariate Time-Series Classification

Yucheng Wang, Yuecong Xu, Jianfei Yang et al.

AAAI 2024paperarXiv:2309.05202
32
citations
#1038

SAM-guided Graph Cut for 3D Instance Segmentation

Haoyu Guo, He Zhu, Sida Peng et al.

ECCV 2024posterarXiv:2312.08372
32
citations
#1039

REACTO: Reconstructing Articulated Objects from a Single Video

Chaoyue Song, Jiacheng Wei, Chuan-Sheng Foo et al.

CVPR 2024posterarXiv:2404.11151
32
citations
#1040

Towards Generalizable Multi-Object Tracking

Zheng Qin, Le Wang, Sanping Zhou et al.

CVPR 2024posterarXiv:2406.00429
32
citations
#1041

Jointly Training Large Autoregressive Multimodal Models

Emanuele Aiello, Lili Yu, Yixin Nie et al.

ICLR 2024posterarXiv:2309.15564
32
citations
#1042

Graph Invariant Learning with Subgraph Co-mixup for Out-of-Distribution Generalization

Tianrui Jia, Haoyang Li, Cheng Yang et al.

AAAI 2024paperarXiv:2312.10988
32
citations
#1043

Towards a Simultaneous and Granular Identity-Expression Control in Personalized Face Generation

Renshuai Liu, Bowen Ma, Wei Zhang et al.

CVPR 2024highlightarXiv:2401.01207
32
citations
#1044

Three Pillars Improving Vision Foundation Model Distillation for Lidar

Gilles Puy, Spyros Gidaris, Alexandre Boulch et al.

CVPR 2024posterarXiv:2310.17504
32
citations
#1045

OpenStreetView-5M: The Many Roads to Global Visual Geolocation

Guillaume Astruc, Nicolas Dufour, Ioannis Siglidis et al.

CVPR 2024posterarXiv:2404.18873
32
citations
#1046

Rethinking Generalizable Face Anti-spoofing via Hierarchical Prototype-guided Distribution Refinement in Hyperbolic Space

Chengyang Hu, Ke-Yue Zhang, Taiping Yao et al.

CVPR 2024highlight
32
citations
#1047

Collaborating Foundation Models for Domain Generalized Semantic Segmentation

Yasser Benigmim, Subhankar Roy, Slim Essid et al.

CVPR 2024posterarXiv:2312.09788
32
citations
#1048

Collaborative Vision-Text Representation Optimizing for Open-Vocabulary Segmentation

Siyu Jiao, hongguang Zhu, Yunchao Wei et al.

ECCV 2024posterarXiv:2408.00744
32
citations
#1049

High-fidelity Person-centric Subject-to-Image Synthesis

Yibin Wang, Weizhong Zhang, Jianwei Zheng et al.

CVPR 2024posterarXiv:2311.10329
32
citations
#1050

Random Feature Amplification: Feature Learning and Generalization in Neural Networks

Spencer Frei, Niladri Chatterji, Peter L. Bartlett

ICLR 2024posterarXiv:2202.07626
32
citations
#1051

MAS: Multi-view Ancestral Sampling for 3D Motion Generation Using 2D Diffusion

Roy Kapon, Guy Tevet, Daniel Cohen-Or et al.

CVPR 2024posterarXiv:2310.14729
32
citations
#1052

View-decoupled Transformer for Person Re-identification under Aerial-ground Camera Network

Quan Zhang, Lei Wang, Vishal M. Patel et al.

CVPR 2024posterarXiv:2403.14513
32
citations
#1053

Inversion-Free Image Editing with Language-Guided Diffusion Models

Sihan Xu, Yidong Huang, Jiayi Pan et al.

CVPR 2024poster
32
citations
#1054

Frequency-Adaptive Pan-Sharpening with Mixture of Experts

Xuanhua He, Keyu Yan, Rui Li et al.

AAAI 2024paperarXiv:2401.02151
32
citations
#1055

Exploring Sparse Visual Prompt for Domain Adaptive Dense Prediction

Senqiao Yang, Jiarui Wu, Jiaming Liu et al.

AAAI 2024paperarXiv:2303.09792
32
citations
#1056

CFR-ICL: Cascade-Forward Refinement with Iterative Click Loss for Interactive Image Segmentation

Shoukun Sun, Min Xian, Fei Xu et al.

AAAI 2024paperarXiv:2303.05620
32
citations
#1057

LiDAR-PTQ: Post-Training Quantization for Point Cloud 3D Object Detection

Sifan Zhou, Liang Li, Xinyu Zhang et al.

ICLR 2024posterarXiv:2401.15865
32
citations
#1058

Transductive Zero-Shot and Few-Shot CLIP

Ségolène Martin, Yunshi HUANG, Fereshteh Shakeri et al.

CVPR 2024highlightarXiv:2405.18437
32
citations
#1059

Rethinking Graph Masked Autoencoders through Alignment and Uniformity

Liang Wang, Xiang Tao, Qiang Liu et al.

AAAI 2024paperarXiv:2402.07225
32
citations
#1060

Spherical Linear Interpolation and Text-Anchoring for Zero-shot Composed Image Retrieval

Young Kyun Jang, Dat B Huynh, Ashish Shah et al.

ECCV 2024posterarXiv:2405.00571
32
citations
#1061

SpikePoint: An Efficient Point-based Spiking Neural Network for Event Cameras Action Recognition

Hongwei Ren, Yue ZHOU, Xiaopeng LIN et al.

ICLR 2024spotlightarXiv:2310.07189
32
citations
#1062

Fantastic Animals and Where to Find Them: Segment Any Marine Animal with Dual SAM

Pingping Zhang, Tianyu Yan, Yang Liu et al.

CVPR 2024highlightarXiv:2404.04996
32
citations
#1063

Exact Diffusion Inversion via Bidirectional Integration Approximation

Guoqiang Zhang, j.p. lewis, W. Bastiaan Kleijn

ECCV 2024poster
32
citations
#1064

Generalizable Whole Slide Image Classification with Fine-Grained Visual-Semantic Interaction

Hao Li, Ying Chen, Yifei Chen et al.

CVPR 2024posterarXiv:2402.19326
32
citations
#1065

It's Never Too Late: Fusing Acoustic Information into Large Language Models for Automatic Speech Recognition

CHEN CHEN, Ruizhe Li, Yuchen Hu et al.

ICLR 2024posterarXiv:2402.05457
32
citations
#1066

How Far Can We Compress Instant-NGP-Based NeRF?

Yihang Chen, Qianyi Wu, Mehrtash Harandi et al.

CVPR 2024posterarXiv:2406.04101
32
citations
#1067

A Dynamic Kernel Prior Model for Unsupervised Blind Image Super-Resolution

Zhixiong Yang, Jingyuan Xia, Shengxi Li et al.

CVPR 2024posterarXiv:2404.15620
32
citations
#1068

HarmonyView: Harmonizing Consistency and Diversity in One-Image-to-3D

Sangmin Woo, byeongjun park, Hyojun Go et al.

CVPR 2024posterarXiv:2312.15980
32
citations
#1069

Distilling Autoregressive Models to Obtain High-Performance Non-autoregressive Solvers for Vehicle Routing Problems with Faster Inference Speed

Yubin Xiao, Di Wang, Boyang Li et al.

AAAI 2024paperarXiv:2312.12469
31
citations
#1070

Audio Generation with Multiple Conditional Diffusion Model

Zhifang Guo, Jianguo Mao, Tao Rui et al.

AAAI 2024paperarXiv:2308.11940
31
citations
#1071

Cross-view image geo-localization with Panorama-BEV Co-Retrieval Network

ye junyan, Zhutao Lv, Li Weijia et al.

ECCV 2024posterarXiv:2408.05475
31
citations
#1072

SpecNeRF: Gaussian Directional Encoding for Specular Reflections

Li Ma, Vasu Agrawal, Haithem Turki et al.

CVPR 2024highlightarXiv:2312.13102
31
citations
#1073

CAT: Exploiting Inter-Class Dynamics for Domain Adaptive Object Detection

Mikhail Kennerley, Jian-Gang Wang, Bharadwaj Veeravalli et al.

CVPR 2024posterarXiv:2403.19278
31
citations
#1074

Boosting Transferability in Vision-Language Attacks via Diversification along the Intersection Region of Adversarial Trajectory

Sensen Gao, Xiaojun Jia, Xuhong Ren et al.

ECCV 2024posterarXiv:2403.12445
31
citations
#1075

HowToCaption: Prompting LLMs to Transform Video Annotations at Scale

Nina Shvetsova, Anna Kukleva, Xudong Hong et al.

ECCV 2024posterarXiv:2310.04900
31
citations
#1076

G-Adapter: Towards Structure-Aware Parameter-Efficient Transfer Learning for Graph Transformer Networks

Anchun Gui, Jinqiang Ye, Han Xiao

AAAI 2024paperarXiv:2305.10329
31
citations
#1077

Skeleton Recall Loss for Connectivity Conserving and Resource Efficient Segmentation of Thin Tubular Structures

Yannick Kirchhoff, Maximilian Rokuss, Saikat Roy et al.

ECCV 2024posterarXiv:2404.03010
31
citations
#1078

RadOcc: Learning Cross-Modality Occupancy Knowledge through Rendering Assisted Distillation

Haiming Zhang, Xu Yan, Dongfeng Bai et al.

AAAI 2024paperarXiv:2312.11829
31
citations
#1079

LaWa: Using Latent Space for In-Generation Image Watermarking

Ahmad Rezaei, Mohammad Akbari, Saeed Ranjbar Alvar et al.

ECCV 2024posterarXiv:2408.05868
31
citations
#1080

Material Palette: Extraction of Materials from a Single Image

Ivan Lopes, Fabio Pizzati, Raoul de Charette

CVPR 2024posterarXiv:2311.17060
31
citations
#1081

Attention Guided CAM: Visual Explanations of Vision Transformer Guided by Self-Attention

Saebom Leem, Hyunseok Seo

AAAI 2024paperarXiv:2402.04563
31
citations
#1082

Spatio-Temporal Few-Shot Learning via Diffusive Neural Network Generation

Yuan Yuan, Chenyang Shao, Jingtao Ding et al.

ICLR 2024oralarXiv:2402.11922
31
citations
#1083

GTA: A Geometry-Aware Attention Mechanism for Multi-View Transformers

Takeru Miyato, Bernhard Jaeger, Max Welling et al.

ICLR 2024posterarXiv:2310.10375
31
citations
#1084

Hierarchical Temporal Context Learning for Camera-based Semantic Scene Completion

Bohan Li, Jiajun Deng, Wenyao Zhang et al.

ECCV 2024posterarXiv:2407.02077
31
citations
#1085

R^2-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding

Ye Liu, Jixuan He, Wanhua Li et al.

ECCV 2024posterarXiv:2404.00801
31
citations
#1086

ExACT: Language-guided Conceptual Reasoning and Uncertainty Estimation for Event-based Action Recognition and More

Jiazhou Zhou, Xu Zheng, Yuanhuiyi Lyu et al.

CVPR 2024highlightarXiv:2403.12534
31
citations
#1087

Unveiling the Power of Audio-Visual Early Fusion Transformers with Dense Interactions through Masked Modeling

Shentong Mo, Pedro Morgado

CVPR 2024posterarXiv:2312.01017
31
citations
#1088

G2P-DDM: Generating Sign Pose Sequence from Gloss Sequence with Discrete Diffusion Model

Pan Xie, Qipeng Zhang, Peng Taiying et al.

AAAI 2024paperarXiv:2208.09141
31
citations
#1089

Masked Generative Video-to-Audio Transformers with Enhanced Synchronicity

Santiago Pascual, Chunghsin YEH, Ioannis Tsiamas et al.

ECCV 2024posterarXiv:2407.10387
31
citations
#1090

FlowIE: Efficient Image Enhancement via Rectified Flow

Yixuan Zhu, Wenliang Zhao, Ao Li et al.

CVPR 2024posterarXiv:2406.00508
31
citations
#1091

Atlantis: Enabling Underwater Depth Estimation with Stable Diffusion

Fan Zhang, Shaodi You, Yu Li et al.

CVPR 2024highlightarXiv:2312.12471
31
citations
#1092

Contrasting Deepfakes Diffusion via Contrastive Learning and Global-Local Similarities

Lorenzo Baraldi, Federico Cocchi, Marcella Cornia et al.

ECCV 2024posterarXiv:2407.20337
31
citations
#1093

Physical Property Understanding from Language-Embedded Feature Fields

Albert J. Zhai, Yuan Shen, Emily Y. Chen et al.

CVPR 2024posterarXiv:2404.04242
31
citations
#1094

Open-World Human-Object Interaction Detection via Multi-modal Prompts

Jie Yang, Bingliang Li, Ailing Zeng et al.

CVPR 2024posterarXiv:2406.07221
31
citations
#1095

Fair and Efficient Contribution Valuation for Vertical Federated Learning

Zhenan Fan, Huang Fang, Xinglu Wang et al.

ICLR 2024posterarXiv:2201.02658
31
citations
#1096

TCI-Former: Thermal Conduction-Inspired Transformer for Infrared Small Target Detection

Tianxiang Chen, Zhentao Tan, Qi Chu et al.

AAAI 2024paperarXiv:2402.02046
31
citations
#1097

Finite-Time Analysis of On-Policy Heterogeneous Federated Reinforcement Learning

Chenyu Zhang, Han Wang, Aritra Mitra et al.

ICLR 2024posterarXiv:2401.15273
31
citations
#1098

PREGO: Online Mistake Detection in PRocedural EGOcentric Videos

Alessandro Flaborea, Guido M. D&amp, #x27 et al.

CVPR 2024posterarXiv:2404.01933
30
citations
#1099

Modular Blind Video Quality Assessment

Wen Wen, Mu Li, Yabin ZHANG et al.

CVPR 2024posterarXiv:2402.19276
30
citations
#1100

Gaussian Splatting on the Move: Blur and Rolling Shutter Compensation for Natural Camera Motion

Otto Seiskari, Jerry Ylilammi, Valtteri Kaatrasalo et al.

ECCV 2024posterarXiv:2403.13327
30
citations
#1101

NECO: NEural Collapse Based Out-of-distribution detection

Mouïn Ben Ammar, Nacim Belkhir, Sebastian Popescu et al.

ICLR 2024posterarXiv:2310.06823
30
citations
#1102

InterHandGen: Two-Hand Interaction Generation via Cascaded Reverse Diffusion

Jihyun Lee, Shunsuke Saito, Giljoo Nam et al.

CVPR 2024posterarXiv:2403.17422
30
citations
#1103

WeConvene: Learned Image Compression with Wavelet-Domain Convolution and Entropy Model

Haisheng Fu, Jie Liang, Zhenman Fang et al.

ECCV 2024posterarXiv:2407.09983
30
citations
#1104

Denoising Vision Transformers

Jiawei Yang, Katie Luo, Jiefeng Li et al.

ECCV 2024posterarXiv:2401.02957
30
citations
#1105

MVPGS: Excavating Multi-view Priors for Gaussian Splatting from Sparse Input Views

Wangze Xu, Huachen Gao, Shihe Shen et al.

ECCV 2024posterarXiv:2409.14316
30
citations
#1106

Soft Prompt Generation for Domain Generalization

Shuanghao Bai, Yuedi Zhang, Wanqi Zhou et al.

ECCV 2024posterarXiv:2404.19286
30
citations
#1107

VP3D: Unleashing 2D Visual Prompt for Text-to-3D Generation

Yang Chen, Yingwei Pan, haibo yang et al.

CVPR 2024posterarXiv:2403.17001
30
citations
#1108

Universal Segmentation at Arbitrary Granularity with Language Instruction

Yong Liu, Cairong Zhang, Yitong Wang et al.

CVPR 2024posterarXiv:2312.01623
30
citations
#1109

Towards Language-Driven Video Inpainting via Multimodal Large Language Models

Jianzong Wu, Xiangtai Li, Chenyang Si et al.

CVPR 2024posterarXiv:2401.10226
30
citations
#1110

Multi-Space Alignments Towards Universal LiDAR Segmentation

Youquan Liu, Lingdong Kong, Xiaoyang Wu et al.

CVPR 2024posterarXiv:2405.01538
30
citations
#1111

Griffon: Spelling out All Object Locations at Any Granularity with Large Language Models

Yufei Zhan, Yousong Zhu, Zhiyang Chen et al.

ECCV 2024posterarXiv:2311.14552
30
citations
#1112

Meta-Prompting for Automating Zero-shot Visual Recognition with LLMs

Muhammad Jehanzeb Mirza, Leonid Karlinsky, Wei Lin et al.

ECCV 2024posterarXiv:2403.11755
30
citations
#1113

TopoGCL: Topological Graph Contrastive Learning

Yuzhou Chen, Jose Frias, Yulia Gel

AAAI 2024paperarXiv:2406.17251
30
citations
#1114

PosterLlama: Bridging Design Ability of Langauge Model to Content-Aware Layout Generation

Jaejung Seol, Seojun Kim, Jaejun Yoo

ECCV 2024posterarXiv:2404.00995
30
citations
#1115

Seeing Motion at Nighttime with an Event Camera

Haoyue Liu, Shihan Peng, Lin Zhu et al.

CVPR 2024posterarXiv:2404.11884
30
citations
#1116

Deep Contrastive Graph Learning with Clustering-Oriented Guidance

Mulin Chen, Bocheng Wang, Xuelong Li

AAAI 2024paperarXiv:2402.16012
30
citations
#1117

Relevant Intrinsic Feature Enhancement Network for Few-Shot Semantic Segmentation

Xiaoyi Bao, Jie Qin, Siyang Sun et al.

AAAI 2024paperarXiv:2312.06474
30
citations
#1118

Beyond Prompt Learning: Continual Adapter for Efficient Rehearsal-Free Continual Learning

XINYUAN GAO, Songlin Dong, Yuhang He et al.

ECCV 2024posterarXiv:2407.10281
30
citations
#1119

Localization Is All You Evaluate: Data Leakage in Online Mapping Datasets and How to Fix It

Adam Lilja, Junsheng Fu, Erik Stenborg et al.

CVPR 2024posterarXiv:2312.06420
30
citations
#1120

Lossy Image Compression with Foundation Diffusion Models

Lucas Relic, Roberto Azevedo, Markus Gross et al.

ECCV 2024posterarXiv:2404.08580
30
citations
#1121

Towards the Uncharted: Density-Descending Feature Perturbation for Semi-supervised Semantic Segmentation

Xiaoyang Wang, Huihui Bai, Limin Yu et al.

CVPR 2024posterarXiv:2403.06462
30
citations
#1122

SHAP-EDITOR: Instruction-Guided Latent 3D Editing in Seconds

Minghao Chen, Junyu Xie, Iro Laina et al.

CVPR 2024posterarXiv:2312.09246
30
citations
#1123

LEOD: Label-Efficient Object Detection for Event Cameras

Ziyi Wu, Mathias Gehrig, Qing Lyu et al.

CVPR 2024posterarXiv:2311.17286
30
citations
#1124

Spanning Training Progress: Temporal Dual-Depth Scoring (TDDS) for Enhanced Dataset Pruning

xin zhang, Jiawei Du, Weiying Xie et al.

CVPR 2024posterarXiv:2311.13613
30
citations
#1125

Domain-Controlled Prompt Learning

Qinglong Cao, Zhengqin Xu, Yuntian Chen et al.

AAAI 2024paperarXiv:2310.07730
30
citations
#1126

EAGLE: Eigen Aggregation Learning for Object-Centric Unsupervised Semantic Segmentation

Chanyoung Kim, Woojung Han, Dayun Ju et al.

CVPR 2024highlightarXiv:2403.01482
30
citations
#1127

Image Inpainting via Tractable Steering of Diffusion Models

Anji Liu, Mathias Niepert, Guy Van den Broeck

ICLR 2024posterarXiv:2401.03349
30
citations
#1128

N2F2: Hierarchical Scene Understanding with Nested Neural Feature Fields

Yash Bhalgat, Iro Laina, Joao F Henriques et al.

ECCV 2024posterarXiv:2403.10997
30
citations
#1129

RegionDrag: Fast Region-Based Image Editing with Diffusion Models

Jingyi Lu, Xinghui Li, Kai Han

ECCV 2024posterarXiv:2407.18247
30
citations
#1130

Style Blind Domain Generalized Semantic Segmentation via Covariance Alignment and Semantic Consistence Contrastive Learning

Woo-Jin Ahn, Geun-Yeong Yang, Hyunduck Choi et al.

CVPR 2024posterarXiv:2403.06122
30
citations
#1131

Hierarchical Gaussian Mixture Normalizing Flow Modeling for Unified Anomaly Detection

Xincheng Yao, Ruoqi Li, Zefeng Qian et al.

ECCV 2024posterarXiv:2403.13349
30
citations
#1132

Hunting Attributes: Context Prototype-Aware Learning for Weakly Supervised Semantic Segmentation

feilong tang, Zhongxing Xu, Zhaojun QU et al.

CVPR 2024posterarXiv:2403.07630
29
citations
#1133

Root Cause Analysis in Microservice Using Neural Granger Causal Discovery

Cheng-Ming Lin, Ching Chang, Wei-Yao Wang et al.

AAAI 2024paperarXiv:2402.01140
29
citations
#1134

Understanding In-Context Learning from Repetitions

Jianhao (Elliott) Yan, Jin Xu, Chiyu Song et al.

ICLR 2024posterarXiv:2310.00297
29
citations
#1135

Self-Supervised Facial Representation Learning with Facial Region Awareness

Zheng Gao, Ioannis Patras

CVPR 2024posterarXiv:2403.02138
29
citations
#1136

Chinese Spelling Correction as Rephrasing Language Model

Linfeng Liu, Hongqiu Wu, Hai Zhao

AAAI 2024paperarXiv:2308.08796
29
citations
#1137

PolyVoice: Language Models for Speech to Speech Translation

Qianqian Dong, Zhiying Huang, Qiao Tian et al.

ICLR 2024posterarXiv:2306.02982
29
citations
#1138

Frozen CLIP: A Strong Backbone for Weakly Supervised Semantic Segmentation

Bingfeng Zhang, Siyue Yu, Yunchao Wei et al.

CVPR 2024highlightarXiv:2406.11189
29
citations
#1139

Nuvo: Neural UV Mapping for Unruly 3D Representations

Pratul Srinivasan, Stephan J Garbin, Dor Verbin et al.

ECCV 2024posterarXiv:2312.05283
29
citations
#1140

Copula Conformal prediction for multi-step time series prediction

Sophia Sun, Rose Yu

ICLR 2024oral
29
citations
#1141

Training Like a Medical Resident: Context-Prior Learning Toward Universal Medical Image Segmentation

Yunhe Gao

CVPR 2024posterarXiv:2306.02416
29
citations
#1142

Make-Your-Anchor: A Diffusion-based 2D Avatar Generation Framework

Ziyao Huang, Fan Tang, Yong Zhang et al.

CVPR 2024posterarXiv:2403.16510
29
citations
#1143

UMIE: Unified Multimodal Information Extraction with Instruction Tuning

Lin Sun, Kai Zhang, Qingyuan Li et al.

AAAI 2024paperarXiv:2401.03082
29
citations
#1144

Boosting the Power of Small Multimodal Reasoning Models to Match Larger Models with Self-Consistency Training

Cheng Tan, Jingxuan Wei, Zhangyang Gao et al.

ECCV 2024posterarXiv:2311.14109
29
citations
#1145

Diffusion Models for Monocular Depth Estimation: Overcoming Challenging Conditions

Fabio Tosi, Pierluigi Zama Ramirez, Matteo Poggi

ECCV 2024posterarXiv:2407.16698
29
citations
#1146

Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling

Rui Liu, Yifan Hu, Yi Ren et al.

AAAI 2024paperarXiv:2312.11947
29
citations
#1147

Logical Languages Accepted by Transformer Encoders with Hard Attention

Pablo Barcelo, Alexander Kozachinskiy, Anthony W. Lin et al.

ICLR 2024posterarXiv:2310.03817
29
citations
#1148

Dataset Distillation by Automatic Training Trajectories

Dai Liu, Jindong Gu, Hu Cao et al.

ECCV 2024posterarXiv:2407.14245
29
citations
#1149

Entropic Open-Set Active Learning

Bardia Safaei, Vibashan VS, Celso de Melo et al.

AAAI 2024paperarXiv:2312.14126
29
citations
#1150

Ghost on the Shell: An Expressive Representation of General 3D Shapes

Zhen Liu, Yao Feng, Yuliang Xiu et al.

ICLR 2024posterarXiv:2310.15168
29
citations
#1151

Four Ways to Improve Verbo-visual Fusion for Dense 3D Visual Grounding

Ozan Unal, Christos Sakaridis, Suman Saha et al.

ECCV 2024posterarXiv:2309.04561
29
citations
#1152

Beyond TreeSHAP: Efficient Computation of Any-Order Shapley Interactions for Tree Ensembles

Maximilian Muschalik, Fabian Fumagalli, Barbara Hammer et al.

AAAI 2024paperarXiv:2401.12069
29
citations
#1153

UniGarmentManip: A Unified Framework for Category-Level Garment Manipulation via Dense Visual Correspondence

Ruihai Wu, Haoran Lu, Yiyan Wang et al.

CVPR 2024posterarXiv:2405.06903
29
citations
#1154

View Selection for 3D Captioning via Diffusion Ranking

Tiange Luo, Justin Johnson, Honglak Lee

ECCV 2024posterarXiv:2404.07984
29
citations
#1155

Unified Language-driven Zero-shot Domain Adaptation

Senqiao Yang, Zhuotao Tian, Li Jiang et al.

CVPR 2024posterarXiv:2404.07155
29
citations
#1156

Unifying Multi-Modal Uncertainty Modeling and Semantic Alignment for Text-to-Image Person Re-identification

Zhiwei Zhao, Bin Liu, Yan Lu et al.

AAAI 2024paper
29
citations
#1157

VOODOO 3D: Volumetric Portrait Disentanglement For One-Shot 3D Head Reenactment

Phong Tran, Egor Zakharov, Long Nhat Ho et al.

CVPR 2024posterarXiv:2312.04651
29
citations
#1158

Zero-1-to-3: Domain-Level Zero-Shot Cognitive Diagnosis via One Batch of Early-Bird Students towards Three Diagnostic Objectives

Weibo Gao, Qi Liu, Hao Wang et al.

AAAI 2024paperarXiv:2312.13434
29
citations
#1159

OmniViD: A Generative Framework for Universal Video Understanding

Junke Wang, Dongdong Chen, Chong Luo et al.

CVPR 2024posterarXiv:2403.17935
29
citations
#1160

Insect-Foundation: A Foundation Model and Large-scale 1M Dataset for Visual Insect Understanding

Hoang-Quan Nguyen, Thanh-Dat Truong, Xuan-Bac Nguyen et al.

CVPR 2024highlightarXiv:2311.15206
29
citations
#1161

Revisiting the Domain Shift and Sample Uncertainty in Multi-source Active Domain Transfer

Wenqiao Zhang, Zheqi Lv

CVPR 2024posterarXiv:2311.12905
29
citations
#1162

DocRes: A Generalist Model Toward Unifying Document Image Restoration Tasks

Jiaxin Zhang, Dezhi Peng, Chongyu Liu et al.

CVPR 2024posterarXiv:2405.04408
29
citations
#1163

Biased Temporal Convolution Graph Network for Time Series Forecasting with Missing Values

Xiaodan Chen, Xiucheng Li, Bo Liu et al.

ICLR 2024oral
29
citations
#1164

Retrieval-Augmented Embodied Agents

Yichen Zhu, Zhicai Ou, Xiaofeng Mou et al.

CVPR 2024posterarXiv:2404.11699
28
citations
#1165

VideoCon: Robust Video-Language Alignment via Contrast Captions

Hritik Bansal, Yonatan Bitton, Idan Szpektor et al.

CVPR 2024posterarXiv:2311.10111
28
citations
#1166

Parallelizing non-linear sequential models over the sequence length

Yi Heng Lim, Qi Zhu, Joshua Selfridge et al.

ICLR 2024posterarXiv:2309.12252
28
citations
#1167

Sparse Semi-DETR: Sparse Learnable Queries for Semi-Supervised Object Detection

Tahira Shehzadi, Khurram Azeem Hashmi, Didier Stricker et al.

CVPR 2024posterarXiv:2404.01819
28
citations
#1168

Region-Disentangled Diffusion Model for High-Fidelity PPG-to-ECG Translation

Debaditya Shome, Pritam Sarkar, Ali Etemad

AAAI 2024paperarXiv:2308.13568
28
citations
#1169

DreamFlow: High-quality text-to-3D generation by Approximating Probability Flow

Kyungmin Lee, Kihyuk Sohn, Jinwoo Shin

ICLR 2024spotlightarXiv:2403.14966
28
citations
#1170

Arbitrary-Scale Image Generation and Upsampling using Latent Diffusion Model and Implicit Neural Decoder

Jinseok Kim, Tae-Kyun Kim

CVPR 2024posterarXiv:2403.10255
28
citations
#1171

VEGS: View Extrapolation of Urban Scenes in 3D Gaussian Splatting using Learned Priors

Sungwon Hwang, Min-Jung Kim, Taewoong Kang et al.

ECCV 2024posterarXiv:2407.02945
28
citations
#1172

LAMM: Label Alignment for Multi-Modal Prompt Learning

Jingsheng Gao, Jiacheng Ruan, Suncheng Xiang et al.

AAAI 2024paperarXiv:2312.08212
28
citations
#1173

Single Domain Generalization for Crowd Counting

Zhuoxuan Peng, S.-H. Gary Chan

CVPR 2024posterarXiv:2403.09124
28
citations
#1174

Attention-Driven Training-Free Efficiency Enhancement of Diffusion Models

Hongjie Wang, Difan Liu, Yan Kang et al.

CVPR 2024posterarXiv:2405.05252
28
citations
#1175

AccDiffusion: An Accurate Method for Higher-Resolution Image Generation

Zhihang Lin, Mingbao Lin, Meng Zhao et al.

ECCV 2024posterarXiv:2407.10738
28
citations
#1176

Unified Generative Modeling of 3D Molecules with Bayesian Flow Networks

Yuxuan Song, Jingjing Gong, Hao Zhou et al.

ICLR 2024posterarXiv:2403.15441
28
citations
#1177

Vision Transformer Off-the-Shelf: A Surprising Baseline for Few-Shot Class-Agnostic Counting

Zhicheng Wang, Liwen Xiao, Zhiguo Cao et al.

AAAI 2024paperarXiv:2305.04440
28
citations
#1178

A Simple Baseline for Efficient Hand Mesh Reconstruction

zhishan zhou, shihao zhou, Zhi Lv et al.

CVPR 2024posterarXiv:2403.01813
28
citations
#1179

Unifying Correspondence Pose and NeRF for Generalized Pose-Free Novel View Synthesis

Sunghwan Hong, Jaewoo Jung, Heeseong Shin et al.

CVPR 2024highlight
28
citations
#1180

HaloQuest: A Visual Hallucination Dataset for Advancing Multimodal Reasoning

Zhecan Wang, Garrett Bingham, Adams Wei Yu et al.

ECCV 2024posterarXiv:2407.15680
28
citations
#1181

GP-NeRF: Generalized Perception NeRF for Context-Aware 3D Scene Understanding

Hao Li, Dingwen Zhang, Yalun Dai et al.

CVPR 2024highlightarXiv:2311.11863
28
citations
#1182

Open-Vocabulary Semantic Segmentation with Image Embedding Balancing

Xiangheng Shan, Dongyue Wu, Guilin Zhu et al.

CVPR 2024posterarXiv:2406.09829
28
citations
#1183

Personalized Federated Domain-Incremental Learning based on Adaptive Knowledge Matching

Yichen Li, Wenchao Xu, Haozhao Wang et al.

ECCV 2024posterarXiv:2407.05005
28
citations
#1184

Analyzing Feed-Forward Blocks in Transformers through the Lens of Attention Maps

Goro Kobayashi, Tatsuki Kuribayashi, Sho Yokoi et al.

ICLR 2024spotlightarXiv:2302.00456
28
citations
#1185

Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis

Zanlin Ni, Yulin Wang, Renping Zhou et al.

CVPR 2024posterarXiv:2406.05478
28
citations
#1186

CAT-SAM: Conditional Tuning for Few-Shot Adaptation of Segment Anything Model

Aoran Xiao, Weihao Xuan, Heli Qi et al.

ECCV 2024posterarXiv:2402.03631
28
citations
#1187

Text-to-3D Generation with Bidirectional Diffusion using both 2D and 3D priors

Lihe Ding, Shaocong Dong, Zhanpeng Huang et al.

CVPR 2024posterarXiv:2312.04963
28
citations
#1188

DREAM: Dual Structured Exploration with Mixup for Open-set Graph Domain Adaption

Nan Yin, Mengzhu Wang, Mengzhu Wang et al.

ICLR 2024poster
28
citations
#1189

Video Editing via Factorized Diffusion Distillation

Uriel Singer, Amit Zohar, Yuval Kirstain et al.

ECCV 2024posterarXiv:2403.09334
28
citations
#1190

Auto-Prox: Training-Free Vision Transformer Architecture Search via Automatic Proxy Discovery

Zimian Wei, Peijie Dong, Zheng Hui et al.

AAAI 2024paperarXiv:2312.09059
28
citations
#1191

WHAC: World-grounded Humans and Cameras

Wanqi Yin, Zhongang Cai, Chen Wei et al.

ECCV 2024posterarXiv:2403.12959
28
citations
#1192

LaneCPP: Continuous 3D Lane Detection using Physical Priors

Maximilian Pittner, Joel Janai, Alexandru Paul Condurache

CVPR 2024posterarXiv:2406.08381
28
citations
#1193

I'M HOI: Inertia-aware Monocular Capture of 3D Human-Object Interactions

Chengfeng Zhao, Juze Zhang, Jiashen Du et al.

CVPR 2024posterarXiv:2312.08869
28
citations
#1194

Contextrast: Contextual Contrastive Learning for Semantic Segmentation

Changki Sung, Wanhee Kim, Jungho An et al.

CVPR 2024posterarXiv:2404.10633
28
citations
#1195

DiffAssemble: A Unified Graph-Diffusion Model for 2D and 3D Reassembly

Gianluca Scarpellini, Stefano Fiorini, Francesco Giuliari et al.

CVPR 2024posterarXiv:2402.19302
28
citations
#1196

MeshAvatar: Learning High-quality Triangular Human Avatars from Multi-view Videos

Yushuo Chen, Zerong Zheng, Zhe Li et al.

ECCV 2024posterarXiv:2407.08414
28
citations
#1197

It's All About Your Sketch: Democratising Sketch Control in Diffusion Models

Subhadeep Koley, Ayan Kumar Bhunia, Deeptanshu Sekhri et al.

CVPR 2024posterarXiv:2403.07234
28
citations
#1198

PPAD: Iterative Interactions of Prediction and Planning for End-to-end Autonomous Driving

Zhili Chen, Maosheng Ye, Shuangjie Xu et al.

ECCV 2024posterarXiv:2311.08100
28
citations
#1199

DC-NAS: Divide-and-Conquer Neural Architecture Search for Multi-Modal Classification

Xinyan Liang, Pinhan Fu, Qian Guo et al.

AAAI 2024paper
28
citations
#1200

Cascade-Zero123: One Image to Highly Consistent 3D with Self-Prompted Nearby Views

Yabo Chen, Jiemin Fang, Yuyang Huang et al.

ECCV 2024posterarXiv:2312.04424
28
citations