Most Cited 2024 "multi-stage competition" Papers

12,324 papers found • Page 8 of 62

#1401

GeoCalib: Learning Single-image Calibration with Geometric Optimization

Alexander Veicht, Paul-Edouard Sarlin, Philipp Lindenberger et al.

ECCV 2024posterarXiv:2409.06704
23
citations
#1402

ValUES: A Framework for Systematic Validation of Uncertainty Estimation in Semantic Segmentation

Kim-Celine Kahl, Carsten Lüth, Maximilian Zenk et al.

ICLR 2024posterarXiv:2401.08501
23
citations
#1403

FLD: Fourier Latent Dynamics for Structured Motion Representation and Learning

Chenhao Li, Elijah Stanger-Jones, Steve Heim et al.

ICLR 2024oralarXiv:2402.13820
23
citations
#1404

Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes

Yaoting Wang, Peiwen Sun, Dongzhan Zhou et al.

ECCV 2024posterarXiv:2407.10957
23
citations
#1405

PanoContext-Former: Panoramic Total Scene Understanding with a Transformer

Yuan Dong, Chuan Fang, Liefeng Bo et al.

CVPR 2024posterarXiv:2305.12497
23
citations
#1406

WeditGAN: Few-Shot Image Generation via Latent Space Relocation

Yuxuan Duan, Li Niu, Yan Hong et al.

AAAI 2024paperarXiv:2305.06671
23
citations
#1407

SGFormer: Semantic Graph Transformer for Point Cloud-Based 3D Scene Graph Generation

Changsheng Lv, Mengshi Qi, Xia Li et al.

AAAI 2024paperarXiv:2303.11048
23
citations
#1408

Videoshop: Localized Semantic Video Editing with Noise-Extrapolated Diffusion Inversion

Xiang Fan, Anand Bhattad, Ranjay Krishna

ECCV 2024posterarXiv:2403.14617
23
citations
#1409

Non-exemplar Online Class-Incremental Continual Learning via Dual-Prototype Self-Augment and Refinement

Fushuo Huo, Wenchao Xu, Jingcai Guo et al.

AAAI 2024paperarXiv:2303.10891
23
citations
#1410

L2MAC: Large Language Model Automatic Computer for Extensive Code Generation

Samuel Holt, Max Ruiz Luyten, Mihaela van der Schaar

ICLR 2024posterarXiv:2310.02003
23
citations
#1411

Switch Diffusion Transformer: Synergizing Denoising Tasks with Sparse Mixture-of-Experts

Byeongjun Park, Hyojun Go, Jin-Young Kim et al.

ECCV 2024posterarXiv:2403.09176
23
citations
#1412

Some Fundamental Aspects about Lipschitz Continuity of Neural Networks

Grigory Khromov, Sidak Pal Singh

ICLR 2024posterarXiv:2302.10886
23
citations
#1413

Revisit Anything: Visual Place Recognition via Image Segment Retrieval

Kartik Garg, Sai Shubodh Puligilla, Shishir N Y Kolathaya et al.

ECCV 2024posterarXiv:2409.18049
23
citations
#1414

Unknown Prompt the only Lacuna: Unveiling CLIP's Potential for Open Domain Generalization

Mainak Singha, Ankit Jha, Shirsha Bose et al.

CVPR 2024posterarXiv:2404.00710
23
citations
#1415

Concept Weaver: Enabling Multi-Concept Fusion in Text-to-Image Models

Gihyun Kwon, Simon Jenni, Ding Li et al.

CVPR 2024posterarXiv:2404.03913
23
citations
#1416

PLACE: Adaptive Layout-Semantic Fusion for Semantic Image Synthesis

Zhengyao Lv, Yuxiang Wei, Wangmeng Zuo et al.

CVPR 2024highlightarXiv:2403.01852
23
citations
#1417

Bayesian Diffusion Models for 3D Shape Reconstruction

Haiyang Xu, Yu lei, Zeyuan Chen et al.

CVPR 2024posterarXiv:2403.06973
23
citations
#1418

6D-Diff: A Keypoint Diffusion Framework for 6D Object Pose Estimation

Li Xu, Haoxuan Qu, Yujun Cai et al.

CVPR 2024posterarXiv:2401.00029
23
citations
#1419

TIP: Tabular-Image Pre-training for Multimodal Classification with Incomplete Data

Siyi Du, Shaoming Zheng, Yinsong Wang et al.

ECCV 2024posterarXiv:2407.07582
23
citations
#1420

milliFlow: Scene Flow Estimation on mmWave Radar Point Cloud for Human Motion Sensing

Fangqiang Ding, Zhen Luo, Peijun Zhao et al.

ECCV 2024posterarXiv:2306.17010
23
citations
#1421

Benchmarking Object Detectors with COCO: A New Path Forward

Shweta Singh, Aayan Yadav, Jitesh Jain et al.

ECCV 2024posterarXiv:2403.18819
23
citations
#1422

HybridGait: A Benchmark for Spatial-Temporal Cloth-Changing Gait Recognition with Hybrid Explorations

Yilan Dong, Chunlin Yu, Ruiyang Ha et al.

AAAI 2024paperarXiv:2401.00271
23
citations
#1423

SQ-LLaVA: Self-Questioning for Large Vision-Language Assistant

Guohao Sun, Can Qin, JIAMINAN WANG et al.

ECCV 2024posterarXiv:2403.11299
23
citations
#1424

StyleTokenizer: Defining Image Style by a Single Instance for Controlling Diffusion Models

Wen Li, Muyuan Fang, Cheng Zou et al.

ECCV 2024posterarXiv:2409.02543
23
citations
#1425

Improving Agent Behaviors with RL Fine-tuning for Autonomous Driving

Zhenghao Peng, Wenjie Luo, Yiren Lu et al.

ECCV 2024posterarXiv:2409.18343
23
citations
#1426

Text2LiDAR: Text-guided LiDAR Point Clouds Generation via Equirectangular Transformer

Yang Wu, Kaihua Zhang, Jianjun Qian et al.

ECCV 2024posterarXiv:2407.19628
22
citations
#1427

Flatten Long-Range Loss Landscapes for Cross-Domain Few-Shot Learning

Yixiong Zou, Yicong Liu, Yiman Hu et al.

CVPR 2024posterarXiv:2403.00567
22
citations
#1428

Semantic-aware SAM for Point-Prompted Instance Segmentation

Zhaoyang Wei, Pengfei Chen, Xuehui Yu et al.

CVPR 2024highlightarXiv:2312.15895
22
citations
#1429

Prioritized Semantic Learning for Zero-shot Instance Navigation

Xinyu Sun, Lizhao Liu, Hongyan Zhi et al.

ECCV 2024posterarXiv:2403.11650
22
citations
#1430

NeRF-MAE: Masked AutoEncoders for Self-Supervised 3D Representation Learning for Neural Radiance Fields

Muhammad Zubair Irshad, Sergey Zakharov, Vitor Guizilini et al.

ECCV 2024posterarXiv:2404.01300
22
citations
#1431

Meaning Representations from Trajectories in Autoregressive Models

Tian Yu Liu, Matthew Trager, Alessandro Achille et al.

ICLR 2024posterarXiv:2310.18348
22
citations
#1432

ControlNet-XS: Rethinking the Control of Text-to-Image Diffusion Models as Feedback-Control Systems

Denis Zavadski, Johann-Friedrich Feiden, Carsten Rother

ECCV 2024posterarXiv:2312.06573
22
citations
#1433

Category-Level Multi-Part Multi-Joint 3D Shape Assembly

Yichen Li, Kaichun Mo, Yueqi Duan et al.

CVPR 2024posterarXiv:2303.06163
22
citations
#1434

Reliability in Semantic Segmentation: Can We Use Synthetic Data?

Thibaut Loiseau, Tuan Hung Vu, Mickael Chen et al.

ECCV 2024posterarXiv:2312.09231
22
citations
#1435

G-NAS: Generalizable Neural Architecture Search for Single Domain Generalization Object Detection

Fan Wu, Jinling Gao, Lanqing Hong et al.

AAAI 2024paperarXiv:2402.04672
22
citations
#1436

PACER+: On-Demand Pedestrian Animation Controller in Driving Scenarios

Jingbo Wang, Zhengyi Luo, Ye Yuan et al.

CVPR 2024posterarXiv:2404.19722
22
citations
#1437

RadEdit: stress-testing biomedical vision models via diffusion image editing

Fernando Pérez-García, Sam Bond-Taylor, Pedro Sanchez et al.

ECCV 2024posterarXiv:2312.12865
22
citations
#1438

Toward Open Vocabulary Aerial Object Detection with CLIP-Activated Student-Teacher Learning

Yan Li, Weiwei Guo, Xue Yang et al.

ECCV 2024posterarXiv:2311.11646
22
citations
#1439

OGNI-DC: Robust Depth Completion with Optimization-Guided Neural Iterations

Yiming Zuo, Jia Deng

ECCV 2024posterarXiv:2406.11711
22
citations
#1440

Summarize the Past to Predict the Future: Natural Language Descriptions of Context Boost Multimodal Object Interaction Anticipation

Razvan Pasca, Alexey Gavryushin, Muhammad Hamza et al.

CVPR 2024posterarXiv:2301.09209
22
citations
#1441

PALM: Predicting Actions through Language Models

Sanghwan Kim, Daoji Huang, Yongqin Xian et al.

ECCV 2024posterarXiv:2311.17944
22
citations
#1442

Domain Prompt Learning with Quaternion Networks

Qinglong Cao, Zhengqin Xu, Yuntian Chen et al.

CVPR 2024highlightarXiv:2312.08878
22
citations
#1443

7471 PNeRFLoc: Visual Localization with Point-Based Neural Radiance Fields

Boming Zhao, Luwei Yang, Mao Mao et al.

AAAI 2024paper
22
citations
#1444

On the Provable Advantage of Unsupervised Pretraining

Jiawei Ge, Shange Tang, Jianqing Fan et al.

ICLR 2024spotlightarXiv:2303.01566
22
citations
#1445

MotionChain: Conversational Motion Controllers via Multimodal Prompts

Biao Jiang, Xin Chen, Chi Zhang et al.

ECCV 2024posterarXiv:2404.01700
22
citations
#1446

Simple Image-Level Classification Improves Open-Vocabulary Object Detection

Ruohuan Fang, Guansong Pang, Xiao Bai

AAAI 2024paperarXiv:2312.10439
22
citations
#1447

GEARS: Local Geometry-aware Hand-object Interaction Synthesis

Keyang Zhou, Bharat Lal Bhatnagar, Jan Lenssen et al.

CVPR 2024posterarXiv:2404.01758
22
citations
#1448

Spatio-Temporal Turbulence Mitigation: A Translational Perspective

Xingguang Zhang, Nicholas M Chimitt, Yiheng Chi et al.

CVPR 2024posterarXiv:2401.04244
22
citations
#1449

A Diffusion-Based Pre-training Framework for Crystal Property Prediction

Zixing Song, Ziqiao Meng, Irwin King

AAAI 2024paper
22
citations
#1450

Wikiformer: Pre-training with Structured Information of Wikipedia for Ad-Hoc Retrieval

Weihang Su, Qingyao Ai, Xiangsheng Li et al.

AAAI 2024paperarXiv:2312.10661
22
citations
#1451

Hybrid-Supervised Dual-Search: Leveraging Automatic Learning for Loss-Free Multi-Exposure Image Fusion

Guanyao Wu, Hongming Fu, Jinyuan Liu et al.

AAAI 2024paperarXiv:2309.01113
22
citations
#1452

FSRT: Facial Scene Representation Transformer for Face Reenactment from Factorized Appearance Head-pose and Facial Expression Features

Andre Rochow, Max Schwarz, Sven Behnke

CVPR 2024posterarXiv:2404.09736
22
citations
#1453

GAFusion: Adaptive Fusing LiDAR and Camera with Multiple Guidance for 3D Object Detection

Xiaotian Li, Baojie Fan, Jiandong Tian et al.

CVPR 2024posterarXiv:2411.00340
22
citations
#1454

A Multi-Modal Contrastive Diffusion Model for Therapeutic Peptide Generation

Yongkang Wang, Xuan Liu, Feng Huang et al.

AAAI 2024paperarXiv:2312.15665
22
citations
#1455

DIM: Dyadic Interaction Modeling for Social Behavior Generation

Minh Tran, Di Chang, Maksim Siniukov et al.

ECCV 2024poster
22
citations
#1456

ViLA: Efficient Video-Language Alignment for Video Question Answering

Xijun Wang, Junbang Liang, Chun-Kai Wang et al.

ECCV 2024posterarXiv:2312.08367
22
citations
#1457

On the Role of Server Momentum in Federated Learning

Jianhui Sun, Xidong Wu, Heng Huang et al.

AAAI 2024paperarXiv:2312.12670
22
citations
#1458

Graph Contrastive Invariant Learning from the Causal Perspective

9672 Yanhu Mo, Xiao Wang, Shaohua Fan et al.

AAAI 2024paperarXiv:2401.12564
22
citations
#1459

Deep SE(3)-Equivariant Geometric Reasoning for Precise Placement Tasks

Ben Eisner, Yi Yang, Todor Davchev et al.

ICLR 2024posterarXiv:2404.13478
22
citations
#1460

COCONut: Modernizing COCO Segmentation

Xueqing Deng, Qihang Yu, Peng Wang et al.

CVPR 2024posterarXiv:2404.08639
22
citations
#1461

Extend Your Own Correspondences: Unsupervised Distant Point Cloud Registration by Progressive Distance Extension

Quan Liu, Hongzi Zhu, Zhenxi Wang et al.

CVPR 2024posterarXiv:2403.03532
22
citations
#1462

Object-Centric Diffusion for Efficient Video Editing

Kumara Kahatapitiya, Adil Karjauv, Davide Abati et al.

ECCV 2024posterarXiv:2401.05735
22
citations
#1463

Learning to Prompt Knowledge Transfer for Open-World Continual Learning

Yujie Li, Xin Yang, Hao Wang et al.

AAAI 2024paperarXiv:2312.14990
22
citations
#1464

StegoGAN: Leveraging Steganography for Non-Bijective Image-to-Image Translation

Sidi Wu, Yizi Chen, Loic Landrieu et al.

CVPR 2024posterarXiv:2403.20142
22
citations
#1465

LiDAR: Sensing Linear Probing Performance in Joint Embedding SSL Architectures

Vimal Thilak, Chen Huang, Omid Saremi et al.

ICLR 2024spotlightarXiv:2312.04000
22
citations
#1466

Generalizable Sleep Staging via Multi-Level Domain Alignment

Jiquan Wang, Sha Zhao, Haiteng Jiang et al.

AAAI 2024paperarXiv:2401.05363
22
citations
#1467

Rethinking Multi-view Representation Learning via Distilled Disentangling

Guanzhou Ke, Bo Wang, Xiao-Li Wang et al.

CVPR 2024posterarXiv:2403.10897
22
citations
#1468

Bayesian Neural Controlled Differential Equations for Treatment Effect Estimation

Konstantin Hess, Valentyn Melnychuk, Dennis Frauen et al.

ICLR 2024posterarXiv:2310.17463
22
citations
#1469

MLNet: Mutual Learning Network with Neighborhood Invariance for Universal Domain Adaptation

Yanzuo Lu, Meng Shen, Andy J Ma et al.

AAAI 2024paperarXiv:2312.07871
22
citations
#1470

Unifying Top-down and Bottom-up Scanpath Prediction Using Transformers

Zhibo Yang, Sounak Mondal, Seoyoung Ahn et al.

CVPR 2024posterarXiv:2303.09383
22
citations
#1471

LiveHPS: LiDAR-based Scene-level Human Pose and Shape Estimation in Free Environment

yiming ren, xiao han, Chengfeng Zhao et al.

CVPR 2024highlightarXiv:2402.17171
22
citations
#1472

F-HOI: Toward Fine-grained Semantic-Aligned 3D Human-Object Interactions

Jie Yang, Xuesong Niu, Nan Jiang et al.

ECCV 2024posterarXiv:2407.12435
22
citations
#1473

Understanding Certified Training with Interval Bound Propagation

Yuhao Mao, Mark N Müller, Marc Fischer et al.

ICLR 2024posterarXiv:2306.10426
22
citations
#1474

FM-OV3D: Foundation Model-Based Cross-Modal Knowledge Blending for Open-Vocabulary 3D Detection

Dongmei Zhang, Chang Li, Renrui Zhang et al.

AAAI 2024paperarXiv:2312.14465
22
citations
#1475

Summarizing Stream Data for Memory-Constrained Online Continual Learning

Jianyang Gu, Kai Wang, Wei Jiang et al.

AAAI 2024paperarXiv:2305.16645
22
citations
#1476

Image Clustering Conditioned on Text Criteria

Sehyun Kwon, Jaden Park, Minkyu Kim et al.

ICLR 2024posterarXiv:2310.18297
21
citations
#1477

PracticalDG: Perturbation Distillation on Vision-Language Models for Hybrid Domain Generalization

Zining Chen, Weiqiu Wang, Zhicheng Zhao et al.

CVPR 2024posterarXiv:2404.09011
21
citations
#1478

Surface Reconstruction for 3D Gaussian Splatting via Local Structural Hints

Qianyi Wu, Jianmin Zheng, Jianfei Cai

ECCV 2024poster
21
citations
#1479

Visual Delta Generator with Large Multi-modal Models for Semi-supervised Composed Image Retrieval

Young Kyun Jang, Donghyun Kim, Zihang Meng et al.

CVPR 2024posterarXiv:2404.15516
21
citations
#1480

ZeST: Zero-Shot Material Transfer from a Single Image

Ta-Ying Cheng, Prafull Sharma, Andrew Markham et al.

ECCV 2024posterarXiv:2404.06425
21
citations
#1481

SEED: A Simple and Effective 3D DETR in Point Clouds

Zhe Liu, Jinghua Hou, Xiaoqing Ye et al.

ECCV 2024posterarXiv:2407.10749
21
citations
#1482

PromptFusion: Decoupling Stability and Plasticity for Continual Learning

Haoran Chen, Zuxuan Wu, Xintong Han et al.

ECCV 2024posterarXiv:2303.07223
21
citations
#1483

VisionTrap: Vision-Augmented Trajectory Prediction Guided by Textual Descriptions

Seokha Moon, Hyun Woo, Hongbeen Park et al.

ECCV 2024posterarXiv:2407.12345
21
citations
#1484

Robust Calibration of Large Vision-Language Adapters

Balamurali Murugesan, Julio Silva-Rodríguez, Ismail Ben Ayed et al.

ECCV 2024posterarXiv:2407.13588
21
citations
#1485

Consistency and Uncertainty: Identifying Unreliable Responses From Black-Box Vision-Language Models for Selective Visual Question Answering

Zaid Khan, Yun Fu

CVPR 2024posterarXiv:2404.10193
21
citations
#1486

Adaptive FSS: A Novel Few-Shot Segmentation Framework via Prototype Enhancement

Jing Wang, Jiangyun Li, Chen Chen et al.

AAAI 2024paperarXiv:2312.15731
21
citations
#1487

One-Shot Diffusion Mimicker for Handwritten Text Generation

Gang Dai, Yifan Zhang, Quhui Ke et al.

ECCV 2024posterarXiv:2409.04004
21
citations
#1488

MonoHair: High-Fidelity Hair Modeling from a Monocular Video

Keyu Wu, LINGCHEN YANG, Zhiyi Kuang et al.

CVPR 2024posterarXiv:2403.18356
21
citations
#1489

Region-Adaptive Transform with Segmentation Prior for Image Compression

Yuxi Liu, Wenhan Yang, Huihui Bai et al.

ECCV 2024posterarXiv:2403.00628
21
citations
#1490

NRDF: Neural Riemannian Distance Fields for Learning Articulated Pose Priors

Yannan He, Garvita Tiwari, Tolga Birdal et al.

CVPR 2024highlightarXiv:2403.03122
21
citations
#1491

When Semantic Segmentation Meets Frequency Aliasing

Linwei Chen, Lin Gu, Ying Fu

ICLR 2024posterarXiv:2403.09065
21
citations
#1492

SLEDGE: Synthesizing Driving Environments with Generative Models and Rule-Based Traffic

Kashyap Chitta, Daniel Dauner, Andreas Geiger

ECCV 2024posterarXiv:2403.17933
21
citations
#1493

Few-Shot Anomaly-Driven Generation for Anomaly Classification and Segmentation

Guan Gui, Bin-Bin Gao, Jun Liu et al.

ECCV 2024posterarXiv:2505.09263
21
citations
#1494

PreSight: Enhancing Autonomous Vehicle Perception with City-Scale NeRF Priors

Tianyuan Yuan, Mao Yucheng, Jiawei Yang et al.

ECCV 2024posterarXiv:2403.09079
21
citations
#1495

Conditional Information Bottleneck Approach for Time Series Imputation

MinGyu Choi, Changhee Lee

ICLR 2024oral
21
citations
#1496

Online Zero-Shot Classification with CLIP

Qi Qian, JUHUA HU

ECCV 2024posterarXiv:2408.13320
21
citations
#1497

UVEB: A Large-scale Benchmark and Baseline Towards Real-World Underwater Video Enhancement

yaofeng xie, Lingwei Kong, Kai Chen et al.

CVPR 2024posterarXiv:2404.14542
21
citations
#1498

Debiasing Algorithm through Model Adaptation

Tomasz Limisiewicz, David Mareček, Tomáš Musil

ICLR 2024posterarXiv:2310.18913
21
citations
#1499

Lipschitz Singularities in Diffusion Models

Zhantao Yang, Ruili Feng, Han Zhang et al.

ICLR 2024posterarXiv:2306.11251
21
citations
#1500

Visual Prompting for Generalized Few-shot Segmentation: A Multi-scale Approach

Mir Rayat Imtiaz Hossain, Mennatullah Siam, Leonid Sigal et al.

CVPR 2024posterarXiv:2404.11732
21
citations
#1501

WTS: A Pedestrian-Centric Traffic Video Dataset for Fine-grained Spatial-Temporal Understanding

Quan Kong, Yuki Kawana, Rajat Saini et al.

ECCV 2024posterarXiv:2407.15350
21
citations
#1502

Contrastive Tuning: A Little Help to Make Masked Autoencoders Forget

Johannes Lehner, Benedikt Alkin, Andreas Fürst et al.

AAAI 2024paperarXiv:2304.10520
21
citations
#1503

Learning to Adapt SAM for Segmenting Cross-domain Point Clouds

Xidong Peng, Runnan Chen, Feng Qiao et al.

ECCV 2024posterarXiv:2310.08820
21
citations
#1504

Self-Supervised Multi-Object Tracking with Path Consistency

Zijia Lu, Bing Shuai, Yanbei Chen et al.

CVPR 2024highlightarXiv:2404.05136
21
citations
#1505

Text-to-Image Generation for Abstract Concepts

Jiayi Liao, Xu Chen, Qiang Fu et al.

AAAI 2024paperarXiv:2309.14623
21
citations
#1506

MOFDiff: Coarse-grained Diffusion for Metal-Organic Framework Design

Xiang Fu, Tian Xie, Andrew Rosen et al.

ICLR 2024posterarXiv:2310.10732
21
citations
#1507

VideoRF: Rendering Dynamic Radiance Fields as 2D Feature Video Streams

Liao Wang, Kaixin Yao, Chengcheng Guo et al.

CVPR 2024posterarXiv:2312.01407
21
citations
#1508

Large Language Models are Good Prompt Learners for Low-Shot Image Classification

Zhaoheng Zheng, Jingmin Wei, Xuefeng Hu et al.

CVPR 2024posterarXiv:2312.04076
21
citations
#1509

Weakly-Supervised Temporal Action Localization by Inferring Salient Snippet-Feature

Wu Yun, Mengshi Qi, Chuanming Wang et al.

AAAI 2024paperarXiv:2303.12332
21
citations
#1510

Clustering Propagation for Universal Medical Image Segmentation

Yuhang Ding, Liulei Li, Wenguan Wang et al.

CVPR 2024posterarXiv:2403.16646
21
citations
#1511

HMD-Poser: On-Device Real-time Human Motion Tracking from Scalable Sparse Observations

Peng Dai, Yang Zhang, Tao Liu et al.

CVPR 2024posterarXiv:2403.03561
21
citations
#1512

GCNext: Towards the Unity of Graph Convolutions for Human Motion Prediction

Xinshun Wang, Qiongjie Cui, Chen Chen et al.

AAAI 2024paperarXiv:2312.11850
21
citations
#1513

Rethinking Few-shot 3D Point Cloud Semantic Segmentation

Zhaochong An, Guolei Sun, Yun Liu et al.

CVPR 2024posterarXiv:2403.00592
21
citations
#1514

Diffusion Soup: Model Merging for Text-to-Image Diffusion Models

Benjamin J Biggs, Arjun Seshadri, Yang Zou et al.

ECCV 2024posterarXiv:2406.08431
21
citations
#1515

Pathologies of Predictive Diversity in Deep Ensembles

Geoff Pleiss, Taiga Abe, E. Kelly Buchanan et al.

ICLR 2024posterarXiv:2302.00704
21
citations
#1516

Boosting Neural Cognitive Diagnosis with Student’s Affective State Modeling

Shanshan Wang, Zhen Zeng, Xun Yang et al.

AAAI 2024paper
21
citations
#1517

AE-NeRF: Audio Enhanced Neural Radiance Field for Few Shot Talking Head Synthesis

Dongze Li, Kang Zhao, Wei Wang et al.

AAAI 2024paperarXiv:2312.10921
21
citations
#1518

TI2V-Zero: Zero-Shot Image Conditioning for Text-to-Video Diffusion Models

Haomiao Ni, Bernhard Egger, Suhas Lohit et al.

CVPR 2024posterarXiv:2404.16306
21
citations
#1519

SAGS: Structure-Aware 3D Gaussian Splatting

Evangelos Ververas, Rolandos Alexandros Potamias, Song Jifei et al.

ECCV 2024posterarXiv:2404.19149
21
citations
#1520

DyFADet: Dynamic Feature Aggregation for Temporal Action Detection

Le Yang, Ziwei Zheng, Yizeng Han et al.

ECCV 2024posterarXiv:2407.03197
21
citations
#1521

Language-Driven Anchors for Zero-Shot Adversarial Robustness

Xiao Li, Wei Zhang, Yining Liu et al.

CVPR 2024posterarXiv:2301.13096
21
citations
#1522

Carve3D: Improving Multi-view Reconstruction Consistency for Diffusion Models with RL Finetuning

Desai Xie, Jiahao Li, Hao Tan et al.

CVPR 2024posterarXiv:2312.13980
21
citations
#1523

Guided Slot Attention for Unsupervised Video Object Segmentation

Minhyeok Lee, Suhwan Cho, Dogyoon Lee et al.

CVPR 2024posterarXiv:2303.08314
21
citations
#1524

Hyperspectral Image Reconstruction via Combinatorial Embedding of Cross-Channel Spatio-Spectral Clues

Xingxing Yang, Jie Chen, Zaifeng Yang

AAAI 2024paperarXiv:2312.11119
21
citations
#1525

3D-SceneDreamer: Text-Driven 3D-Consistent Scene Generation

Songchun Zhang, Yibo Zhang, Quan Zheng et al.

CVPR 2024posterarXiv:2403.09439
21
citations
#1526

DreamPropeller: Supercharge Text-to-3D Generation with Parallel Sampling

Linqi Zhou, Andy Shih, Chenlin Meng et al.

CVPR 2024highlightarXiv:2311.17082
21
citations
#1527

Real-time 3D-aware Portrait Video Relighting

Ziqi Cai, Kaiwen Jiang, Shu-Yu Chen et al.

CVPR 2024highlightarXiv:2410.18355
21
citations
#1528

Factorized Diffusion: Perceptual Illusions by Noise Decomposition

Daniel Geng, Inbum Park, Andrew Owens

ECCV 2024posterarXiv:2404.11615
21
citations
#1529

An Incremental Unified Framework for Small Defect Inspection

Jiaqi Tang, Hao Lu, Xiaogang Xu et al.

ECCV 2024posterarXiv:2312.08917
21
citations
#1530

Question Calibration and Multi-Hop Modeling for Temporal Question Answering

Chao Xue, Di Liang, Pengfei Wang et al.

AAAI 2024paperarXiv:2402.13188
21
citations
#1531

IMPUS: Image Morphing with Perceptually-Uniform Sampling Using Diffusion Models

Zhaoyuan Yang, Zhengyang Yu, Zhiwei Xu et al.

ICLR 2024posterarXiv:2311.06792
21
citations
#1532

PrPSeg: Universal Proposition Learning for Panoramic Renal Pathology Segmentation

Ruining Deng, Quan Liu, Can Cui et al.

CVPR 2024posterarXiv:2402.19286
21
citations
#1533

SAVSR: Arbitrary-Scale Video Super-resolution via a Learned Scale-Adaptive Network

Zekun Li, Hongying Liu, Fanhua Shang et al.

AAAI 2024paper
21
citations
#1534

ASAM: Boosting Segment Anything Model with Adversarial Tuning

Bo Li, Haoke Xiao, Lv Tang

CVPR 2024posterarXiv:2405.00256
20
citations
#1535

Pre-training Sequence, Structure, and Surface Features for Comprehensive Protein Representation Learning

Youhan Lee, Hasun Yu, Jaemyung Lee et al.

ICLR 2024poster
20
citations
#1536

Sketch and Refine: Towards Fast and Accurate Lane Detection

Chao Chen, Jie Liu, Chang Zhou et al.

AAAI 2024paperarXiv:2401.14729
20
citations
#1537

Zero-shot Referring Expression Comprehension via Structural Similarity Between Images and Captions

Zeyu Han, Fangrui Zhu, Qianru Lao et al.

CVPR 2024posterarXiv:2311.17048
20
citations
#1538

Neural Spline Fields for Burst Image Fusion and Layer Separation

Ilya Chugunov, David Shustin, Ruyu Yan et al.

CVPR 2024posterarXiv:2312.14235
20
citations
#1539

Neuro-Inspired Information-Theoretic Hierarchical Perception for Multimodal Learning

Xiongye Xiao, Gengshuo Liu, Gaurav Gupta et al.

ICLR 2024posterarXiv:2404.09403
20
citations
#1540

Steerers: A Framework for Rotation Equivariant Keypoint Descriptors

Georg Bökman, Johan Edstedt, Michael Felsberg et al.

CVPR 2024posterarXiv:2312.02152
20
citations
#1541

FlowTrack: Revisiting Optical Flow for Long-Range Dense Tracking

Seokju Cho, Gabriel Huang, Seungryong Kim et al.

CVPR 2024poster
20
citations
#1542

Generating and Reweighting Dense Contrastive Patterns for Unsupervised Anomaly Detection

Songmin Dai, Yifan Wu, Xiaoqiang Li et al.

AAAI 2024paperarXiv:2312.15911
20
citations
#1543

Loose Inertial Poser: Motion Capture with IMU-attached Loose-Wear Jacket

Chengxu Zuo, Yiming Wang, Lishuang Zhan et al.

CVPR 2024poster
20
citations
#1544

ExeDec: Execution Decomposition for Compositional Generalization in Neural Program Synthesis

Kensen Shi, Joey Hong, Yinlin Deng et al.

ICLR 2024posterarXiv:2307.13883
20
citations
#1545

TeTriRF: Temporal Tri-Plane Radiance Fields for Efficient Free-Viewpoint Video

Minye Wu, Zehao Wang, Georgios Kouros et al.

CVPR 2024posterarXiv:2312.06713
20
citations
#1546

Simplifying Source-Free Domain Adaptation for Object Detection: Effective Self-Training Strategies and Performance Insights

Yan Hao, Florent Forest, Olga Fink

ECCV 2024posterarXiv:2407.07586
20
citations
#1547

Composed Video Retrieval via Enriched Context and Discriminative Embeddings

Omkar Thawakar, Muzammal Naseer, Rao Anwer et al.

CVPR 2024posterarXiv:2403.16997
20
citations
#1548

Improving Zero-shot Generalization of Learned Prompts via Unsupervised Knowledge Distillation

Marco Mistretta, Alberto Baldrati, Marco Bertini et al.

ECCV 2024posterarXiv:2407.03056
20
citations
#1549

Navigation Instruction Generation with BEV Perception and Large Language Models

Sheng Fan, Rui Liu, Wenguan Wang et al.

ECCV 2024posterarXiv:2407.15087
20
citations
#1550

Improving Plasticity in Online Continual Learning via Collaborative Learning

Maorong Wang, Nicolas Michel, Ling Xiao et al.

CVPR 2024posterarXiv:2312.00600
20
citations
#1551

WordRobe: Text-Guided Generation of Textured 3D Garments

Astitva Srivastava, Pranav Manu, Amit Raj et al.

ECCV 2024posterarXiv:2403.17541
20
citations
#1552

Restore Anything with Masks: Leveraging Mask Image Modeling for Blind All-in-One Image Restoration

Chu Jie Qin, Ruiqi Wu, Zikun Liu et al.

ECCV 2024posterarXiv:2409.19403
20
citations
#1553

A Call to Reflect on Evaluation Practices for Age Estimation: Comparative Analysis of the State-of-the-Art and a Unified Benchmark

Jakub Paplham, Vojtech Franc

CVPR 2024posterarXiv:2307.04570
20
citations
#1554

An Upload-Efficient Scheme for Transferring Knowledge From a Server-Side Pre-trained Generator to Clients in Heterogeneous Federated Learning

Jianqing Zhang, Yang Liu, Yang Hua et al.

CVPR 2024posterarXiv:2403.15760
20
citations
#1555

ECoFLaP: Efficient Coarse-to-Fine Layer-Wise Pruning for Vision-Language Models

Yi-Lin Sung, Jaehong Yoon, Mohit Bansal

ICLR 2024posterarXiv:2310.02998
20
citations
#1556

AlignMiF: Geometry-Aligned Multimodal Implicit Field for LiDAR-Camera Joint Synthesis

Tao Tang, Guangrun Wang, Yixing Lao et al.

CVPR 2024highlightarXiv:2402.17483
20
citations
#1557

On the Variance of Neural Network Training with respect to Test Sets and Distributions

Keller Jordan

ICLR 2024posterarXiv:2304.01910
20
citations
#1558

Training-free Video Temporal Grounding using Large-scale Pre-trained Models

Minghang Zheng, Xinhao Cai, Qingchao Chen et al.

ECCV 2024posterarXiv:2408.16219
20
citations
#1559

Long-Tailed Anomaly Detection with Learnable Class Names

Chih-Hui Ho, Kuan-Chuan Peng, Nuno Vasconcelos

CVPR 2024posterarXiv:2403.20236
20
citations
#1560

Leaving the Nest: Going beyond Local Loss Functions for Predict-Then-Optimize

Sanket Shah, Bryan Wilder, Andrew Perrault et al.

AAAI 2024paperarXiv:2305.16830
20
citations
#1561

FedRA: A Random Allocation Strategy for Federated Tuning to Unleash the Power of Heterogeneous Clients

Shangchao Su, Bin Li, Xiangyang Xue

ECCV 2024posterarXiv:2311.11227
20
citations
#1562

ConR: Contrastive Regularizer for Deep Imbalanced Regression

Mahsa Keramati, Lili Meng, R. Evans

ICLR 2024posterarXiv:2309.06651
20
citations
#1563

DiffAIL: Diffusion Adversarial Imitation Learning

Bingzheng Wang, Guoqiang Wu, Teng Pang et al.

AAAI 2024paperarXiv:2312.06348
20
citations
#1564

DIBS: Enhancing Dense Video Captioning with Unlabeled Videos via Pseudo Boundary Enrichment and Online Refinement

Hao Wu, Huabin Liu, Yu Qiao et al.

CVPR 2024posterarXiv:2404.02755
20
citations
#1565

MMVR: Millimeter-wave Multi-View Radar Dataset and Benchmark for Indoor Perception

Mohammad Mahbubur Rahman, Ryoma Yataka, Sorachi Kato et al.

ECCV 2024posterarXiv:2406.10708
20
citations
#1566

MirrorGaussian: Reflecting 3D Gaussians for Reconstructing Mirror Reflections

Jiayue Liu, Tang Xiao, Freeman Cheng et al.

ECCV 2024posterarXiv:2405.11921
20
citations
#1567

Embarrassingly Simple Dataset Distillation

Yunzhen Feng, Shanmukha Ramakrishna Vedantam, Julia Kempe

ICLR 2024poster
20
citations
#1568

Multi-Level Neural Scene Graphs for Dynamic Urban Environments

Tobias Fischer, Lorenzo Porzi, Samuel Rota Bulò et al.

CVPR 2024posterarXiv:2404.00168
20
citations
#1569

MLP Can Be A Good Transformer Learner

Sihao Lin, Pumeng Lyu, Dongrui Liu et al.

CVPR 2024posterarXiv:2404.05657
20
citations
#1570

Upper Bounding Barlow Twins: A Novel Filter for Multi-Relational Clustering

Xiaowei Qian, Bingheng Li, Zhao Kang

AAAI 2024paperarXiv:2312.14066
20
citations
#1571

PerAda: Parameter-Efficient Federated Learning Personalization with Generalization Guarantees

Chulin Xie, De-An Huang, Wenda Chu et al.

CVPR 2024posterarXiv:2302.06637
20
citations
#1572

Aligning Geometric Spatial Layout in Cross-View Geo-Localization via Feature Recombination

Qingwang Zhang, Yingying Zhu

AAAI 2024paper
20
citations
#1573

How to Overcome Curse-of-Dimensionality for Out-of-Distribution Detection?

Soumya Suvra Ghosal, Yiyou Sun, Yixuan Li

AAAI 2024paperarXiv:2312.14452
20
citations
#1574

Gaussian Frosting: Editable Complex Radiance Fields with Real-Time Rendering

Antoine Guedon, Vincent Lepetit

ECCV 2024poster
20
citations
#1575

Learning to Predict Activity Progress by Self-Supervised Video Alignment

Gerard Donahue, Ehsan Elhamifar

CVPR 2024poster
20
citations
#1576

Towards Open Domain Text-Driven Synthesis of Multi-Person Motions

Shan Mengyi, Lu Dong, Yutao Han et al.

ECCV 2024posterarXiv:2405.18483
20
citations
#1577

Structure-Guided Adversarial Training of Diffusion Models

Ling Yang, Haotian Qian, Zhilong Zhang et al.

CVPR 2024posterarXiv:2402.17563
20
citations
#1578

Grid Diffusion Models for Text-to-Video Generation

Taegyeong Lee, Soyeong Kwon, Taehwan Kim

CVPR 2024posterarXiv:2404.00234
20
citations
#1579

SelEx: Self-Expertise in Fine-Grained Generalized Category Discovery

Sarah Rastegar, Mohammadreza Salehi, Yuki M Asano et al.

ECCV 2024posterarXiv:2408.14371
20
citations
#1580

Modeling Collaborator: Enabling Subjective Vision Classification With Minimal Human Effort via LLM Tool-Use

Imad Eddine Toubal, Aditya Avinash, Neil Alldrin et al.

CVPR 2024posterarXiv:2403.02626
20
citations
#1581

Cascade-CLIP: Cascaded Vision-Language Embeddings Alignment for Zero-Shot Semantic Segmentation

Yunheng Li, Zhong-Yu Li, Quan-Sheng Zeng et al.

ICML 2024posterarXiv:2406.00670
20
citations
#1582

Self-Distillation Regularized Connectionist Temporal Classification Loss for Text Recognition: A Simple Yet Effective Approach

Ziyin Zhang, Ning Lu, Minghui Liao et al.

AAAI 2024paperarXiv:2308.08806
20
citations
#1583

A Graph-Based Approach for Category-Agnostic Pose Estimation

Or Hirschorn, Shai Avidan

ECCV 2024posterarXiv:2311.17891
20
citations
#1584

RealViformer: Investigating Attention for Real-World Video Super-Resolution

Yuehan Zhang, Angela Yao

ECCV 2024posterarXiv:2407.13987
20
citations
#1585

PORF: POSE RESIDUAL FIELD FOR ACCURATE NEURAL SURFACE RECONSTRUCTION

Jia-Wang Bian, Wenjing Bian, Victor Prisacariu et al.

ICLR 2024posterarXiv:2310.07449
20
citations
#1586

Federated Learning with Extremely Noisy Clients via Negative Distillation

Yang Lu, Lin Chen, Yonggang Zhang et al.

AAAI 2024paperarXiv:2312.12703
20
citations
#1587

ZePT: Zero-Shot Pan-Tumor Segmentation via Query-Disentangling and Self-Prompting

Yankai Jiang, Zhongzhen Huang, Rongzhao Zhang et al.

CVPR 2024posterarXiv:2312.04964
20
citations
#1588

Improving Cross-Modal Alignment with Synthetic Pairs for Text-Only Image Captioning

Zhiyue Liu, Jinyuan Liu, Fanrong Ma

AAAI 2024paperarXiv:2312.08865
20
citations
#1589

EditShield: Protecting Unauthorized Image Editing by Instruction-guided Diffusion Models

Ruoxi Chen, Haibo Jin, Yixin Liu et al.

ECCV 2024posterarXiv:2311.12066
20
citations
#1590

ZO-AdaMU Optimizer: Adapting Perturbation by the Momentum and Uncertainty in Zeroth-Order Optimization

Shuoran Jiang, Qingcai Chen, Yang Xiang et al.

AAAI 2024paperarXiv:2312.15184
20
citations
#1591

WildVidFit: Video Virtual Try-On in the Wild via Image-Based Controlled Diffusion Models

Zijian He, Peixin Chen, Guangrun Wang et al.

ECCV 2024posterarXiv:2407.10625
20
citations
#1592

Unlocking the Potential of Prompt-Tuning in Bridging Generalized and Personalized Federated Learning

wenlong deng, Christos Thrampoulidis, Xiaoxiao Li

CVPR 2024posterarXiv:2310.18285
20
citations
#1593

Isomorphic Pruning for Vision Models

Gongfan Fang, Xinyin Ma, Michael Bi Mi et al.

ECCV 2024posterarXiv:2407.04616
20
citations
#1594

GOODAT: Towards Test-Time Graph Out-of-Distribution Detection

Luzhi Wang, Di Jin, He Zhang et al.

AAAI 2024paperarXiv:2401.06176
20
citations
#1595

Distilling Vision-Language Models on Millions of Videos

Yue Zhao, Long Zhao, Xingyi Zhou et al.

CVPR 2024posterarXiv:2401.06129
20
citations
#1596

Diffusion for Natural Image Matting

Yihan Hu, Yiheng Lin, Wei Wang et al.

ECCV 2024posterarXiv:2312.05915
20
citations
#1597

EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE

Junyi Chen, Longteng Guo, Jia Sun et al.

AAAI 2024paperarXiv:2308.11971
20
citations
#1598

360Loc: A Dataset and Benchmark for Omnidirectional Visual Localization with Cross-device Queries

Huajian Huang, Changkun Liu, Yipeng Zhu et al.

CVPR 2024posterarXiv:2311.17389
20
citations
#1599

VideoMAC: Video Masked Autoencoders Meet ConvNets

Gensheng Pei, Tao Chen, Xiruo Jiang et al.

CVPR 2024posterarXiv:2402.19082
20
citations
#1600

Domain Randomization via Entropy Maximization

Gabriele Tiboni, Pascal Klink, Jan Peters et al.

ICLR 2024posterarXiv:2311.01885
20
citations