Most Cited 2024 "target-aware projection" Papers

12,324 papers found • Page 9 of 62

#1601

On Penalty Methods for Nonconvex Bilevel Optimization and First-Order Stochastic Approximation

Jeongyeol Kwon, Dohyun Kwon, Stephen Wright et al.

ICLR 2024spotlightarXiv:2309.01753
56
citations
#1602

Neural Operators with Localized Integral and Differential Kernels

Miguel Liu-Schiaffini, Julius Berner, Boris Bonev et al.

ICML 2024arXiv:2402.16845
56
citations
#1603

OpenNeRF: Open Set 3D Neural Scene Segmentation with Pixel-Wise Features and Rendered Novel Views

Francis Engelmann, Fabian Manhardt, Michael Niemeyer et al.

ICLR 2024arXiv:2404.03650
56
citations
#1604

Analytic-Splatting: Anti-Aliased 3D Gaussian Splatting via Analytic Integration

Zhihao Liang, Qi Zhang, WENBO HU et al.

ECCV 2024arXiv:2403.11056
56
citations
#1605

Implicit Discriminative Knowledge Learning for Visible-Infrared Person Re-Identification

kaijie ren, Lei Zhang

CVPR 2024arXiv:2403.11708
56
citations
#1606

Safe Offline Reinforcement Learning with Feasibility-Guided Diffusion Model

Yinan Zheng, Jianxiong Li, Dongjie Yu et al.

ICLR 2024arXiv:2401.10700
56
citations
#1607

In-Context Learning Learns Label Relationships but Is Not Conventional Learning

Jannik Kossen, Yarin Gal, Tom Rainforth

ICLR 2024arXiv:2307.12375
56
citations
#1608

Adaptive Text Watermark for Large Language Models

Yepeng Liu, Yuheng Bu

ICML 2024arXiv:2401.13927
55
citations
#1609

Accelerating Diffusion Sampling with Optimized Time Steps

Shuchen Xue, Zhaoqiang Liu, Fei Chen et al.

CVPR 2024arXiv:2402.17376
55
citations
#1610

FlashTex: Fast Relightable Mesh Texturing with LightControlNet

Kangle Deng, Timothy Omernick, Alexander B Weiss et al.

ECCV 2024arXiv:2402.13251
55
citations
#1611

GeoGaussian: Geometry-aware Gaussian Splatting for Scene Rendering

Yanyan Li, Chenyu Lyu, Yan Di et al.

ECCV 2024arXiv:2403.11324
55
citations
#1612

Test-Time Training on Nearest Neighbors for Large Language Models

Moritz Hardt, Yu Sun

ICLR 2024arXiv:2305.18466
55
citations
#1613

Tuning LayerNorm in Attention: Towards Efficient Multi-Modal LLM Finetuning

Bingchen Zhao, Haoqin Tu, Chen Wei et al.

ICLR 2024spotlightarXiv:2312.11420
55
citations
#1614

Radiative Gaussian Splatting for Efficient X-ray Novel View Synthesis

Yuanhao Cai, Yixun Liang, Jiahao Wang et al.

ECCV 2024arXiv:2403.04116
55
citations
#1615

CamoDiffusion: Camouflaged Object Detection via Conditional Diffusion Models

Zhongxi Chen, Ke Sun, Xianming Lin

AAAI 2024paperarXiv:2305.17932
55
citations
#1616

FakeInversion: Learning to Detect Images from Unseen Text-to-Image Models by Inverting Stable Diffusion

George Cazenavette, Avneesh Sud, Thomas Leung et al.

CVPR 2024arXiv:2406.08603
55
citations
#1617

How Universal Polynomial Bases Enhance Spectral Graph Neural Networks: Heterophily, Over-smoothing, and Over-squashing

Keke Huang, Yu Guang Wang, Ming Li et al.

ICML 2024arXiv:2405.12474
55
citations
#1618

HeadGaS: Real-Time Animatable Head Avatars via 3D Gaussian Splatting

Helisa Dhamo, Yinyu Nie, Arthur Moreau et al.

ECCV 2024arXiv:2312.02902
55
citations
#1619

Data Roaming and Quality Assessment for Composed Image Retrieval

Matan Levy, Rami Ben-Ari, Nir Darshan et al.

AAAI 2024paperarXiv:2303.09429
55
citations
#1620

Differentially Private Bias-Term Fine-tuning of Foundation Models

Zhiqi Bu, Yu-Xiang Wang, Sheng Zha et al.

ICML 2024arXiv:2210.00036
55
citations
#1621

SWAG: Splatting in the Wild images with Appearance-conditioned Gaussians

Hiba Dahmani, Moussab Bennehar, Nathan Piasco et al.

ECCV 2024arXiv:2403.10427
55
citations
#1622

RTMO: Towards High-Performance One-Stage Real-Time Multi-Person Pose Estimation

Peng Lu, Tao Jiang, Yining Li et al.

CVPR 2024arXiv:2312.07526
55
citations
#1623

EGTR: Extracting Graph from Transformer for Scene Graph Generation

Jinbae Im, JeongYeon Nam, Nokyung Park et al.

CVPR 2024arXiv:2404.02072
55
citations
#1624

Beyond Weisfeiler-Lehman: A Quantitative Framework for GNN Expressiveness

Bohang Zhang, Jingchu Gai, Yiheng Du et al.

ICLR 2024arXiv:2401.08514
55
citations
#1625

LaRa: Efficient Large-Baseline Radiance Fields

Anpei Chen, Haofei Xu, Stefano Esposito et al.

ECCV 2024arXiv:2407.04699
55
citations
#1626

UniAdapter: Unified Parameter-Efficient Transfer Learning for Cross-modal Modeling

Haoyu Lu, Yuqi Huo, Guoxing Yang et al.

ICLR 2024arXiv:2302.06605
55
citations
#1627

Auto-Regressive Next-Token Predictors are Universal Learners

Eran Malach

ICML 2024arXiv:2309.06979
55
citations
#1628

Watermarking Conditional Text Generation for AI Detection: Unveiling Challenges and a Semantic-Aware Watermark Remedy

Yu Fu, Deyi Xiong, Yue Dong

AAAI 2024paperarXiv:2307.13808
55
citations
#1629

A Recipe for Scaling up Text-to-Video Generation with Text-free Videos

Xiang Wang, Shiwei Zhang, Hangjie Yuan et al.

CVPR 2024arXiv:2312.15770
55
citations
#1630

SALMON: Self-Alignment with Instructable Reward Models

Zhiqing Sun, Yikang Shen, Hongxin Zhang et al.

ICLR 2024arXiv:2310.05910
55
citations
#1631

Dynamic Evaluation of Large Language Models by Meta Probing Agents

Kaijie Zhu, Jindong Wang, Qinlin Zhao et al.

ICML 2024arXiv:2402.14865
55
citations
#1632

Seer: Language Instructed Video Prediction with Latent Diffusion Models

Xianfan Gu, Chuan Wen, Weirui Ye et al.

ICLR 2024oralarXiv:2303.14897
55
citations
#1633

See and Think: Embodied Agent in Virtual Environment

Zhonghan Zhao, Xuan Wang, Wenhao Chai et al.

ECCV 2024arXiv:2311.15209
54
citations
#1634

Shadows Don't Lie and Lines Can't Bend! Generative Models don't know Projective Geometry...for now

Ayush Sarkar, Hanlin Mai, Amitabh Mahapatra et al.

CVPR 2024arXiv:2311.17138
54
citations
#1635

SecondPose: SE(3)-Consistent Dual-Stream Feature Fusion for Category-Level Pose Estimation

Yamei Chen, Yan Di, Guangyao Zhai et al.

CVPR 2024arXiv:2311.11125
54
citations
#1636

Pruner-Zero: Evolving Symbolic Pruning Metric From Scratch for Large Language Models

Peijie Dong, Lujun Li, Zhenheng Tang et al.

ICML 2024arXiv:2406.02924
54
citations
#1637

Towards Text-guided 3D Scene Composition

Qihang Zhang, Chaoyang Wang, Aliaksandr Siarohin et al.

CVPR 2024arXiv:2312.08885
54
citations
#1638

Local Search GFlowNets

Minsu Kim, Yun Taeyoung, Emmanuel Bengio et al.

ICLR 2024spotlightarXiv:2310.02710
54
citations
#1639

Lookahead Exploration with Neural Radiance Representation for Continuous Vision-Language Navigation

Zihan Wang, Xiangyang Li, Jiahao Yang et al.

CVPR 2024highlightarXiv:2404.01943
54
citations
#1640

Open-Vocabulary Segmentation with Semantic-Assisted Calibration

Yong Liu, Sule Bai, Guanbin Li et al.

CVPR 2024arXiv:2312.04089
54
citations
#1641

Dense Optical Tracking: Connecting the Dots

Guillaume Le Moing, Jean Ponce, Cordelia Schmid

CVPR 2024highlightarXiv:2312.00786
54
citations
#1642

Tag-LLM: Repurposing General-Purpose LLMs for Specialized Domains

Junhong Shen, Neil Tenenholtz, James Hall et al.

ICML 2024arXiv:2402.05140
54
citations
#1643

Visual In-Context Prompting

Feng Li, Qing Jiang, Hao Zhang et al.

CVPR 2024arXiv:2311.13601
54
citations
#1644

VFusion3D: Learning Scalable 3D Generative Models from Video Diffusion Models

Junlin Han, Filippos Kokkinos, Philip Torr

ECCV 2024arXiv:2403.12034
54
citations
#1645

SPOC: Imitating Shortest Paths in Simulation Enables Effective Navigation and Manipulation in the Real World

Kiana Ehsani, Tanmay Gupta, Rose Hendrix et al.

CVPR 2024arXiv:2312.02976
54
citations
#1646

LatestEval: Addressing Data Contamination in Language Model Evaluation through Dynamic and Time

Sensitive Test Construction - Yucheng Li, Frank Guerin, Chenghua Lin

AAAI 2024paperarXiv:2312.12343
54
citations
#1647

A Semi-supervised Nighttime Dehazing Baseline with Spatial-Frequency Aware and Realistic Brightness Constraint

Xiaofeng Cong, Jie Gui, Jing Zhang et al.

CVPR 2024arXiv:2403.18548
54
citations
#1648

End-to-End Temporal Action Detection with 1B Parameters Across 1000 Frames

Shuming Liu, Chenlin Zhang, Chen Zhao et al.

CVPR 2024arXiv:2311.17241
54
citations
#1649

A Study of Bayesian Neural Network Surrogates for Bayesian Optimization

Yucen Li, Tim G. J. Rudner, Andrew Gordon Wilson

ICLR 2024arXiv:2305.20028
54
citations
#1650

Swallowing the Bitter Pill: Simplified Scalable Conformer Generation

Yuyang Wang, Ahmed Elhag, Navdeep Jaitly et al.

ICML 2024arXiv:2311.17932
54
citations
#1651

Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering

Zeyu Liu, Weicong Liang, Zhanhao Liang et al.

ECCV 2024arXiv:2403.09622
54
citations
#1652

Relightful Harmonization: Lighting-aware Portrait Background Replacement

Mengwei Ren, Wei Xiong, Jae Shin Yoon et al.

CVPR 2024arXiv:2312.06886
54
citations
#1653

MapTracker: Tracking with Strided Memory Fusion for Consistent Vector HD Mapping

Jiacheng Chen, Yuefan Wu, Tan Jiaqi et al.

ECCV 2024arXiv:2403.15951
54
citations
#1654

Embodied Understanding of Driving Scenarios

Yunsong Zhou, Linyan Huang, Qingwen Bu et al.

ECCV 2024arXiv:2403.04593
54
citations
#1655

MVSFormer++: Revealing the Devil in Transformer's Details for Multi-View Stereo

chenjie cao, xinlin ren, Yanwei Fu

ICLR 2024arXiv:2401.11673
54
citations
#1656

VideoStudio: Generating Consistent-Content and Multi-Scene Videos

Fuchen Long, Zhaofan Qiu, Ting Yao et al.

ECCV 2024arXiv:2401.01256
54
citations
#1657

OMNI: Open-endedness via Models of human Notions of Interestingness

Jenny Zhang, Joel Lehman, Kenneth Stanley et al.

ICLR 2024arXiv:2306.01711
54
citations
#1658

Panoptic Scene Graph Generation with Semantics-Prototype Learning

Li Li, Wei Ji, Yiming Wu et al.

AAAI 2024paperarXiv:2307.15567
54
citations
#1659

Relax Image-Specific Prompt Requirement in SAM: A Single Generic Prompt for Segmenting Camouflaged Objects

Jian Hu, Jiayi Lin, Shaogang Gong et al.

AAAI 2024paperarXiv:2312.07374
54
citations
#1660

GoMAvatar: Efficient Animatable Human Modeling from Monocular Video Using Gaussians-on-Mesh

Jing Wen, Xiaoming Zhao, Jason Ren et al.

CVPR 2024arXiv:2404.07991
54
citations
#1661

MultiDiff: Consistent Novel View Synthesis from a Single Image

Norman Müller, Katja Schwarz, Barbara Roessle et al.

CVPR 2024arXiv:2406.18524
53
citations
#1662

Consistent Prompting for Rehearsal-Free Continual Learning

Zhanxin Gao, Jun Cen, Xiaobin Chang

CVPR 2024arXiv:2403.08568
53
citations
#1663

Mechanistic Design and Scaling of Hybrid Architectures

Michael Poli, Armin Thomas, Eric Nguyen et al.

ICML 2024arXiv:2403.17844
53
citations
#1664

Visual Instruction Tuning with Polite Flamingo

Delong Chen, Jianfeng Liu, Wenliang Dai et al.

AAAI 2024paperarXiv:2307.01003
53
citations
#1665

GAMC: An Unsupervised Method for Fake News Detection Using Graph Autoencoder with Masking

Shu Yin, Peican Zhu, Lianwei Wu et al.

AAAI 2024paperarXiv:2312.05739
53
citations
#1666

Protein Discovery with Discrete Walk-Jump Sampling

Nathan Frey, Dan Berenberg, Karina Zadorozhny et al.

ICLR 2024arXiv:2306.12360
53
citations
#1667

EchoScene: Indoor Scene Generation via Information Echo over Scene Graph Diffusion

Guangyao Zhai, Evin Pınar Örnek, Dave Zhenyu Chen et al.

ECCV 2024arXiv:2405.00915
53
citations
#1668

Training Diffusion Models Towards Diverse Image Generation with Reinforcement Learning

Zichen Miao, Jiang Wang, Ze Wang et al.

CVPR 2024
53
citations
#1669

GenerateCT: Text-Conditional Generation of 3D Chest CT Volumes

Ibrahim Ethem Hamamci, Sezgin Er, Anjany Sekuboyina et al.

ECCV 2024arXiv:2305.16037
53
citations
#1670

MOTOR: A Time-to-Event Foundation Model For Structured Medical Records

Ethan Steinberg, Jason Fries, Yizhe Xu et al.

ICLR 2024oralarXiv:2301.03150
53
citations
#1671

Domain-Agnostic Mutual Prompting for Unsupervised Domain Adaptation

Zhekai Du, Xinyao Li, Fengling Li et al.

CVPR 2024arXiv:2403.02899
53
citations
#1672

Agile-Quant: Activation-Guided Quantization for Faster Inference of LLMs on the Edge

Xuan Shen, Peiyan Dong, Lei Lu et al.

AAAI 2024paperarXiv:2312.05693
53
citations
#1673

Learning Multi-Agent Communication from Graph Modeling Perspective

Shengchao Hu, Li Shen, Ya Zhang et al.

ICLR 2024arXiv:2405.08550
53
citations
#1674

Towards Scalable 3D Anomaly Detection and Localization: A Benchmark via 3D Anomaly Synthesis and A Self-Supervised Learning Network

wenqiao Li, Xiaohao Xu, Yao Gu et al.

CVPR 2024arXiv:2311.14897
53
citations
#1675

EarthVQA: Towards Queryable Earth via Relational Reasoning-Based Remote Sensing Visual Question Answering

Junjue Wang, Zhuo Zheng, Zihang Chen et al.

AAAI 2024paperarXiv:2312.12222
53
citations
#1676

SQLdepth: Generalizable Self-Supervised Fine-Structured Monocular Depth Estimation

Dong Wu, Mingmin Chi, Xuan Zang et al.

AAAI 2024paperarXiv:2309.00526
53
citations
#1677

Few-Shot Object Detection with Foundation Models

Guangxing Han, Ser-Nam Lim

CVPR 2024
53
citations
#1678

A Comprehensive Study of Multimodal Large Language Models for Image Quality Assessment

Tianhe Wu, Kede Ma, Jie Liang et al.

ECCV 2024arXiv:2403.10854
53
citations
#1679

Particle Denoising Diffusion Sampler

Angus Phillips, Hai-Dang Dau, Michael Hutchinson et al.

ICML 2024arXiv:2402.06320
53
citations
#1680

A Comparative Study of Image Restoration Networks for General Backbone Network Design

Xiangyu Chen, Zheyuan Li, Yuandong Pu et al.

ECCV 2024arXiv:2310.11881
53
citations
#1681

Prot2Text: Multimodal Protein’s Function Generation with GNNs and Transformers

Hadi Abdine, Michail Chatzianastasis, Costas Bouyioukos et al.

AAAI 2024paperarXiv:2307.14367
53
citations
#1682

Graph Neural Networks for Learning Equivariant Representations of Neural Networks

Miltiadis (Miltos) Kofinas, Boris Knyazev, Yan Zhang et al.

ICLR 2024arXiv:2403.12143
53
citations
#1683

WOUAF: Weight Modulation for User Attribution and Fingerprinting in Text-to-Image Diffusion Models

Changhoon Kim, Kyle Min, Maitreya Patel et al.

CVPR 2024arXiv:2306.04744
53
citations
#1684

Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training

David Wan, Jaemin Cho, Elias Stengel-Eskin et al.

ECCV 2024arXiv:2403.02325
53
citations
#1685

DéjàVu: KV-cache Streaming for Fast, Fault-tolerant Generative LLM Serving

Foteini Strati, Sara McAllister, Amar Phanishayee et al.

ICML 2024arXiv:2403.01876
53
citations
#1686

Text-Image Alignment for Diffusion-Based Perception

Neehar Kondapaneni, Markus Marks, Manuel Knott et al.

CVPR 2024arXiv:2310.00031
53
citations
#1687

AquaLoRA: Toward White-box Protection for Customized Stable Diffusion Models via Watermark LoRA

Weitao Feng, Wenbo Zhou, Jiyan He et al.

ICML 2024arXiv:2405.11135
53
citations
#1688

Knowledge Card: Filling LLMs' Knowledge Gaps with Plug-in Specialized Language Models

Shangbin Feng, Weijia Shi, Yuyang Bai et al.

ICLR 2024arXiv:2305.09955
53
citations
#1689

ECoDepth: Effective Conditioning of Diffusion Models for Monocular Depth Estimation

Suraj Patni, Aradhye Agarwal, Chetan Arora

CVPR 2024arXiv:2403.18807
53
citations
#1690

Direct2.5: Diverse Text-to-3D Generation via Multi-view 2.5D Diffusion

Yuanxun Lu, Jingyang Zhang, Shiwei Li et al.

CVPR 2024arXiv:2311.15980
53
citations
#1691

Learning to Embed Time Series Patches Independently

Seunghan Lee, Taeyoung Park, Kibok Lee

ICLR 2024arXiv:2312.16427
53
citations
#1692

SGNet: Structure Guided Network via Gradient-Frequency Awareness for Depth Map Super-resolution

Zhengxue Wang, Zhiqiang Yan, Jian Yang

AAAI 2024paperarXiv:2312.05799
53
citations
#1693

A Diffusion Model Framework for Unsupervised Neural Combinatorial Optimization

Sebastian Sanokowski, Sepp Hochreiter, Sebastian Lehner

ICML 2024arXiv:2406.01661
53
citations
#1694

Advancing Spatial Reasoning in Large Language Models: An In-Depth Evaluation and Enhancement Using the StepGame Benchmark

Fangjun Li, David C. Hogg, Anthony G. Cohn

AAAI 2024paperarXiv:2401.03991
53
citations
#1695

Cross-Domain Few-Shot Object Detection via Enhanced Open-Set Object Detector

Yuqian Fu, Yu Wang, Yixuan Pan et al.

ECCV 2024arXiv:2402.03094
53
citations
#1696

PointOBB: Learning Oriented Object Detection via Single Point Supervision

Junwei Luo, Xue Yang, Yi Yu et al.

CVPR 2024arXiv:2311.14757
53
citations
#1697

FreeCustom: Tuning-Free Customized Image Generation for Multi-Concept Composition

Ganggui Ding, Canyu Zhao, Wen Wang et al.

CVPR 2024arXiv:2405.13870
53
citations
#1698

Structure-CLIP: Towards Scene Graph Knowledge to Enhance Multi-Modal Structured Representations

Yufeng Huang, Jiji Tang, Zhuo Chen et al.

AAAI 2024paperarXiv:2305.06152
53
citations
#1699

FaceTalk: Audio-Driven Motion Diffusion for Neural Parametric Head Models

Shivangi Aneja, Justus Thies, Angela Dai et al.

CVPR 2024arXiv:2312.08459
53
citations
#1700

ScanReason: Empowering 3D Visual Grounding with Reasoning Capabilities

CHENMING ZHU, Tai Wang, Wenwei Zhang et al.

ECCV 2024arXiv:2407.01525
52
citations
#1701

AutomaTikZ: Text-Guided Synthesis of Scientific Vector Graphics with TikZ

Jonas Belouadi, Anne Lauscher, Steffen Eger

ICLR 2024arXiv:2310.00367
52
citations
#1702

Sequential Neural Score Estimation: Likelihood-Free Inference with Conditional Score Based Diffusion Models

Louis Sharrock, Jack Simons, Song Liu et al.

ICML 2024spotlightarXiv:2210.04872
52
citations
#1703

Enhancing Multimodal Cooperation via Sample-level Modality Valuation

Yake Wei, Ruoxuan Feng, Zihe Wang et al.

CVPR 2024arXiv:2309.06255
52
citations
#1704

MolCRAFT: Structure-Based Drug Design in Continuous Parameter Space

Yanru Qu, Keyue Qiu, Yuxuan Song et al.

ICML 2024arXiv:2404.12141
52
citations
#1705

S2WAT: Image Style Transfer via Hierarchical Vision Transformer Using Strips Window Attention

Chiyu Zhang, Xiaogang Xu, Lei Wang et al.

AAAI 2024paperarXiv:2210.12381
52
citations
#1706

SatSynth: Augmenting Image-Mask Pairs through Diffusion Models for Aerial Semantic Segmentation

Aysim Toker, Marvin Eisenberger, Daniel Cremers et al.

CVPR 2024arXiv:2403.16605
52
citations
#1707

Aleth-NeRF: Illumination Adaptive NeRF with Concealing Field Assumption

Ziteng Cui, Lin Gu, Xiao Sun et al.

AAAI 2024paperarXiv:2312.09093
52
citations
#1708

SFC: Shared Feature Calibration in Weakly Supervised Semantic Segmentation

AAAI 2024paperarXiv:2401.11719
52
citations
#1709

Drag Your Noise: Interactive Point-based Editing via Diffusion Semantic Propagation

Haofeng Liu, Chenshu Xu, Yifei Yang et al.

CVPR 2024arXiv:2404.01050
52
citations
#1710

Physical 3D Adversarial Attacks against Monocular Depth Estimation in Autonomous Driving

Junhao Zheng, Chenhao Lin, Jiahao Sun et al.

CVPR 2024arXiv:2403.17301
52
citations
#1711

Cooperative Graph Neural Networks

Ben Finkelshtein, Xingyue Huang, Michael Bronstein et al.

ICML 2024arXiv:2310.01267
52
citations
#1712

M3D: Dataset Condensation by Minimizing Maximum Mean Discrepancy

Hansong Zhang, Shikun Li, Pengju Wang et al.

AAAI 2024paperarXiv:2312.15927
52
citations
#1713

Intriguing Properties of Generative Classifiers

Priyank Jaini, Kevin Clark, Robert Geirhos

ICLR 2024spotlightarXiv:2309.16779
52
citations
#1714

Unveiling and Mitigating Memorization in Text-to-image Diffusion Models through Cross Attention

Jie Ren, Yaxin Li, Shenglai Zeng et al.

ECCV 2024arXiv:2403.11052
52
citations
#1715

A Framework and Benchmark for Deep Batch Active Learning for Regression

David Holzmüller, Viktor Zaverkin, Johannes Kästner et al.

ICLR 2024arXiv:2203.09410
52
citations
#1716

Premise Order Matters in Reasoning with Large Language Models

Xinyun Chen, Ryan Chi, Xuezhi Wang et al.

ICML 2024arXiv:2402.08939
52
citations
#1717

Prompt to Transfer: Sim-to-Real Transfer for Traffic Signal Control with Prompt Learning

Longchao Da, Minquan Gao, Hua Wei et al.

AAAI 2024paperarXiv:2308.14284
52
citations
#1718

CFPL-FAS: Class Free Prompt Learning for Generalizable Face Anti-spoofing

Ajian Liu, Shuai Xue, Gan Jianwen et al.

CVPR 2024highlightarXiv:2403.14333
52
citations
#1719

ArtAdapter: Text-to-Image Style Transfer using Multi-Level Style Encoder and Explicit Adaptation

Dar-Yen Chen, Hamish Tennent, Ching-Wen Hsu

CVPR 2024arXiv:2312.02109
52
citations
#1720

GPAvatar: Generalizable and Precise Head Avatar from Image(s)

Xuangeng Chu, Yu Li, Ailing Zeng et al.

ICLR 2024arXiv:2401.10215
52
citations
#1721

Spatial Transform Decoupling for Oriented Object Detection

Hongtian Yu, Yunjie Tian, Qixiang Ye et al.

AAAI 2024paperarXiv:2308.10561
52
citations
#1722

Tree-Planner: Efficient Close-loop Task Planning with Large Language Models

Mengkang Hu, Yao Mu, Xinmiao Yu et al.

ICLR 2024arXiv:2310.08582
52
citations
#1723

Understanding the Role of the Projector in Knowledge Distillation

AAAI 2024paperarXiv:2303.11098
52
citations
#1724

Describing Differences in Image Sets with Natural Language

Lisa Dunlap, Yuhui Zhang, Xiaohan Wang et al.

CVPR 2024arXiv:2312.02974
52
citations
#1725

MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D World

Yining Hong, Zishuo Zheng, Peihao Chen et al.

CVPR 2024arXiv:2401.08577
52
citations
#1726

BigGait: Learning Gait Representation You Want by Large Vision Models

Dingqiang Ye, Chao Fan, Jingzhe Ma et al.

CVPR 2024arXiv:2402.19122
52
citations
#1727

Consistent3D: Towards Consistent High-Fidelity Text-to-3D Generation with Deterministic Sampling Prior

Zike Wu, Pan Zhou, YI Xuanyu et al.

CVPR 2024arXiv:2401.09050
52
citations
#1728

SubT-MRS Dataset: Pushing SLAM Towards All-weather Environments

Shibo Zhao, Yuanjun Gao, Tianhao Wu et al.

CVPR 2024arXiv:2307.07607
52
citations
#1729

Pink: Unveiling the Power of Referential Comprehension for Multi-modal LLMs

shiyu xuan, Qingpei Guo, Ming Yang et al.

CVPR 2024arXiv:2310.00582
52
citations
#1730

Soft Contrastive Learning for Time Series

Seunghan Lee, Taeyoung Park, Kibok Lee

ICLR 2024oralarXiv:2312.16424
52
citations
#1731

DiffusionLight: Light Probes for Free by Painting a Chrome Ball

Pakkapon Phongthawee, Worameth Chinchuthakun, Nontaphat Sinsunthithet et al.

CVPR 2024arXiv:2312.09168
52
citations
#1732

AvatarGPT: All-in-One Framework for Motion Understanding Planning Generation and Beyond

Zixiang Zhou, Yu Wan, Baoyuan Wang

CVPR 2024
52
citations
#1733

UniDream: Unifying Diffusion Priors for Relightable Text-to-3D Generation

Zexiang Liu, Yangguang Li, Youtian Lin et al.

ECCV 2024arXiv:2312.08754
51
citations
#1734

Privacy-Preserving In-Context Learning for Large Language Models

Tong Wu, Ashwinee Panda, Jiachen (Tianhao) Wang et al.

ICLR 2024arXiv:2305.01639
51
citations
#1735

JeDi: Joint-Image Diffusion Models for Finetuning-Free Personalized Text-to-Image Generation

Yu Zeng, Vishal M. Patel, Haochen Wang et al.

CVPR 2024arXiv:2407.06187
51
citations
#1736

High-Order Structure Based Middle-Feature Learning for Visible-Infrared Person Re-identification

Liuxiang Qiu, Si Chen, Yan Yan et al.

AAAI 2024paperarXiv:2312.07853
51
citations
#1737

CONFORM: Contrast is All You Need for High-Fidelity Text-to-Image Diffusion Models

Tuna Han Salih Meral, Enis Simsar, Federico Tombari et al.

CVPR 2024arXiv:2312.06059
51
citations
#1738

FRESCO: Spatial-Temporal Correspondence for Zero-Shot Video Translation

Shuai Yang, Yifan Zhou, Ziwei Liu et al.

CVPR 2024arXiv:2403.12962
51
citations
#1739

Challenging Forgets: Unveiling the Worst-Case Forget Sets in Machine Unlearning

Chongyu Fan, Jiancheng Liu, Alfred Hero et al.

ECCV 2024arXiv:2403.07362
51
citations
#1740

DI-V2X: Learning Domain-Invariant Representation for Vehicle-Infrastructure Collaborative 3D Object Detection

Li Xiang, Junbo Yin, Wei Li et al.

AAAI 2024paperarXiv:2312.15742
51
citations
#1741

UFineBench: Towards Text-based Person Retrieval with Ultra-fine Granularity

Jialong Zuo, Hanyu Zhou, Ying Nie et al.

CVPR 2024arXiv:2312.03441
51
citations
#1742

From Zero to Turbulence: Generative Modeling for 3D Flow Simulation

Marten Lienen, David Lüdke, Jan Hansen-Palmus et al.

ICLR 2024arXiv:2306.01776
51
citations
#1743

On the Test-Time Zero-Shot Generalization of Vision-Language Models: Do We Really Need Prompt Learning?

Maxime Zanella, Ismail Ben Ayed

CVPR 2024arXiv:2405.02266
51
citations
#1744

GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer

Ding Jia, Jianyuan Guo, Kai Han et al.

ICML 2024arXiv:2406.01210
51
citations
#1745

SelfIE: Self-Interpretation of Large Language Model Embeddings

Haozhe Chen, Carl Vondrick, Chengzhi Mao

ICML 2024arXiv:2403.10949
51
citations
#1746

Jack of All Tasks Master of Many: Designing General-Purpose Coarse-to-Fine Vision-Language Model

Shraman Pramanick, Guangxing Han, Rui Hou et al.

CVPR 2024highlightarXiv:2312.12423
51
citations
#1747

Score Regularized Policy Optimization through Diffusion Behavior

Huayu Chen, Cheng Lu, Zhengyi Wang et al.

ICLR 2024arXiv:2310.07297
51
citations
#1748

Tri-Perspective View Decomposition for Geometry-Aware Depth Completion

Zhiqiang Yan, Yuankai Lin, Kun Wang et al.

CVPR 2024arXiv:2403.15008
51
citations
#1749

PIGEON: Predicting Image Geolocations

Lukas Haas, Michal Skreta, Silas Alberti et al.

CVPR 2024highlightarXiv:2307.05845
51
citations
#1750

LogFormer: A Pre-train and Tuning Pipeline for Log Anomaly Detection

hongcheng Guo, Jian Yang, Jiaheng Liu et al.

AAAI 2024paperarXiv:2401.04749
51
citations
#1751

Neural Implicit Representation for Building Digital Twins of Unknown Articulated Objects

Yijia Weng, Bowen Wen, Jonathan Tremblay et al.

CVPR 2024arXiv:2404.01440
51
citations
#1752

Dual Memory Networks: A Versatile Adaptation Approach for Vision-Language Models

Yabin Zhang, Wenjie Zhu, Hui Tang et al.

CVPR 2024arXiv:2403.17589
51
citations
#1753

Scaling Exponents Across Parameterizations and Optimizers

Katie Everett, Lechao Xiao, Mitchell Wortsman et al.

ICML 2024arXiv:2407.05872
51
citations
#1754

GraphBEV: Towards Robust BEV Feature Alignment for Multi-Modal 3D Object Detection

Ziying Song, Lei Yang, Shaoqing Xu et al.

ECCV 2024arXiv:2403.11848
51
citations
#1755

ArGue: Attribute-Guided Prompt Tuning for Vision-Language Models

Xinyu Tian, Shu Zou, Zhaoyuan Yang et al.

CVPR 2024arXiv:2311.16494
51
citations
#1756

PIE-NeRF: Physics-based Interactive Elastodynamics with NeRF

Yutao Feng, Yintong Shang, Xuan Li et al.

CVPR 2024arXiv:2311.13099
51
citations
#1757

Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models

Jiayi Guo, Xingqian Xu, Yifan Pu et al.

CVPR 2024arXiv:2312.04410
51
citations
#1758

Curated LLM: Synergy of LLMs and Data Curation for tabular augmentation in low-data regimes

Nabeel Seedat, Nicolas Huynh, Boris van Breugel et al.

ICML 2024arXiv:2312.12112
51
citations
#1759

ReMoS: 3D Motion-Conditioned Reaction Synthesis for Two-Person Interactions

Anindita Ghosh, Rishabh Dabral, Vladislav Golyanik et al.

ECCV 2024arXiv:2311.17057
51
citations
#1760

FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis

Linjiang Huang, Rongyao Fang, Aiping Zhang et al.

ECCV 2024arXiv:2403.12963
51
citations
#1761

Make a Cheap Scaling: A Self-Cascade Diffusion Model for Higher-Resolution Adaptation

Lanqing Guo, Yingqing He, Haoxin Chen et al.

ECCV 2024arXiv:2402.10491
51
citations
#1762

LCM-Lookahead for Encoder-based Text-to-Image Personalization

Rinon Gal, Or Lichter, Elad Richardson et al.

ECCV 2024arXiv:2404.03620
51
citations
#1763

Deep Patch Visual SLAM

Lahav Lipson, Zachary Teed, Jia Deng

ECCV 2024arXiv:2408.01654
51
citations
#1764

Language-driven All-in-one Adverse Weather Removal

Hao Yang, Liyuan Pan, Yan Yang et al.

CVPR 2024arXiv:2312.01381
51
citations
#1765

MVHumanNet: A Large-scale Dataset of Multi-view Daily Dressing Human Captures

Zhangyang Xiong, Chenghong Li, Kenkun Liu et al.

CVPR 2024arXiv:2312.02963
51
citations
#1766

ViTamin: Designing Scalable Vision Models in the Vision-Language Era

Jieneng Chen, Qihang Yu, Xiaohui Shen et al.

CVPR 2024arXiv:2404.02132
51
citations
#1767

GVGEN: Text-to-3D Generation with Volumetric Representation

Xianglong He, Junyi Chen, Sida Peng et al.

ECCV 2024arXiv:2403.12957
51
citations
#1768

Retrieval-Augmented Egocentric Video Captioning

Jilan Xu, Yifei Huang, Junlin Hou et al.

CVPR 2024arXiv:2401.00789
51
citations
#1769

One-Shot Open Affordance Learning with Foundation Models

Gen Li, Deqing Sun, Laura Sevilla-Lara et al.

CVPR 2024arXiv:2311.17776
50
citations
#1770

Strong Baselines for Parameter-Efficient Few-Shot Fine-Tuning

Samyadeep Basu, Shell Hu, Daniela Massiceti et al.

AAAI 2024paperarXiv:2304.01917
50
citations
#1771

BESA: Pruning Large Language Models with Blockwise Parameter-Efficient Sparsity Allocation

Peng Xu, Wenqi Shao, Mengzhao Chen et al.

ICLR 2024arXiv:2402.16880
50
citations
#1772

CUTS+: High-Dimensional Causal Discovery from Irregular Time-Series

Yuxiao Cheng, Lianglong Li, Tingxiong Xiao et al.

AAAI 2024paperarXiv:2305.05890
50
citations
#1773

An Efficient Membership Inference Attack for the Diffusion Model by Proximal Initialization

Fei Kong, Jinhao Duan, ruipeng ma et al.

ICLR 2024arXiv:2305.18355
50
citations
#1774

Cam4DOcc: Benchmark for Camera-Only 4D Occupancy Forecasting in Autonomous Driving Applications

Junyi Ma, Xieyuanli Chen, Jiawei Huang et al.

CVPR 2024arXiv:2311.17663
50
citations
#1775

Discovering and Mitigating Visual Biases through Keyword Explanation

Younghyun Kim, Sangwoo Mo, Minkyu Kim et al.

CVPR 2024highlightarXiv:2301.11104
50
citations
#1776

EgoThink: Evaluating First-Person Perspective Thinking Capability of Vision-Language Models

Sijie Cheng, Zhicheng Guo, Jingwen Wu et al.

CVPR 2024highlightarXiv:2311.15596
50
citations
#1777

Neural Markov Random Field for Stereo Matching

Tongfan Guan, Chen Wang, Yun-Hui Liu

CVPR 2024arXiv:2403.11193
50
citations
#1778

SDDGR: Stable Diffusion-based Deep Generative Replay for Class Incremental Object Detection

JUNSU KIM, Hoseong Cho, Jihyeon Kim et al.

CVPR 2024highlightarXiv:2402.17323
50
citations
#1779

DAP: A Dynamic Adversarial Patch for Evading Person Detectors

Amira Guesmi, Ruitian Ding, Muhammad Abdullah Hanif et al.

CVPR 2024arXiv:2305.11618
50
citations
#1780

Hybrid Internal Model: Learning Agile Legged Locomotion with Simulated Robot Response

Junfeng Long, ZiRui Wang, Quanyi Li et al.

ICLR 2024arXiv:2312.11460
50
citations
#1781

Towards Memorization-Free Diffusion Models

Chen Chen, Daochang Liu, Chang Xu

CVPR 2024arXiv:2404.00922
50
citations
#1782

Leveraging Representations from Intermediate Encoder-blocks for Synthetic Image Detection

Christos Koutlis, Symeon Papadopoulos

ECCV 2024arXiv:2402.19091
50
citations
#1783

Brush Your Text: Synthesize Any Scene Text on Images via Diffusion Model

Lingjun Zhang, Xinyuan Chen, Yaohui Wang et al.

AAAI 2024paperarXiv:2312.12232
50
citations
#1784

MM-Narrator: Narrating Long-form Videos with Multimodal In-Context Learning

Chaoyi Zhang, Kevin Lin, Zhengyuan Yang et al.

CVPR 2024highlightarXiv:2311.17435
50
citations
#1785

Matching Anything by Segmenting Anything

Siyuan Li, Lei Ke, Martin Danelljan et al.

CVPR 2024highlightarXiv:2406.04221
50
citations
#1786

CoMo: Controllable Motion Generation through Language Guided Pose Code Editing

Yiming Huang, WEILIN WAN, Yue Yang et al.

ECCV 2024arXiv:2403.13900
50
citations
#1787

Large-Vocabulary 3D Diffusion Model with Transformer

Ziang Cao, Fangzhou Hong, Tong Wu et al.

ICLR 2024arXiv:2309.07920
50
citations
#1788

Dynamic Semantic-Based Spatial Graph Convolution Network for Skeleton-Based Human Action Recognition

Jianyang Xie, Yanda Meng, Yitian Zhao et al.

AAAI 2024paper
50
citations
#1789

SC4D: Sparse-Controlled Video-to-4D Generation and Motion Transfer

Zijie Wu, Chaohui Yu, Yanqin Jiang et al.

ECCV 2024arXiv:2404.03736
50
citations
#1790

Patched Denoising Diffusion Models For High-Resolution Image Synthesis

Zheng Ding, Mengqi Zhang, Jiajun Wu et al.

ICLR 2024arXiv:2308.01316
50
citations
#1791

Correlation-aware Coarse-to-fine MLPs for Deformable Medical Image Registration

Mingyuan Meng, Dagan Feng, Lei Bi et al.

CVPR 2024arXiv:2406.00123
50
citations
#1792

Machine Unlearning for Image-to-Image Generative Models

Guihong Li, Hsiang Hsu, Chun-Fu Chen et al.

ICLR 2024arXiv:2402.00351
50
citations
#1793

Diff-Plugin: Revitalizing Details for Diffusion-based Low-level Tasks

Yuhao Liu, Zhanghan Ke, Fang Liu et al.

CVPR 2024arXiv:2403.00644
50
citations
#1794

LLM Augmented LLMs: Expanding Capabilities through Composition

Rachit Bansal, Bidisha Samanta, Siddharth Dalmia et al.

ICLR 2024arXiv:2401.02412
50
citations
#1795

MatSynth: A Modern PBR Materials Dataset

Giuseppe Vecchio, Valentin Deschaintre

CVPR 2024arXiv:2401.06056
50
citations
#1796

MaskClustering: View Consensus based Mask Graph Clustering for Open-Vocabulary 3D Instance Segmentation

Mi Yan, Jiazhao Zhang, Yan Zhu et al.

CVPR 2024arXiv:2401.07745
50
citations
#1797

ReMamber: Referring Image Segmentation with Mamba Twister

Yuhuan Yang, Chaofan Ma, Jiangchao Yao et al.

ECCV 2024arXiv:2403.17839
50
citations
#1798

MatFuse: Controllable Material Generation with Diffusion Models

Giuseppe Vecchio, Renato Sortino, Simone Palazzo et al.

CVPR 2024arXiv:2308.11408
50
citations
#1799

Protein Conformation Generation via Force-Guided SE(3) Diffusion Models

YAN WANG, Lihao Wang, Yuning Shen et al.

ICML 2024arXiv:2403.14088
50
citations
#1800

Frozen Transformers in Language Models Are Effective Visual Encoder Layers

Ziqi Pang, Ziyang Xie, Yunze Man et al.

ICLR 2024oralarXiv:2310.12973
50
citations