Most Cited 2024 "exponential lower bounds" Papers

12,324 papers found • Page 7 of 62

#1201

SiTH: Single-view Textured Human Reconstruction with Image-Conditioned Diffusion

Hsuan-I Ho, Jie Song, Otmar Hilliges

CVPR 2024arXiv:2311.15855
74
citations
#1202

ProxyCLIP: Proxy Attention Improves CLIP for Open-Vocabulary Segmentation

Mengcheng Lan, Chaofeng Chen, Yiping Ke et al.

ECCV 2024arXiv:2408.04883
74
citations
#1203

Polynormer: Polynomial-Expressive Graph Transformer in Linear Time

Chenhui Deng, Zichao Yue, Zhiru Zhang

ICLR 2024arXiv:2403.01232
74
citations
#1204

GenTron: Diffusion Transformers for Image and Video Generation

Shoufa Chen, Mengmeng Xu, Jiawei Ren et al.

CVPR 2024arXiv:2312.04557
74
citations
#1205

Dynamic Sparse No Training: Training-Free Fine-tuning for Sparse LLMs

Yuxin Zhang, Lirui Zhao, Mingbao Lin et al.

ICLR 2024arXiv:2310.08915
74
citations
#1206

Mega-TTS 2: Boosting Prompting Mechanisms for Zero-Shot Speech Synthesis

Ziyue Jiang, Jinglin Liu, Yi Ren et al.

ICLR 2024arXiv:2307.07218
74
citations
#1207

Self-Rectifying Diffusion Sampling with Perturbed-Attention Guidance

Donghoon Ahn, Hyoungwon Cho, Jaewon Min et al.

ECCV 2024arXiv:2403.17377
74
citations
#1208

DITTO: Diffusion Inference-Time T-Optimization for Music Generation

Zachary Novack, Julian McAuley, Taylor Berg-Kirkpatrick et al.

ICML 2024arXiv:2401.12179
74
citations
#1209

Integer-Valued Training and Spike-driven Inference Spiking Neural Network for High-performance and Energy-efficient Object Detection

Xinhao Luo, Man Yao, Yuhong Chou et al.

ECCV 2024arXiv:2407.20708
74
citations
#1210

MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model

Muyao Niu, Xiaodong Cun, Xintao Wang et al.

ECCV 2024arXiv:2405.20222
74
citations
#1211

RegionGPT: Towards Region Understanding Vision Language Model

Qiushan Guo, Shalini De Mello, Danny Yin et al.

CVPR 2024arXiv:2403.02330
73
citations
#1212

Stay on Topic with Classifier-Free Guidance

Guillaume Sanchez, Alexander Spangher, Honglu Fan et al.

ICML 2024spotlightarXiv:2306.17806
73
citations
#1213

EfficientDM: Efficient Quantization-Aware Fine-Tuning of Low-Bit Diffusion Models

YEFEI HE, Jing Liu, Weijia Wu et al.

ICLR 2024oralarXiv:2310.03270
73
citations
#1214

Depth Information Assisted Collaborative Mutual Promotion Network for Single Image Dehazing

Yafei Zhang, Shen Zhou, Huafeng Li

CVPR 2024arXiv:2403.01105
73
citations
#1215

LaRE^2: Latent Reconstruction Error Based Method for Diffusion-Generated Image Detection

Yunpeng Luo, Junlong Du, Ke Yan et al.

CVPR 2024arXiv:2403.17465
73
citations
#1216

DE-COP: Detecting Copyrighted Content in Language Models Training Data

André Duarte, Xuandong Zhao, Arlindo Oliveira et al.

ICML 2024arXiv:2402.09910
73
citations
#1217

SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks

Jiwon Song, Kyungseok Oh, Taesu Kim et al.

ICML 2024arXiv:2402.09025
73
citations
#1218

Generating Human Interaction Motions in Scenes with Text Control

Hongwei Yi, Justus Thies, Michael J. Black et al.

ECCV 2024arXiv:2404.10685
73
citations
#1219

DistiLLM: Towards Streamlined Distillation for Large Language Models

Jongwoo Ko, Sungnyun Kim, Tianyi Chen et al.

ICML 2024arXiv:2402.03898
73
citations
#1220

ConsistNet: Enforcing 3D Consistency for Multi-view Images Diffusion

Jiayu Yang, Ziang Cheng, Yunfei Duan et al.

CVPR 2024arXiv:2310.10343
73
citations
#1221

Relay Diffusion: Unifying diffusion process across resolutions for image synthesis

Jiayan Teng, Wendi Zheng, Ming Ding et al.

ICLR 2024spotlightarXiv:2309.03350
73
citations
#1222

Guidance with Spherical Gaussian Constraint for Conditional Diffusion

Lingxiao Yang, Shutong Ding, Yifan Cai et al.

ICML 2024arXiv:2402.03201
73
citations
#1223

SpikingBERT: Distilling BERT to Train Spiking Language Models Using Implicit Differentiation

Malyaban Bal, Abhronil Sengupta

AAAI 2024paperarXiv:2308.10873
73
citations
#1224

VideoPrism: A Foundational Visual Encoder for Video Understanding

Long Zhao, Nitesh Bharadwaj Gundavarapu, Liangzhe Yuan et al.

ICML 2024arXiv:2402.13217
73
citations
#1225

VISTA-LLAMA: Reducing Hallucination in Video Language Models via Equal Distance to Visual Tokens

Fan Ma, Xiaojie Jin, Heng Wang et al.

CVPR 2024arXiv:2312.08870
73
citations
#1226

PromptTTS 2: Describing and Generating Voices with Text Prompt

Yichong Leng, ZHifang Guo, Kai Shen et al.

ICLR 2024arXiv:2309.02285
73
citations
#1227

Teaching Large Language Models to Translate with Comparison

Jiali Zeng, Fandong Meng, Yongjing Yin et al.

AAAI 2024paperarXiv:2307.04408
73
citations
#1228

Overthinking the Truth: Understanding how Language Models Process False Demonstrations

Danny Halawi, Jean-Stanislas Denain, Jacob Steinhardt

ICLR 2024spotlightarXiv:2307.09476
73
citations
#1229

DreamScene: 3D Gaussian-based Text-to-3D Scene Generation via Formation Pattern Sampling

Haoran Li, Haolin Shi, Wenli Zhang et al.

ECCV 2024arXiv:2404.03575
73
citations
#1230

MoCha-Stereo: Motif Channel Attention Network for Stereo Matching

Ziyang Chen, Wei Long, He Yao et al.

CVPR 2024arXiv:2404.06842
73
citations
#1231

LLMs are Good Sign Language Translators

Jia Gong, Lin Geng Foo, Yixuan He et al.

CVPR 2024arXiv:2404.00925
73
citations
#1232

EpiDiff: Enhancing Multi-View Synthesis via Localized Epipolar-Constrained Diffusion

Zehuan Huang, Hao Wen, Junting Dong et al.

CVPR 2024arXiv:2312.06725
72
citations
#1233

SkeletonGait: Gait Recognition Using Skeleton Maps

Chao Fan, Jingzhe Ma, Dongyang Jin et al.

AAAI 2024paperarXiv:2311.13444
72
citations
#1234

Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptation

Guy Yariv, Itai Gat, Sagie Benaim et al.

AAAI 2024paperarXiv:2309.16429
72
citations
#1235

Confucius: Iterative Tool Learning from Introspection Feedback by Easy-to-Difficult Curriculum

Zhengliang Shi, Shen Gao, Minghang Zhu et al.

AAAI 2024paperarXiv:2308.14034
72
citations
#1236

Feature Re-Embedding: Towards Foundation Model-Level Performance in Computational Pathology

Wenhao Tang, Fengtao ZHOU, Sheng Huang et al.

CVPR 2024arXiv:2402.17228
72
citations
#1237

SMFANet: A Lightweight Self-Modulation Feature Aggregation Network for Efficient Image Super-Resolution

mingjun zheng, Long Sun, Jiangxin Dong et al.

ECCV 2024
72
citations
#1238

Context-Guided Spatial Feature Reconstruction for Efficient Semantic Segmentation

Zhenliang Ni, Xinghao Chen, Yingjie Zhai et al.

ECCV 2024arXiv:2405.06228
72
citations
#1239

GOAT-Bench: A Benchmark for Multi-Modal Lifelong Navigation

Mukul Khanna, Ram Ramrakhya, Gunjan Chhablani et al.

CVPR 2024arXiv:2404.06609
72
citations
#1240

ChatPose: Chatting about 3D Human Pose

Yao Feng, Jing Lin, Sai Kumar Dwivedi et al.

CVPR 2024arXiv:2311.18836
72
citations
#1241

WAVES: Benchmarking the Robustness of Image Watermarks

Bang An, Mucong Ding, Tahseen Rabbani et al.

ICML 2024arXiv:2401.08573
72
citations
#1242

Vanilla Bayesian Optimization Performs Great in High Dimensions

Carl Hvarfner, Erik Hellsten, Luigi Nardi

ICML 2024arXiv:2402.02229
72
citations
#1243

OpenBias: Open-set Bias Detection in Text-to-Image Generative Models

Moreno D&#x27, Incà, Elia Peruzzo et al.

CVPR 2024highlightarXiv:2404.07990
72
citations
#1244

UniPAD: A Universal Pre-training Paradigm for Autonomous Driving

Honghui Yang, Sha Zhang, Di Huang et al.

CVPR 2024arXiv:2310.08370
72
citations
#1245

Inducing High Energy-Latency of Large Vision-Language Models with Verbose Images

Kuofeng Gao, Yang Bai, Jindong Gu et al.

ICLR 2024oralarXiv:2401.11170
72
citations
#1246

Fast Adversarial Attacks on Language Models In One GPU Minute

Vinu Sankar Sadasivan, Shoumik Saha, Gaurang Sriramanan et al.

ICML 2024arXiv:2402.15570
72
citations
#1247

HARDVS: Revisiting Human Activity Recognition with Dynamic Vision Sensors

Xiao Wang, Zongzhen Wu, Bo Jiang et al.

AAAI 2024paperarXiv:2211.09648
72
citations
#1248

GaussianImage: 1000 FPS Image Representation and Compression by 2D Gaussian Splatting

XINJIE ZHANG, Xingtong Ge, Tongda Xu et al.

ECCV 2024arXiv:2403.08551
72
citations
#1249

OmniGlue: Generalizable Feature Matching with Foundation Model Guidance

Hanwen Jiang, Arjun Karpur, Bingyi Cao et al.

CVPR 2024arXiv:2405.12979
72
citations
#1250

Enhancing Job Recommendation through LLM-Based Generative Adversarial Networks

Yingpeng Du, Di Luo, Rui Yan et al.

AAAI 2024paperarXiv:2307.10747
72
citations
#1251

From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations

Evonne Ng, Javier Romero, Timur Bagautdinov et al.

CVPR 2024arXiv:2401.01885
72
citations
#1252

In-Context Learning through the Bayesian Prism

Madhur Panwar, Kabir Ahuja, Navin Goyal

ICLR 2024arXiv:2306.04891
72
citations
#1253

Learning to Model the World With Language

Jessy Lin, Yuqing Du, Olivia Watkins et al.

ICML 2024arXiv:2308.01399
71
citations
#1254

Exploiting Style Latent Flows for Generalizing Deepfake Video Detection

Jongwook Choi, Taehoon Kim, Yonghyun Jeong et al.

CVPR 2024arXiv:2403.06592
71
citations
#1255

Frequency-Aware Transformer for Learned Image Compression

Han Li, Shaohui Li, Wenrui Dai et al.

ICLR 2024arXiv:2310.16387
71
citations
#1256

TR-DETR: Task-Reciprocal Transformer for Joint Moment Retrieval and Highlight Detection

Hao Sun, Mingyao Zhou, Wenjing Chen et al.

AAAI 2024paperarXiv:2401.02309
71
citations
#1257

Plug-In Diffusion Model for Sequential Recommendation

Haokai Ma, Ruobing Xie, Lei Meng et al.

AAAI 2024paperarXiv:2401.02913
71
citations
#1258

TorchRL: A data-driven decision-making library for PyTorch

Albert Bou, Matteo Bettini, Sebastian Dittert et al.

ICLR 2024spotlightarXiv:2306.00577
71
citations
#1259

HDMixer: Hierarchical Dependency with Extendable Patch for Multivariate Time Series Forecasting

Qihe Huang, Lei Shen, Ruixin Zhang et al.

AAAI 2024paper
71
citations
#1260

Towards Generalizable Tumor Synthesis

Qi Chen, Xiaoxi Chen, Haorui Song et al.

CVPR 2024arXiv:2402.19470
71
citations
#1261

Exploring Post-training Quantization in LLMs from Comprehensive Study to Low Rank Compensation

Zhewei Yao, Xiaoxia Wu, Cheng Li et al.

AAAI 2024paperarXiv:2303.08302
71
citations
#1262

Frequency-Spatial Entanglement Learning for Camouflaged Object Detection

Yanguang Sun, Chunyan Xu, Jian Yang et al.

ECCV 2024arXiv:2409.01686
71
citations
#1263

Scalable Diffusion for Materials Generation

Sherry Yang, Kwanghwan Cho, Amil Merchant et al.

ICLR 2024arXiv:2311.09235
71
citations
#1264

End-to-End Rate-Distortion Optimized 3D Gaussian Representation

Henan Wang, Hanxin Zhu, Tianyu He et al.

ECCV 2024arXiv:2406.01597
71
citations
#1265

OneRestore: A Universal Restoration Framework for Composite Degradation

Yu Guo, Yuan Gao, Yuxu Lu et al.

ECCV 2024arXiv:2407.04621
71
citations
#1266

DriveWorld: 4D Pre-trained Scene Understanding via World Models for Autonomous Driving

Chen Min, Dawei Zhao, Liang Xiao et al.

CVPR 2024arXiv:2405.04390
71
citations
#1267

DGE: Direct Gaussian 3D Editing by Consistent Multi-view Editing

Minghao Chen, Iro Laina, Andrea Vedaldi

ECCV 2024arXiv:2404.18929
71
citations
#1268

When Do We Not Need Larger Vision Models?

Baifeng Shi, Ziyang Wu, Maolin Mao et al.

ECCV 2024arXiv:2403.13043
71
citations
#1269

Beyond First-Order Tweedie: Solving Inverse Problems using Latent Diffusion

Litu Rout, Yujia Chen, Abhishek Kumar et al.

CVPR 2024arXiv:2312.00852
71
citations
#1270

GAvatar: Animatable 3D Gaussian Avatars with Implicit Mesh Learning

Ye Yuan, Xueting Li, Yangyi Huang et al.

CVPR 2024highlightarXiv:2312.11461
71
citations
#1271

Simple Hierarchical Planning with Diffusion

Chang Chen, Fei Deng, Kenji Kawaguchi et al.

ICLR 2024oralarXiv:2401.02644
71
citations
#1272

InstructRetro: Instruction Tuning post Retrieval-Augmented Pretraining

Boxin Wang, Wei Ping, Lawrence McAfee et al.

ICML 2024arXiv:2310.07713
70
citations
#1273

MindBridge: A Cross-Subject Brain Decoding Framework

Shizun Wang, Songhua Liu, Zhenxiong Tan et al.

CVPR 2024highlightarXiv:2404.07850
70
citations
#1274

Accelerating Convergence of Score-Based Diffusion Models, Provably

Gen Li, Yu Huang, Timofey Efimov et al.

ICML 2024arXiv:2403.03852
70
citations
#1275

Learning a Diffusion Model Policy from Rewards via Q-Score Matching

Michael Psenka, Alejandro Escontrela, Pieter Abbeel et al.

ICML 2024arXiv:2312.11752
70
citations
#1276

Composed Image Retrieval with Text Feedback via Multi-grained Uncertainty Regularization

Yiyang Chen, Zhedong Zheng, Wei Ji et al.

ICLR 2024arXiv:2211.07394
70
citations
#1277

Transformers as Decision Makers: Provable In-Context Reinforcement Learning via Supervised Pretraining

Licong Lin, Yu Bai, Song Mei

ICLR 2024arXiv:2310.08566
70
citations
#1278

Visual Programming for Zero-shot Open-Vocabulary 3D Visual Grounding

Zhihao Yuan, Jinke Ren, Chun-Mei Feng et al.

CVPR 2024arXiv:2311.15383
70
citations
#1279

InstructScene: Instruction-Driven 3D Indoor Scene Synthesis with Semantic Graph Prior

Chenguo Lin, Yadong MU

ICLR 2024spotlightarXiv:2402.04717
70
citations
#1280

Harnessing Large Language Models for Training-free Video Anomaly Detection

Luca Zanella, Willi Menapace, Massimiliano Mancini et al.

CVPR 2024arXiv:2404.01014
70
citations
#1281

DreamScene360: Unconstrained Text-to-3D Scene Generation with Panoramic Gaussian Splatting

Shijie Zhou, Zhiwen Fan, Dejia Xu et al.

ECCV 2024arXiv:2404.06903
70
citations
#1282

DiffDA: a Diffusion model for weather-scale Data Assimilation

Langwen Huang, Lukas Gianinazzi, Yuejiang Yu et al.

ICML 2024arXiv:2401.05932
70
citations
#1283

On the Learnability of Watermarks for Language Models

Chenchen Gu, XIANG LI, Percy Liang et al.

ICLR 2024arXiv:2312.04469
70
citations
#1284

CogView3: Finer and Faster Text-to-Image Generation via Relay Diffusion

Wendi Zheng, Jiayan Teng, Zhuoyi Yang et al.

ECCV 2024arXiv:2403.05121
70
citations
#1285

Circumventing Concept Erasure Methods For Text-To-Image Generative Models

Minh Pham, Kelly Marshall, Niv Cohen et al.

ICLR 2024arXiv:2308.01508
70
citations
#1286

Detector-Free Structure from Motion

Xingyi He, Jiaming Sun, Yifan Wang et al.

CVPR 2024arXiv:2306.15669
70
citations
#1287

ViewDiff: 3D-Consistent Image Generation with Text-to-Image Models

Lukas Höllein, Aljaž Božič, Norman Müller et al.

CVPR 2024arXiv:2403.01807
69
citations
#1288

NeRF-LiDAR: Generating Realistic LiDAR Point Clouds with Neural Radiance Fields

Junge Zhang, Feihu Zhang, Shaochen Kuang et al.

AAAI 2024paperarXiv:2304.14811
69
citations
#1289

DOGE: Domain Reweighting with Generalization Estimation

Simin Fan, Matteo Pagliardini, Martin Jaggi

ICML 2024arXiv:2310.15393
69
citations
#1290

FINER: Flexible Spectral-bias Tuning in Implicit NEural Representation by Variable-periodic Activation Functions

Zhen Liu, Hao Zhu, Qi Zhang et al.

CVPR 2024arXiv:2312.02434
69
citations
#1291

Learning to Rank in Generative Retrieval

Yongqi Li, Nan Yang, Liang Wang et al.

AAAI 2024paperarXiv:2306.15222
69
citations
#1292

How do Language Models Bind Entities in Context?

Jiahai Feng, Jacob Steinhardt

ICLR 2024arXiv:2310.17191
69
citations
#1293

Generative Camera Dolly: Extreme Monocular Dynamic Novel View Synthesis

Basile Van Hoorick, Rundi Wu, Ege Ozguroglu et al.

ECCV 2024arXiv:2405.14868
69
citations
#1294

LaMPilot: An Open Benchmark Dataset for Autonomous Driving with Language Model Programs

Yunsheng Ma, Can Cui, Xu Cao et al.

CVPR 2024arXiv:2312.04372
69
citations
#1295

Deep Confident Steps to New Pockets: Strategies for Docking Generalization

Gabriele Corso, Arthur Deng, Nicholas Polizzi et al.

ICLR 2024arXiv:2402.18396
69
citations
#1296

Learning Content-Enhanced Mask Transformer for Domain Generalized Urban-Scene Segmentation

Qi Bi, Shaodi You, Theo Gevers

AAAI 2024paperarXiv:2307.00371
69
citations
#1297

VoroNav: Voronoi-based Zero-shot Object Navigation with Large Language Model

Pengying Wu, Yao Mu, Bingxian Wu et al.

ICML 2024arXiv:2401.02695
69
citations
#1298

Text Is MASS: Modeling as Stochastic Embedding for Text-Video Retrieval

Jiamian Wang, Guohao Sun, Pichao Wang et al.

CVPR 2024highlightarXiv:2403.17998
69
citations
#1299

DiffMorpher: Unleashing the Capability of Diffusion Models for Image Morphing

Kaiwen Zhang, Yifan Zhou, Xudong XU et al.

CVPR 2024arXiv:2312.07409
69
citations
#1300

Generative-Based Fusion Mechanism for Multi-Modal Tracking

Zhangyong Tang, Tianyang Xu, Xiaojun Wu et al.

AAAI 2024paperarXiv:2309.01728
69
citations
#1301

Free3D: Consistent Novel View Synthesis without 3D Representation

Chuanxia Zheng, Andrea Vedaldi

CVPR 2024arXiv:2312.04551
69
citations
#1302

Generating Images of Rare Concepts Using Pre-trained Diffusion Models

Dvir Samuel, Rami Ben-Ari, Simon Raviv et al.

AAAI 2024paperarXiv:2304.14530
69
citations
#1303

AVID: Any-Length Video Inpainting with Diffusion Model

Zhixing Zhang, Bichen Wu, Xiaoyan Wang et al.

CVPR 2024arXiv:2312.03816
69
citations
#1304

Guiding Masked Representation Learning to Capture Spatio-Temporal Relationship of Electrocardiogram

Yeongyeon Na, Minje Park, Yunwon Tae et al.

ICLR 2024oralarXiv:2402.09450
69
citations
#1305

SolidGen: An Autoregressive Model for Direct B-rep Synthesis

Karl Willis, Joseph Lambourne, Nigel Morris et al.

ICLR 2024
69
citations
#1306

Task-Customized Mixture of Adapters for General Image Fusion

Pengfei Zhu, Yang Sun, Bing Cao et al.

CVPR 2024arXiv:2403.12494
69
citations
#1307

Open-Vocabulary Video Anomaly Detection

Peng Wu, Xuerong Zhou, Guansong Pang et al.

CVPR 2024arXiv:2311.07042
69
citations
#1308

Aligning and Prompting Everything All at Once for Universal Visual Perception

Yunhang Shen, Chaoyou Fu, Peixian Chen et al.

CVPR 2024arXiv:2312.02153
69
citations
#1309

EDTalk: Efficient Disentanglement for Emotional Talking Head Synthesis

Shuai Tan, Bin Ji, Mengxiao Bi et al.

ECCV 2024arXiv:2404.01647
69
citations
#1310

OMG: Occlusion-friendly Personalized Multi-concept Generation in Diffusion Models

Kong Zhe, Yong Zhang, Tianyu Yang et al.

ECCV 2024arXiv:2403.10983
69
citations
#1311

Learning to Unlearn: Instance-Wise Unlearning for Pre-trained Classifiers

Sungmin Cha, Sungjun Cho, Dasol Hwang et al.

AAAI 2024paperarXiv:2301.11578
69
citations
#1312

Asymmetry in Low-Rank Adapters of Foundation Models

Jiacheng Zhu, Kristjan Greenewald, Kimia Nadjahi et al.

ICML 2024arXiv:2402.16842
68
citations
#1313

Tensor Programs VI: Feature Learning in Infinite Depth Neural Networks

Greg Yang, Dingli Yu, Chen Zhu et al.

ICLR 2024arXiv:2310.02244
68
citations
#1314

Scaling Laws for Data Filtering— Data Curation cannot be Compute Agnostic

Sachin Goyal, Pratyush Maini, Zachary Lipton et al.

CVPR 2024arXiv:2404.07177
68
citations
#1315

Optimizing Diffusion Noise Can Serve As Universal Motion Priors

Korrawe Karunratanakul, Konpat Preechakul, Emre Aksan et al.

CVPR 2024arXiv:2312.11994
68
citations
#1316

Recursive Generalization Transformer for Image Super-Resolution

Zheng Chen, Yulun Zhang, Jinjin Gu et al.

ICLR 2024arXiv:2303.06373
68
citations
#1317

Stochastic Interpolants with Data-Dependent Couplings

Michael Albergo, Mark Goldstein, Nicholas Boffi et al.

ICML 2024spotlightarXiv:2310.03725
68
citations
#1318

Towards Realistic Scene Generation with LiDAR Diffusion Models

Haoxi Ran, Vitor Guizilini, Yue Wang

CVPR 2024arXiv:2404.00815
68
citations
#1319

OmniSeg3D: Omniversal 3D Segmentation via Hierarchical Contrastive Learning

Haiyang Ying, Yixuan Yin, Jinzhi Zhang et al.

CVPR 2024arXiv:2311.11666
68
citations
#1320

Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language Models

Mert Yuksekgonul, Varun Chandrasekaran, Erik Jones et al.

ICLR 2024arXiv:2309.15098
68
citations
#1321

QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models

Jing Liu, Ruihao Gong, Xiuying Wei et al.

ICLR 2024arXiv:2310.08041
68
citations
#1322

Grokking as the transition from lazy to rich training dynamics

Tanishq Kumar, Blake Bordelon, Samuel Gershman et al.

ICLR 2024arXiv:2310.06110
68
citations
#1323

Multi-Source Diffusion Models for Simultaneous Music Generation and Separation

Giorgio Mariani, Irene Tallini, Emilian Postolache et al.

ICLR 2024arXiv:2302.02257
68
citations
#1324

BadCLIP: Trigger-Aware Prompt Learning for Backdoor Attacks on CLIP

Jiawang Bai, Kuofeng Gao, Shaobo Min et al.

CVPR 2024arXiv:2311.16194
68
citations
#1325

Scaling for Training Time and Post-hoc Out-of-distribution Detection Enhancement

Kai Xu, Rongyu Chen, Gianni Franchi et al.

ICLR 2024arXiv:2310.00227
68
citations
#1326

MMEarth: Exploring Multi-Modal Pretext Tasks For Geospatial Representation Learning

Vishal Nedungadi, Ankit Kariryaa, Stefan Oehmcke et al.

ECCV 2024arXiv:2405.02771
68
citations
#1327

PeLK: Parameter-efficient Large Kernel ConvNets with Peripheral Convolution

Honghao Chen, Xiangxiang Chu, Renyongjian et al.

CVPR 2024arXiv:2403.07589
68
citations
#1328

BAT: Behavior-Aware Human-Like Trajectory Prediction for Autonomous Driving

Haicheng Liao, Zhenning Li, Huanming Shen et al.

AAAI 2024paperarXiv:2312.06371
67
citations
#1329

TRAM: Global Trajectory and Motion of 3D Humans from in-the-wild Videos

Yufu Wang, Ziyun Wang, Lingjie Liu et al.

ECCV 2024arXiv:2403.17346
67
citations
#1330

Deep Temporal Graph Clustering

Meng Liu, Yue Liu, KE LIANG et al.

ICLR 2024oralarXiv:2305.10738
67
citations
#1331

MoReVQA: Exploring Modular Reasoning Models for Video Question Answering

Juhong Min, Shyamal Buch, Arsha Nagrani et al.

CVPR 2024arXiv:2404.06511
67
citations
#1332

Intelligent Grimm - Open-ended Visual Storytelling via Latent Diffusion Models

Chang Liu, Haoning Wu, Yujie Zhong et al.

CVPR 2024arXiv:2306.00973
67
citations
#1333

Discover-then-Name: Task-Agnostic Concept Bottlenecks via Automated Concept Discovery

Sukrut Rao, Sweta Mahajan, Moritz Böhle et al.

ECCV 2024arXiv:2407.14499
67
citations
#1334

Gaussian Shell Maps for Efficient 3D Human Generation

Rameen Abdal, Wang Yifan, Zifan Shi et al.

CVPR 2024arXiv:2311.17857
67
citations
#1335

An Aggregation-Free Federated Learning for Tackling Data Heterogeneity

Yuan Wang, Huazhu Fu, Renuga Kanagavelu et al.

CVPR 2024arXiv:2404.18962
67
citations
#1336

CorrMatch: Label Propagation via Correlation Matching for Semi-Supervised Semantic Segmentation

Bo-Yuan Sun, Yuqi Yang, Le Zhang et al.

CVPR 2024arXiv:2306.04300
67
citations
#1337

LLM and Simulation as Bilevel Optimizers: A New Paradigm to Advance Physical Scientific Discovery

Pingchuan Ma, Johnson Tsun-Hsuan Wang, Minghao Guo et al.

ICML 2024arXiv:2405.09783
67
citations
#1338

Towards Seamless Adaptation of Pre-trained Models for Visual Place Recognition

Feng Lu, Lijun Zhang, Xiangyuan Lan et al.

ICLR 2024arXiv:2402.14505
67
citations
#1339

SpikingResformer: Bridging ResNet and Vision Transformer in Spiking Neural Networks

Xinyu Shi, Zecheng Hao, Zhaofei Yu

CVPR 2024arXiv:2403.14302
67
citations
#1340

Matryoshka Diffusion Models

Jiatao Gu, Shuangfei Zhai, Yizhe Zhang et al.

ICLR 2024arXiv:2310.15111
67
citations
#1341

Image Fusion via Vision-Language Model

Zixiang Zhao, Lilun Deng, Haowen Bai et al.

ICML 2024arXiv:2402.02235
67
citations
#1342

SkillDiffuser: Interpretable Hierarchical Planning via Skill Abstractions in Diffusion-Based Task Execution

Zhixuan Liang, Yao Mu, Hengbo Ma et al.

CVPR 2024arXiv:2312.11598
67
citations
#1343

Successor Heads: Recurring, Interpretable Attention Heads In The Wild

Rhys Gould, Euan Ong, George Ogden et al.

ICLR 2024arXiv:2312.09230
67
citations
#1344

FlowVid: Taming Imperfect Optical Flows for Consistent Video-to-Video Synthesis

Feng Liang, Bichen Wu, Jialiang Wang et al.

CVPR 2024highlightarXiv:2312.17681
67
citations
#1345

RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content

Zhuowen Yuan, Zidi Xiong, Yi Zeng et al.

ICML 2024arXiv:2403.13031
67
citations
#1346

Looped Transformers are Better at Learning Learning Algorithms

Liu Yang, Kangwook Lee, Robert Nowak et al.

ICLR 2024arXiv:2311.12424
67
citations
#1347

GIVT: Generative Infinite-Vocabulary Transformers

Michael Tschannen, Cian Eastwood, Fabian Mentzer

ECCV 2024arXiv:2312.02116
67
citations
#1348

Let's Think Outside the Box: Exploring Leap-of-Thought in Large Language Models with Creative Humor Generation

Shanshan Zhong, Zhongzhan Huang, Shanghua Gao et al.

CVPR 2024arXiv:2312.02439
67
citations
#1349

Scene Coordinate Reconstruction: Posing of Image Collections via Incremental Learning of a Relocalizer

Eric Brachmann, Jamie Wynn, Shuai Chen et al.

ECCV 2024arXiv:2404.14351
67
citations
#1350

Space Group Constrained Crystal Generation

Rui Jiao, Wenbing Huang, Yu Liu et al.

ICLR 2024arXiv:2402.03992
66
citations
#1351

Attention-Challenging Multiple Instance Learning for Whole Slide Image Classification

Yunlong Zhang, Honglin Li, YUXUAN SUN et al.

ECCV 2024arXiv:2311.07125
66
citations
#1352

Structure Matters: Tackling the Semantic Discrepancy in Diffusion Models for Image Inpainting

Haipeng Liu, Yang Wang, Biao Qian et al.

CVPR 2024arXiv:2403.19898
66
citations
#1353

Long-Tail Learning with Foundation Model: Heavy Fine-Tuning Hurts

Jiang-Xin Shi, Tong Wei, Zhi Zhou et al.

ICML 2024arXiv:2309.10019
66
citations
#1354

NeRF On-the-go: Exploiting Uncertainty for Distractor-free NeRFs in the Wild

Weining Ren, Zihan Zhu, Boyang Sun et al.

CVPR 2024arXiv:2405.18715
66
citations
#1355

DiffusionTrack: Diffusion Model for Multi-Object Tracking

Run Luo, Zikai Song, Lintao Ma et al.

AAAI 2024paperarXiv:2308.09905
66
citations
#1356

An Emulator for Fine-tuning Large Language Models using Small Language Models

Eric Mitchell, Rafael Rafailov, Archit Sharma et al.

ICLR 2024oralarXiv:2310.12962
66
citations
#1357

Alleviating Exposure Bias in Diffusion Models through Sampling with Shifted Time Steps

Mingxiao Li, Tingyu Qu, Ruicong Yao et al.

ICLR 2024arXiv:2305.15583
66
citations
#1358

Vlogger: Make Your Dream A Vlog

Shaobin Zhuang, Kunchang Li, Xinyuan Chen et al.

CVPR 2024arXiv:2401.09414
66
citations
#1359

TC4D: Trajectory-Conditioned Text-to-4D Generation

Sherwin Bahmani, Xian Liu, Wang Yifan et al.

ECCV 2024arXiv:2403.17920
66
citations
#1360

Multi-HMR: Multi-Person Whole-Body Human Mesh Recovery in a Single Shot

Fabien Baradel, Thomas Lucas, Matthieu Armando et al.

ECCV 2024arXiv:2402.14654
66
citations
#1361

TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models

Yushi Huang, Ruihao Gong, Jing Liu et al.

CVPR 2024highlightarXiv:2311.16503
66
citations
#1362

Light the Night: A Multi-Condition Diffusion Framework for Unpaired Low-Light Enhancement in Autonomous Driving

JINLONG LI, Baolu Li, Zhengzhong Tu et al.

CVPR 2024arXiv:2404.04804
66
citations
#1363

Evaluating the Zero-shot Robustness of Instruction-tuned Language Models

Jiuding Sun, Chantal Shaib, Byron Wallace

ICLR 2024spotlightarXiv:2306.11270
66
citations
#1364

MonoCD: Monocular 3D Object Detection with Complementary Depths

Longfei Yan, Pei Yan, Shengzhou Xiong et al.

CVPR 2024arXiv:2404.03181
66
citations
#1365

Large Language Models Can Automatically Engineer Features for Few-Shot Tabular Learning

Sungwon Han, Jinsung Yoon, Sercan Arik et al.

ICML 2024arXiv:2404.09491
66
citations
#1366

Make RepVGG Greater Again: A Quantization-Aware Approach

Xuesong Nie, Yunfeng Yan, Siyuan Li et al.

AAAI 2024paperarXiv:2212.01593
66
citations
#1367

Self-Discovering Interpretable Diffusion Latent Directions for Responsible Text-to-Image Generation

Hang Li, Chengzhi Shen, Philip H.S. Torr et al.

CVPR 2024arXiv:2311.17216
66
citations
#1368

Video Interpolation with Diffusion Models

Siddhant Jain, Daniel Watson, Aleksander Holynski et al.

CVPR 2024arXiv:2404.01203
66
citations
#1369

C-TPT: Calibrated Test-Time Prompt Tuning for Vision-Language Models via Text Feature Dispersion

Hee Suk Yoon, Eunseop Yoon, Joshua Tian Jin Tee et al.

ICLR 2024arXiv:2403.14119
66
citations
#1370

GIM: Learning Generalizable Image Matcher From Internet Videos

Xuelun Shen, zhipeng cai, Wei Yin et al.

ICLR 2024spotlightarXiv:2402.11095
66
citations
#1371

On the Foundations of Shortcut Learning

Katherine Hermann, Hossein Mobahi, Thomas FEL et al.

ICLR 2024spotlightarXiv:2310.16228
66
citations
#1372

EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars

Nikita Drobyshev, Antoni Bigata Casademunt, Konstantinos Vougioukas et al.

CVPR 2024arXiv:2404.19110
65
citations
#1373

TabR: Tabular Deep Learning Meets Nearest Neighbors

Yury Gorishniy, Ivan Rubachev, Nikolay Kartashev et al.

ICLR 2024arXiv:2307.14338
65
citations
#1374

GraphCare: Enhancing Healthcare Predictions with Personalized Knowledge Graphs

Pengcheng Jiang, Cao Xiao, Adam Cross et al.

ICLR 2024arXiv:2305.12788
65
citations
#1375

Covert Malicious Finetuning: Challenges in Safeguarding LLM Adaptation

Danny Halawi, Alexander Wei, Eric Wallace et al.

ICML 2024arXiv:2406.20053
65
citations
#1376

NExT: Teaching Large Language Models to Reason about Code Execution

Ansong Ni, Miltiadis Allamanis, Arman Cohan et al.

ICML 2024arXiv:2404.14662
65
citations
#1377

Repoformer: Selective Retrieval for Repository-Level Code Completion

Di Wu, Wasi Ahmad, Dejiao Zhang et al.

ICML 2024arXiv:2403.10059
65
citations
#1378

Open3DSG: Open-Vocabulary 3D Scene Graphs from Point Clouds with Queryable Objects and Open-Set Relationships

Sebastian Koch, Narunas Vaskevicius, Mirco Colosi et al.

CVPR 2024arXiv:2402.12259
65
citations
#1379

GliDe with a CaPE: A Low-Hassle Method to Accelerate Speculative Decoding

Cunxiao Du, Jing Jiang, Xu Yuanchen et al.

ICML 2024arXiv:2402.02082
65
citations
#1380

A Closer Look at the Few-Shot Adaptation of Large Vision-Language Models

Julio Silva-Rodríguez, Sina Hajimiri, Ismail Ben Ayed et al.

CVPR 2024arXiv:2312.12730
65
citations
#1381

Improved Zero-Shot Classification by Adapting VLMs with Text Descriptions

Oindrila Saha, Grant Horn, Subhransu Maji

CVPR 2024arXiv:2401.02460
65
citations
#1382

Don't Trust: Verify -- Grounding LLM Quantitative Reasoning with Autoformalization

Jin Zhou, Charles Staats, Wenda Li et al.

ICLR 2024arXiv:2403.18120
65
citations
#1383

Dense Reward for Free in Reinforcement Learning from Human Feedback

Alexander Chan, Hao Sun, Samuel Holt et al.

ICML 2024arXiv:2402.00782
65
citations
#1384

Source-Free Domain Adaptation with Frozen Multimodal Foundation Model

Song Tang, Wenxin Su, Mao Ye et al.

CVPR 2024arXiv:2311.16510
65
citations
#1385

Language-Image Pre-training with Long Captions

Kecheng Zheng, Yifei Zhang, Wei Wu et al.

ECCV 2024arXiv:2403.17007
65
citations
#1386

Robust agents learn causal world models

Jonathan Richens, Tom Everitt

ICLR 2024arXiv:2402.10877
65
citations
#1387

Variational Bayesian Last Layers

James Harrison, John Willes, Jasper Snoek

ICLR 2024spotlightarXiv:2404.11599
65
citations
#1388

MMPareto: Boosting Multimodal Learning with Innocent Unimodal Assistance

Yake Wei, Di Hu

ICML 2024arXiv:2405.17730
64
citations
#1389

Journey to the Center of the Knowledge Neurons: Discoveries of Language-Independent Knowledge Neurons and Degenerate Knowledge Neurons

Yuheng Chen, Pengfei Cao, Yubo Chen et al.

AAAI 2024paperarXiv:2308.13198
64
citations
#1390

Koala: Key Frame-Conditioned Long Video-LLM

Reuben Tan, Ximeng Sun, Ping Hu et al.

CVPR 2024highlightarXiv:2404.04346
64
citations
#1391

Relation DETR: Exploring Explicit Position Relation Prior for Object Detection

Xiuquan Hou, Meiqin Liu, Senlin Zhang et al.

ECCV 2024arXiv:2407.11699
64
citations
#1392

GPT4Point: A Unified Framework for Point-Language Understanding and Generation

Zhangyang Qi, Ye Fang, Zeyi Sun et al.

CVPR 2024highlightarXiv:2312.02980
64
citations
#1393

Unleashing the Potential of SAM for Medical Adaptation via Hierarchical Decoding

Zhiheng Cheng, Qingyue Wei, Hongru Zhu et al.

CVPR 2024arXiv:2403.18271
64
citations
#1394

Focus on Your Instruction: Fine-grained and Multi-instruction Image Editing by Attention Modulation

guo, Tianwei Lin

CVPR 2024arXiv:2312.10113
64
citations
#1395

Masked Audio Generation using a Single Non-Autoregressive Transformer

Alon Ziv, Itai Gat, Gael Le Lan et al.

ICLR 2024arXiv:2401.04577
64
citations
#1396

ImagenHub: Standardizing the evaluation of conditional image generation models

Max Ku, Tianle Li, Kai Zhang et al.

ICLR 2024arXiv:2310.01596
64
citations
#1397

CLaM-TTS: Improving Neural Codec Language Model for Zero-Shot Text-to-Speech

Jaehyeon Kim, Keon Lee, Seungjun Chung et al.

ICLR 2024arXiv:2404.02781
64
citations
#1398

Prompt-tuning Latent Diffusion Models for Inverse Problems

Hyungjin Chung, Jong Chul YE, Peyman Milanfar et al.

ICML 2024arXiv:2310.01110
64
citations
#1399

Decoupling Static and Hierarchical Motion Perception for Referring Video Segmentation

Shuting He, Henghui Ding

CVPR 2024arXiv:2404.03645
64
citations
#1400

What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation

Aaditya Singh, Ted Moskovitz, Feilx Hill et al.

ICML 2024spotlightarXiv:2404.07129
64
citations