Most Cited 2024 "temporal distances" Papers

12,324 papers found • Page 11 of 62

#2001

Unveiling the Pitfalls of Knowledge Editing for Large Language Models

Zhoubo Li, Ningyu Zhang, Yunzhi Yao et al.

ICLR 2024arXiv:2310.02129
44
citations
#2002

MG-TSD: Multi-Granularity Time Series Diffusion Models with Guided Learning Process

Xinyao Fan, Yueying Wu, Chang XU et al.

ICLR 2024arXiv:2403.05751
44
citations
#2003

Retrieval-Augmented Layout Transformer for Content-Aware Layout Generation

Daichi Horita, Naoto Inoue, Kotaro Kikuchi et al.

CVPR 2024arXiv:2311.13602
44
citations
#2004

Theoretical insights for diffusion guidance: A case study for Gaussian mixture models

Yuchen Wu, Minshuo Chen, Zihao Li et al.

ICML 2024arXiv:2403.01639
44
citations
#2005

UniBind: LLM-Augmented Unified and Balanced Representation Space to Bind Them All

Yuanhuiyi Lyu, Xu Zheng, Jiazhou Zhou et al.

CVPR 2024arXiv:2403.12532
44
citations
#2006

Generative Modeling of Regular and Irregular Time Series Data via Koopman VAEs

Ilan Naiman, N. Benjamin Erichson, Pu Ren et al.

ICLR 2024arXiv:2310.02619
44
citations
#2007

Unsupervised Continual Anomaly Detection with Contrastively-Learned Prompt

Jiaqi Liu, Kai Wu, Qiang Nie et al.

AAAI 2024paperarXiv:2401.01010
44
citations
#2008

EgoLifter: Open-world 3D Segmentation for Egocentric Perception

Qiao Gu, Zhaoyang Lv, Duncan Frost et al.

ECCV 2024arXiv:2403.18118
44
citations
#2009

FreeKD: Knowledge Distillation via Semantic Frequency Prompt

Yuan Zhang, Tao Huang, Jiaming Liu et al.

CVPR 2024arXiv:2311.12079
44
citations
#2010

A Dual-Augmentor Framework for Domain Generalization in 3D Human Pose Estimation

Qucheng Peng, Ce Zheng, Chen Chen

CVPR 2024arXiv:2403.11310
44
citations
#2011

ScribblePrompt: Fast and Flexible Interactive Segmentation for Any Biomedical Image

Hallee E. Wong, Marianne Rakic, John Guttag et al.

ECCV 2024arXiv:2312.07381
44
citations
#2012

LLMRG: Improving Recommendations through Large Language Model Reasoning Graphs

Yan Wang, Zhixuan Chu, Xin Ouyang et al.

AAAI 2024paper
44
citations
#2013

Text2Analysis: A Benchmark of Table Question Answering with Advanced Data Analysis and Unclear Queries

Xinyi He, Mengyu Zhou, Xinrun Xu et al.

AAAI 2024paperarXiv:2312.13671
44
citations
#2014

A Sober Look at LLMs for Material Discovery: Are They Actually Good for Bayesian Optimization Over Molecules?

Agustinus Kristiadi, Felix Strieth-Kalthoff, Marta Skreta et al.

ICML 2024arXiv:2402.05015
44
citations
#2015

DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs

Donghyun Kim, Byeongho Heo, Dongyoon Han

ECCV 2024arXiv:2403.19588
44
citations
#2016

Parameterized Physics-informed Neural Networks for Parameterized PDEs

Woojin Cho, Minju Jo, Haksoo Lim et al.

ICML 2024arXiv:2408.09446
44
citations
#2017

Online conformal prediction with decaying step sizes

Anastasios Angelopoulos, Rina Barber, Stephen Bates

ICML 2024arXiv:2402.01139
44
citations
#2018

Improved Probabilistic Image-Text Representations

Sanghyuk Chun

ICLR 2024arXiv:2305.18171
44
citations
#2019

KVQ: Kwai Video Quality Assessment for Short-form Videos

Yiting Lu, Xin Li, Yajing Pei et al.

CVPR 2024arXiv:2402.07220
44
citations
#2020

What to Remember: Self-Adaptive Continual Learning for Audio Deepfake Detection

XiaoHui Zhang, Jiangyan Yi, Chenglong Wang et al.

AAAI 2024paperarXiv:2312.09651
43
citations
#2021

Error Detection in Egocentric Procedural Task Videos

Shih-Po Lee, Zijia Lu, Zekun Zhang et al.

CVPR 2024
43
citations
#2022

Elijah: Eliminating Backdoors Injected in Diffusion Models via Distribution Shift

Shengwei An, Sheng-Yen Chou, Kaiyuan Zhang et al.

AAAI 2024paperarXiv:2312.00050
43
citations
#2023

TiC-CLIP: Continual Training of CLIP Models

Saurabh Garg, Mehrdad Farajtabar, Hadi Pouransari et al.

ICLR 2024oralarXiv:2310.16226
43
citations
#2024

PREFER: Prompt Ensemble Learning via Feedback-Reflect-Refine

Chenrui Zhang, Lin Liu, Chuyuan Wang et al.

AAAI 2024paperarXiv:2308.12033
43
citations
#2025

Mask Grounding for Referring Image Segmentation

Yong Xien Chng, Henry Zheng, Yizeng Han et al.

CVPR 2024arXiv:2312.12198
43
citations
#2026

Readout Guidance: Learning Control from Diffusion Features

Grace Luo, Trevor Darrell, Oliver Wang et al.

CVPR 2024highlightarXiv:2312.02150
43
citations
#2027

Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs

Yeonhong Park, Jake Hyun, SangLyul Cho et al.

ICML 2024arXiv:2402.10517
43
citations
#2028

Consistent Video-to-Video Transfer Using Synthetic Dataset

Jiaxin Cheng, Tianjun Xiao, Tong He

ICLR 2024arXiv:2311.00213
43
citations
#2029

Evaluating Language Model Agency Through Negotiations

Tim R. Davidson, Veniamin Veselovsky, Michal Kosinski et al.

ICLR 2024arXiv:2401.04536
43
citations
#2030

MEMORYLLM: Towards Self-Updatable Large Language Models

Yu Wang, Yifan Gao, Xiusi Chen et al.

ICML 2024arXiv:2402.04624
43
citations
#2031

The good, the bad and the ugly sides of data augmentation: An implicit spectral regularization perspective

Chi-Heng Lin, Chiraag Kaushik, Eva Dyer et al.

ICML 2024arXiv:2210.05021
43
citations
#2032

Multi-modal In-Context Learning Makes an Ego-evolving Scene Text Recognizer

Zhen Zhao, Jingqun Tang, Chunhui Lin et al.

CVPR 2024arXiv:2311.13120
43
citations
#2033

Two-stage LLM Fine-tuning with Less Specialization and More Generalization

Yihan Wang, Si Si, Daliang Li et al.

ICLR 2024arXiv:2211.00635
43
citations
#2034

HouseCat6D - A Large-Scale Multi-Modal Category Level 6D Object Perception Dataset with Household Objects in Realistic Scenarios

HyunJun Jung, Shun-Cheng Wu, Patrick Ruhkamp et al.

CVPR 2024highlightarXiv:2212.10428
43
citations
#2035

Towards Robust Event-guided Low-Light Image Enhancement: A Large-Scale Real-World Event-Image Dataset and Novel Approach

Guoqiang Liang, Kanghao Chen, Hangyu Li et al.

CVPR 2024arXiv:2404.00834
43
citations
#2036

Think2Drive: Efficient Reinforcement Learning by Thinking with Latent World Model for Autonomous Driving (in CARLA-v2)

Qifeng Li, Xiaosong Jia, Shaobo Wang et al.

ECCV 2024
43
citations
#2037

PEM: Prototype-based Efficient MaskFormer for Image Segmentation

Niccolò Cavagnero, Gabriele Rosi, Claudia Cuttano et al.

CVPR 2024arXiv:2402.19422
43
citations
#2038

SemCity: Semantic Scene Generation with Triplane Diffusion

Jumin Lee, Sebin Lee, Changho Jo et al.

CVPR 2024arXiv:2403.07773
43
citations
#2039

Continual Self-supervised Learning: Towards Universal Multi-modal Medical Data Representation Learning

Yiwen Ye, Yutong Xie, Jianpeng Zhang et al.

CVPR 2024highlightarXiv:2311.17597
43
citations
#2040

Uniform Memory Retrieval with Larger Capacity for Modern Hopfield Models

Dennis Wu, Jerry Yao-Chieh Hu, Teng-Yun Hsiao et al.

ICML 2024arXiv:2404.03827
43
citations
#2041

TC-LIF: A Two-Compartment Spiking Neuron Model for Long-Term Sequential Modelling

Shimin Zhang, Qu Yang, Chenxiang Ma et al.

AAAI 2024paperarXiv:2308.13250
43
citations
#2042

Multi-Memory Matching for Unsupervised Visible-Infrared Person Re-Identification

Jiangming Shi, Xiangbo Yin, Yeyun Chen et al.

ECCV 2024arXiv:2401.06825
43
citations
#2043

360DVD: Controllable Panorama Video Generation with 360-Degree Video Diffusion Model

Qian Wang, Weiqi Li, Chong Mou et al.

CVPR 2024arXiv:2401.06578
43
citations
#2044

Conversational Drug Editing Using Retrieval and Domain Feedback

Shengchao Liu, Jiongxiao Wang, Yijin Yang et al.

ICLR 2024arXiv:2305.18090
43
citations
#2045

Provable Offline Preference-Based Reinforcement Learning

Wenhao Zhan, Masatoshi Uehara, Nathan Kallus et al.

ICLR 2024spotlightarXiv:2305.14816
43
citations
#2046

Fair Federated Learning under Domain Skew with Local Consistency and Domain Diversity

Yuhang Chen, Wenke Huang, Mang Ye

CVPR 2024arXiv:2405.16585
43
citations
#2047

Learned Representation-Guided Diffusion Models for Large-Image Generation

Alexandros Graikos, Srikar Yellapragada, Minh-Quan Le et al.

CVPR 2024arXiv:2312.07330
43
citations
#2048

Map-Relative Pose Regression for Visual Re-Localization

Shuai Chen, Tommaso Cavallari, Victor Adrian Prisacariu et al.

CVPR 2024highlightarXiv:2404.09884
43
citations
#2049

Debiasing Multimodal Sarcasm Detection with Contrastive Learning

Mengzhao Jia, Can Xie, Liqiang Jing

AAAI 2024paperarXiv:2312.10493
43
citations
#2050

AI Alignment with Changing and Influenceable Reward Functions

Micah Carroll, Davis Foote, Anand Siththaranjan et al.

ICML 2024arXiv:2405.17713
43
citations
#2051

In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation

Shiqi Chen, Miao Xiong, Junteng Liu et al.

ICML 2024arXiv:2403.01548
43
citations
#2052

Beyond ELBOs: A Large-Scale Evaluation of Variational Methods for Sampling

Denis Blessing, Xiaogang Jia, Johannes Esslinger et al.

ICML 2024arXiv:2406.07423
43
citations
#2053

Towards Unified Multi-Modal Personalization: Large Vision-Language Models for Generative Recommendation and Beyond

Tianxin Wei, Bowen Jin, Ruirui Li et al.

ICLR 2024arXiv:2403.10667
43
citations
#2054

On Exact Inversion of DPM-Solvers

Seongmin Hong, Kyeonghyun Lee, Suh Yoon Jeon et al.

CVPR 2024arXiv:2311.18387
43
citations
#2055

Can AI Assistants Know What They Don't Know?

Qinyuan Cheng, Tianxiang Sun, Xiangyang Liu et al.

ICML 2024arXiv:2401.13275
43
citations
#2056

Learning the 3D Fauna of the Web

Zizhang Li, Dor Litvak, Ruining Li et al.

CVPR 2024arXiv:2401.02400
43
citations
#2057

Scene Adaptive Sparse Transformer for Event-based Object Detection

Yansong Peng, Li Hebei, Yueyi Zhang et al.

CVPR 2024arXiv:2404.01882
43
citations
#2058

Graph Attention Retrospective

Kimon Fountoulakis, Amit Levi, Shenghao Yang et al.

ICML 2024arXiv:2202.13060
43
citations
#2059

EmoVIT: Revolutionizing Emotion Insights with Visual Instruction Tuning

Hongxia Xie, Chu-Jun Peng, Yu-Wen Tseng et al.

CVPR 2024arXiv:2404.16670
43
citations
#2060

CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers

Dachuan Shi, Chaofan Tao, Anyi Rao et al.

ICML 2024arXiv:2305.17455
43
citations
#2061

Conformal Prediction for Deep Classifier via Label Ranking

Jianguo Huang, HuaJun Xi, Linjun Zhang et al.

ICML 2024arXiv:2310.06430
43
citations
#2062

Towards Real-World Test-Time Adaptation: Tri-net Self-Training with Balanced Normalization

Yongyi Su, Xun Xu, Kui Jia

AAAI 2024paperarXiv:2309.14949
43
citations
#2063

LEMON: Learning 3D Human-Object Interaction Relation from 2D Images

Yuhang Yang, Wei Zhai, Hongchen Luo et al.

CVPR 2024arXiv:2312.08963
43
citations
#2064

WorDepth: Variational Language Prior for Monocular Depth Estimation

Ziyao Zeng, Hyoungseob Park, Fengyu Yang et al.

CVPR 2024arXiv:2404.03635
43
citations
#2065

Zeroth-Order Optimization Meets Human Feedback: Provable Learning via Ranking Oracles

Zhiwei Tang, Dmitry Rybin, Tsung-Hui Chang

ICLR 2024arXiv:2303.03751
42
citations
#2066

SeD: Semantic-Aware Discriminator for Image Super-Resolution

Bingchen Li, Xin Li, Hanxin Zhu et al.

CVPR 2024arXiv:2402.19387
42
citations
#2067

3D Face Reconstruction with the Geometric Guidance of Facial Part Segmentation

Zidu Wang, Xiangyu Zhu, Tianshuo Zhang et al.

CVPR 2024highlightarXiv:2312.00311
42
citations
#2068

Fast Decision Boundary based Out-of-Distribution Detector

Litian Liu, Yao Qin

ICML 2024arXiv:2312.11536
42
citations
#2069

CAGE: Controllable Articulation GEneration

Jiayi Liu, Hou In Ivan Tam, Ali Mahdavi Amiri et al.

CVPR 2024arXiv:2312.09570
42
citations
#2070

Particle Guidance: non-I.I.D. Diverse Sampling with Diffusion Models

Gabriele Corso, Yilun Xu, Valentin De Bortoli et al.

ICLR 2024arXiv:2310.13102
42
citations
#2071

Is In-Context Learning in Large Language Models Bayesian? A Martingale Perspective

Fabian Falck, Ziyu Wang, Christopher Holmes

ICML 2024arXiv:2406.00793
42
citations
#2072

TroVE: Inducing Verifiable and Efficient Toolboxes for Solving Programmatic Tasks

Zhiruo Wang, Graham Neubig, Daniel Fried

ICML 2024arXiv:2401.12869
42
citations
#2073

HanDiffuser: Text-to-Image Generation With Realistic Hand Appearances

Supreeth Narasimhaswamy, Uttaran Bhattacharya, Xiang Chen et al.

CVPR 2024arXiv:2403.01693
42
citations
#2074

From Yes-Men to Truth-Tellers: Addressing Sycophancy in Large Language Models with Pinpoint Tuning

Wei Chen, Zhen Huang, Liang Xie et al.

ICML 2024arXiv:2409.01658
42
citations
#2075

AlignDiff: Aligning Diverse Human Preferences via Behavior-Customisable Diffusion Model

Zibin Dong, Yifu Yuan, Jianye HAO et al.

ICLR 2024oralarXiv:2310.02054
42
citations
#2076

MAPE-PPI: Towards Effective and Efficient Protein-Protein Interaction Prediction via Microenvironment-Aware Protein Embedding

Lirong Wu, Yijun Tian, Yufei Huang et al.

ICLR 2024spotlightarXiv:2402.14391
42
citations
#2077

A Vision Check-up for Language Models

Pratyusha Sharma, Tamar Rott Shaham, Manel Baradad et al.

CVPR 2024highlightarXiv:2401.01862
42
citations
#2078

Channel Vision Transformers: An Image Is Worth 1 x 16 x 16 Words

Yujia Bao, Srinivasan Sivanandan, THEOFANIS KARALETSOS

ICLR 2024arXiv:2309.16108
42
citations
#2079

A Compact Dynamic 3D Gaussian Representation for Real-Time Dynamic View Synthesis

Kai Katsumata, Duc Minh Vo, Hideki Nakayama

ECCV 2024arXiv:2311.12897
42
citations
#2080

EulerMormer: Robust Eulerian Motion Magnification via Dynamic Filtering within Transformer

Fei Wang, Dan Guo, Kun Li et al.

AAAI 2024paperarXiv:2312.04152
42
citations
#2081

Few-Shot Detection of Machine-Generated Text using Style Representations

Rafael Rivera Soto, Kailin Koch, Aleem Khan et al.

ICLR 2024arXiv:2401.06712
42
citations
#2082

vid-TLDR: Training Free Token Merging for Light-weight Video Transformer

Joonmyung Choi, Sanghyeok Lee, Jaewon Chu et al.

CVPR 2024arXiv:2403.13347
42
citations
#2083

Balancing Act: Distribution-Guided Debiasing in Diffusion Models

Rishubh Parihar, Abhijnya Bhat, Abhipsa Basu et al.

CVPR 2024arXiv:2402.18206
42
citations
#2084

CARTE: Pretraining and Transfer for Tabular Learning

Myung Jun Kim, Leo Grinsztajn, Gael Varoquaux

ICML 2024arXiv:2402.16785
42
citations
#2085

TAPTR: Tracking Any Point with Transformers as Detection

Hongyang Li, Hao Zhang, Shilong Liu et al.

ECCV 2024arXiv:2403.13042
42
citations
#2086

Fine-Grained Distillation for Long Document Retrieval

Yucheng Zhou, Tao Shen, Xiubo Geng et al.

AAAI 2024paperarXiv:2212.10423
42
citations
#2087

Probabilistic Forecasting with Stochastic Interpolants and Föllmer Processes

Yifan Chen, Mark Goldstein, Mengjian Hua et al.

ICML 2024arXiv:2403.13724
42
citations
#2088

Hard-Constrained Deep Learning for Climate Downscaling

Paula Harder, Alex Hernandez-Garcia, Venkatesh Ramesh et al.

ICLR 2024arXiv:2208.05424
42
citations
#2089

SCEdit: Efficient and Controllable Image Diffusion Generation via Skip Connection Editing

Zeyinzi Jiang, Chaojie Mao, Yulin Pan et al.

CVPR 2024highlightarXiv:2312.11392
42
citations
#2090

Diffusion 3D Features (Diff3F): Decorating Untextured Shapes with Distilled Semantic Features

Niladri Shekhar Dutt, Sanjeev Muralikrishnan, Niloy J. Mitra

CVPR 2024arXiv:2311.17024
42
citations
#2091

Any-Size-Diffusion: Toward Efficient Text-Driven Synthesis for Any-Size HD Images

Qingping Zheng, Yuanfan Guo, Jiankang Deng et al.

AAAI 2024paperarXiv:2308.16582
42
citations
#2092

OrCo: Towards Better Generalization via Orthogonality and Contrast for Few-Shot Class-Incremental Learning

Noor Ahmed, Anna Kukleva, Bernt Schiele

CVPR 2024highlightarXiv:2403.18550
42
citations
#2093

Curriculum reinforcement learning for quantum architecture search under hardware errors

Yash J. Patel, Akash Kundu, Mateusz Ostaszewski et al.

ICLR 2024arXiv:2402.03500
42
citations
#2094

Diffusion Model-Augmented Behavioral Cloning

Shang-Fu Chen, Hsiang-Chun Wang, Ming-Hao Hsu et al.

ICML 2024oralarXiv:2302.13335
42
citations
#2095

DiffusionAvatars: Deferred Diffusion for High-fidelity 3D Head Avatars

Tobias Kirschstein, Simon Giebenhain, Matthias Nießner

CVPR 2024arXiv:2311.18635
42
citations
#2096

Investigating and Mitigating the Side Effects of Noisy Views for Self-Supervised Clustering Algorithms in Practical Multi-View Scenarios

Jie Xu, Yazhou Ren, Xiaolong Wang et al.

CVPR 2024arXiv:2303.17245
42
citations
#2097

T-MARS: Improving Visual Representations by Circumventing Text Feature Learning

Pratyush Maini, Sachin Goyal, Zachary Lipton et al.

ICLR 2024arXiv:2307.03132
42
citations
#2098

SD-DiT: Unleashing the Power of Self-supervised Discrimination in Diffusion Transformer

Rui Zhu, Yingwei Pan, Yehao Li et al.

CVPR 2024arXiv:2403.17004
42
citations
#2099

A Watermark-Conditioned Diffusion Model for IP Protection

Rui Min, Sen Li, Hongyang Chen et al.

ECCV 2024arXiv:2403.10893
42
citations
#2100

FFB: A Fair Fairness Benchmark for In-Processing Group Fairness Methods

Xiaotian Han, Jianfeng Chi, Yu Chen et al.

ICLR 2024arXiv:2306.09468
42
citations
#2101

Learning Hierarchical Prompt with Structured Linguistic Knowledge for Vision-Language Models

Yubin Wang, Xinyang Jiang, De Cheng et al.

AAAI 2024paperarXiv:2312.06323
42
citations
#2102

FlowVQTalker: High-Quality Emotional Talking Face Generation through Normalizing Flow and Quantization

Shuai Tan, Bin Ji, Ye Pan

CVPR 2024arXiv:2403.06375
42
citations
#2103

Interpreting and Improving Large Language Models in Arithmetic Calculation

Wei Zhang, Wan Chaoqun, Yonggang Zhang et al.

ICML 2024arXiv:2409.01659
42
citations
#2104

Prompt Learning via Meta-Regularization

Jinyoung Park, Juyeon Ko, Hyunwoo J. Kim

CVPR 2024arXiv:2404.00851
42
citations
#2105

Stream Query Denoising for Vectorized HD-Map Construction

Shuo Wang, Fan Jia, Weixin Mao et al.

ECCV 2024arXiv:2401.09112
42
citations
#2106

Language Models as Black-Box Optimizers for Vision-Language Models

Shihong Liu, Samuel Yu, Zhiqiu Lin et al.

CVPR 2024arXiv:2309.05950
42
citations
#2107

LP++: A Surprisingly Strong Linear Probe for Few-Shot CLIP

Yunshi HUANG, Fereshteh Shakeri, Jose Dolz et al.

CVPR 2024arXiv:2404.02285
42
citations
#2108

Outlier-Efficient Hopfield Layers for Large Transformer-Based Models

Jerry Yao-Chieh Hu, Pei-Hsuan Chang, Haozheng Luo et al.

ICML 2024arXiv:2404.03828
42
citations
#2109

Emotional Speech-driven 3D Body Animation via Disentangled Latent Diffusion

Kiran Chhatre, Radek Danecek, Nikos Athanasiou et al.

CVPR 2024arXiv:2312.04466
42
citations
#2110

Object-Aware Domain Generalization for Object Detection

WooJu Lee, Dasol Hong, Hyungtae Lim et al.

AAAI 2024paperarXiv:2312.12133
42
citations
#2111

Test-Time Domain Generalization for Face Anti-Spoofing

Qianyu Zhou, Ke-Yue Zhang, Taiping Yao et al.

CVPR 2024arXiv:2403.19334
41
citations
#2112

DePT: Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning

Zhengxiang Shi, Aldo Lipani

ICLR 2024arXiv:2309.05173
41
citations
#2113

Non-Vacuous Generalization Bounds for Large Language Models

Sanae Lotfi, Marc Finzi, Yilun Kuang et al.

ICML 2024arXiv:2312.17173
41
citations
#2114

Intriguing Properties of Data Attribution on Diffusion Models

Xiaosen Zheng, Tianyu Pang, Chao Du et al.

ICLR 2024arXiv:2311.00500
41
citations
#2115

Test-Time Adaptation with CLIP Reward for Zero-Shot Generalization in Vision-Language Models

Shuai Zhao, Xiaohan Wang, Linchao Zhu et al.

ICLR 2024arXiv:2305.18010
41
citations
#2116

Class-Incremental Learning with CLIP: Adaptive Representation Adjustment and Parameter Fusion

Linlan Huang, Xusheng Cao, Haori Lu et al.

ECCV 2024arXiv:2407.14143
41
citations
#2117

Vision-Language Pre-training with Object Contrastive Learning for 3D Scene Understanding

Taolin Zhang, Sunan He, Tao Dai et al.

AAAI 2024paperarXiv:2305.10714
41
citations
#2118

Attribute-Missing Graph Clustering Network

Wenxuan Tu, Renxiang Guan, Sihang Zhou et al.

AAAI 2024paper
41
citations
#2119

Beyond Imitation: Leveraging Fine-grained Quality Signals for Alignment

Geyang Guo, Ranchi Zhao, Tianyi Tang et al.

ICLR 2024arXiv:2311.04072
41
citations
#2120

A Hard-to-Beat Baseline for Training-free CLIP-based Adaptation

Zhengbo Wang, Jian Liang, Lijun Sheng et al.

ICLR 2024arXiv:2402.04087
41
citations
#2121

Image Sculpting: Precise Object Editing with 3D Geometry Control

Jiraphon Yenphraphai, Xichen Pan, Sainan Liu et al.

CVPR 2024arXiv:2401.01702
41
citations
#2122

Teaching Language Models to Hallucinate Less with Synthetic Tasks

Erik Jones, Hamid Palangi, Clarisse Ribeiro et al.

ICLR 2024arXiv:2310.06827
41
citations
#2123

Transfer CLIP for Generalizable Image Denoising

Jun Cheng, Dong Liang, Shan Tan

CVPR 2024arXiv:2403.15132
41
citations
#2124

Frequency Spectrum Is More Effective for Multimodal Representation and Fusion: A Multimodal Spectrum Rumor Detector

An Lao, Qi Zhang, Chongyang Shi et al.

AAAI 2024paperarXiv:2312.11023
41
citations
#2125

Correlation-Decoupled Knowledge Distillation for Multimodal Sentiment Analysis with Incomplete Modalities

Mingcheng Li, Dingkang Yang, Xiao Zhao et al.

CVPR 2024arXiv:2404.16456
41
citations
#2126

SD4Match: Learning to Prompt Stable Diffusion Model for Semantic Matching

Xinghui Li, Jingyi Lu, Kai Han et al.

CVPR 2024arXiv:2310.17569
41
citations
#2127

Norm Tweaking: High-Performance Low-Bit Quantization of Large Language Models

Liang Li, Qingyuan Li, Bo Zhang et al.

AAAI 2024paperarXiv:2309.02784
41
citations
#2128

Image Restoration Through Generalized Ornstein-Uhlenbeck Bridge

Yue Conghan, Zhengwei Peng, Junlong Ma et al.

ICML 2024arXiv:2312.10299
41
citations
#2129

Towards Efficient Replay in Federated Incremental Learning

Yichen Li, Qunwei Li, Haozhao Wang et al.

CVPR 2024arXiv:2403.05890
41
citations
#2130

Does CLIP’s generalization performance mainly stem from high train-test similarity?

Prasanna Mayilvahanan, Thaddäus Wiedemer, Evgenia Rusak et al.

ICLR 2024arXiv:2310.09562
41
citations
#2131

Facing the Elephant in the Room: Visual Prompt Tuning or Full finetuning?

Cheng Han, Qifan Wang, Yiming Cui et al.

ICLR 2024arXiv:2401.12902
41
citations
#2132

Multi-Task Reinforcement Learning with Mixture of Orthogonal Experts

Ahmed Hendawy, Jan Peters, Carlo D'Eramo

ICLR 2024arXiv:2311.11385
41
citations
#2133

Dual RL: Unification and New Methods for Reinforcement and Imitation Learning

Harshit Sikchi, Qinqing Zheng, Amy Zhang et al.

ICLR 2024spotlightarXiv:2302.08560
41
citations
#2134

Motion-Guided Latent Diffusion for Temporally Consistent Real-world Video Super-resolution

Xi Yang, Chenhang He, Jianqi Ma et al.

ECCV 2024arXiv:2312.00853
41
citations
#2135

Linguistic-Aware Patch Slimming Framework for Fine-grained Cross-Modal Alignment

Zheren Fu, Lei Zhang, Hou Xia et al.

CVPR 2024
41
citations
#2136

MuseChat: A Conversational Music Recommendation System for Videos

Zhikang Dong, Bin Chen, Xiulong Liu et al.

CVPR 2024highlightarXiv:2310.06282
41
citations
#2137

Neural Optimal Transport with General Cost Functionals

Arip Asadulaev, Alexander Korotin, Vage Egiazarian et al.

ICLR 2024arXiv:2205.15403
41
citations
#2138

Codebook Features: Sparse and Discrete Interpretability for Neural Networks

Alex Tamkin, Mohammad Taufeeque, Noah Goodman

ICML 2024arXiv:2310.17230
41
citations
#2139

Addressing Loss of Plasticity and Catastrophic Forgetting in Continual Learning

Mohamed Elsayed, A. Rupam Mahmood

ICLR 2024arXiv:2404.00781
41
citations
#2140

Yet Another ICU Benchmark: A Flexible Multi-Center Framework for Clinical ML

Robin van de Water, Hendrik Schmidt, Paul Elbers et al.

ICLR 2024oralarXiv:2306.05109
41
citations
#2141

ReconBoost: Boosting Can Achieve Modality Reconcilement

Cong Hua, Qianqian Xu, Shilong Bao et al.

ICML 2024arXiv:2405.09321
41
citations
#2142

ID-like Prompt Learning for Few-Shot Out-of-Distribution Detection

Yichen Bai, Zongbo Han, Bing Cao et al.

CVPR 2024arXiv:2311.15243
41
citations
#2143

Fusing Models with Complementary Expertise

Hongyi Wang, Felipe Polo, Yuekai Sun et al.

ICLR 2024arXiv:2310.01542
41
citations
#2144

Symbolic Music Generation with Non-Differentiable Rule Guided Diffusion

Yujia Huang, Adishree Ghatare, Yuanzhe Liu et al.

ICML 2024arXiv:2402.14285
41
citations
#2145

VCR-Graphormer: A Mini-batch Graph Transformer via Virtual Connections

Dongqi Fu, Zhigang Hua, Yan Xie et al.

ICLR 2024arXiv:2403.16030
41
citations
#2146

A Simple Recipe for Contrastively Pre-training Video-First Encoders Beyond 16 Frames

Pinelopi Papalampidi, Skanda Koppula, Shreya Pathak et al.

CVPR 2024arXiv:2312.07395
41
citations
#2147

IBD-PSC: Input-level Backdoor Detection via Parameter-oriented Scaling Consistency

Linshan Hou, Ruili Feng, Zhongyun Hua et al.

ICML 2024arXiv:2405.09786
41
citations
#2148

MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation

KUNPENG SONG, Yizhe Zhu, Bingchen Liu et al.

ECCV 2024arXiv:2404.05674
41
citations
#2149

Improving Diffusion Models for Inverse Problems Using Optimal Posterior Covariance

Xinyu Peng, Ziyang Zheng, Wenrui Dai et al.

ICML 2024arXiv:2402.02149
41
citations
#2150

Benchmarking and Improving Generator-Validator Consistency of Language Models

XIANG LI, Vaishnavi Shrivastava, Siyan Li et al.

ICLR 2024arXiv:2310.01846
41
citations
#2151

Context-Guided Spatio-Temporal Video Grounding

Xin Gu, Heng Fan, Yan Huang et al.

CVPR 2024arXiv:2401.01578
41
citations
#2152

Quality-Diversity through AI Feedback

Herbie Bradley, Andrew Dai, Hannah Teufel et al.

ICLR 2024arXiv:2310.13032
41
citations
#2153

Distinguishing the Knowable from the Unknowable with Language Models

Gustaf Ahdritz, Tian Qin, Nikhil Vyas et al.

ICML 2024arXiv:2402.03563
41
citations
#2154

THOUGHT PROPAGATION: AN ANALOGICAL APPROACH TO COMPLEX REASONING WITH LARGE LANGUAGE MODELS

Junchi Yu, Ran He, Rex Ying

ICLR 2024arXiv:2310.03965
41
citations
#2155

Context-Aware Integration of Language and Visual References for Natural Language Tracking

Yanyan Shao, Shuting He, Qi Ye et al.

CVPR 2024arXiv:2403.19975
41
citations
#2156

MesonGS: Post-training Compression of 3D Gaussians via Efficient Attribute Transformation

Shuzhao Xie, Weixiang Zhang, Chen Tang et al.

ECCV 2024arXiv:2409.09756
41
citations
#2157

Large Language Models Are Neurosymbolic Reasoners

Meng Fang, Shilong Deng, Yudi Zhang et al.

AAAI 2024paperarXiv:2401.09334
41
citations
#2158

DiffAvatar: Simulation-Ready Garment Optimization with Differentiable Simulation

Yifei Li, Hsiaoyu Chen, Egor Larionov et al.

CVPR 2024arXiv:2311.12194
41
citations
#2159

PAD: Patch-Agnostic Defense against Adversarial Patch Attacks

Lihua Jing, Rui Wang, Wenqi Ren et al.

CVPR 2024arXiv:2404.16452
41
citations
#2160

A Diffusion-Based Framework for Multi-Class Anomaly Detection

Haoyang He, Jiangning Zhang, Hongxu Chen et al.

AAAI 2024paperarXiv:2312.06607
41
citations
#2161

Consistent Diffusion Meets Tweedie: Training Exact Ambient Diffusion Models with Noisy Data

Giannis Daras, Alexandros Dimakis, Constantinos Daskalakis

ICML 2024arXiv:2404.10177
41
citations
#2162

Towards Balanced Alignment: Modal-Enhanced Semantic Modeling for Video Moment Retrieval

Zhihang Liu, Jun Li, Hongtao Xie et al.

AAAI 2024paperarXiv:2312.12155
41
citations
#2163

Generative Proxemics: A Prior for 3D Social Interaction from Images

Vickie Ye, Vickie Ye, Georgios Pavlakos et al.

CVPR 2024arXiv:2306.09337
41
citations
#2164

Leveraging Optimization for Adaptive Attacks on Image Watermarks

Nils Lukas, Abdelrahman Ahmed, Lucas Fenaux et al.

ICLR 2024arXiv:2309.16952
41
citations
#2165

Overestimation, Overfitting, and Plasticity in Actor-Critic: the Bitter Lesson of Reinforcement Learning

Michal Nauman, Michał Bortkiewicz, Piotr Milos et al.

ICML 2024arXiv:2403.00514
41
citations
#2166

TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes

Bu Jin, Yupeng Zheng, Pengfei Li et al.

ECCV 2024arXiv:2403.19589
40
citations
#2167

SLiMe: Segment Like Me

Aliasghar Khani, Saeid Asgari, Aditya Sanghi et al.

ICLR 2024arXiv:2309.03179
40
citations
#2168

Retrieval-Enhanced Contrastive Vision-Text Models

Ahmet Iscen, Mathilde Caron, Alireza Fathi et al.

ICLR 2024arXiv:2306.07196
40
citations
#2169

Improved Operator Learning by Orthogonal Attention

Zipeng Xiao, Zhongkai Hao, Bokai Lin et al.

ICML 2024spotlightarXiv:2310.12487
40
citations
#2170

DeepSpeed Data Efficiency: Improving Deep Learning Model Quality and Training Efficiency via Efficient Data Sampling and Routing

Conglong Li, Zhewei Yao, Xiaoxia Wu et al.

AAAI 2024paperarXiv:2212.03597
40
citations
#2171

Learning with 3D rotations, a hitchhiker's guide to SO(3)

Andreas René Geist, Jonas Frey, Mikel Zhobro et al.

ICML 2024arXiv:2404.11735
40
citations
#2172

MegaScenes: Scene-Level View Synthesis at Scale

Joseph Tung, Gene Chou, Ruojin Cai et al.

ECCV 2024arXiv:2406.11819
40
citations
#2173

NOPE: Novel Object Pose Estimation from a Single Image

Van Nguyen Nguyen, Thibault Groueix, Georgy Ponimatkin et al.

CVPR 2024arXiv:2303.13612
40
citations
#2174

Texture-GS: Disentangle the Geometry and Texture for 3D Gaussian Splatting Editing

Tian-Xing Xu, WENBO HU, Yu-Kun Lai et al.

ECCV 2024arXiv:2403.10050
40
citations
#2175

A Unified and General Framework for Continual Learning

Zhenyi Wang, Yan Li, Li Shen et al.

ICLR 2024arXiv:2403.13249
40
citations
#2176

WOODS: Benchmarks for Out-of-Distribution Generalization in Time Series

Irina Rish, Kartik Ahuja, Mohammad Javad Darvishi Bayazi et al.

ICLR 2024
40
citations
#2177

BAT: Learning to Reason about Spatial Sounds with Large Language Models

Zhisheng Zheng, Puyuan Peng, Ziyang Ma et al.

ICML 2024arXiv:2402.01591
40
citations
#2178

XKD: Cross-Modal Knowledge Distillation with Domain Alignment for Video Representation Learning

Pritam Sarkar, Ali Etemad

AAAI 2024paperarXiv:2211.13929
40
citations
#2179

Superpoint Gaussian Splatting for Real-Time High-Fidelity Dynamic Scene Reconstruction

Diwen Wan, Ruijie Lu, Gang Zeng

ICML 2024arXiv:2406.03697
40
citations
#2180

Stable Neural Stochastic Differential Equations in Analyzing Irregular Time Series Data

YongKyung Oh, Dongyoung Lim, Sungil Kim

ICLR 2024spotlightarXiv:2402.14989
40
citations
#2181

Taming Mode Collapse in Score Distillation for Text-to-3D Generation

Peihao Wang, Dejia Xu, Zhiwen Fan et al.

CVPR 2024arXiv:2401.00909
40
citations
#2182

Fourier Transporter: Bi-Equivariant Robotic Manipulation in 3D

Haojie Huang, Owen Howell, Dian Wang et al.

ICLR 2024arXiv:2401.12046
40
citations
#2183

Agent Instructs Large Language Models to be General Zero-Shot Reasoners

Nicholas Crispino, Kyle Montgomery, Fankun Zeng et al.

ICML 2024arXiv:2310.03710
40
citations
#2184

Generalized Neural Collapse for a Large Number of Classes

Jiachen Jiang, Jinxin Zhou, Peng Wang et al.

ICML 2024arXiv:2310.05351
40
citations
#2185

Do You Remember? Dense Video Captioning with Cross-Modal Memory Retrieval

Minkuk Kim, Hyeon Bae Kim, Jinyoung Moon et al.

CVPR 2024arXiv:2404.07610
40
citations
#2186

Texture-Preserving Diffusion Models for High-Fidelity Virtual Try-On

Xu Yang, Changxing Ding, Zhibin Hong et al.

CVPR 2024arXiv:2404.01089
40
citations
#2187

UnScene3D: Unsupervised 3D Instance Segmentation for Indoor Scenes

David Rozenberszki, Or Litany, Angela Dai

CVPR 2024arXiv:2303.14541
40
citations
#2188

Multimodal Prototyping for cancer survival prediction

Andrew Song, Richard Chen, Guillaume Jaume et al.

ICML 2024arXiv:2407.00224
40
citations
#2189

NoiseCLR: A Contrastive Learning Approach for Unsupervised Discovery of Interpretable Directions in Diffusion Models

Yusuf Dalva, Pinar Yanardag

CVPR 2024arXiv:2312.05390
40
citations
#2190

Reverse Diffusion Monte Carlo

Xunpeng Huang, Hanze Dong, Yifan HAO et al.

ICLR 2024arXiv:2307.02037
40
citations
#2191

FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores

Dan Fu, Hermann Kumbong, Eric Nguyen et al.

ICLR 2024arXiv:2311.05908
40
citations
#2192

Separate the Wheat from the Chaff: Model Deficiency Unlearning via Parameter-Efficient Module Operation

Xinshuo Hu, Dongfang Li, Zihao Zheng et al.

AAAI 2024paperarXiv:2308.08090
40
citations
#2193

Friendly Sharpness-Aware Minimization

Tao Li, Pan Zhou, Zhengbao He et al.

CVPR 2024arXiv:2403.12350
40
citations
#2194

Few Shot Part Segmentation Reveals Compositional Logic for Industrial Anomaly Detection

Soopil Kim, Sion An, Philip Chikontwe et al.

AAAI 2024paperarXiv:2312.13783
40
citations
#2195

Devignet: High-Resolution Vignetting Removal via a Dual Aggregated Fusion Transformer with Adaptive Channel Expansion

Shenghong Luo, Xuhang Chen, Weiwen Chen et al.

AAAI 2024paperarXiv:2308.13739
40
citations
#2196

LLMs Meet VLMs: Boost Open Vocabulary Object Detection with Fine-grained Descriptors

Sheng JIn, Xueying Jiang, Jiaxing Huang et al.

ICLR 2024arXiv:2402.04630
40
citations
#2197

AiOS: All-in-One-Stage Expressive Human Pose and Shape Estimation

Qingping SUN, Yanjun Wang, Ailing Zeng et al.

CVPR 2024arXiv:2403.17934
40
citations
#2198

Mastering Memory Tasks with World Models

Mohammad Reza Samsami, Artem Zholus, Janarthanan Rajendran et al.

ICLR 2024oralarXiv:2403.04253
40
citations
#2199

ViLa-MIL: Dual-scale Vision-Language Multiple Instance Learning for Whole Slide Image Classification

Jiangbo Shi, Chen Li, Tieliang Gong et al.

CVPR 2024arXiv:2502.08391
40
citations
#2200

Translate Meanings, Not Just Words: IdiomKB’s Role in Optimizing Idiomatic Translation with Language Models

Shuang Li, Jiangjie Chen, Siyu Yuan et al.

AAAI 2024paperarXiv:2308.13961
40
citations