Most Cited 2024 "long-form qa" Papers

12,324 papers found • Page 23 of 62

#4401

Proportional Representation in Metric Spaces and Low-Distortion Committee Selection

Yusuf Kalayci, David Kempe, Vikram Kher

AAAI 2024paperarXiv:2312.10369
17
citations
#4402

Neural Oscillators for Generalization of Physics-Informed Machine Learning

Taniya Kapoor, Abhishek Chandra, Daniel M. Tartakovsky et al.

AAAI 2024paperarXiv:2308.08989
17
citations
#4403

Efficiently Assemble Normalization Layers and Regularization for Federated Domain Generalization

Khiem Le, Tuan Long Ho, Cuong Do et al.

CVPR 2024arXiv:2403.15605
17
citations
#4404

Semi-supervised Active Learning for Video Action Detection

Ayush Singh, Aayush J Rana, Akash Kumar et al.

AAAI 2024paperarXiv:2312.07169
17
citations
#4405

RGMComm: Return Gap Minimization via Discrete Communications in Multi-Agent Reinforcement Learning

Jingdi Chen, Tian Lan, Carlee Joe-Wong

AAAI 2024paperarXiv:2308.03358
17
citations
#4406

A Unified Diffusion Framework for Scene-aware Human Motion Estimation from Sparse Signals

Jiangnan Tang, Jingya Wang, Kaiyang Ji et al.

CVPR 2024arXiv:2404.04890
17
citations
#4407

CLOSER: Towards Better Representation Learning for Few-Shot Class-Incremental Learning

Junghun Oh, Sungyong Baik, Kyoung Mu Lee

ECCV 2024arXiv:2410.05627
17
citations
#4408

Improving Domain Generalization with Domain Relations

Huaxiu Yao, Xinyu Yang, Xinyi Pan et al.

ICLR 2024spotlightarXiv:2302.02609
17
citations
#4409

Rethinking Few-shot Class-incremental Learning: Learning from Yourself

Yu-Ming Tang, Yi-Xing Peng, Jing-Ke Meng et al.

ECCV 2024arXiv:2407.07468
17
citations
#4410

LiFT: A Surprisingly Simple Lightweight Feature Transform for Dense ViT Descriptors

Saksham Suri, Matthew Walmer, Kamal Gupta et al.

ECCV 2024arXiv:2403.14625
17
citations
#4411

Generating Images with 3D Annotations Using Diffusion Models

Wufei Ma, Qihao Liu, Jiahao Wang et al.

ICLR 2024spotlightarXiv:2306.08103
17
citations
#4412

Emergent Visual-Semantic Hierarchies in Image-Text Representations

Morris Alper, Hadar Averbuch-Elor

ECCV 2024arXiv:2407.08521
17
citations
#4413

PartCraft: Crafting Creative Objects by Parts

Kam Woh Ng, Xiatian Zhu, Yi-Zhe Song et al.

ECCV 2024arXiv:2407.04604
17
citations
#4414

From Vision to Audio and Beyond: A Unified Model for Audio-Visual Representation and Generation

Kun Su, Xiulong Liu, Eli Shlizerman

ICML 2024arXiv:2409.19132
17
citations
#4415

ConsistentEE: A Consistent and Hardness-Guided Early Exiting Method for Accelerating Language Models Inference

Ziqian Zeng, Yihuai Hong, Hongliang Dai et al.

AAAI 2024paperarXiv:2312.11882
17
citations
#4416

MDFL: Multi-Domain Diffusion-Driven Feature Learning

Daixun Li, Weiying Xie, Jiaqing Zhang et al.

AAAI 2024paperarXiv:2311.09520
17
citations
#4417

Region-Aware Exposure Consistency Network for Mixed Exposure Correction

Jin Liu, Huiyuan Fu, Chuanming Wang et al.

AAAI 2024paperarXiv:2402.18217
17
citations
#4418

Music Style Transfer with Time-Varying Inversion of Diffusion Models

Sifei Li, Yuxin Zhang, Fan Tang et al.

AAAI 2024paperarXiv:2402.13763
17
citations
#4419

Decomposing Semantic Shifts for Composed Image Retrieval

Xingyu Yang, Daqing Liu, Heng Zhang et al.

AAAI 2024paperarXiv:2309.09531
17
citations
#4420

From-Ground-To-Objects: Coarse-to-Fine Self-supervised Monocular Depth Estimation of Dynamic Objects with Ground Contact Prior

Jaeho Moon, Juan Luis Gonzalez Bello, Byeongjun Kwon et al.

CVPR 2024arXiv:2312.10118
17
citations
#4421

CIFAR-10-Warehouse: Broad and More Realistic Testbeds in Model Generalization Analysis

Xiaoxiao Sun, Xingjian Leng, Zijian Wang et al.

ICLR 2024arXiv:2310.04414
17
citations
#4422

De-Diffusion Makes Text a Strong Cross-Modal Interface

Chen Wei, Chenxi Liu, Siyuan Qiao et al.

CVPR 2024arXiv:2311.00618
17
citations
#4423

A Multimodal Benchmark Dataset and Model for Crop Disease Diagnosis

Xiang Liu, Zhaoxiang Liu, Huan Hu et al.

ECCV 2024arXiv:2503.06973
17
citations
#4424

Weakly Supervised Open-Vocabulary Object Detection

Jianghang Lin, Yunhang Shen, Bingquan Wang et al.

AAAI 2024paperarXiv:2312.12437
17
citations
#4425

PTT: Point-Trajectory Transformer for Efficient Temporal 3D Object Detection

Kuan-Chih Huang, Weijie Lyu, Ming-Hsuan Yang et al.

CVPR 2024arXiv:2312.08371
17
citations
#4426

Sample-Efficient Robust Multi-Agent Reinforcement Learning in the Face of Environmental Uncertainty

Laixi Shi, Eric Mazumdar, Yuejie Chi et al.

ICML 2024arXiv:2404.18909
17
citations
#4427

Text to Layer-wise 3D Clothed Human Generation

Junting Dong, Qi Fang, Zehuan Huang et al.

ECCV 2024arXiv:2404.16748
17
citations
#4428

Layout-Agnostic Scene Text Image Synthesis with Diffusion Models

Qilong Zhangli, Jindong Jiang, Di Liu et al.

CVPR 2024arXiv:2406.01062
17
citations
#4429

Local-Global Multi-Modal Distillation for Weakly-Supervised Temporal Video Grounding

6627 Peijun Bao, Yong Xia, Wenhan Yang et al.

AAAI 2024paper
17
citations
#4430

City-on-Web: Real-time Neural Rendering of Large-scale Scenes on the Web

Kaiwen Song, Xiaoyi Zeng, Chenqu Ren et al.

ECCV 2024arXiv:2312.16457
17
citations
#4431

De-confounded Data-free Knowledge Distillation for Handling Distribution Shifts

Yuzheng Wang, Dingkang Yang, Zhaoyu Chen et al.

CVPR 2024arXiv:2403.19539
17
citations
#4432

Feel-Good Thompson Sampling for Contextual Dueling Bandits

Xuheng Li, Heyang Zhao, Quanquan Gu

ICML 2024arXiv:2404.06013
17
citations
#4433

Fast-Slow Test-Time Adaptation for Online Vision-and-Language Navigation

JUNYU GAO, Xuan Yao, Changsheng Xu

ICML 2024arXiv:2311.13209
17
citations
#4434

Lazy Diffusion Transformer for Interactive Image Editing

Yotam Nitzan, Zongze Wu, Richard Zhang et al.

ECCV 2024arXiv:2404.12382
17
citations
#4435

Defining and extracting generalizable interaction primitives from DNNs

Lu Chen, Siyu Lou, Benhao Huang et al.

ICLR 2024arXiv:2401.16318
17
citations
#4436

Plug-and-Play image restoration with Stochastic deNOising REgularization

Marien Renaud, Jean Prost, Arthur Leclaire et al.

ICML 2024arXiv:2402.01779
17
citations
#4437

Incentivized Learning in Principal-Agent Bandit Games

Antoine Scheid, Daniil Tiapkin, Etienne Boursier et al.

ICML 2024arXiv:2403.03811
17
citations
#4438

Tackling Structural Hallucination in Image Translation with Local Diffusion

Seunghoi Kim, Chen Jin, Tom Diethe et al.

ECCV 2024arXiv:2404.05980
17
citations
#4439

Exploration-Driven Policy Optimization in RLHF: Theoretical Insights on Efficient Data Utilization

Yihan Du, Anna Winnicki, Gal Dalal et al.

ICML 2024arXiv:2402.10342
17
citations
#4440

ANIM: Accurate Neural Implicit Model for Human Reconstruction from a single RGB-D Image

Marco Pesavento, Yuanlu Xu, Nikolaos Sarafianos et al.

CVPR 2024arXiv:2403.10357
17
citations
#4441

Efficient Image Pre-Training with Siamese Cropped Masked Autoencoders

Alexandre Eymaël, Renaud Vandeghen, Anthony Cioppa et al.

ECCV 2024arXiv:2403.17823
17
citations
#4442

When should we prefer Decision Transformers for Offline Reinforcement Learning?

Prajjwal Bhargava, Rohan Chitnis, Alborz Geramifard et al.

ICLR 2024arXiv:2305.14550
17
citations
#4443

Weakly Supervised Semantic Segmentation for Driving Scenes

Dongseob Kim, Seungho Lee, Junsuk Choe et al.

AAAI 2024paperarXiv:2312.13646
17
citations
#4444

Object Pose Estimation via the Aggregation of Diffusion Features

Tianfu Wang, Guosheng Hu, Hongguang Wang

CVPR 2024highlightarXiv:2403.18791
17
citations
#4445

Leave-one-out Distinguishability in Machine Learning

Jiayuan Ye, Anastasia Borovykh, Soufiane Hayou et al.

ICLR 2024arXiv:2309.17310
17
citations
#4446

Multimodal Cross-Domain Few-Shot Learning for Egocentric Action Recognition

Masashi Hatano, Ryo Hachiuma, Ryo Fujii et al.

ECCV 2024arXiv:2405.19917
17
citations
#4447

Decompose-and-Compose: A Compositional Approach to Mitigating Spurious Correlation

Fahimeh Hosseini Noohdani, Parsa Hosseini, Aryan Yazdan Parast et al.

CVPR 2024arXiv:2402.18919
17
citations
#4448

A New Benchmark and Model for Challenging Image Manipulation Detection

Zhenfei Zhang, Mingyang Li, Ming-Ching Chang

AAAI 2024paperarXiv:2311.14218
17
citations
#4449

SPEAL: Skeletal Prior Embedded Attention Learning for Cross-Source Point Cloud Registration

Kezheng Xiong, Maoji Zheng, Qingshan Xu et al.

AAAI 2024paperarXiv:2312.08664
17
citations
#4450

An Efficient and Effective Transformer Decoder-Based Framework for Multi-Task Visual Grounding

Wei Chen, Long Chen, Yu Wu

ECCV 2024arXiv:2408.01120
17
citations
#4451

LRANet: Towards Accurate and Efficient Scene Text Detection with Low-Rank Approximation

Yuchen Su, Zhineng Chen, Zhiwen Shao et al.

AAAI 2024paperarXiv:2306.15142
17
citations
#4452

Context Diffusion: In-Context Aware Image Generation

Ivona Najdenkoska, Animesh Sinha, Abhimanyu Dubey et al.

ECCV 2024arXiv:2312.03584
17
citations
#4453

DIRECT-3D: Learning Direct Text-to-3D Generation on Massive Noisy 3D Data

Qihao Liu, Yi Zhang, Song Bai et al.

CVPR 2024arXiv:2406.04322
17
citations
#4454

Non-Exchangeable Conformal Risk Control

António Farinhas, Chrysoula Zerva, Dennis Ulmer et al.

ICLR 2024arXiv:2310.01262
17
citations
#4455

$S^2$IP-LLM: Semantic Space Informed Prompt Learning with LLM for Time Series Forecasting

Zijie Pan, Yushan Jiang, Sahil Garg et al.

ICML 2024oralarXiv:2403.05798
17
citations
#4456

HDRFlow: Real-Time HDR Video Reconstruction with Large Motions

Gangwei Xu, Yujin Wang, Jinwei Gu et al.

CVPR 2024arXiv:2403.03447
17
citations
#4457

Heterogeneous Graph Reasoning for Fact Checking over Texts and Tables

Haisong Gong, Weizhi Xu, Shu Wu et al.

AAAI 2024paperarXiv:2402.13028
17
citations
#4458

Exploring Pre-trained Text-to-Video Diffusion Models for Referring Video Object Segmentation

Xuelu Feng, Dongdong Chen, Junsong Yuan et al.

ECCV 2024arXiv:2403.12042
17
citations
#4459

GSNeRF: Generalizable Semantic Neural Radiance Fields with Enhanced 3D Scene Understanding

Zi-Ting Chou, Sheng-Yu Huang, I-Jieh Liu et al.

CVPR 2024arXiv:2403.03608
17
citations
#4460

PosFormer: Recognizing Complex Handwritten Mathematical Expression with Position Forest Transformer

Tongkun Guan, Chengyu Lin, Wei Shen et al.

ECCV 2024arXiv:2407.07764
17
citations
#4461

Triplet Interaction Improves Graph Transformers: Accurate Molecular Graph Learning with Triplet Graph Transformers

Md Shamim Hussain, Mohammed Zaki, Dharmashankar Subramanian

ICML 2024arXiv:2402.04538
17
citations
#4462

Distractors-Immune Representation Learning with Cross-modal Contrastive Regularization for Change Captioning

Yunbin Tu, Liang Li, Li Su et al.

ECCV 2024arXiv:2407.11683
17
citations
#4463

TCNet: Continuous Sign Language Recognition from Trajectories and Correlated Regions

AAAI 2024paperarXiv:2403.11818
17
citations
#4464

Knowledge Transfer with Simulated Inter-Image Erasing for Weakly Supervised Semantic Segmentation

Tao Chen, Xiruo Jiang, Gensheng Pei et al.

ECCV 2024arXiv:2407.02768
17
citations
#4465

Sliced Wasserstein with Random-Path Projecting Directions

Khai Nguyen, Shujian Zhang, Tam Le et al.

ICML 2024arXiv:2401.15889
17
citations
#4466

Image Inpainting via Iteratively Decoupled Probabilistic Modeling

Wenbo Li, Xin Yu, Kun Zhou et al.

ICLR 2024spotlightarXiv:2212.02963
17
citations
#4467

CAMIL: Context-Aware Multiple Instance Learning for Cancer Detection and Subtyping in Whole Slide Images

olga fourkioti, Matt De Vries, Chris Bakal

ICLR 2024spotlightarXiv:2305.05314
17
citations
#4468

CustomListener: Text-guided Responsive Interaction for User-friendly Listening Head Generation

Xi Liu, Ying Guo, Cheng Zhen et al.

CVPR 2024arXiv:2403.00274
17
citations
#4469

AnatoMask: Enhancing Medical Image Segmentation with Reconstruction-guided Self-masking

Yuheng Li, Tianyu Luan, Yizhou Wu et al.

ECCV 2024arXiv:2407.06468
17
citations
#4470

Crystalformer: Infinitely Connected Attention for Periodic Structure Encoding

Tatsunori Taniai, Ryo Igarashi, Yuta Suzuki et al.

ICLR 2024arXiv:2403.11686
17
citations
#4471

Self-Supervised Video Desmoking for Laparoscopic Surgery

Renlong Wu, Zhilu Zhang, Shuohao Zhang et al.

ECCV 2024arXiv:2403.11192
17
citations
#4472

What Makes a Good Prune? Maximal Unstructured Pruning for Maximal Cosine Similarity

Gabryel Mason-Williams, Fredrik Dahlqvist

ICLR 2024
17
citations
#4473

Condition-Aware Neural Network for Controlled Image Generation

Han Cai, Muyang Li, Qinsheng Zhang et al.

CVPR 2024arXiv:2404.01143
17
citations
#4474

SfmCAD: Unsupervised CAD Reconstruction by Learning Sketch-based Feature Modeling Operations

Pu Li, Jianwei Guo, HUIBIN LI et al.

CVPR 2024
17
citations
#4475

Counterfactual Image Editing

Yushu Pan, Elias Bareinboim

ICML 2024arXiv:2403.09683
17
citations
#4476

One-stage Prompt-based Continual Learning

Youngeun Kim, YUHANG LI, Priyadarshini Panda

ECCV 2024arXiv:2402.16189
17
citations
#4477

On the Universality of Volume-Preserving and Coupling-Based Normalizing Flows

Felix Draxler, Stefan Wahl, Christoph Schnörr et al.

ICML 2024arXiv:2402.06578
17
citations
#4478

Stratified Avatar Generation from Sparse Observations

Han Feng, Wenchao Ma, Quankai Gao et al.

CVPR 2024arXiv:2405.20786
17
citations
#4479

Stable Unlearnable Example: Enhancing the Robustness of Unlearnable Examples via Stable Error-Minimizing Noise

Yixin Liu, Kaidi Xu, Xun Chen et al.

AAAI 2024paperarXiv:2311.13091
17
citations
#4480

Beyond Image Super-Resolution for Image Recognition with Task-Driven Perceptual Loss

Jaeha Kim, Junghun Oh, Kyoung Mu Lee

CVPR 2024arXiv:2404.01692
17
citations
#4481

Learning to Optimize Permutation Flow Shop Scheduling via Graph-Based Imitation Learning

Longkang Li, Siyuan Liang, Zihao Zhu et al.

AAAI 2024paperarXiv:2210.17178
17
citations
#4482

Mastering Robot Manipulation with Multimodal Prompts through Pretraining and Multi-task Fine-tuning

Jiachen Li, Qiaozi Gao, Michael Johnston et al.

ICML 2024arXiv:2310.09676
17
citations
#4483

Jointly Training and Pruning CNNs via Learnable Agent Guidance and Alignment

Alireza Ganjdanesh, Shangqian Gao, Heng Huang

CVPR 2024arXiv:2403.19490
17
citations
#4484

Sliced Denoising: A Physics-Informed Molecular Pre-Training Method

yuyan ni, Shikun Feng, Wei-Ying Ma et al.

ICLR 2024arXiv:2311.02124
17
citations
#4485

MMVP: A Multimodal MoCap Dataset with Vision and Pressure Sensors

He Zhang, Shenghao Ren, Haolei Yuan et al.

CVPR 2024arXiv:2403.17610
17
citations
#4486

QDFormer: Towards Robust Audiovisual Segmentation in Complex Environments with Quantization-based Semantic Decomposition

Xiang Li, Jinglu Wang, Xiaohao Xu et al.

CVPR 2024arXiv:2310.00132
17
citations
#4487

Towards Understanding Factual Knowledge of Large Language Models

Xuming Hu, Junzhe Chen, Xiaochuan Li et al.

ICLR 2024oral
17
citations
#4488

Motion-prior Contrast Maximization for Dense Continuous-Time Motion Estimation

Friedhelm Hamann, Ziyun Wang, Ioannis Asmanis et al.

ECCV 2024arXiv:2407.10802
17
citations
#4489

Parameter Efficient Quasi-Orthogonal Fine-Tuning via Givens Rotation

Xinyu Ma, Xu Chu, Zhibang Yang et al.

ICML 2024arXiv:2404.04316
17
citations
#4490

Sparse Spiking Neural Network: Exploiting Heterogeneity in Timescales for Pruning Recurrent SNN

Biswadeep Chakraborty, Beomseok Kang, Harshit Kumar et al.

ICLR 2024arXiv:2403.03409
17
citations
#4491

Physics-Informed Neural Network Policy Iteration: Algorithms, Convergence, and Verification

Yiming Meng, Ruikun Zhou, Amartya Mukherjee et al.

ICML 2024arXiv:2402.10119
17
citations
#4492

Systematic Comparison of Semi-supervised and Self-supervised Learning for Medical Image Classification

Zhe Huang, Ruijie Jiang, Shuchin Aeron et al.

CVPR 2024arXiv:2307.08919
17
citations
#4493

HHMR: Holistic Hand Mesh Recovery by Enhancing the Multimodal Controllability of Graph Diffusion Models

Mengcheng Li, Hongwen Zhang, Yuxiang Zhang et al.

CVPR 2024highlightarXiv:2406.01334
17
citations
#4494

E.T. the Exceptional Trajectory: Text-to-camera-trajectory generation with character awareness

Robin Courant, Nicolas Dufour, Xi WANG et al.

ECCV 2024arXiv:2407.01516
17
citations
#4495

Align Before Adapt: Leveraging Entity-to-Region Alignments for Generalizable Video Action Recognition

Yifei Chen, Dapeng Chen, Ruijin Liu et al.

CVPR 2024arXiv:2311.15619
17
citations
#4496

Controllable 3D Face Generation with Conditional Style Code Diffusion

AAAI 2024paperarXiv:2312.13941
17
citations
#4497

Diffusion Posterior Sampling is Computationally Intractable

Shivam Gupta, Ajil Jalal, Aditya Parulekar et al.

ICML 2024arXiv:2402.12727
17
citations
#4498

Beyond Sole Strength: Customized Ensembles for Generalized Vision-Language Models

Zhihe Lu, Jiawang Bai, Xin Li et al.

ICML 2024arXiv:2311.17091
17
citations
#4499

ProtoGate: Prototype-based Neural Networks with Global-to-local Feature Selection for Tabular Biomedical Data

Xiangjian Jiang, Andrei Margeloiu, Nikola Simidjievski et al.

ICML 2024arXiv:2306.12330
17
citations
#4500

Language-Driven 6-DoF Grasp Detection Using Negative Prompt Guidance

Tien Toan Nguyen, Minh Nhat Nhat Vu, Baoru Huang et al.

ECCV 2024arXiv:2407.13842
17
citations
#4501

Enhancing 3D Object Detection with 2D Detection-Guided Query Anchors

Haoxuanye Ji, Pengpeng Liang, Erkang Cheng

CVPR 2024arXiv:2403.06093
17
citations
#4502

Predictive, scalable and interpretable knowledge tracing on structured domains

Hanqi Zhou, Robert Bamler, Charley Wu et al.

ICLR 2024spotlightarXiv:2403.13179
17
citations
#4503

Low-Resource Vision Challenges for Foundation Models

Yunhua Zhang, Hazel Doughty, Cees G. M. Snoek

CVPR 2024arXiv:2401.04716
17
citations
#4504

Diffusive Gibbs Sampling

Wenlin Chen, Mingtian Zhang, Brooks Paige et al.

ICML 2024arXiv:2402.03008
17
citations
#4505

LMUFormer: Low Complexity Yet Powerful Spiking Model With Legendre Memory Units

Zeyu Liu, Gourav Datta, Anni Li et al.

ICLR 2024arXiv:2402.04882
17
citations
#4506

CaRiNG: Learning Temporal Causal Representation under Non-Invertible Generation Process

Guangyi Chen, Yifan Shen, Zhenhao Chen et al.

ICML 2024oralarXiv:2401.14535
17
citations
#4507

Decentralized Riemannian Conjugate Gradient Method on the Stiefel Manifold

Jun Chen, Haishan Ye, Mengmeng Wang et al.

ICLR 2024arXiv:2308.10547
17
citations
#4508

UniM2AE: Multi-modal Masked Autoencoders with Unified 3D Representation for 3D Perception in Autonomous Driving

Jian Zou, Tianyu Huang, Guanglei Yang et al.

ECCV 2024
17
citations
#4509

Differentiable Information Bottleneck for Deterministic Multi-view Clustering

Xiaoqiang Yan, Zhixiang Jin, Fengshou Han et al.

CVPR 2024arXiv:2403.15681
17
citations
#4510

EControl: Fast Distributed Optimization with Compression and Error Control

Yuan Gao, Rustem Islamov, Sebastian Stich

ICLR 2024arXiv:2311.05645
17
citations
#4511

Unleashing the Potential of Fractional Calculus in Graph Neural Networks with FROND

Qiyu Kang, Kai Zhao, Qinxu Ding et al.

ICLR 2024spotlightarXiv:2404.17099
17
citations
#4512

SparseFormer: Sparse Visual Recognition via Limited Latent Tokens

Ziteng Gao, Zhan Tong, Limin Wang et al.

ICLR 2024arXiv:2304.03768
17
citations
#4513

Unsupervised Layer-Wise Score Aggregation for Textual OOD Detection

Maxime Darrin, Guillaume Staerman, Eduardo Dadalto Camara Gomes et al.

AAAI 2024paperarXiv:2302.09852
17
citations
#4514

NeuroBack: Improving CDCL SAT Solving using Graph Neural Networks

Wenxi Wang, Yang Hu, Mohit Tiwari et al.

ICLR 2024arXiv:2110.14053
17
citations
#4515

NeISF: Neural Incident Stokes Field for Geometry and Material Estimation

Chenhao Li, Taishi Ono, Takeshi Uemori et al.

CVPR 2024highlightarXiv:2311.13187
17
citations
#4516

M2D2M: Multi-Motion Generation from Text with Discrete Diffusion Models

Seunggeun Chi, Hyung-gun Chi, Hengbo Ma et al.

ECCV 2024arXiv:2407.14502
17
citations
#4517

Single-Trajectory Distributionally Robust Reinforcement Learning

Zhipeng Liang, Xiaoteng Ma, Jose Blanchet et al.

ICML 2024arXiv:2301.11721
17
citations
#4518

MSD: A Benchmark Dataset for Floor Plan Generation of Building Complexes

Casper van Engelenburg, Fatemeh Mostafavi, Emanuel Kuhn et al.

ECCV 2024arXiv:2407.10121
17
citations
#4519

Geometry Transfer for Stylizing Radiance Fields

Hyunyoung Jung, Seonghyeon Nam, Nikolaos Sarafianos et al.

CVPR 2024arXiv:2402.00863
17
citations
#4520

R-MAE: Regions Meet Masked Autoencoders

Duy-Kien Nguyen, Yanghao Li, Vaibhav Aggarwal et al.

ICLR 2024arXiv:2306.05411
17
citations
#4521

Towards Multimodal Open-Set Domain Generalization and Adaptation through Self-supervision

Hao Dong, Eleni Chatzi, Olga Fink

ECCV 2024arXiv:2407.01518
17
citations
#4522

Predicated Diffusion: Predicate Logic-Based Attention Guidance for Text-to-Image Diffusion Models

Kota Sueyoshi, Takashi Matsubara

CVPR 2024highlightarXiv:2311.16117
17
citations
#4523

Lane2Seq: Towards Unified Lane Detection via Sequence Generation

Kunyang Zhou

CVPR 2024arXiv:2402.17172
17
citations
#4524

PromptIQA: Boosting the Performance and Generalization for No-Reference Image Quality Assessment via Prompts

Zewen Chen, Haina Qin, Juan Wang et al.

ECCV 2024arXiv:2403.04993
17
citations
#4525

Spider: A Unified Framework for Context-dependent Concept Segmentation

Xiaoqi Zhao, Youwei Pang, Wei Ji et al.

ICML 2024arXiv:2405.01002
17
citations
#4526

Self-supervised Representation Learning from Random Data Projectors

Yi Sui, Tongzi Wu, Jesse Cresswell et al.

ICLR 2024arXiv:2310.07756
17
citations
#4527

Data Distillation Can Be Like Vodka: Distilling More Times For Better Quality

Xuxi Chen, Yu Yang, Zhangyang Wang et al.

ICLR 2024arXiv:2310.06982
17
citations
#4528

Frequency-Controlled Diffusion Model for Versatile Text-Guided Image-to-Image Translation

Xiang Gao, Zhengbo Xu, junhan Zhao et al.

AAAI 2024paperarXiv:2407.03006
17
citations
#4529

Information Flow in Self-Supervised Learning

Zhiquan Tan, Jingqin Yang, Weiran Huang et al.

ICML 2024arXiv:2309.17281
17
citations
#4530

CycleINR: Cycle Implicit Neural Representation for Arbitrary-Scale Volumetric Super-Resolution of Medical Data

Wei Fang, Yuxing Tang, Heng Guo et al.

CVPR 2024arXiv:2404.04878
17
citations
#4531

DPN: Decoupling Partition and Navigation for Neural Solvers of Min-max Vehicle Routing Problems

zhi Zheng, Shunyu Yao, Zhenkun Wang et al.

ICML 2024arXiv:2405.17272
17
citations
#4532

Multi-Source Collaborative Gradient Discrepancy Minimization for Federated Domain Generalization

Yikang Wei, Yahong Han

AAAI 2024paperarXiv:2401.10272
17
citations
#4533

Single-to-Dual-View Adaptation for Egocentric 3D Hand Pose Estimation

Ruicong Liu, Takehiko Ohkawa, Mingfang Zhang et al.

CVPR 2024arXiv:2403.04381
17
citations
#4534

You Only Need Less Attention at Each Stage in Vision Transformers

Shuoxi Zhang, Hanpeng Liu, Stephen Lin et al.

CVPR 2024arXiv:2406.00427
17
citations
#4535

Robust Optimization in Protein Fitness Landscapes Using Reinforcement Learning in Latent Space

Minji Lee, Luiz Felipe Vecchietti, Hyunkyu Jung et al.

ICML 2024spotlightarXiv:2405.18986
17
citations
#4536

Binning as a Pretext Task: Improving Self-Supervised Learning in Tabular Domains

Kyungeun Lee, Ye Seul Sim, Hye-Seung Cho et al.

ICML 2024arXiv:2405.07414
17
citations
#4537

Residual-Conditioned Optimal Transport: Towards Structure-Preserving Unpaired and Paired Image Restoration

Xiaole Tang, Hu Xin, Xiang Gu et al.

ICML 2024arXiv:2405.02843
17
citations
#4538

Enhancing Hyperspectral Images via Diffusion Model and Group-Autoencoder Super-resolution Network

Zhaoyang Wang, Dongyang Li, Mingyang Zhang et al.

AAAI 2024paperarXiv:2402.17285
17
citations
#4539

Grokking Group Multiplication with Cosets

Dashiell Stander, Qinan Yu, Honglu Fan et al.

ICML 2024arXiv:2312.06581
17
citations
#4540

Adaptive VIO: Deep Visual-Inertial Odometry with Online Continual Learning

Youqi Pan, Wugen Zhou, Yingdian Cao et al.

CVPR 2024arXiv:2405.16754
17
citations
#4541

Identifying Representations for Intervention Extrapolation

Sorawit (James) Saengkyongam, Elan Rosenfeld, Pradeep K Ravikumar et al.

ICLR 2024arXiv:2310.04295
17
citations
#4542

Mismatch Quest: Visual and Textual Feedback for Image-Text Misalignment

Brian Gordon, Yonatan Bitton, Yonatan Shafir et al.

ECCV 2024arXiv:2312.03766
17
citations
#4543

Towards Green AI in Fine-tuning Large Language Models via Adaptive Backpropagation

Kai Huang, Hanyun Yin, Heng Huang et al.

ICLR 2024arXiv:2309.13192
17
citations
#4544

More Benefits of Being Distributional: Second-Order Bounds for Reinforcement Learning

Kaiwen Wang, Owen Oertell, Alekh Agarwal et al.

ICML 2024arXiv:2402.07198
17
citations
#4545

Neural Image Compression with Text-guided Encoding for both Pixel-level and Perceptual Fidelity

Hagyeong Lee, Minkyu Kim, Jun-Hyuk Kim et al.

ICML 2024arXiv:2403.02944
17
citations
#4546

How to Evaluate the Generalization of Detection? A Benchmark for Comprehensive Open-Vocabulary Detection

Yiyang Yao, Peng Liu, Tiancheng Zhao et al.

AAAI 2024paperarXiv:2308.13177
17
citations
#4547

DiffusionNAG: Predictor-guided Neural Architecture Generation with Diffusion Models

Sohyun An, Hayeon Lee, Jaehyeong Jo et al.

ICLR 2024arXiv:2305.16943
17
citations
#4548

Neural Visibility Field for Uncertainty-Driven Active Mapping

Shangjie Xue, Jesse Dill, Pranay Mathur et al.

CVPR 2024arXiv:2406.06948
17
citations
#4549

Self-Supervised Contrastive Learning for Long-term Forecasting

Junwoo Park, Daehoon Gwak, Jaegul Choo et al.

ICLR 2024arXiv:2402.02023
17
citations
#4550

Retraining-Free Model Quantization via One-Shot Weight-Coupling Learning

Chen Tang, Yuan Meng, Jiacheng Jiang et al.

CVPR 2024arXiv:2401.01543
17
citations
#4551

Not Just Pretty Pictures: Toward Interventional Data Augmentation Using Text-to-Image Generators

Jianhao Yuan, Francesco Pinto, Adam Davies et al.

ICML 2024arXiv:2212.11237
17
citations
#4552

RedCore: Relative Advantage Aware Cross-Modal Representation Learning for Missing Modalities with Imbalanced Missing Rates

Jun Sun, Xinxin Zhang, Shoukang Han et al.

AAAI 2024paperarXiv:2312.10386
17
citations
#4553

Label-Noise Robust Diffusion Models

Byeonghu Na, Yeongmin Kim, HeeSun Bae et al.

ICLR 2024arXiv:2402.17517
16
citations
#4554

Experts Don't Cheat: Learning What You Don't Know By Predicting Pairs

Daniel D. Johnson, Daniel Tarlow, David Duvenaud et al.

ICML 2024arXiv:2402.08733
16
citations
#4555

Compositional Generative Inverse Design

Tailin Wu, Takashi Maruyama, Long Wei et al.

ICLR 2024spotlightarXiv:2401.13171
16
citations
#4556

Tri^{2}-plane: Thinking Head Avatar via Feature Pyramid

Luchuan Song, Pinxin Liu, Lele Chen et al.

ECCV 2024arXiv:2401.09386
16
citations
#4557

ColorMNet: A Memory-based Deep Spatial-Temporal Feature Propagation Network for Video Colorization

Yixin Yang, Jiangxin Dong, Jinhui Tang et al.

ECCV 2024arXiv:2404.06251
16
citations
#4558

Towards Latent Masked Image Modeling for Self-Supervised Visual Representation Learning

Yibing Wei, Abhinav Gupta, Pedro Morgado

ECCV 2024arXiv:2407.15837
16
citations
#4559

Taming Latent Diffusion Model for Neural Radiance Field Inpainting

Chieh Lin, Changil Kim, Jia-Bin Huang et al.

ECCV 2024arXiv:2404.09995
16
citations
#4560

Jointly-Learned Exit and Inference for a Dynamic Neural Network

Florence Regol, Joud Chataoui, Mark Coates

ICLR 2024arXiv:2310.09163
16
citations
#4561

Learning Instance-Aware Correspondences for Robust Multi-Instance Point Cloud Registration in Cluttered Scenes

Zhiyuan Yu, Zheng Qin, lintao zheng et al.

CVPR 2024arXiv:2404.04557
16
citations
#4562

Three Heads Are Better than One: Complementary Experts for Long-Tailed Semi-supervised Learning

Chengcheng Ma, Ismail Elezi, Jiankang Deng et al.

AAAI 2024paperarXiv:2312.15702
16
citations
#4563

Real-Time Simulated Avatar from Head-Mounted Sensors

Zhengyi Luo, Jinkun Cao, Rawal Khirodkar et al.

CVPR 2024highlightarXiv:2403.06862
16
citations
#4564

Gaussian Shadow Casting for Neural Characters

Luis Bolanos, Shih-Yang Su, Helge Rhodin

CVPR 2024arXiv:2401.06116
16
citations
#4565

Compositional Generalization for Multi-Label Text Classification: A Data-Augmentation Approach

Yuyang Chai, Zhuang Li, Jiahui Liu et al.

AAAI 2024paperarXiv:2312.11276
16
citations
#4566

Guess The Unseen: Dynamic 3D Scene Reconstruction from Partial 2D Glimpses

Inhee Lee, Byungjun Kim, Hanbyul Joo

CVPR 2024arXiv:2404.14410
16
citations
#4567

Explicitly Guided Information Interaction Network for Cross-modal Point Cloud Completion

Hang Xu, Chen Long, Wenxiao Zhang et al.

ECCV 2024arXiv:2407.02887
16
citations
#4568

Towards Multi-modal Transformers in Federated Learning

Guangyu Sun, Matias Mendieta, Aritra Dutta et al.

ECCV 2024arXiv:2404.12467
16
citations
#4569

MCPNet: An Interpretable Classifier via Multi-Level Concept Prototypes

Bor Shiun Wang, Chien-Yi Wang, Wei-Chen Chiu

CVPR 2024arXiv:2404.08968
16
citations
#4570

DDMI: Domain-agnostic Latent Diffusion Models for Synthesizing High-Quality Implicit Neural Representations

Dogyun Park, Sihyeon Kim, Sojin Lee et al.

ICLR 2024arXiv:2401.12517
16
citations
#4571

BaCon: Boosting Imbalanced Semi-supervised Learning via Balanced Feature-Level Contrastive Learning

Qianhan Feng, Lujing Xie, Shijie Fang et al.

AAAI 2024paperarXiv:2403.12986
16
citations
#4572

Balanced Resonate-and-Fire Neurons

Saya Higuchi, Sebastian Kairat, Sander Bohte et al.

ICML 2024arXiv:2402.14603
16
citations
#4573

Content-Style Decoupling for Unsupervised Makeup Transfer without Generating Pseudo Ground Truth

Zhaoyang Sun, Shengwu Xiong, Yaxiong Chen et al.

CVPR 2024arXiv:2405.17240
16
citations
#4574

Controllable Navigation Instruction Generation with Chain of Thought Prompting

Xianghao Kong, Jinyu Chen, Wenguan Wang et al.

ECCV 2024arXiv:2407.07433
16
citations
#4575

Ditto: Quantization-aware Secure Inference of Transformers upon MPC

Haoqi Wu, Wenjing Fang, Yancheng Zheng et al.

ICML 2024arXiv:2405.05525
16
citations
#4576

Adversarial Score Distillation: When score distillation meets GAN

Min Wei, Jingkai Zhou, Junyao Sun et al.

CVPR 2024arXiv:2312.00739
16
citations
#4577

LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction

Penghui Du, Yu Wang, Yifan Sun et al.

ECCV 2024arXiv:2407.11335
16
citations
#4578

Review-Enhanced Hierarchical Contrastive Learning for Recommendation

Ke Wang, Yanmin Zhu, Tianzi Zang et al.

AAAI 2024paper
16
citations
#4579

Negative Pre-aware for Noisy Cross-Modal Matching

Xu Zhang, Hao Li, Mang Ye

AAAI 2024paperarXiv:2312.05777
16
citations
#4580

Progressive Poisoned Data Isolation for Training-Time Backdoor Defense

Yiming Chen, Haiwei Wu, Jiantao Zhou

AAAI 2024paperarXiv:2312.12724
16
citations
#4581

One Prompt is not Enough: Automated Construction of a Mixture-of-Expert Prompts

Ruochen Wang, Sohyun An, Minhao Cheng et al.

ICML 2024arXiv:2407.00256
16
citations
#4582

Diversified and Personalized Multi-rater Medical Image Segmentation

Yicheng Wu, Xiangde Luo, Zhe Xu et al.

CVPR 2024highlightarXiv:2403.13417
16
citations
#4583

FoX: Formation-Aware Exploration in Multi-Agent Reinforcement Learning

Yonghyeon Jo, Sunwoo Lee, Junghyuk Yum et al.

AAAI 2024paperarXiv:2308.11272
16
citations
#4584

How to Handle Sketch-Abstraction in Sketch-Based Image Retrieval?

Subhadeep Koley, Ayan Kumar Bhunia, Aneeshan Sain et al.

CVPR 2024arXiv:2403.07203
16
citations
#4585

Functional Interpolation for Relative Positions improves Long Context Transformers

Shanda Li, Chong You, Guru Guruganesh et al.

ICLR 2024arXiv:2310.04418
16
citations
#4586

Learning with Counterfactual Explanations for Radiology Report Generation

Mingjie Li, Haokun Lin, Liang Qiu et al.

ECCV 2024arXiv:2407.14474
16
citations
#4587

Causal Mode Multiplexer: A Novel Framework for Unbiased Multispectral Pedestrian Detection

Taeheon Kim, Sebin Shin, Youngjoon Yu et al.

CVPR 2024arXiv:2403.01300
16
citations
#4588

Thermal3D-GS: Physics-induced 3D Gaussians for Thermal Infrared Novel-view Synthesis

Qian Chen, Shihao Shu, Xiangzhi Bai

ECCV 2024arXiv:2409.08042
16
citations
#4589

PKU-DyMVHumans: A Multi-View Video Benchmark for High-Fidelity Dynamic Human Modeling

Xiaoyun Zheng, Liwei Liao, Xufeng Li et al.

CVPR 2024arXiv:2403.16080
16
citations
#4590

Overcoming the Pitfalls of Vision-Language Model Finetuning for OOD Generalization

Yuhang Zang, Hanlin Goh, Joshua Susskind et al.

ICLR 2024arXiv:2401.15914
16
citations
#4591

A Geometric Explanation of the Likelihood OOD Detection Paradox

Hamidreza Kamkari, Brendan Ross, Jesse Cresswell et al.

ICML 2024arXiv:2403.18910
16
citations
#4592

Binarized Low-light Raw Video Enhancement

Gengchen Zhang, Yulun Zhang, Xin Yuan et al.

CVPR 2024arXiv:2403.19944
16
citations
#4593

Progressive Divide-and-Conquer via Subsampling Decomposition for Accelerated MRI

Chong Wang, Lanqing Guo, Yufei Wang et al.

CVPR 2024highlightarXiv:2403.10064
16
citations
#4594

Beyond MOT: Semantic Multi-Object Tracking

Yunhao Li, Qin Li, Hao Wang et al.

ECCV 2024arXiv:2403.05021
16
citations
#4595

Quadratic models for understanding catapult dynamics of neural networks

Libin Zhu, Chaoyue Liu, Adityanarayanan Radhakrishnan et al.

ICLR 2024arXiv:2205.11787
16
citations
#4596

An Interpretable Evaluation of Entropy-based Novelty of Generative Models

Jingwei Zhang, Cheuk Ting Li, Farzan Farnia

ICML 2024arXiv:2402.17287
16
citations
#4597

SHINOBI: Shape and Illumination using Neural Object Decomposition via BRDF Optimization In-the-wild

Andreas Engelhardt, Amit Raj, Mark Boss et al.

CVPR 2024arXiv:2401.10171
16
citations
#4598

FedSOL: Stabilized Orthogonal Learning with Proximal Restrictions in Federated Learning

Gihun Lee, Minchan Jeong, SangMook Kim et al.

CVPR 2024arXiv:2308.12532
16
citations
#4599

TULIP: Transformer for Upsampling of LiDAR Point Clouds

Bin Yang, Patrick Pfreundschuh, Roland Siegwart et al.

CVPR 2024arXiv:2312.06733
16
citations
#4600

Generative Modeling on Manifolds Through Mixture of Riemannian Diffusion Processes

Jaehyeong Jo, Sung Ju Hwang

ICML 2024arXiv:2310.07216
16
citations