Most Cited 2024 "action region localization" Papers

12,324 papers found • Page 10 of 62

#1801

CoDA: Instructive Chain-of-Domain Adaptation with Severity-Aware Visual Prompt Tuning

Ziyang Gong, FuHao Li, Yupeng Deng et al.

ECCV 2024arXiv:2403.17369
18
citations
#1802

STDiff: Spatio-Temporal Diffusion for Continuous Stochastic Video Prediction

Xi Ye, Guillaume-Alexandre Bilodeau

AAAI 2024paperarXiv:2312.06486
18
citations
#1803

Implicit Concept Removal of Diffusion Models

Zhili LIU, Kai Chen, Yifan Zhang et al.

ECCV 2024arXiv:2310.05873
18
citations
#1804

Benchmarking Algorithms for Federated Domain Generalization

Ruqi Bai, Saurabh Bagchi, David Inouye

ICLR 2024spotlightarXiv:2307.04942
18
citations
#1805

Good Teachers Explain: Explanation-Enhanced Knowledge Distillation

Amin Parchami, Moritz Böhle, Sukrut Rao et al.

ECCV 2024arXiv:2402.03119
18
citations
#1806

MM-Point: Multi-View Information-Enhanced Multi-Modal Self-Supervised 3D Point Cloud Understanding

HaiTao Yu, Mofei Song

AAAI 2024paperarXiv:2402.10002
18
citations
#1807

Synchronization is All You Need: Exocentric-to-Egocentric Transfer for Temporal Action Segmentation with Unlabeled Synchronized Video Pairs

Camillo Quattrocchi, Antonino Furnari, Daniele Di Mauro et al.

ECCV 2024arXiv:2312.02638
18
citations
#1808

FreePoint: Unsupervised Point Cloud Instance Segmentation

Zhikai Zhang, Jian Ding, Li Jiang et al.

CVPR 2024arXiv:2305.06973
18
citations
#1809

DVSAI: Diverse View-Shared Anchors Based Incomplete Multi-View Clustering

Shengju Yu, Siwei Wang, Pei Zhang et al.

AAAI 2024paper
18
citations
#1810

A Simple and Effective Point-based Network for Event Camera 6-DOFs Pose Relocalization

Hongwei Ren, Jiadong Zhu, Yue Zhou et al.

CVPR 2024arXiv:2403.19412
18
citations
#1811

MESA: Matching Everything by Segmenting Anything

Yesheng Zhang, Xu Zhao

CVPR 2024arXiv:2401.16741
18
citations
#1812

Zero-Shot Aerial Object Detection with Visual Description Regularization

Chenyu Lin, Zhengqing Zang, Chenwei Tang et al.

AAAI 2024paperarXiv:2402.18233
18
citations
#1813

MoST: Motion Style Transformer Between Diverse Action Contents

Boeun Kim, Jungho Kim, Hyung Jin Chang et al.

CVPR 2024arXiv:2403.06225
18
citations
#1814

Learning to Detect Multi-class Anomalies with Just One Normal Image Prompt

Bin-Bin Gao

ECCV 2024arXiv:2505.09264
18
citations
#1815

Crowd-SAM:SAM as a smart annotator for object detection in crowded scenes

Zhi Cai, Yingjie Gao, Yaoyan Zheng et al.

ECCV 2024arXiv:2407.11464
18
citations
#1816

Traffic Flow Optimisation for Lifelong Multi-Agent Path Finding

Zhe Chen, Daniel Harabor, Jiaoyang Li et al.

AAAI 2024paperarXiv:2308.11234
18
citations
#1817

Code-Style In-Context Learning for Knowledge-Based Question Answering

Zhijie Nie, Richong Zhang, Zhongyuan Wang et al.

AAAI 2024paperarXiv:2309.04695
18
citations
#1818

Diverse Person: Customize Your Own Dataset for Text-Based Person Search

Zifan Song, Guosheng Hu, Cairong Zhao

AAAI 2024paper
18
citations
#1819

Connecting Consistency Distillation to Score Distillation for Text-to-3D Generation

Zongrui Li, Minghui Hu, Qian Zheng et al.

ECCV 2024arXiv:2407.13584
18
citations
#1820

Beta-Tuned Timestep Diffusion Model

Tianyi Zheng, Peng-Tao Jiang, Ben Wan et al.

ECCV 2024
18
citations
#1821

Relightable and Animatable Neural Avatars from Videos

Wenbin Lin, Chengwei Zheng, Jun-hai Yong et al.

AAAI 2024paperarXiv:2312.12877
18
citations
#1822

Beat-It: Beat-Synchronized Multi-Condition 3D Dance Generation

Zikai Huang, Xuemiao Xu, Cheng Xu et al.

ECCV 2024arXiv:2407.07554
18
citations
#1823

SemiReward: A General Reward Model for Semi-supervised Learning

Siyuan Li, Weiyang Jin, Zedong Wang et al.

ICLR 2024arXiv:2310.03013
18
citations
#1824

EgoCVR: An Egocentric Benchmark for Fine-Grained Composed Video Retrieval

Thomas Hummel, Shyamgopal Karthik, Mariana-Iuliana Georgescu et al.

ECCV 2024arXiv:2407.16658
18
citations
#1825

FedMef: Towards Memory-efficient Federated Dynamic Pruning

Hong Huang, Weiming Zhuang, Chen Chen et al.

CVPR 2024arXiv:2403.14737
18
citations
#1826

Geometric-Facilitated Denoising Diffusion Model for 3D Molecule Generation

6428 Can Xu, Haosen Wang, Weigang Wang et al.

AAAI 2024paperarXiv:2401.02683
18
citations
#1827

Open-Set Domain Adaptation for Semantic Segmentation

Seun-An Choe, Ah-Hyung Shin, Keon Hee Park et al.

CVPR 2024arXiv:2405.19899
18
citations
#1828

Blur-aware Spatio-temporal Sparse Transformer for Video Deblurring

Huicong Zhang, Haozhe Xie, Hongxun Yao

CVPR 2024arXiv:2406.07551
18
citations
#1829

InfMAE: A Foundation Model in The Infrared Modality

Fangcen liu, Chenqiang Gao, Yaming Zhang et al.

ECCV 2024arXiv:2402.00407
18
citations
#1830

HiPose: Hierarchical Binary Surface Encoding and Correspondence Pruning for RGB-D 6DoF Object Pose Estimation

Yongliang Lin, Yongzhi Su, Praveen Nathan et al.

CVPR 2024arXiv:2311.12588
18
citations
#1831

BCLNet: Bilateral Consensus Learning for Two-View Correspondence Pruning

Xiangyang Miao, Guobao Xiao, Shiping Wang et al.

AAAI 2024paperarXiv:2401.03459
18
citations
#1832

SuperSVG: Superpixel-based Scalable Vector Graphics Synthesis

Teng Hu, Ran Yi, Baihong Qian et al.

CVPR 2024arXiv:2406.09794
18
citations
#1833

A Dual-Way Enhanced Framework from Text Matching Point of View for Multimodal Entity Linking

Shezheng Song, Shan Zhao, ChengYu Wang et al.

AAAI 2024paperarXiv:2312.11816
18
citations
#1834

MagDiff: Multi-Alignment Diffusion for High-Fidelity Video Generation and Editing

Haoyu Zhao, Tianyi Lu, Jiaxi Gu et al.

ECCV 2024arXiv:2311.17338
18
citations
#1835

FAR: Flexible Accurate and Robust 6DoF Relative Camera Pose Estimation

Chris Rockwell, Nilesh Kulkarni, Linyi Jin et al.

CVPR 2024highlightarXiv:2403.03221
18
citations
#1836

TAMM: TriAdapter Multi-Modal Learning for 3D Shape Understanding

Zhihao Zhang, Shengcao Cao, Yu-Xiong Wang

CVPR 2024arXiv:2402.18490
18
citations
#1837

Spectral-Based Graph Neutral Networks for Complementary Item Recommendation

Haitong Luo, Xuying Meng, Suhang Wang et al.

AAAI 2024paper
18
citations
#1838

UNIC: Universal Classification Models via Multi-teacher Distillation

Yannis Kalantidis, Larlus Diane, Mert Bulent SARIYILDIZ et al.

ECCV 2024arXiv:2408.05088
18
citations
#1839

Towards Faithful XAI Evaluation via Generalization-Limited Backdoor Watermark

Mengxi Ya, Yiming Li, Tao Dai et al.

ICLR 2024
18
citations
#1840

EAT: Towards Long-Tailed Out-of-Distribution Detection

Tong Wei, Bo-Lin Wang, Min-Ling Zhang

AAAI 2024paperarXiv:2312.08939
18
citations
#1841

PartSTAD: 2D-to-3D Part Segmentation Task Adaptation

Hyunjin Kim, Minhyuk Sung

ECCV 2024arXiv:2401.05906
18
citations
#1842

Temporally and Distributionally Robust Optimization for Cold-Start Recommendation

Xinyu Lin, Wenjie Wang, Jujia Zhao et al.

AAAI 2024paperarXiv:2312.09901
18
citations
#1843

Diffusion Model is a Good Pose Estimator from 3D RF-Vision

Junqiao Fan, Jianfei Yang, Yuecong Xu et al.

ECCV 2024arXiv:2403.16198
17
citations
#1844

Scaling and Masking: A New Paradigm of Data Sampling for Image and Video Quality Assessment

Yongxu Liu, Yinghui Quan, Guoyao Xiao et al.

AAAI 2024paperarXiv:2401.02614
17
citations
#1845

Masked Video and Body-worn IMU Autoencoder for Egocentric Action Recognition

Mingfang Zhang, Yifei Huang, Ruicong Liu et al.

ECCV 2024arXiv:2407.06628
17
citations
#1846

SfmCAD: Unsupervised CAD Reconstruction by Learning Sketch-based Feature Modeling Operations

Pu Li, Jianwei Guo, HUIBIN LI et al.

CVPR 2024
17
citations
#1847

DeiT-LT: Distillation Strikes Back for Vision Transformer Training on Long-Tailed Datasets

Harsh Rangwani, Pradipto Mondal, Mayank Mishra et al.

CVPR 2024arXiv:2404.02900
17
citations
#1848

What Effects the Generalization in Visual Reinforcement Learning: Policy Consistency with Truncated Return Prediction

Shuo Wang, Zhihao Wu, X. Hu et al.

AAAI 2024paper
17
citations
#1849

SuperNormal: Neural Surface Reconstruction via Multi-View Normal Integration

Xu Cao, Takafumi Taketomi

CVPR 2024arXiv:2312.04803
17
citations
#1850

Weakly Supervised Semantic Segmentation for Driving Scenes

Dongseob Kim, Seungho Lee, Junsuk Choe et al.

AAAI 2024paperarXiv:2312.13646
17
citations
#1851

Differentiable Information Bottleneck for Deterministic Multi-view Clustering

Xiaoqiang Yan, Zhixiang Jin, Fengshou Han et al.

CVPR 2024arXiv:2403.15681
17
citations
#1852

Visual Alignment Pre-training for Sign Language Translation

Peiqi Jiao, Yuecong Min, Xilin CHEN

ECCV 2024
17
citations
#1853

Stitching Sub-trajectories with Conditional Diffusion Model for Goal-Conditioned Offline RL

Sungyoon Kim, Yunseon Choi, Daiki Matsunaga et al.

AAAI 2024paperarXiv:2402.07226
17
citations
#1854

SPEAL: Skeletal Prior Embedded Attention Learning for Cross-Source Point Cloud Registration

Kezheng Xiong, Maoji Zheng, Qingshan Xu et al.

AAAI 2024paperarXiv:2312.08664
17
citations
#1855

Deep Diffusion Image Prior for Efficient OOD Adaptation in 3D Inverse Problems

Hyungjin Chung, Jong Chul Ye

ECCV 2024arXiv:2407.10641
17
citations
#1856

CycleINR: Cycle Implicit Neural Representation for Arbitrary-Scale Volumetric Super-Resolution of Medical Data

Wei Fang, Yuxing Tang, Heng Guo et al.

CVPR 2024arXiv:2404.04878
17
citations
#1857

LRANet: Towards Accurate and Efficient Scene Text Detection with Low-Rank Approximation

Yuchen Su, Zhineng Chen, Zhiwen Shao et al.

AAAI 2024paperarXiv:2306.15142
17
citations
#1858

Every Node Is Different: Dynamically Fusing Self-Supervised Tasks for Attributed Graph Clustering

Pengfei Zhu, Qian Wang, Yu Wang et al.

AAAI 2024paperarXiv:2401.06595
17
citations
#1859

SFPNet: Sparse Focal Point Network for Semantic Segmentation on General LiDAR Point Clouds

Yanbo Wang, Wentao Zhao, Cao Chuan et al.

ECCV 2024arXiv:2407.11569
17
citations
#1860

Revisiting Adversarial Training Under Long-Tailed Distributions

Xinli Yue, Ningping Mou, Qian Wang et al.

CVPR 2024arXiv:2403.10073
17
citations
#1861

Unsupervised Layer-Wise Score Aggregation for Textual OOD Detection

Maxime Darrin, Guillaume Staerman, Eduardo Dadalto Camara Gomes et al.

AAAI 2024paperarXiv:2302.09852
17
citations
#1862

CLOSER: Towards Better Representation Learning for Few-Shot Class-Incremental Learning

Junghun Oh, Sungyong Baik, Kyoung Mu Lee

ECCV 2024arXiv:2410.05627
17
citations
#1863

Music Style Transfer with Time-Varying Inversion of Diffusion Models

Sifei Li, Yuxin Zhang, Fan Tang et al.

AAAI 2024paperarXiv:2402.13763
17
citations
#1864

Adaptive VIO: Deep Visual-Inertial Odometry with Online Continual Learning

Youqi Pan, Wugen Zhou, Yingdian Cao et al.

CVPR 2024arXiv:2405.16754
17
citations
#1865

LiFT: A Surprisingly Simple Lightweight Feature Transform for Dense ViT Descriptors

Saksham Suri, Matthew Walmer, Kamal Gupta et al.

ECCV 2024arXiv:2403.14625
17
citations
#1866

What Makes a Good Prune? Maximal Unstructured Pruning for Maximal Cosine Similarity

Gabryel Mason-Williams, Fredrik Dahlqvist

ICLR 2024
17
citations
#1867

Emergent Visual-Semantic Hierarchies in Image-Text Representations

Morris Alper, Hadar Averbuch-Elor

ECCV 2024arXiv:2407.08521
17
citations
#1868

Adapting Short-Term Transformers for Action Detection in Untrimmed Videos

Min Yang, gaohuan, Ping Guo et al.

CVPR 2024arXiv:2312.01897
17
citations
#1869

Revisiting Document-Level Relation Extraction with Context-Guided Link Prediction

Monika Jain, Raghava Mutharaju, Ramakanth Kavuluru et al.

AAAI 2024paperarXiv:2401.11800
17
citations
#1870

LMUFormer: Low Complexity Yet Powerful Spiking Model With Legendre Memory Units

Zeyu Liu, Gourav Datta, Anni Li et al.

ICLR 2024arXiv:2402.04882
17
citations
#1871

Condition-Aware Neural Network for Controlled Image Generation

Han Cai, Muyang Li, Qinsheng Zhang et al.

CVPR 2024arXiv:2404.01143
17
citations
#1872

Label-Agnostic Forgetting: A Supervision-Free Unlearning in Deep Models

Shaofei Shen, Chenhao Zhang, Yawen Zhao et al.

ICLR 2024arXiv:2404.00506
17
citations
#1873

Explain via Any Concept: Concept Bottleneck Model with Open Vocabulary Concepts

Andong Tan, Fengtao Zhou, Hao Chen

ECCV 2024arXiv:2408.02265
17
citations
#1874

Understanding Video Transformers via Universal Concept Discovery

Matthew Kowal, Achal Dave, Rares Andrei Ambrus et al.

CVPR 2024highlightarXiv:2401.10831
17
citations
#1875

eTraM: Event-based Traffic Monitoring Dataset

Aayush Atul Verma, Bharatesh Chakravarthi, Arpitsinh Vaghela et al.

CVPR 2024highlightarXiv:2403.19976
17
citations
#1876

PetFace: A Large-Scale Dataset and Benchmark for Animal Identification

Risa Shinoda, Kaede Shiohara

ECCV 2024arXiv:2407.13555
17
citations
#1877

Decompose-and-Compose: A Compositional Approach to Mitigating Spurious Correlation

Fahimeh Hosseini Noohdani, Parsa Hosseini, Aryan Yazdan Parast et al.

CVPR 2024arXiv:2402.18919
17
citations
#1878

City-on-Web: Real-time Neural Rendering of Large-scale Scenes on the Web

Kaiwen Song, Xiaoyi Zeng, Chenqu Ren et al.

ECCV 2024arXiv:2312.16457
17
citations
#1879

Decomposing Semantic Shifts for Composed Image Retrieval

Xingyu Yang, Daqing Liu, Heng Zhang et al.

AAAI 2024paperarXiv:2309.09531
17
citations
#1880

HAVE-FUN: Human Avatar Reconstruction from Few-Shot Unconstrained Images

Xihe Yang, Xingyu Chen, Daiheng Gao et al.

CVPR 2024arXiv:2311.15672
17
citations
#1881

RAW-Adapter: Adapting Pretrained Visual Model to Camera RAW Images

Ziteng Cui, Tatsuya Harada

ECCV 2024
17
citations
#1882

Raindrop Clarity: A Dual-Focused Dataset for Day and Night Raindrop Removal

Yeying Jin, Xin Li, Jiadong Wang et al.

ECCV 2024arXiv:2407.16957
17
citations
#1883

Unlocking the Potential of Federated Learning: The Symphony of Dataset Distillation via Deep Generative Latents

Yuqi Jia, Saeed Vahidian, Jingwei Sun et al.

ECCV 2024arXiv:2312.01537
17
citations
#1884

Image Inpainting via Iteratively Decoupled Probabilistic Modeling

Wenbo Li, Xin Yu, Kun Zhou et al.

ICLR 2024spotlightarXiv:2212.02963
17
citations
#1885

De-confounded Data-free Knowledge Distillation for Handling Distribution Shifts

Yuzheng Wang, Dingkang Yang, Zhaoyu Chen et al.

CVPR 2024arXiv:2403.19539
17
citations
#1886

DIRECT-3D: Learning Direct Text-to-3D Generation on Massive Noisy 3D Data

Qihao Liu, Yi Zhang, Song Bai et al.

CVPR 2024arXiv:2406.04322
17
citations
#1887

Towards Green AI in Fine-tuning Large Language Models via Adaptive Backpropagation

Kai Huang, Hanyun Yin, Heng Huang et al.

ICLR 2024arXiv:2309.13192
17
citations
#1888

Towards Understanding Factual Knowledge of Large Language Models

Xuming Hu, Junzhe Chen, Xiaochuan Li et al.

ICLR 2024oral
17
citations
#1889

Neural Visibility Field for Uncertainty-Driven Active Mapping

Shangjie Xue, Jesse Dill, Pranay Mathur et al.

CVPR 2024arXiv:2406.06948
17
citations
#1890

Exploring Pre-trained Text-to-Video Diffusion Models for Referring Video Object Segmentation

Xuelu Feng, Dongdong Chen, Junsong Yuan et al.

ECCV 2024arXiv:2403.12042
17
citations
#1891

PosFormer: Recognizing Complex Handwritten Mathematical Expression with Position Forest Transformer

Tongkun Guan, Chengyu Lin, Wei Shen et al.

ECCV 2024arXiv:2407.07764
17
citations
#1892

HandDiff: 3D Hand Pose Estimation with Diffusion on Image-Point Cloud

WENCAN CHENG, Hao Tang, Luc Van Gool et al.

CVPR 2024highlightarXiv:2404.03159
17
citations
#1893

Towards Effective Usage of Human-Centric Priors in Diffusion Models for Text-based Human Image Generation

Junyan Wang, Zhenhong Sun, Stewart Tan et al.

CVPR 2024arXiv:2403.05239
17
citations
#1894

RGMComm: Return Gap Minimization via Discrete Communications in Multi-Agent Reinforcement Learning

Jingdi Chen, Tian Lan, Carlee Joe-Wong

AAAI 2024paperarXiv:2308.03358
17
citations
#1895

PreRoutGNN for Timing Prediction with Order Preserving Partition: Global Circuit Pre-training, Local Delay Learning and Attentional Cell Modeling

Ruizhe Zhong, Junjie Ye, Zhentao Tang et al.

AAAI 2024paperarXiv:2403.00012
17
citations
#1896

Knowledge Transfer with Simulated Inter-Image Erasing for Weakly Supervised Semantic Segmentation

Tao Chen, Xiruo Jiang, Gensheng Pei et al.

ECCV 2024arXiv:2407.02768
17
citations
#1897

Self-Supervised Video Desmoking for Laparoscopic Surgery

Renlong Wu, Zhilu Zhang, Shuohao Zhang et al.

ECCV 2024arXiv:2403.11192
17
citations
#1898

Keypoint Promptable Re-Identification

Vladimir Somers, Alexandre ALahi, Christophe De Vleeschouwer

ECCV 2024arXiv:2407.18112
17
citations
#1899

One-stage Prompt-based Continual Learning

Youngeun Kim, YUHANG LI, Priyadarshini Panda

ECCV 2024arXiv:2402.16189
17
citations
#1900

COPlanner: Plan to Roll Out Conservatively but to Explore Optimistically for Model-Based RL

Xiyao Wang, Ruijie Zheng, Yanchao Sun et al.

ICLR 2024arXiv:2310.07220
17
citations
#1901

Understanding Augmentation-based Self-Supervised Representation Learning via RKHS Approximation and Regression

Runtian Zhai, Bingbin Liu, Andrej Risteski et al.

ICLR 2024spotlightarXiv:2306.00788
17
citations
#1902

Enhancing 3D Object Detection with 2D Detection-Guided Query Anchors

Haoxuanye Ji, Pengpeng Liang, Erkang Cheng

CVPR 2024arXiv:2403.06093
17
citations
#1903

Motion-prior Contrast Maximization for Dense Continuous-Time Motion Estimation

Friedhelm Hamann, Ziyun Wang, Ioannis Asmanis et al.

ECCV 2024arXiv:2407.10802
17
citations
#1904

Language-Driven 6-DoF Grasp Detection Using Negative Prompt Guidance

Tien Toan Nguyen, Minh Nhat Nhat Vu, Baoru Huang et al.

ECCV 2024arXiv:2407.13842
17
citations
#1905

Crystalformer: Infinitely Connected Attention for Periodic Structure Encoding

Tatsunori Taniai, Ryo Igarashi, Yuta Suzuki et al.

ICLR 2024arXiv:2403.11686
17
citations
#1906

Data Valuation and Detections in Federated Learning

Wenqian Li, Shuran Fu, Fengrui Zhang et al.

CVPR 2024arXiv:2311.05304
17
citations
#1907

Bongard-OpenWorld: Few-Shot Reasoning for Free-form Visual Concepts in the Real World

Rujie Wu, Xiaojian Ma, Zhenliang Zhang et al.

ICLR 2024arXiv:2310.10207
17
citations
#1908

UniM2AE: Multi-modal Masked Autoencoders with Unified 3D Representation for 3D Perception in Autonomous Driving

Jian Zou, Tianyu Huang, Guanglei Yang et al.

ECCV 2024
17
citations
#1909

CoReS: Orchestrating the Dance of Reasoning and Segmentation

Xiaoyi Bao, Siyang Sun, Shuailei Ma et al.

ECCV 2024arXiv:2404.05673
17
citations
#1910

Listen to Look into the Future: Audio-Visual Egocentric Gaze Anticipation

Bolin Lai, Fiona Ryan, Wenqi Jia et al.

ECCV 2024arXiv:2305.03907
17
citations
#1911

M2D2M: Multi-Motion Generation from Text with Discrete Diffusion Models

Seunggeun Chi, Hyung-gun Chi, Hengbo Ma et al.

ECCV 2024arXiv:2407.14502
17
citations
#1912

PAIR Diffusion: A Comprehensive Multimodal Object-Level Image Editor

Vidit Goel, Elia Peruzzo, Yifan Jiang et al.

CVPR 2024
17
citations
#1913

Towards Multimodal Open-Set Domain Generalization and Adaptation through Self-supervision

Hao Dong, Eleni Chatzi, Olga Fink

ECCV 2024arXiv:2407.01518
17
citations
#1914

Multiagent Multitraversal Multimodal Self-Driving: Open MARS Dataset

Yiming Li, Zhiheng Li, Nuo Chen et al.

CVPR 2024arXiv:2406.09383
17
citations
#1915

Latent Diffusion Prior Enhanced Deep Unfolding for Snapshot Spectral Compressive Imaging

Zongliang Wu, Ruiying Lu, Ying Fu et al.

ECCV 2024arXiv:2311.14280
17
citations
#1916

PromptIQA: Boosting the Performance and Generalization for No-Reference Image Quality Assessment via Prompts

Zewen Chen, Haina Qin, Juan Wang et al.

ECCV 2024arXiv:2403.04993
17
citations
#1917

ConsistentEE: A Consistent and Hardness-Guided Early Exiting Method for Accelerating Language Models Inference

Ziqian Zeng, Yihuai Hong, Hongliang Dai et al.

AAAI 2024paperarXiv:2312.11882
17
citations
#1918

Mismatch Quest: Visual and Textual Feedback for Image-Text Misalignment

Brian Gordon, Yonatan Bitton, Yonatan Shafir et al.

ECCV 2024arXiv:2312.03766
17
citations
#1919

Exploring the Transferability of Visual Prompting for Multimodal Large Language Models

Yichi Zhang, Yinpeng Dong, Siyuan Zhang et al.

CVPR 2024highlightarXiv:2404.11207
17
citations
#1920

De-Diffusion Makes Text a Strong Cross-Modal Interface

Chen Wei, Chenxi Liu, Siyuan Qiao et al.

CVPR 2024arXiv:2311.00618
17
citations
#1921

Towards Latent Masked Image Modeling for Self-Supervised Visual Representation Learning

Yibing Wei, Abhinav Gupta, Pedro Morgado

ECCV 2024arXiv:2407.15837
16
citations
#1922

Mirage: Model-agnostic Graph Distillation for Graph Classification

Mridul Gupta, Sahil Manchanda, HARIPRASAD KODAMANA et al.

ICLR 2024arXiv:2310.09486
16
citations
#1923

Taming Latent Diffusion Model for Neural Radiance Field Inpainting

Chieh Lin, Changil Kim, Jia-Bin Huang et al.

ECCV 2024arXiv:2404.09995
16
citations
#1924

CORN: Contact-based Object Representation for Nonprehensile Manipulation of General Unseen Objects

Yoonyoung Cho, Junhyek Han, Yoontae Cho et al.

ICLR 2024arXiv:2403.10760
16
citations
#1925

CIFAR-10-Warehouse: Broad and More Realistic Testbeds in Model Generalization Analysis

Xiaoxiao Sun, Xingjian Leng, Zijian Wang et al.

ICLR 2024arXiv:2310.04414
16
citations
#1926

AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation with Unified Audio-Visual Speech Representation

Jeongsoo Choi, Se Jin Park, Minsu Kim et al.

CVPR 2024highlightarXiv:2312.02512
16
citations
#1927

Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective

Ming Zhong, Chenxin An, Weizhu Chen et al.

ICLR 2024arXiv:2310.11451
16
citations
#1928

PARE-Net: Position-Aware Rotation-Equivariant Networks for Robust Point Cloud Registration

Runzhao Yao, Shaoyi Du, Wenting Cui et al.

ECCV 2024arXiv:2407.10142
16
citations
#1929

Optimizing Diffusion Models for Joint Trajectory Prediction and Controllable Generation

Yixiao Wang, Chen Tang, Lingfeng Sun et al.

ECCV 2024arXiv:2408.00766
16
citations
#1930

Grounded Object-Centric Learning

Avinash Kori, Francesco Locatello, Fabio De Sousa Ribeiro et al.

ICLR 2024
16
citations
#1931

VividDreamer: Invariant Score Distillation for Hyper-Realistic Text-to-3D Generation

Wenjie Zhuo, Fan Ma, Hehe Fan et al.

ECCV 2024arXiv:2407.09822
16
citations
#1932

SURER: Structure-Adaptive Unified Graph Neural Network for Multi-View Clustering

Jing Wang, Songhe Feng, Gengyu Lyu et al.

AAAI 2024paper
16
citations
#1933

FRIH: Fine-Grained Region-Aware Image Harmonization

Jinlong Peng, Zekun Luo, Liang Liu et al.

AAAI 2024paperarXiv:2205.06448
16
citations
#1934

SuperGaussian: Repurposing Video Models for 3D Super Resolution

Yuan Shen, Duygu Ceylan, Paul Guerrero et al.

ECCV 2024arXiv:2406.00609
16
citations
#1935

Controllable Navigation Instruction Generation with Chain of Thought Prompting

Xianghao Kong, Jinyu Chen, Wenguan Wang et al.

ECCV 2024arXiv:2407.07433
16
citations
#1936

Combating Data Imbalances in Federated Semi-supervised Learning with Dual Regulators

Sikai Bai, Shuaicheng Li, Weiming Zhuang et al.

AAAI 2024paperarXiv:2307.05358
16
citations
#1937

Review-Enhanced Hierarchical Contrastive Learning for Recommendation

Ke Wang, Yanmin Zhu, Tianzi Zang et al.

AAAI 2024paper
16
citations
#1938

Jointly Training and Pruning CNNs via Learnable Agent Guidance and Alignment

Alireza Ganjdanesh, Shangqian Gao, Heng Huang

CVPR 2024arXiv:2403.19490
16
citations
#1939

Learning Hierarchical Image Segmentation For Recognition and By Recognition

Tsung-Wei Ke, Sangwoo Mo, Stella Yu

ICLR 2024spotlightarXiv:2210.00314
16
citations
#1940

Inverse Rendering of Glossy Objects via the Neural Plenoptic Function and Radiance Fields

Haoyuan Wang, Wenbo Hu, Lei Zhu et al.

CVPR 2024arXiv:2403.16224
16
citations
#1941

Day-Night Cross-domain Vehicle Re-identification

Hongchao Li, Jingong Chen, AIHUA ZHENG et al.

CVPR 2024
16
citations
#1942

Self-Supervised Multi-Modal Knowledge Graph Contrastive Hashing for Cross-Modal Search

Meiyu Liang, Junping Du, Zhengyang Liang et al.

AAAI 2024paper
16
citations
#1943

Heterogeneous Graph Reasoning for Fact Checking over Texts and Tables

Haisong Gong, Weizhi Xu, Shu Wu et al.

AAAI 2024paperarXiv:2402.13028
16
citations
#1944

KD-DETR: Knowledge Distillation for Detection Transformer with Consistent Distillation Points Sampling

Yu Wang, Xin Li, Shengzhao Wen et al.

CVPR 2024arXiv:2211.08071
16
citations
#1945

Align Before Adapt: Leveraging Entity-to-Region Alignments for Generalizable Video Action Recognition

Yifei Chen, Dapeng Chen, Ruijin Liu et al.

CVPR 2024arXiv:2311.15619
16
citations
#1946

A Comprehensive Augmentation Framework for Anomaly Detection

Lin Jiang, Yaping Yan

AAAI 2024paperarXiv:2308.15068
16
citations
#1947

Thermal3D-GS: Physics-induced 3D Gaussians for Thermal Infrared Novel-view Synthesis

Qian Chen, Shihao Shu, Xiangzhi Bai

ECCV 2024arXiv:2409.08042
16
citations
#1948

Beyond MOT: Semantic Multi-Object Tracking

Yunhao Li, Qin Li, Hao Wang et al.

ECCV 2024arXiv:2403.05021
16
citations
#1949

Quadratic models for understanding catapult dynamics of neural networks

Libin Zhu, Chaoyue Liu, Adityanarayanan Radhakrishnan et al.

ICLR 2024arXiv:2205.11787
16
citations
#1950

Learning Encodings for Constructive Neural Combinatorial Optimization Needs to Regret

Rui Sun, Zhi Zheng, Zhenkun Wang

AAAI 2024paper
16
citations
#1951

Comparing the Robustness of Modern No-Reference Image- and Video-Quality Metrics to Adversarial Attacks

Anastasia Antsiferova, Khaled Abud, Aleksandr Gushchin et al.

AAAI 2024paperarXiv:2310.06958
16
citations
#1952

IS-DARTS: Stabilizing DARTS through Precise Measurement on Candidate Importance

Hongyi He, Longjun Liu, Haonan Zhang et al.

AAAI 2024paperarXiv:2312.12648
16
citations
#1953

Guess The Unseen: Dynamic 3D Scene Reconstruction from Partial 2D Glimpses

Inhee Lee, Byungjun Kim, Hanbyul Joo

CVPR 2024arXiv:2404.14410
16
citations
#1954

Token Transformation Matters: Towards Faithful Post-hoc Explanation for Vision Transformer

Junyi Wu, Bin Duan, Weitai Kang et al.

CVPR 2024arXiv:2403.14552
16
citations
#1955

Select and Distill: Selective Dual-Teacher Knowledge Transfer for Continual Learning on Vision-Language Models

Yu-Chu Yu, Chi-Pin Huang, Jr-Jen Chen et al.

ECCV 2024arXiv:2403.09296
16
citations
#1956

The Hard Positive Truth about Vision-Language Compositionality

Amita Kamath, Cheng-Yu Hsieh, Kai-Wei Chang et al.

ECCV 2024arXiv:2409.17958
16
citations
#1957

DPMesh: Exploiting Diffusion Prior for Occluded Human Mesh Recovery

Yixuan Zhu, Ao Li, Yansong Tang et al.

CVPR 2024arXiv:2404.01424
16
citations
#1958

Progressive Poisoned Data Isolation for Training-Time Backdoor Defense

Yiming Chen, Haiwei Wu, Jiantao Zhou

AAAI 2024paperarXiv:2312.12724
16
citations
#1959

LocLLM: Exploiting Generalizable Human Keypoint Localization via Large Language Model

Dongkai Wang, shiyu xuan, Shiliang Zhang

CVPR 2024highlightarXiv:2406.04659
16
citations
#1960

Stable Unlearnable Example: Enhancing the Robustness of Unlearnable Examples via Stable Error-Minimizing Noise

Yixin Liu, Kaidi Xu, Xun Chen et al.

AAAI 2024paperarXiv:2311.13091
16
citations
#1961

Learning to Optimize Permutation Flow Shop Scheduling via Graph-Based Imitation Learning

Longkang Li, Siyuan Liang, Zihao Zhu et al.

AAAI 2024paperarXiv:2210.17178
16
citations
#1962

CADTalk: An Algorithm and Benchmark for Semantic Commenting of CAD Programs

Haocheng Yuan, Jing Xu, Hao Pan et al.

CVPR 2024highlightarXiv:2311.16703
16
citations
#1963

Evidential Active Recognition: Intelligent and Prudent Open-World Embodied Perception

Lei Fan, Mingfu Liang, Yunxuan Li et al.

CVPR 2024arXiv:2311.13793
16
citations
#1964

TimeLens-XL: Real-time Event-based Video Frame Interpolation with Large Motion

Shi Guo, Yutian Chen, Tianfan Xue et al.

ECCV 2024
16
citations
#1965

ProxyDet: Synthesizing Proxy Novel Classes via Classwise Mixup for Open Vocabulary Object Detection

Joonhyun Jeong, Geondo Park, Jayeon Yoo et al.

AAAI 2024paperarXiv:2312.07266
16
citations
#1966

Lazy Diffusion Transformer for Interactive Image Editing

Yotam Nitzan, Zongze Wu, Richard Zhang et al.

ECCV 2024arXiv:2404.12382
16
citations
#1967

Joint Demosaicing and Denoising for Spike Camera

Yanchen Dong, Ruiqin Xiong, Jing Zhao et al.

AAAI 2024paper
16
citations
#1968

Transformer-Based Selective Super-resolution for Efficient Image Refinement

Tianyi Zhang, Kishore Kasichainula, Yaoxin Zhuo et al.

AAAI 2024paperarXiv:2312.05803
16
citations
#1969

Tackling Structural Hallucination in Image Translation with Local Diffusion

Seunghoi Kim, Chen Jin, Tom Diethe et al.

ECCV 2024arXiv:2404.05980
16
citations
#1970

Iterated Learning Improves Compositionality in Large Vision-Language Models

Chenhao Zheng, Jieyu Zhang, Aniruddha Kembhavi et al.

CVPR 2024arXiv:2404.02145
16
citations
#1971

Image Demoireing in RAW and sRGB Domains

Shuning Xu, Binbin Song, Xiangyu Chen et al.

ECCV 2024arXiv:2312.09063
16
citations
#1972

Efficient Image Pre-Training with Siamese Cropped Masked Autoencoders

Alexandre Eymaël, Renaud Vandeghen, Anthony Cioppa et al.

ECCV 2024arXiv:2403.17823
16
citations
#1973

Revealing the Proximate Long-Tail Distribution in Compositional Zero-Shot Learning

Chenyi Jiang, Haofeng Zhang

AAAI 2024paperarXiv:2312.15923
16
citations
#1974

SWAP-NAS: Sample-Wise Activation Patterns for Ultra-fast NAS

Yameng Peng, Andy Song, Haytham Fayek et al.

ICLR 2024spotlightarXiv:2403.04161
16
citations
#1975

Auto-GAS: Automated Proxy Discovery for Training-free Generative Architecture Search

Lujun Li, Haosen SUN, Shiwen Li et al.

ECCV 2024
16
citations
#1976

An Efficient and Effective Transformer Decoder-Based Framework for Multi-Task Visual Grounding

Wei Chen, Long Chen, Yu Wu

ECCV 2024arXiv:2408.01120
16
citations
#1977

Efficient Meshflow and Optical Flow Estimation from Event Cameras

Xinglong Luo, Ao Luo, Zhengning Wang et al.

CVPR 2024
16
citations
#1978

Context Diffusion: In-Context Aware Image Generation

Ivona Najdenkoska, Animesh Sinha, Abhimanyu Dubey et al.

ECCV 2024arXiv:2312.03584
16
citations
#1979

Semi-supervised Active Learning for Video Action Detection

Ayush Singh, Aayush J Rana, Akash Kumar et al.

AAAI 2024paperarXiv:2312.07169
16
citations
#1980

DiffusionNAG: Predictor-guided Neural Architecture Generation with Diffusion Models

Sohyun An, Hayeon Lee, Jaehyeong Jo et al.

ICLR 2024arXiv:2305.16943
16
citations
#1981

R-MAE: Regions Meet Masked Autoencoders

Duy-Kien Nguyen, Yanghao Li, Vaibhav Aggarwal et al.

ICLR 2024arXiv:2306.05411
16
citations
#1982

AnatoMask: Enhancing Medical Image Segmentation with Reconstruction-guided Self-masking

Yuheng Li, Tianyu Luan, Yizhou Wu et al.

ECCV 2024arXiv:2407.06468
16
citations
#1983

TP2O: Creative Text Pair-to-Object Generation using Balance Swap-Sampling

Jun Li, Zedong Zhang, Jian Yang

ECCV 2024arXiv:2310.01819
16
citations
#1984

Three Heads Are Better than One: Complementary Experts for Long-Tailed Semi-supervised Learning

Chengcheng Ma, Ismail Elezi, Jiankang Deng et al.

AAAI 2024paperarXiv:2312.15702
16
citations
#1985

Weakly Supervised Open-Vocabulary Object Detection

Jianghang Lin, Yunhang Shen, Bingquan Wang et al.

AAAI 2024paperarXiv:2312.12437
16
citations
#1986

C^2RV: Cross-Regional and Cross-View Learning for Sparse-View CBCT Reconstruction

Yiqun Lin, Jiewen Yang, hualiang wang et al.

CVPR 2024arXiv:2406.03902
16
citations
#1987

SHINOBI: Shape and Illumination using Neural Object Decomposition via BRDF Optimization In-the-wild

Andreas Engelhardt, Amit Raj, Mark Boss et al.

CVPR 2024arXiv:2401.10171
16
citations
#1988

Video Prediction by Modeling Videos as Continuous Multi-Dimensional Processes

Gaurav Shrivastava, Abhinav Shrivastava

CVPR 2024
16
citations
#1989

Breaking Physical and Linguistic Borders: Multilingual Federated Prompt Tuning for Low-Resource Languages

Wanru Zhao, Yihong Chen, Royson Lee et al.

ICLR 2024arXiv:2507.03003
16
citations
#1990

PatchRefiner: Leveraging Synthetic Data for Real-Domain High-Resolution Monocular Metric Depth Estimation

Zhenyu Li, Shariq Farooq Bhat, Peter Wonka

ECCV 2024arXiv:2406.06679
16
citations
#1991

CoT3DRef: Chain-of-Thoughts Data-Efficient 3D Visual Grounding

eslam Abdelrahman, Mohamed Ayman Mohamed, Mahmoud Ahmed et al.

ICLR 2024arXiv:2310.06214
16
citations
#1992

Visual Concept Connectome (VCC): Open World Concept Discovery and their Interlayer Connections in Deep Models

Matthew Kowal, Richard P. Wildes, Kosta Derpanis

CVPR 2024highlightarXiv:2404.02233
16
citations
#1993

Object Pose Estimation via the Aggregation of Diffusion Features

Tianfu Wang, Guosheng Hu, Hongguang Wang

CVPR 2024highlightarXiv:2403.18791
16
citations
#1994

Interactive3D: Create What You Want by Interactive 3D Generation

Shaocong Dong, Lihe Ding, Zhanpeng Huang et al.

CVPR 2024arXiv:2404.16510
16
citations
#1995

Commonsense Prototype for Outdoor Unsupervised 3D Object Detection

Hai Wu, Shijia Zhao, Xun Huang et al.

CVPR 2024arXiv:2404.16493
16
citations
#1996

Frozen Feature Augmentation for Few-Shot Image Classification

Andreas Bär, Neil Houlsby, Mostafa Dehghani et al.

CVPR 2024arXiv:2403.10519
16
citations
#1997

Programmable Motion Generation for Open-Set Motion Control Tasks

Hanchao Liu, Xiaohang Zhan, Shaoli Huang et al.

CVPR 2024highlightarXiv:2405.19283
16
citations
#1998

Efficiently Assemble Normalization Layers and Regularization for Federated Domain Generalization

Khiem Le, Tuan Long Ho, Cuong Do et al.

CVPR 2024arXiv:2403.15605
16
citations
#1999

TCLC-GS: Tightly Coupled LiDAR-Camera Gaussian Splatting for Autonomous Driving

Cheng Zhao, su sun, Ruoyu Wang et al.

ECCV 2024arXiv:2404.02410
16
citations
#2000

Osmosis: RGBD Diffusion Prior for Underwater Image Restoration

Opher Bar Nathan, Deborah Steinberger-Levy, Tali Treibitz et al.

ECCV 2024arXiv:2403.14837
16
citations