Most Cited 2024 &quot;log-derivative trick&quot; Papers

ECCV 2024arXiv:2407.11422

#2602

Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models

Jinrui Zhang, Teng Wang, Haigang Zhang et al.

#2603

Efficient Privacy-Preserving Visual Localization Using 3D Ray Clouds

Heejoon Moon, Chunghwan Lee, Je Hyeong Hong

ECCV 2024arXiv:2403.04908

#2604

Self-Adapting Large Visual-Language Models to Edge Devices across Visual Modalities

Kaiwen Cai, ZheKai Duan, Gaowen Liu et al.

ECCV 2024arXiv:2312.02202

#2605

Volumetric Rendering with Baked Quadrature Fields

Gopal Sharma, Daniel Rebain, Kwang Moo Yi et al.

ECCV 2024arXiv:2311.16254

#2606

Safe-CLIP: Removing NSFW Concepts from Vision-and-Language Models

Samuele Poppi, Tobia Poppi, Federico Cocchi et al.

ECCV 2024arXiv:2407.12239

#2607

Motion and Structure from Event-based Normal Flow

Zhongyang Ren, Bangyan Liao, Delei Kong et al.

CVPR 2024arXiv:2404.02152

#2608

GeneAvatar: Generic Expression-Aware Volumetric Head Avatar Editing from a Single Image

Chong Bao, Yinda Zhang, Yuan Li et al.

ECCV 2024arXiv:2410.17839

#2609

Few-shot NeRF by Adaptive Rendering Loss Regularization

Qingshan Xu, Xuanyu Yi, Jianyao Xu et al.

#2610

Zero-Shot Structure-Preserving Diffusion Model for High Dynamic Range Tone Mapping

Ruoxi Zhu, Shusong Xu, Peiye Liu et al.

CVPR 2024highlight

#2611

Multi-Person Pose Forecasting with Individual Interaction Perceptron and Prior Learning

Peng Xiao, Yi Xie, Xuemiao Xu et al.

ICLR 2024arXiv:2401.17992

#2612

Multilinear Operator Networks

Yixin Cheng, Grigorios Chrysos, Markos Georgopoulos et al.

ICLR 2024arXiv:2310.10780

#2613

Demystifying Poisoning Backdoor Attacks from a Statistical Perspective

Ganghua Wang, Xun Xian, Ashish Kundu et al.

ECCV 2024arXiv:2308.11487

#2614

Free Lunch for Gait Recognition: A Novel Relation Descriptor

Jilong Wang, Saihui Hou, Yan Huang et al.

CVPR 2024arXiv:2404.06244

#2615

Anchor-based Robust Finetuning of Vision-Language Models

Jinwei Han, Zhiwen Lin, Zhongyisun Sun et al.

ECCV 2024arXiv:2312.08291

#2616

VQ-HPS: Human Pose and Shape Estimation in a Vector-Quantized Latent Space

Guénolé Fiche, Simon Leglaive, Xavier Alameda-Pineda et al.

CVPR 2024highlightarXiv:2312.10144

#2617

Data-Efficient Multimodal Fusion on a Single GPU

Noël Vouitsis, Zhaoyan Liu, Satya Krishna Gorti et al.

AAAI 2024paperarXiv:2108.10859

#2618

Cumulative Regret Analysis of the Piyavskii–Shubert Algorithm and Its Variants for Global Optimization

Kaan Gokcesu, Hakan Gökcesu

ECCV 2024arXiv:2312.13604

#2619

Ponymation: Learning Articulated 3D Animal Motions from Unlabeled Online Videos

Keqiang Sun, Dori Litvak, Yunzhi Zhang et al.

AAAI 2024paperarXiv:2401.01528

#2620

Improved Bandits in Many-to-One Matching Markets with Incentive Compatibility

Fang Kong, Shuai Li

AAAI 2024paperarXiv:2305.13562

#2621

Understanding and Improving Optimization in Predictive Coding Networks

Nicholas Alonso, Jeffrey Krichmar, Emre Neftci

ECCV 2024arXiv:2407.09352

#2622

Imaging Interiors: An Implicit Solution to Electromagnetic Inverse Scattering Problems

Ziyuan Luo, Boxin Shi, Haoliang Li et al.

ECCV 2024arXiv:2402.14000

#2623

Real-time 3D-aware Portrait Editing from a Single Image

Qingyan Bai, Zifan Shi, Yinghao Xu et al.

#2624

Exploring the Feature Extraction and Relation Modeling For Light-Weight Transformer Tracking

Jikai Zheng, Mingjiang Liang, Shaoli Huang et al.

ICLR 2024arXiv:2405.00646

#2625

Learning to Compose: Improving Object Centric Learning by Injecting Compositionality

Whie Jung, Jaehoon Yoo, Sungjin Ahn et al.

AAAI 2024paperarXiv:2401.14919

#2626

PARSAC: Accelerating Robust Multi-Model Fitting with Parallel Sample Consensus

Florian Kluger, Bodo Rosenhahn

#2627

Adversarially Robust Distillation by Reducing the Student-Teacher Variance Gap

Junhao Dong, Piotr Koniusz, Junxi Chen et al.

ECCV 2024arXiv:2312.04763

#2628

Enhancing Recipe Retrieval with Foundation Models: A Data Augmentation Perspective

Fangzhou Song, Bin Zhu, Yanbin Hao et al.

#2629

KDProR: A Knowledge-Decoupling Probabilistic Framework for Video-Text Retrieval

Xianwei Zhuang, Hongxiang Li, Xuxin Cheng et al.

ICLR 2024arXiv:2310.00115

#2630

Learning Over Molecular Conformer Ensembles: Datasets and Benchmarks

Yanqiao Zhu, Jeehyun Hwang, Keir Adams et al.

CVPR 2024arXiv:2312.14124

#2631

Neural Point Cloud Diffusion for Disentangled 3D Shape and Appearance Generation

Philipp Schröppel, Christopher Wewer, Jan Lenssen et al.

CVPR 2024arXiv:2404.03477

#2632

Towards Automated Movie Trailer Generation

Dawit Argaw Argaw, Mattia Soldan, Alejandro Pardo et al.

#2633

LTA-PCS: Learnable Task-Agnostic Point Cloud Sampling

Jiaheng Liu, Jianhao Li, Kaisiyuan Wang et al.

#2634

Low-Light Face Super-resolution via Illumination, Structure, and Texture Associated Representation

Chenyang Wang, Junjun Jiang, Kui Jiang et al.

ICLR 2024spotlightarXiv:2310.12975

#2635

Variational Inference for SDEs Driven by Fractional Noise

Rembert Daems, Manfred Opper, Guillaume Crevecoeur et al.

ECCV 2024arXiv:2502.05641

#2636

Generating Physically Realistic and Directable Human Motions from Multi-Modal Inputs

Aayam Shrestha, Pan Liu, German Ros et al.

ECCV 2024arXiv:2409.13037

#2637

DNI: Dilutional Noise Initialization for Diffusion Video Editing

Sunjae Yoon, Gwanhyeong Koo, Ji Woo Hong et al.

AAAI 2024paperarXiv:2303.08906

#2638

VVS: Video-to-Video Retrieval with Irrelevant Frame Suppression

Won Jo, Geuntaek Lim, Gwangjin Lee et al.

CVPR 2024arXiv:2411.02220

#2639

SIRA: Scalable Inter-frame Relation and Association for Radar Perception

Ryoma Yataka, Pu Wang, Petros Boufounos et al.

CVPR 2024arXiv:2212.02081

#2640

YolOOD: Utilizing Object Detection Concepts for Multi-Label Out-of-Distribution Detection

Alon Zolfi, Guy AmiT, Amit Baras et al.

ECCV 2024arXiv:2407.04345

#2641

CanonicalFusion: Generating Drivable 3D Human Avatars from Multiple Images

Jisu Shin, Junmyeong Lee, Seongmin Lee et al.

#2642

OctOcc: High-Resolution 3D Occupancy Prediction with Octree

Wenzhe Ouyang, Xiaolin Song, Bailan Feng et al.

ECCV 2024arXiv:2405.02508

#2643

Rasterized Edge Gradients: Handling Discontinuities Differentially

Stanislav Pidhorskyi, Tomas Simon, Gabriel Schwartz et al.

ECCV 2024arXiv:2312.02878

#2644

Towards More Practical Group Activity Detection: A New Benchmark and Model

Dongkeun Kim, Youngkil Song, Minsu Cho et al.

ICLR 2024arXiv:2404.07863

#2645

Backdoor Contrastive Learning via Bi-level Trigger Optimization

Weiyu Sun, Xinyu Zhang, Hao LU et al.

#2646

Improving Open Set Recognition via Visual Prompts Distilled from Common-Sense Knowledge

Seong-Tae Kim, Hyungil Kim, Y. Ro

ICML 2024arXiv:2405.18217

#2647

Understanding Inter-Concept Relationships in Concept-Based Models

Naveen Raman, Mateo Espinosa Zarlenga, Mateja Jamnik

AAAI 2024paperarXiv:2309.11236

#2648

Colour Passing Revisited: Lifted Model Construction with Commutative Factors

Malte Luttermann, Tanya Braun, Ralf Möller et al.

#2649

Characteristics Matching Based Hash Codes Generation for Efficient Fine-grained Image Retrieval

Zhen-Duo Chen, Li-Jun Zhao, Zi-Chao Zhang et al.

CVPR 2024arXiv:2312.00598

#2650

Learning from One Continuous Video Stream

Joao Carreira, Michael King, Viorica Patraucean et al.

ECCV 2024arXiv:2403.04398

#2651

MAGR: Manifold-Aligned Graph Regularization for Continual Action Quality Assessment

Kanglei Zhou, Liyuan Wang, Xingxing Zhang et al.

CVPR 2024arXiv:2404.01725

#2652

Disentangled Pre-training for Human-Object Interaction Detection

Zhuolong Li, Xingao Li, Changxing Ding et al.

CVPR 2024arXiv:2403.15019

#2653

BSNet: Box-Supervised Simulation-assisted Mean Teacher for 3D Instance Segmentation

Jiahao Lu, Jiacheng Deng, Tianzhu Zhang

CVPR 2024arXiv:2212.05315

#2654

Mind The Edge: Refining Depth Edges in Sparsely-Supervised Monocular Depth Estimation

Lior Talker, Aviad Cohen, Erez Yosef et al.

ECCV 2024arXiv:2407.11668

#2655

Learning to Make Keypoints Sub-Pixel Accurate

Shinjeong Kim, Marc Pollefeys, Daniel Barath

CVPR 2024arXiv:2403.01124

#2656

Text-guided Explorable Image Super-resolution

Kanchana Vaishnavi Gandikota, Paramanand Chandramouli

CVPR 2024highlightarXiv:2405.10053

#2657

SHiNe: Semantic Hierarchy Nexus for Open-vocabulary Object Detection

Mingxuan Liu, Tyler Hayes, Elisa Ricci et al.

CVPR 2024arXiv:2404.09001

#2658

Smart Help: Strategic Opponent Modeling for Proactive and Adaptive Robot Assistance in Households

Zhihao Cao, ZiDong Wang, Siwen Xie et al.

AAAI 2024paperarXiv:2401.01549

#2659

Towards Modeling Uncertainties of Self-explaining Neural Networks via Conformal Prediction

Wei Qian, Chenxu Zhao, Yangyi Li et al.

ECCV 2024arXiv:2405.19689

#2660

Uncertainty-aware sign language video retrieval with probability distribution modeling

Xuan Wu, Hongxiang Li, yuanjiang luo et al.

CVPR 2024arXiv:2404.16222

#2661

Step Differences in Instructional Video

Tushar Nagarajan, Lorenzo Torresani

AAAI 2024paperarXiv:2303.09174

#2662

Grab What You Need: Rethinking Complex Table Structure Recognition with Flexible Components Deliberation

Hao Liu, Xin Li, Mingming Gong et al.

ECCV 2024arXiv:2407.17365

#2663

ViPer: Visual Personalization of Generative Models via Individual Preference Learning

Sogand Salehi, Mahdi Shafiei, Roman Bachmann et al.

#2664

RG-GAN: Dynamic Regenerative Pruning for Data-Efficient Generative Adversarial Networks

Divya Saxena, Jiannong Cao, Jiahao Xu et al.

ECCV 2024arXiv:2312.02249

#2665

Recursive Visual Programming

Jiaxin Ge, Sanjay Subramanian, Baifeng Shi et al.

ICLR 2024spotlightarXiv:2305.11290

#2666

Massively Scalable Inverse Reinforcement Learning in Google Maps

Matt Barnes, Matthew Abueg, Oliver Lange et al.

ECCV 2024arXiv:2403.14053

#2667

Leveraging Thermal Modality to Enhance Reconstruction in Low-Light Conditions

Jiacong Xu, Mingqian Liao, Ram Prabhakar Kathirvel et al.

ECCV 2024arXiv:2404.09490

#2668

Leveraging temporal contextualization for video action recognition

Minji Kim, Dongyoon Han, Taekyung Kim et al.

CVPR 2024arXiv:2403.07359

#2669

FSC: Few-point Shape Completion

Xianzu Wu, Xianfeng Wu, Tianyu Luan et al.

#2670

Domain Shifting: A Generalized Solution for Heterogeneous Cross-Modality Person Re-Identification

Yan Jiang, Xu Cheng, Hao Yu et al.

CVPR 2024arXiv:2404.05001

#2671

Dual-Scale Transformer for Large-Scale Single-Pixel Imaging

Gang Qu, Ping Wang, Xin Yuan

CVPR 2024arXiv:2403.19473

#2672

Benchmarking Implicit Neural Representation and Geometric Rendering in Real-Time RGB-D SLAM

Tongyan Hua, Addison, Lin Wang

#2673

Symmetric Self-Paced Learning for Domain Generalization

Di Zhao, Yun Sing Koh, Gillian Dobbie et al.

ICLR 2024arXiv:2405.20439

#2674

Sharpness-Aware Minimization Enhances Feature Quality via Balanced Learning

Jacob Springer, Vaishnavh Nagarajan, Aditi Raghunathan

AAAI 2024paperarXiv:2403.17374

#2675

Multi-Domain Recommendation to Attract Users via Domain Preference Modeling

Hyunjun Ju, SeongKu Kang, Dongha Lee et al.

AAAI 2024paperarXiv:2312.16767

#2676

Adaptive Anytime Multi-Agent Path Finding Using Bandit-Based Large Neighborhood Search

Thomy Phan, Taoan Huang, Bistra Dilkina et al.

AAAI 2024paperarXiv:2308.11230

#2677

Towards Optimal Subsidy Bounds for Envy-Freeable Allocations

Yasushi Kawase, Kazuhisa Makino, Hanna Sumita et al.

#2678

Geometry-Guided Domain Generalization for Monocular 3D Object Detection

Fan Yang, Hui Chen, Yuwei He et al.

AAAI 2024paperarXiv:2403.08157

#2679

Multiscale Low-Frequency Memory Network for Improved Feature Extraction in Convolutional Neural Networks

Fuzhi Wu, Jiasong Wu, Youyong Kong et al.

#2680

How to Use the Metropolis Algorithm for Multi-Objective Optimization?

Weijie Zheng, Mingfeng Li, Renzhong Deng et al.

AAAI 2024paperarXiv:2312.12183

#2681

Poincaré Differential Privacy for Hierarchy-Aware Graph Embedding

Yuecen Wei, Haonan Yuan, Xingcheng Fu et al.

AAAI 2024paperarXiv:2401.17609

#2682

LaneGraph2Seq: Lane Topology Extraction with Language Model via Vertex-Edge Encoding and Connectivity Enhancement

Renyuan Peng, Xinyue Cai, Hang Xu et al.

CVPR 2024arXiv:2403.11222

#2683

SpikeNeRF: Learning Neural Radiance Fields from Continuous Spike Stream

Lin Zhu, Kangmin Jia, Yifan Zhao et al.

ECCV 2024arXiv:2409.05122

#2684

PMT: Progressive Mean Teacher via Exploring Temporal Consistency for Semi-Supervised Medical Image Segmentation

Ning Gao, Sanping Zhou, Le Wang et al.

AAAI 2024paperarXiv:2312.09486

#2685

Unraveling Batch Normalization for Realistic Test-Time Adaptation

Zixian Su, Jingwei Guo, Kai Yao et al.

ECCV 2024arXiv:2407.13609

#2686

Training-free Composite Scene Generation for Layout-to-Image Synthesis

Jiaqi Liu, Tao Huang, Chang Xu

AAAI 2024paperarXiv:2402.08578

#2687

FedLPS: Heterogeneous Federated Learning for Multiple Tasks with Local Parameter Sharing

Yongzhe Jia, Xuyun Zhang, Amin Beheshti et al.

ECCV 2024arXiv:2407.06516

#2688

VQA-Diff: Exploiting VQA and Diffusion for Zero-Shot Image-to-3D Vehicle Asset Generation in Autonomous Driving

Yibo Liu, Zheyuan Yang, Guile Wu et al.

AAAI 2024paperarXiv:2312.15970

#2689

Learning Deformable Hypothesis Sampling for Accurate PatchMatch Multi-View Stereo

Hongjie Li, Yao Guo, Xianwei Zheng et al.

AAAI 2024paperarXiv:2312.15291

#2690

Reverse Multi-Choice Dialogue Commonsense Inference with Graph-of-Thought

Li Zheng, Hao Fei, Fei Li et al.

AAAI 2024paperarXiv:2312.14518

#2691

Joint Learning Neuronal Skeleton and Brain Circuit Topology with Permutation Invariant Encoders for Neuron Classification

Minghui Liao, Guojia Wan, Bo Du

AAAI 2024paperarXiv:2401.16193

#2692

Contributing Dimension Structure of Deep Feature for Coreset Selection

Zhijing Wan, Zhixiang Wang, Yuran Wang et al.

CVPR 2024arXiv:2404.00974

#2693

Improving Visual Recognition with Hyperbolical Visual Hierarchy Mapping

Hyeongjun Kwon, Jinhyun Jang, Jin Kim et al.

ECCV 2024arXiv:2311.17050

#2694

Surf-D: Generating High-Quality Surfaces of Arbitrary Topologies Using Diffusion Models

Zhengming Yu, Zhiyang Dou, Xiaoxiao Long et al.

AAAI 2024paperarXiv:2312.08187

#2695

Completing Priceable Committees: Utilitarian and Representation Guarantees for Proportional Multiwinner Voting

Markus Brill, Jannik Peters

#2696

2043 Improved MLP Point Cloud Processing with High-Dimensional Positional Encoding

Yanmei Zou, Hongshan Yu, Zhengeng Yang et al.

AAAI 2024paperarXiv:2303.17594

#2697

MobileInst: Video Instance Segmentation on the Mobile

Renhong Zhang, Tianheng Cheng, Shusheng Yang et al.

#2698

11293 Cross-Class Feature Augmentation for Class Incremental Learning

Taehoon Kim, JaeYoo Park, Bohyung Han

AAAI 2024paperarXiv:2306.15272

#2699

Delivering Inflated Explanations

Yacine Izza, Alexey Ignatiev, Peter Stuckey et al.

AAAI 2024paperarXiv:2312.14472

#2700

Not All Tasks Are Equally Difficult: Multi-Task Deep Reinforcement Learning with Dynamic Depth Routing

Jinmin He, Kai Li, Yifan Zang et al.

CVPR 2024arXiv:2312.09250

#2701

Single Mesh Diffusion Models with Field Latents for Texture Generation

Thomas W. Mitchel, Carlos Esteves, Ameesh Makadia

CVPR 2024arXiv:2403.01414

#2702

Unsigned Orthogonal Distance Fields: An Accurate Neural Implicit Representation for Diverse 3D Shapes

YuJie Lu, Long Wan, Nayu Ding et al.

CVPR 2024arXiv:2405.05502

#2703

Towards Accurate and Robust Architectures via Neural Architecture Search

Yuwei Ou, Yuqi Feng, Yanan Sun

ECCV 2024arXiv:2407.18899

#2704

Learn from the Learnt: Source-Free Active Domain Adaptation via Contrastive Sampling and Visual Persistence

Mengyao Lyu, Tianxiang Hao, Xinhao Xu et al.

AAAI 2024paperarXiv:2312.11936

#2705

Exact ASP Counting with Compact Encodings

Mohimenul Kabir, Supratik Chakraborty, Kuldeep S Meel

AAAI 2024paperarXiv:2401.07212

#2706

HiHPQ: Hierarchical Hyperbolic Product Quantization for Unsupervised Image Retrieval

Zexuan Qiu, Jiahong Liu, Yankai Chen et al.

AAAI 2024paperarXiv:2403.05117

#2707

Arbitrary-Scale Point Cloud Upsampling by Voxel-Based Network with Latent Geometric-Consistent Learning

Hang Du, Xuejun Yan, Jingjing Wang et al.

AAAI 2024paperarXiv:2312.16451

#2708

Domain Generalization with Vital Phase Augmentation

Ingyun Lee, WooJu Lee, Hyun Myung

ICLR 2024arXiv:2306.03301

#2709

Estimating Conditional Mutual Information for Dynamic Feature Selection

Soham Gadgil, Ian Covert, Su-In Lee

#2710

Parsing All Adverse Scenes: Severity-Aware Semantic Segmentation with Mask-Enhanced Cross-Domain Consistency

Fuhao Li, Ziyang Gong, Yupeng Deng et al.

ICLR 2024arXiv:2302.05326

#2711

Scalable Real-Time Recurrent Learning Using Columnar-Constructive Networks

Khurram Javed, Haseeb Shah, Richard Sutton et al.

#2712

PTMQ: Post-training Multi-Bit Quantization of Neural Networks

Ke Xu, Zhongcheng Li, Shanshan Wang et al.

ECCV 2024arXiv:2404.09991

#2713

EgoPet: Egomotion and Interaction Data from an Animal's Perspective

Amir Bar, Arya Bakhtiar, Danny L Tran et al.

ECCV 2024arXiv:2407.09378

#2714

Graph Neural Network Causal Explanation via Neural Causal Models

Arman Behnam, Binghui Wang

ICLR 2024arXiv:2401.08819

#2715

Learning from Sparse Offline Datasets via Conservative Density Estimation

Zhepeng Cen, Zuxin Liu, Zitong Wang et al.

AAAI 2024paperarXiv:2312.09716

#2716

Let All Be Whitened: Multi-Teacher Distillation for Efficient Visual Retrieval

Zhe Ma, Jianfeng Dong, Shouling Ji et al.

AAAI 2024paperarXiv:2401.06443

#2717

BOK-VQA: Bilingual outside Knowledge-Based Visual Question Answering via Graph Representation Pretraining

Minjun Kim, SeungWoo Song, Youhan Lee et al.

AAAI 2024paperarXiv:2312.05551

#2718

Multi-Dimensional Fair Federated Learning

Cong Su, Guoxian Yu, Jun Wang et al.

#2719

SaCo Loss: Sample-wise Affinity Consistency for Vision-Language Pre-training

WU Sitong, Haoru Tan, Zhuotao Tian et al.

CVPR 2024arXiv:2403.01619

#2720

Spectrum AUC Difference (SAUCD): Human-aligned 3D Shape Evaluation

Tianyu Luan, Zhong Li, Lele Chen et al.

AAAI 2024paperarXiv:2312.08009

#2721

Semi-supervised Class-Agnostic Motion Prediction with Pseudo Label Regeneration and BEVMix

Kewei Wang, Yizheng Wu, Zhiyu Pan et al.

#2722

Continuous Optical Zooming: A Benchmark for Arbitrary-Scale Image Super-Resolution in Real World

Huiyuan Fu, Fei Peng, Xianwei Li et al.

#2723

Cross-Modal Match for Language Conditioned 3D Object Grounding

Yachao Zhang, Runze Hu, Ronghui Li et al.

AAAI 2024paperarXiv:2402.03561

#2724

VLN-Video: Utilizing Driving Videos for Outdoor Vision-and-Language Navigation

Jialu Li, Aishwarya Padmakumar, Gaurav Sukhatme et al.

ECCV 2024arXiv:2403.16394

#2725

Skews in the Phenomenon Space Hinder Generalization in Text-to-Image Generation

Yingshan Chang, Yasi Zhang, Zhiyuan Fang et al.

ECCV 2024arXiv:2407.10876

#2726

RepVF: A Unified Vector Fields Representation for Multi-task 3D Perception

Shen Jianbing, Chunliang Li, Wencheng Han et al.

#2727

Multi-View Dynamic Reflection Prior for Video Glass Surface Detection

Fang Liu, Yuhao Liu, Jiaying Lin et al.

ECCV 2024arXiv:2407.16193

#2728

CloudFixer: Test-Time Adaptation for 3D Point Clouds via Diffusion-Guided Geometric Transformation

Hajin Shim, Changhun Kim, Eunho Yang

ECCV 2024arXiv:2403.16528

#2729

Open-Set Recognition in the Age of Vision-Language Models

Dimity Miller, Niko Suenderhauf, Alex Kenna et al.

ECCV 2024arXiv:2407.10084

#2730

Part2Object: Hierarchical Unsupervised 3D Instance Segmentation

cheng Shi, Yulin zhang, Bin Yang et al.

ECCV 2024arXiv:2408.02231

#2731

REVISION: Rendering Tools Enable Spatial Fidelity in Vision-Language Models

Agneet Chatterjee, Yiran Luo, Tejas Gokhale et al.

#2732

A Theory of Joint Light and Heat Transport for Lambertian Scenes

Mani Ramanagopal, Sriram Narayanan, Aswin C. Sankaranarayanan et al.

ICLR 2024arXiv:2310.03957

#2733

Understanding prompt engineering may not require rethinking generalization

Victor Akinwande, Yiding Jiang, Dylan Sam et al.

ECCV 2024arXiv:2407.13851

#2734

X-Former: Unifying Contrastive and Reconstruction Learning for MLLMs

Swetha Sirnam, Jinyu Yang, Tal Neiman et al.

CVPR 2024arXiv:2403.13351

#2735

OrthCaps: An Orthogonal CapsNet with Sparse Attention Routing and Pruning

Geng Xinyu, Jiaming Wang, Jiawei Gong et al.

ICLR 2024arXiv:2305.19838

#2736

Relaxing the Additivity Constraints in Decentralized No-Regret High-Dimensional Bayesian Optimization

Anthony Bardou, Patrick Thiran, Thomas Begin

#2737

Identification of Necessary Semantic Undertakers in the Causal View for Image-Text Matching

Huatian Zhang, Lei Zhang, Kun Zhang et al.

ECCV 2024arXiv:2408.12352

#2738

GarmentAligner: Text-to-Garment Generation via Retrieval-augmented Multi-level Corrections

Shiyue Zhang, Zheng Chong, Xujie Zhang et al.

ECCV 2024arXiv:2405.10690

#2739

CoLeaF: A Contrastive-Collaborative Learning Framework for Weakly Supervised Audio-Visual Video Parsing

Faegheh Sardari, Armin Mustafa, Philip JB Jackson et al.

#2740

CatFormer: Category-Level 6D Object Pose Estimation with Transformer

Sheng Yu, Dihua Zhai, Yuanqing Xia

ECCV 2024arXiv:2403.06443

#2741

Temporal-Mapping Photography for Event Cameras

Yuhan Bao, Lei Sun, Yuqin Ma et al.

ICLR 2024arXiv:2402.13241

#2742

Federated Causal Discovery from Heterogeneous Data

Loka Li, Ignavier Ng, Gongxu Luo et al.

CVPR 2024arXiv:2406.04032

#2743

Zero-Painter: Training-Free Layout Control for Text-to-Image Synthesis

Marianna Ohanyan, Hayk Manukyan, Zhangyang Wang et al.

#2744

Self-Training Based Few-Shot Node Classification by Knowledge Distillation

Zongqian Wu, Yujie Mo, Peng Zhou et al.

ECCV 2024arXiv:2403.16167

#2745

Exploiting Semantic Reconstruction to Mitigate Hallucinations in Vision-Language Models

Minchan Kim, Minyeong Kim, Junik Bae et al.

AAAI 2024paperarXiv:2307.05892

#2746

SC-NeuS: Consistent Neural Surface Reconstruction from Sparse and Noisy Views

Shi-Sheng Huang, Zixin Zou, Yichi Zhang et al.

CVPR 2024highlightarXiv:2403.13171

#2747

LUWA Dataset: Learning Lithic Use-Wear Analysis on Microscopic Images

Jing Zhang, Irving Fang, Hao Wu et al.

CVPR 2024arXiv:2405.14855

#2748

Synergistic Global-space Camera and Human Reconstruction from Videos

Yizhou Zhao, Tuanfeng Y. Wang, Bhiksha Raj et al.

CVPR 2024arXiv:2311.16682

#2749

ContextSeg: Sketch Semantic Segmentation by Querying the Context with Attention

Jiawei Wang, Changjian Li

#2750

Distributionally Robust Loss for Long-Tailed Multi-Label Image Classification

Dekun Lin, Zhe Cui, Rui Chen et al.

ECCV 2024arXiv:2407.12676

#2751

CoSIGN: Few-Step Guidance of ConSIstency Model to Solve General INverse Problems

Jiankun Zhao, Bowen Song, Liyue Shen

AAAI 2024paperarXiv:2401.04348

#2752

LAMPAT: Low-Rank Adaption for Multilingual Paraphrasing Using Adversarial Training

Khoi M. Le, Trinh Pham, Tho Quan et al.

CVPR 2024arXiv:2309.16421

#2753

Distilling ODE Solvers of Diffusion Models into Smaller Steps

Sanghwan Kim, Hao Tang, Fisher Yu

CVPR 2024arXiv:2403.01781

#2754

Integrating Efficient Optimal Transport and Functional Maps For Unsupervised Shape Correspondence Learning

Tung Le, Khai Nguyen, Shanlin Sun et al.

ECCV 2024arXiv:2409.20557

#2755

Propose, Assess, Search: Harnessing LLMs for Goal-Oriented Planning in Instructional Videos

Mohaiminul Islam, Tushar Nagarajan, Huiyu Wang et al.

AAAI 2024paperarXiv:2310.09603

#2756

B-spine: Learning B-spline Curve Representation for Robust and Interpretable Spinal Curvature Estimation

Hao Wang, Qiang Song, Ruofeng Yin et al.

#2757

Hiding Imperceptible Noise in Curvature-Aware Patches for 3D Point Cloud Attack

Mingyu Yang, Daizong Liu, Keke Tang et al.

ECCV 2024arXiv:2408.12316

#2758

Unrolled Decomposed Unpaired Learning for Controllable Low-Light Video Enhancement

Lingyu Zhu, Wenhan Yang, Baoliang Chen et al.

AAAI 2024paperarXiv:2309.02923

#2759

Patched Line Segment Learning for Vector Road Mapping

Jiakun Xu, Bowen Xu, Gui-Song Xia et al.

ICLR 2024arXiv:2307.06175

#2760

Learning Decentralized Partially Observable Mean Field Control for Artificial Collective Behavior

Kai Cui, Sascha Hauck, Christian Fabian et al.

ECCV 2024arXiv:2403.13808

#2761

On Pretraining Data Diversity for Self-Supervised Learning

Hasan Abed El Kader Hammoud, Tuhin Das, Fabio Pizzati et al.

ECCV 2024arXiv:2409.17316

#2762

Bi-TTA: Bidirectional Test-Time Adapter for Remote Physiological Measurement

Haodong LI, Hao LU, Yingcong Chen

ECCV 2024arXiv:2407.11494

#2763

Learning Semantic Latent Directions for Accurate and Controllable Human Motion Prediction

Guowei Xu, Jiale Tao, Wen Li et al.

CVPR 2024arXiv:2404.19696

#2764

Naturally Supervised 3D Visual Grounding with Language-Regularized Concept Learners

Chun Feng, Joy Hsu, Weiyu Liu et al.

ECCV 2024arXiv:2408.13752

#2765

Localization and Expansion: A Decoupled Framework for Point Cloud Few-shot Semantic Segmentation

Zhaoyang Li, Yuan Wang, Wangkai Li et al.

ECCV 2024arXiv:2408.13459

#2766

Rethinking Video Deblurring with Wavelet-Aware Dynamic Transformer and Diffusion Model

chen rao, Guangyuan Li, Zehua Lan et al.

ECCV 2024arXiv:2305.15798

#2767

BK-SDM: A Lightweight, Fast, and Cheap Version of Stable Diffusion

Bo-Kyeong Kim, Hyoung-Kyu Song, Thibault Castells et al.

CVPR 2024arXiv:2311.17951

#2768

C3Net: Compound Conditioned ControlNet for Multimodal Content Generation

Juntao Zhang, Yuehuai LIU, Yu-Wing Tai et al.

CVPR 2024arXiv:2401.06146

#2769

AAMDM: Accelerated Auto-regressive Motion Diffusion Model

Tianyu Li, Calvin Zhuhan Qiao, Ren Guanqiao et al.

ECCV 2024arXiv:2305.03716

#2770

3D Small Object Detection with Dynamic Spatial Pruning

Xiuwei Xu, Zhihao Sun, Ziwei Wang et al.

ECCV 2024arXiv:2312.07315

#2771

NVS-Adapter: Plug-and-Play Novel View Synthesis from a Single Image

Yoonwoo Jeong, Jinwoo Lee, Chiheon Kim et al.

ECCV 2024arXiv:2407.07324

#2772

Event-Aided Time-To-Collision Estimation for Autonomous Driving

Jinghang Li, Bangyan Liao, Xiuyuan LU et al.

ECCV 2024arXiv:2404.06493

#2773

Flying with Photons: Rendering Novel Views of Propagating Light

Anagh Malik, Noah Juravsky, Ryan Po et al.

ICLR 2024arXiv:2403.09274

#2774

EventRPG: Event Data Augmentation with Relevance Propagation Guidance

Mingyuan Sun, Donghao Zhang, Zongyuan Ge et al.

ECCV 2024arXiv:2407.05352

#2775

Exploring Phrase-Level Grounding with Text-to-Image Diffusion Model

Danni Yang, Ruohan Dong, Jiayi Ji et al.

ECCV 2024arXiv:2407.09826

#2776

3D Weakly Supervised Semantic Segmentation with 2D Vision-Language Guidance

Xiaoxu Xu, Yitian Yuan, Jinlong Li et al.

#2777

Motion Diversification Networks

Hee Jae Kim, Eshed Ohn-Bar

CVPR 2024highlightarXiv:2401.15261

#2778

Vanishing-Point-Guided Video Semantic Segmentation of Driving Scenes

Diandian Guo, Deng-Ping Fan, Tongyu Lu et al.

AAAI 2024paperarXiv:2402.19119

#2779

VIXEN: Visual Text Comparison Network for Image Difference Captioning

Alexander Black, Jing Shi, Yifei Fan et al.

AAAI 2024paperarXiv:2306.12681

#2780

One at a Time: Progressive Multi-Step Volumetric Probability Learning for Reliable 3D Scene Perception

Bohan Li, Yasheng Sun, Jingxin Dong et al.

CVPR 2024arXiv:2403.19501

#2781

RELI11D: A Comprehensive Multimodal Human Motion Dataset and Method

Ming Yan, Yan Zhang, Shuqiang Cai et al.

ECCV 2024arXiv:2311.13777

#2782

GS-Pose: Category-Level Object Pose Estimation via Geometric and Semantic Correspondence

Pengyuan Wang, Takuya Ikeda, Robert Lee et al.

CVPR 2024arXiv:2404.16123

#2783

FairDeDup: Detecting and Mitigating Vision-Language Fairness Disparities in Semantic Dataset Deduplication

Eric Slyman, Stefan Lee, Scott Cohen et al.

ECCV 2024arXiv:2409.07239

#2784

PiTe: Pixel-Temporal Alignment for Large Video-Language Model

Yang Liu, Pengxiang Ding, Siteng Huang et al.

#2785

Brain Netflix: Scaling Data to Reconstruct Videos from Brain Signals

Camilo Fosco, Benjamin Lahner, Bowen Pan et al.

#2786

REGLO: Provable Neural Network Repair for Global Robustness Properties

Feisi Fu, Zhilu Wang, Weichao Zhou et al.

CVPR 2024arXiv:2307.04760

#2787

Learning Spatial Features from Audio-Visual Correspondence in Egocentric Videos

Sagnik Majumder, Ziad Al-Halah, Kristen Grauman

#2788

Exploring Vulnerabilities in Spiking Neural Networks: Direct Adversarial Attacks on Raw Event Data

Yanmeng Yao, Xiaohan Zhao, Bin Gu

ECCV 2024arXiv:2403.09468

#2789

Eta Inversion: Designing an Optimal Eta Function for Diffusion-based Real Image Editing

Wonjun Kang, Kevin Galim, Hyung Il Koo

ICLR 2024spotlightarXiv:2310.10434

#2790

Equivariant Matrix Function Neural Networks

Ilyes Batatia, Lars Leon Schaaf, Gábor Csányi et al.

#2791

TurboSL: Dense Accurate and Fast 3D by Neural Inverse Structured Light

Parsa Mirdehghan, Maxx Wu, Wenzheng Chen et al.

AAAI 2024paperarXiv:2401.12497

#2792

Building Minimal and Reusable Causal State Abstractions for Reinforcement Learning

Zizhao Wang, Caroline Wang, Xuesu Xiao et al.

CVPR 2024arXiv:2002.07756

#2793

Hierarchical Correlation Clustering and Tree Preserving Embedding

Morteza Haghir Chehreghani, Mostafa Haghir Chehreghani

AAAI 2024paperarXiv:2406.08799

#2794

Pareto Front-Diverse Batch Multi-Objective Bayesian Optimization

Alaleh Ahmadianshalchi, Syrine Belakaria, Janardhan Rao Doppa

ECCV 2024arXiv:2311.15908

#2795

Enhancing Perceptual Quality in Video Super-Resolution through Temporally-Consistent Detail Synthesis using Diffusion Models

Claudio Rota, Marco Buzzelli, Joost Van de Weijer

#2796

Epitopological learning and Cannistraci-Hebb network shape intelligence brain-inspired theory for ultra-sparse advantage in deep learning

Yingtao Zhang, Jialin Zhao, Wenjing Wu et al.

ICLR 2024

ICLR 2024arXiv:2401.10556

#2797

Symbol as Points: Panoptic Symbol Spotting via Point-based Representation

Wenlong Liu, Tianyu Yang, Yuhan Wang et al.

ECCV 2024arXiv:2407.05008

#2798

T-CorresNet: Template Guided 3D Point Cloud Completion with Correspondence Pooling Query Generation Strategy

Fan Duan, Jiahao Yu, Li Chen

#2799

Enhancing Cross-Subject fMRI-to-Video Decoding with Global-Local Functional Alignment

Chong Li, Xuelin Qian, Yun Wang et al.

CVPR 2024arXiv:2405.02608

#2800

UnSAMFlow: Unsupervised Optical Flow Guided by Segment Anything Model

Shuai Yuan, Lei Luo, Zhuo Hui et al.