Most Cited ECCV Paper by Hongsheng LI Papers

14 papers found

#1

MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?

Renrui Zhang, Dongzhi Jiang, Yichi Zhang et al.

ECCV 2024arXiv:2403.14624
498
citations
#2

FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis

Linjiang Huang, Rongyao Fang, Aiping Zhang et al.

ECCV 2024arXiv:2403.12963
51
citations
#3

Deep Reward Supervisions for Tuning Text-to-Image Diffusion Models

Xiaoshi Wu, Yiming Hao, Manyuan Zhang et al.

ECCV 2024arXiv:2405.00760
46
citations
#4

Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation

Fu-Yun Wang, Xiaoshi Wu, Zhaoyang Huang et al.

ECCV 2024arXiv:2403.13745
30
citations
#5

GiT: Towards Generalist Vision Transformer through Universal Language Interface

Haiyang Wang, Hao Tang, Li Jiang et al.

ECCV 2024arXiv:2403.09394
23
citations
#6

Any2Point: Empowering Any-modality Transformers for Efficient 3D Understanding

YIWEN TANG, Renrui Zhang, Jiaming Liu et al.

ECCV 2024
19
citations
#7

Ponymation: Learning Articulated 3D Animal Motions from Unlabeled Online Videos

Keqiang Sun, Dori Litvak, Yunzhi Zhang et al.

ECCV 2024arXiv:2312.13604
10
citations
#8

ZoLA: Zero-Shot Creative Long Animation Generation with Short Video Model

Fu-Yun Wang, Zhaoyang Huang, Qiang Ma et al.

ECCV 2024
9
citations
#9

BlinkVision: A Benchmark for Optical Flow, Scene Flow and Point Tracking Estimation using RGB Frames and Events

Yijin Li, Yichen Shen, Zhaoyang Huang et al.

ECCV 2024arXiv:2410.20451
8
citations
#10

Unmasking Bias in Diffusion Model Training

Hu Yu, Li Shen, Jie Huang et al.

ECCV 2024arXiv:2310.08442
7
citations
#11

Delving Deep into Engagement Prediction of Short Videos

dasong Li, Wenjie Li, Baili Lu et al.

ECCV 2024arXiv:2410.00289
6
citations
#12

nuCraft: Crafting High Resolution 3D Semantic Occupancy for Unified 3D Scene Understanding

Benjin Zhu, zhe wang, Hongsheng LI

ECCV 2024
5
citations
#13

SPHINX: A Mixer of Weights, Visual Embeddings and Image Scales for Multi-modal Large Language Models

Ziyi Lin, Dongyang Liu, Renrui Zhang et al.

ECCV 2024
#14

Three Things We Need to Know About Transferring Stable Diffusion to Visual Dense Prediciton Tasks

Manyuan Zhang, Guanglu Song, Xiaoyu Shi et al.

ECCV 2024