Lianwen Jin

13

Papers

215

Total Citations

Papers (13)

FontDiffuser: One-Shot Font Generation via Denoising Diffusion with Multi-Scale Content Aggregation and Style Contrastive Learning

CC-OCR: A Comprehensive and Challenging OCR Benchmark for Evaluating Large Multimodal Models in Literacy

DocRes: A Generalist Model Toward Unifying Document Image Restoration Tasks

ViTEraser: Harnessing the Power of Vision Transformers for Scene Text Removal with SegMIM Pretraining

M2Doc: A Multi-Modal Fusion Approach for Document Layout Analysis

Bridging the Gap Between End-to-End and Two-Step Text Spotting

DocLayLLM: An Efficient Multi-modal Extension of Large Language Models for Text-rich Document Understanding

Revisiting Tampered Scene Text Detection in the Era of Generative AI

Predicting the Original Appearance of Damaged Historical Documents

DocNLC: A Document Image Enhancement Framework with Normalized and Latent Contrastive Representation for Multiple Degradations

Towards Modern Image Manipulation Localization: A Large-Scale Dataset and Novel Methods

DocKylin: A Large Multimodal Model for Visual Document Understanding with Efficient Visual Slimming

UPOCR: Towards Unified Pixel-Level OCR Interface