Lianwen Jin

33
Papers
215
Total Citations

Papers (33)

FontDiffuser: One-Shot Font Generation via Denoising Diffusion with Multi-Scale Content Aggregation and Style Contrastive Learning

AAAI 2024arXiv
74
citations

CC-OCR: A Comprehensive and Challenging OCR Benchmark for Evaluating Large Multimodal Models in Literacy

ICCV 2025arXiv
42
citations

DocRes: A Generalist Model Toward Unifying Document Image Restoration Tasks

CVPR 2024
29
citations

ViTEraser: Harnessing the Power of Vision Transformers for Scene Text Removal with SegMIM Pretraining

AAAI 2024arXiv
18
citations

M2Doc: A Multi-Modal Fusion Approach for Document Layout Analysis

AAAI 2024
14
citations

Bridging the Gap Between End-to-End and Two-Step Text Spotting

CVPR 2024
11
citations

Revisiting Tampered Scene Text Detection in the Era of Generative AI

AAAI 2025
10
citations

DocLayLLM: An Efficient Multi-modal Extension of Large Language Models for Text-rich Document Understanding

CVPR 2025arXiv
10
citations

Predicting the Original Appearance of Damaged Historical Documents

AAAI 2025
7
citations

On the General Value of Evidence, and Bilingual Scene-Text Visual Question Answering

CVPR 2020arXiv
0
citations

Implicit Feature Alignment: Learn To Convert Text Recognizer to Text Spotter

CVPR 2021arXiv
0
citations

Fourier Contour Embedding for Arbitrary-Shaped Text Detection

CVPR 2021arXiv
0
citations

SimAN: Exploring Self-Supervised Representation Learning of Scene Text via Similarity-Aware Normalization

CVPR 2022arXiv
0
citations

SwinTextSpotter: Scene Text Spotting via Better Synergy Between Text Detection and Text Recognition

CVPR 2022arXiv
0
citations

Look Closer To Supervise Better: One-Shot Font Generation via Component-Based Discriminator

CVPR 2022arXiv
0
citations

Towards Robust Tampered Text Detection in Document Image: New Dataset and New Solution

CVPR 2023
0
citations

M6Doc: A Large-Scale Multi-Format, Multi-Type, Multi-Layout, Multi-Language, Multi-Annotation Category Dataset for Modern Document Layout Analysis

CVPR 2023
0
citations

Scale-Aware Modulation Meet Transformer

ICCV 2023arXiv
0
citations

Revisiting Scene Text Recognition: A Data Perspective

ICCV 2023arXiv
0
citations

ESTextSpotter: Towards Better Scene Text Spotting with Explicit Synergy in Transformer

ICCV 2023arXiv
0
citations

RD-GAN: Few/Zero-Shot Chinese Character Style Transfer via Radical Decomposition and Rendering

ECCV 2020
0
citations

Don’t Forget Me: Accurate Background Recovery for Text Removal via Modeling Local-Global Context

ECCV 2022
0
citations

Learn to Augment: Joint Data Augmentation and Network Optimization for Text Recognition

CVPR 2020arXiv
0
citations

DocKylin: A Large Multimodal Model for Visual Document Understanding with Efficient Visual Slimming

AAAI 2025
0
citations

DocNLC: A Document Image Enhancement Framework with Normalized and Latent Contrastive Representation for Multiple Degradations

AAAI 2024
0
citations

Towards Modern Image Manipulation Localization: A Large-Scale Dataset and Novel Methods

CVPR 2024
0
citations

UPOCR: Towards Unified Pixel-Level OCR Interface

ICML 2024
0
citations

Deep Matching Prior Network: Toward Tighter Multi-Oriented Text Detection

CVPR 2017arXiv
0
citations

Aggregation Cross-Entropy for Sequence Recognition

CVPR 2019
0
citations

Tightness-Aware Evaluation Protocol for Scene Text Detection

CVPR 2019
0
citations

ABCNet: Real-Time Scene Text Spotting With Adaptive Bezier-Curve Network

CVPR 2020arXiv
0
citations

MSDS: A Large-Scale Chinese Signature and Token Digit String Dataset for Handwriting Verification

NeurIPS 2022
0
citations

M5HisDoc: A Large-scale Multi-style Chinese Historical Document Analysis Benchmark

NeurIPS 2023
0
citations