Skip to content

Phimanu/TextSR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Code for "Text Image Super-Resolution for Improved OCR in Real-Life Scenarios using Swin Transformers"

Step 0 - Automatic Setup

For the initial setup (downloading weights and test data) run

bash prepare.bash

and then start the benchmark with

bash benchmark.bash

If any of the script fails, follow the next steps. Otherwise you are done :).

Step 1 - Downloading

Download the pretrained weights for Aster, MORAN and CRNN:

Aster: https://github.com/ayumiymk/aster.pytorch  
MORAN:  https://github.com/Canjie-Luo/MORAN_v2  
CRNN: https://github.com/meijieru/crnn.pytorch

Place them into the folder model_zoo and rename them to aster.pth.tar, moran.pth and crnn.pth.

Download the Textzoom dataset:

https://github.com/JasonBoy1/TextZoom

Change the paths from extract_images.py to match the ones of the test folders you just downloaded.

test_paths = {
    'easy': 'textzoom/test/easy', #To be changed
    'medium': 'textzoom/test/medium', #To be changed
    'hard': 'textzoom/test/hard' #To be changed
}

Download our pretrained models for Phase 2 and Phase 3 from here:

https://drive.google.com/drive/folders/14UggkVJH3RPQwF-B_mj0bHEtjoloP3if?usp=sharing

Place them into the model_zoo folder.

Step 2 - Image Extraction

Run our image extraction script to extract LR and HR images:

python extract_images.py

Step 3 - Evaluate

Run the evaluation scripts for image quality and text recognition accuracy:

bash demo_image_quality.bash
bash demo_text_recognition.bash

References

Our Code is built on top of the Super Resolution repository KAIR: https://github.com/cszn/KAIR


@inproceedings{liang2021swinir,
title={SwinIR: Image Restoration Using Swin Transformer},
author={Liang, Jingyun and Cao, Jiezhang and Sun, Guolei and Zhang, Kai and Van Gool, Luc and Timofte, Radu},
booktitle={IEEE International Conference on Computer Vision Workshops},
pages={1833--1844},
year={2021}
}

@article{bshi2018aster,
  author  = {Baoguang Shi and
               Mingkun Yang and
               Xinggang Wang and
               Pengyuan Lyu and
               Cong Yao and
               Xiang Bai},
  title   = {ASTER: An Attentional Scene Text Recognizer with Flexible Rectification},
  journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence}, 
  volume  = {}, 
  number  = {}, 
  pages   = {1-1},
  year    = {2018}, 
}

@article{cluo2019moran,
  author    = {Canjie Luo and Lianwen Jin and Zenghui Sun},
  title     = {MORAN: A Multi-Object Rectified Attention Network for Scene Text Recognition},
  journal   = {Pattern Recognition}, 
  volume    = {90}, 
  pages     = {109--118},
  year      = {2019},
  publisher = {Elsevier}
}

@article{shi2016end,
  title={An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition},
  author={Shi, Baoguang and Bai, Xiang and Yao, Cong},
  journal={IEEE transactions on pattern analysis and machine intelligence},
  volume={39},
  number={11},
  pages={2298--2304},
  year={2016},
  publisher={IEEE}
}

@inproceedings{wang2020scene,
  title={Scene text image super-resolution in the wild},
  author={Wang, Wenjia and Xie, Enze and Liu, Xuebo and Wang, Wenhai and Liang, Ding and Shen, Chunhua and Bai, Xiang},
  booktitle={European Conference on Computer Vision},
  pages={650--666},
  year={2020},
  organization={Springer}
}

The aster, moran and crnn models are copied from their respective github repo:

Aster: https://github.com/ayumiymk/aster.pytorch  
MORAN:  https://github.com/Canjie-Luo/MORAN_v2  
CRNN: https://github.com/meijieru/crnn.pytorch

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages