MJSynth
Classification
OCR/Text Detection
|...
License: Unknown

Overview

This is synthetically generated dataset which we found sufficient for training text recognition on real-world images

Synthetic Data Engine process

This dataset consists of 9 million images covering 90k English words, and includes the training, validation and test splits used in our work.

Citation

@InProceedings{Jaderberg14c,
  author       = "Max Jaderberg and Karen Simonyan and Andrea Vedaldi and Andrew Zisserman",
  title        = "Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition",
  booktitle    = "Workshop on Deep Learning, NIPS",
  year         = "2014",
}

@Article{Jaderberg16,
  author       = "Max Jaderberg and Karen Simonyan and Andrea Vedaldi and Andrew Zisserman",
  title        = "Reading Text in the Wild with Convolutional Neural Networks",
  journal      = "International Journal of Computer Vision",
  number       = "1",
  volume       = "116",
  pages        = "1--20",
  month        = "jan",
  year         = "2016",
}
Start Building AI Now