Multispectral Images of Historical Documents dataset
Text
OCR/Text Detection
|...
License: Unknown

Overview

The MS-TEx 2015 dataset contains 10 handwritten and machine-printed historical document images along with eight spectral images for each image.
The goal of this contest is evaluation of the most recent advances in text extraction from historical document images captured by a multispectral imaging system.

Citation

@INPROCEEDINGS{6628610,
    author={Hedjam, R. and Cheriet, M.},
    booktitle={Document Analysis and Recognition (ICDAR), 2013 12th
    International Conference on},
    title={Ground-Truth Estimation in Multispectral Representation Space:
    Application to Degraded Document Image Binarization},
    year={2013},
    month={Aug},
    pages={190-194},
    doi={10.1109/ICDAR.2013.45},
    ISSN={1520-5363}
}

@article{DBLP:journals/pr/HedjamC13,
    author = {Rachid Hedjam and
    Mohamed Cheriet},
    title = {Historical document image restoration using multispectral imaging
    system},
    journal = {Pattern Recognition},
    volume = {46},
    number = {8},
    pages = {2297--2312},
    year = {2013},
    url = {http://dx.doi.org/10.1016/j.patcog.2012.12.015},
    doi = {10.1016/j.patcog.2012.12.015},
    timestamp = {Mon, 08 Apr 2013 20:24:35 +0200},
    biburl = {http://dblp.uni-trier.de/rec/bib/journals/pr/HedjamC13},
    bibsource = {dblp computer science bibliography, http://dblp.org}
}

Data Summary
Type
Image,
Amount
--
Size
--
Provided by
Synchromedia
The group works on a wide diversity of pertinent research areas. It also consists of several professors, researchers and graduate students, as well as adjunct members from other departments and universities.
Issue
Start Building AI Now