graviti
Products
Resources
About us
Kannada-MNIST
Classification
MNIST
|...
License: Unknown

Overview

Here, we disseminate a new handwritten digits-dataset, termed Kannada-MNIST, for the Kannada
script, that can potentially serve as a direct drop-in replacement for the original MNIST dataset.

Data Collection

This dataset is based off of the efforts of 65 volunteers from Bangalore, India, who are native
speakers and users of the Kannada language and the script.
This was curated to serve as a
direct one-to-one drop-in replacement for the original MNIST dataset (akin to Fashion-MNIST
and K-MNIST datasets).

65 volunteers were recruited in Bangalore, India, who were native speakers
of the language as well as day-to-day users of the numeral script. Each volunteer filled out
an A3 sheet containing a 32 × 40 grid. This yielded filled-out A3 sheets containing 128 instances
of each number which we posit is large enough to capture most of the natural intra-volunteer
variations of the glyph shapes. All of the sheets thus collected were scanned at 600 dots-per-inch
resolution using the Konica Accurio-Press-C6085 scanner that yielded 65 4963 × 3509 png images.

Data Format

The main Kannada-MNIST dataset that consists of a training set of 60000 28 × 28 gray-scale
sample images.

Citation

Please use the following citation when referencing the dataset:

@article{prabhu2019kannada,
  title={Kannada-MNIST: A new handwritten digits dataset for the Kannada language},
  author={Prabhu, Vinay Uday},
  journal={arXiv preprint arXiv:1908.01242},
  year={2019}
}
Data Summary
Type
Image,
Amount
60K
Size
64.19MB
Provided by
Vinay Uday Prabhu
PhD , ECE, Carnegie Mellon University. Machine Learning and Data Sciences at UnifyID
| Amount 60K | Size 64.19MB
Kannada-MNIST
Classification
MNIST
License: Unknown

Overview

Here, we disseminate a new handwritten digits-dataset, termed Kannada-MNIST, for the Kannada
script, that can potentially serve as a direct drop-in replacement for the original MNIST dataset.

Data Collection

This dataset is based off of the efforts of 65 volunteers from Bangalore, India, who are native
speakers and users of the Kannada language and the script.
This was curated to serve as a
direct one-to-one drop-in replacement for the original MNIST dataset (akin to Fashion-MNIST
and K-MNIST datasets).

65 volunteers were recruited in Bangalore, India, who were native speakers
of the language as well as day-to-day users of the numeral script. Each volunteer filled out
an A3 sheet containing a 32 × 40 grid. This yielded filled-out A3 sheets containing 128 instances
of each number which we posit is large enough to capture most of the natural intra-volunteer
variations of the glyph shapes. All of the sheets thus collected were scanned at 600 dots-per-inch
resolution using the Konica Accurio-Press-C6085 scanner that yielded 65 4963 × 3509 png images.

Data Format

The main Kannada-MNIST dataset that consists of a training set of 60000 28 × 28 gray-scale
sample images.

Citation

Please use the following citation when referencing the dataset:

@article{prabhu2019kannada,
  title={Kannada-MNIST: A new handwritten digits dataset for the Kannada language},
  author={Prabhu, Vinay Uday},
  journal={arXiv preprint arXiv:1908.01242},
  year={2019}
}
0
Start building your AI now
graviti
wechat-QR
Long pressing the QR code to follow wechat official account

Copyright@Graviti