graviti
Products
Resources
About us
THCHS-30
Audio
NLP
|...
License: Custom

Overview

Speech data is crucially important for speech recognition research. There are quite some speech
databases that can be purchased at prices that are reasonable for most research institutes.
However, for young people who just start research activities or those who just gain initial
interest in this direction, the cost for data is still an annoying barrier. We support the
`free data' movement in speech recognition: research institutes (particularly supported by
public funds) publish their data freely so that new researchers can obtain sufficient data
to kick of their career.Here, we follow this trend and release a free Chinese speech database
THCHS-30 that can be used to build a full- edged Chinese speech recognition system.

Citation

Please use the following citation when referencing the dataset:

@article{DBLP:journals/corr/WangZ15e,
  author    = {Dong Wang and
               Xuewei Zhang},
  title     = {{THCHS-30} : {A} Free Chinese Speech Corpus},
  journal   = {CoRR},
  volume    = {abs/1512.01882},
  year      = {2015},
  url       = {http://arxiv.org/abs/1512.01882},
  archivePrefix = {arXiv},
  eprint    = {1512.01882},
  timestamp = {Mon, 13 Aug 2018 16:46:59 +0200},
  biburl    = {https://dblp.org/rec/journals/corr/WangZ15e.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

License

Custom

Data Summary
Type
Audio,
Amount
--
Size
13.4GB
Provided by
CSLT at Tsinghua University
The Center for Speech and Language Technology (CSLT), Tsinghua University, was established with the goal of conducting cut-edging research on intelligent human-machine interactions, particularly the research on speech and language techniques.
| Amount -- | Size 13.4GB
THCHS-30
Audio
NLP
License: Custom

Overview

Speech data is crucially important for speech recognition research. There are quite some speech
databases that can be purchased at prices that are reasonable for most research institutes.
However, for young people who just start research activities or those who just gain initial
interest in this direction, the cost for data is still an annoying barrier. We support the
`free data' movement in speech recognition: research institutes (particularly supported by
public funds) publish their data freely so that new researchers can obtain sufficient data
to kick of their career.Here, we follow this trend and release a free Chinese speech database
THCHS-30 that can be used to build a full- edged Chinese speech recognition system.

Citation

Please use the following citation when referencing the dataset:

@article{DBLP:journals/corr/WangZ15e,
  author    = {Dong Wang and
               Xuewei Zhang},
  title     = {{THCHS-30} : {A} Free Chinese Speech Corpus},
  journal   = {CoRR},
  volume    = {abs/1512.01882},
  year      = {2015},
  url       = {http://arxiv.org/abs/1512.01882},
  archivePrefix = {arXiv},
  eprint    = {1512.01882},
  timestamp = {Mon, 13 Aug 2018 16:46:59 +0200},
  biburl    = {https://dblp.org/rec/journals/corr/WangZ15e.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

License

Custom

0
Start building your AI now
graviti
wechat-QR
Long pressing the QR code to follow wechat official account

Copyright@Graviti