Commit 9e7687ca authored by Dan Povey's avatar Dan Povey
Browse files

Clarify README.

git-svn-id: 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
parent bad94ae0
This directory contains example scripts that demonstrate how to
use Kaldi. Each subdirectory corresponds to a corpus that we have
example scripts for. Currently these are both corpora available from
example scripts for. Currently these are all corpora available from
the Linguistic Data Consortium (LDC).
Explanations of the corpora are below:
Explanations of the corpora are below.
Note: the easiest examples to work with are rm/s3 and wsj/s3.
wsj: The Wall Street Journal corpus. This is a corpus of read
sentences from the Wall Street Journal, recorded under clean conditions.
......@@ -20,13 +21,12 @@ Explanations of the corpora are below:
get the same data using combinations of other catalog numbers, but this
is the one we used).
Recipes in progress:
Recipes in progress (these may be less polished than the ones above).
swbd: Switchboard. A fairly large amount of telephone speech (2-channel, 8kHz
sampling rate).
This directory is a work in progress.
gp: GlobalPhone. This is a multilingual speech corpus.
timit: TIMIT, which is an old corpus of carefully read speech.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment