VoxForge

What VoxForge does
VoxForge collects speech from users from all around the world. We are working to create a Free speech corpus (a database of speech audio files and their text transcriptions) that can be used to create acoustic models for use in Free and Open Source speech recognition engines.

We have corpora in English (our largest corpus), German, Dutch, and Russian. We are also working to add Italian and Hebrew to the mix. We collect speech using a number of approaches: using a Java applet, by telephone, and, of course, using Audacity.

We take a user's submission, and depending on its original formatting, we might downsample it to our standard formats of:



Each submission contains the following directory structure: * [submitterID]-[date]-[3 random characters] * etc * audio file_details - contains the submission's orginal formatting information (sampling rate/bits per sample). * GPL_License - full GPL license. * HDMan_log - console output from HTK's HDMan tool. - gives the phone usage counts for the submission. * HVite_log - console output from HTK's HViteDMan tool. - runs a "re-alignment" of the training data (to find                     the best pronunciation for a given word), but its main purpose is a sanity check to make sure an audio recording matches its prompt. * PROMPTS - sanitized prompts file (with some punctuation removed) and includes path to audio for acoustic mode training. * prompts-original - prompts in their original format. * README - information about the user (gender, age range, language,                 pronunciation dialect) and their recording environment. * wav file - the actual audio. LICENSE - GPL license notice

Free Speech... Recognition

http://www.voxforge.org