Related pages

Thursday, January 31, 2008

STEP 5 Coding the Data

This is the last step in Data Preparation Stage. We need to convert the audio wav files to another format
called MFCC format.
We create a file containing a list of each source audio file and the name of the MFCC file it will be
converted to and use that file as a parameter to the HCopy command. This file is called the
codetr.scp.We use the HCopy tool to convert our wav files to MFCC format.
wav/S0001.wav mfcc/S0001.mfc wav/S0004.wav mfcc/S0004.mfc
wav/S0005.wav mfcc/S0005.mfc wav/S0008.wav mfcc/S0008.mfc

The HCopy command performs the conversion from wav format to MFCC. To do this, a configuration file
which specifies all the needed conversion parameters is required. Create a file called wav_config. (wav_config is the configuration file) It should contain following parameters

#Coding parameters
TARGETKIND = MFCC_0
TARGETRATE = 100000.0
SAVECOMPRESSED = T
SAVEWITHCRC = T
WINDOWSIZE = 250000.0
USEHAMMING = T
PREEMCOEF = 0.97
NUMCHANS = 26
CEPLIFTER = 22
NUMCEPS = 12
ENORMALISE = F

Now, Create a new directory MFCC in the working folder and execute the HCopy command from the working directory as follows
cmd> HCopy -T 1 -C wav_config -S codetr.scp
This result in the creation of a series of mfcc files corresponding to the files listed in your codetrain.scp script.

STEP 4 Creating the Transcriptional Files

In this step we create a Master Label File (MLF) - which is a single file that contains a label entry for
each line in our PROMPTS file. We use the script prompts2mlf contained in HTK_scripts directory

perl ../HTK_scripts/prompts2mlf words.mlf prompts
Now, create the mkphones0.led edit script
EX
IS sil sil
DE sp

Next we need to execute the HLEd command to expand the Word Level Transcriptions to Phone Level
Transcriptions - i.e. replace each word with its phonemes, and put the result in a new Phone Level
Master Label File This is done by reviewing each word in the MLF file, and looking up the phones that
make up that word in the dict file we created earlier, and outputting the result in a file called
phones0.mlf This is done by using the Label editor tool HLEd.
cmd> HLEd -A -D -T 1 -l '*' -d dict -i phones0.mlf mkphones0.led words.mlf

This creates the phones0.mlf file.

Wednesday, January 30, 2008

STEP 3 : RECORDING THE DATA

STEP 3 : RECORDING THE DATA
The training and test data will be recorded using the HTK tool binary HSLab. In this the HSLab will be
used for recording the files that were mentioned in the prompts file.
HSLab S0001.wav
Creates a sound file S0001.wav which is recorded by pressing the record button.All the lines in the
prompts file have to be recorded corresponding to the name of files given before.

Now we have to record the files S0001.wav by opening the HSLab which looks like



We can see in the next image that the phonemes have to be labeled as seen by hearing the word utterances so that the label file can be created by saving by the name S0001.lab. now this is again of the same file name as the wave file. After this process has been repeated to all the wav files making the same number of label files we proceed further. Also note that the sil which we use is to mark the silence in the wave pattern and is necessary to tell the computer that it is silence and not any other utterance of the word phoneme.

..

Now, create the global.ded script in your working folder (default script used by HDMan), which
contains:

AS sp
RS cmu
MP sil sil sp
Now using the HDMan Tool we generate the dict and monophones1 file as follows
HDMan -m -w wlist -n monophones1 -l dlog dict Hindi_Dict

Hindi_Dict is the Hindi dictionary made by us containing the mapping the the Hindi words used and their
respective phonemes.

AAM aa m ah
BAJRA b aa jh r aa
BHAAV b h aa w
CHAWAL ch aa w ax l
CHEENI ch ee n iy
DAAM dh aa m
DIKHAO dh iy kh aa oh

Monophones1 is just the list of phonemes used in the dict file.
Monophones0 is created by deleting the “sp” phoneme from the monophones1

STEP -2 DICTIONARY

In this step we create a PROMPTS file that contains the list of words that need to be recorded in a
sentence wise form that also abides by the grammar defined in the gram file.

*/S0001 CHAWAL KA BHAAV DIKHAO CHAWAL KA BHAAV DIKHAO CHAWAL KA BHAAV DIKHAO
*/S0004 CHAWAL KA DAAM DIKHAO CHAWAL KA DAAM DIKHAO CHAWAL KA DAAM DIKHAO

Now we create the wlist file by running the script prompts2wlist which is stored in the folder HTK_SCRIPTS during the installation of HTK 3.3.We execute the following command

perl ../HTK_scripts/prompts2wlist prompts wlist

now add SENT-START & SENT-END manually in the wlist.

STEPS IN BUILDING HMM: STEP-1 TASK GRAMMAR

5.3 STEPS IN BUILDING HMM

5.3.1 Data Preparation


We make a file c gram” with all the suitable grammar
An instance gram” file i


$commodity = CHAWAL | GEHU | CHEENI | SOOJI | AAM | BAJRA | ARHAR | MOONGFALI
| MAKAI | SANTARA | SOYA;
$verb = [KA] DAAM | [KA] BHAAV;
$show = DIKHAO;
(SENT-START ($commodity $verb $show SENT-END))


Now, to depict this grammar in form of a network, we use HParse tool
cmd> HParse gram wdent

HTK installation

HTK-Installation on Windows:

System Requirements:-

Hardware: Audio Devises & capability to record voice.

Software: Active Perl, Visual C++, WinZip

Steps for HTK-Installation

1. Unzip the downloaded HTK file from htk website.
· Open a DOS command window
· cd into the htk directory
cd htk

2.Next we create a directory for the library and tools.
mkdir bin.win32
· Run VCVARS32

3. To build the library following commands are executed
cd HTKLib
nmake /f htk_htklib_nt.mkf all
cd ..

4. HTK tool is built by invoking
cd HTKTools
nmake /f htk_htktools_nt.mkf all
cd ..
cd HLMLib
nmake /f htk_hlmlib_nt.mkf all
cd ..
cd HLMTools
nmake /f htk_hlmtools_nt.mkf all
cd ..

5. Finally add HTK/bin.win32 folder to the path of the system.

6.Restart the computer to make the changes effective.

7.Our HTK installation is complete & its binaries can be used from Dos prompt.