These are the detail steps I went through to install PocketSphinx on my element14 Road Test Raspberry Pi 3B (Thank you guys again - I love it!)
(It got late and I stopped writing down things at the end - sorry.)
Refer to:
makezine.com Roomba, I Command Thee: Use Raspberry Pi for Voice Control
http://makezine.com/projects/use-raspberry-pi-for-voice-control/
First, go get the packages required for SphinxBase by executing:
sudo apt-get update
sudo apt-get install libasound2-dev autoconf libtool bison \
swig python-dev python-pyaudio
You’ll also need to install some Python libraries for use with our demo application. To do this, you’ll install and use the Python pip command with the following commands:
curl -O https://bootstrap.pypa.io/get-pip.py
sudo python get-pip.py
sudo pip install gevent grequests
OBTAINING THE SPHINX TOOLS
Now you can go about getting the SphinxBase package, which is used by PocketSphinx as well as other software in the CMU Sphinx family.
To obtain SphinxBase execute the following commands:
git clone git://github.com/cmusphinx/sphinxbase.git
cd sphinxbase
git checkout 3b34d87
./autogen.sh
make
(At this stage you may want to go make coffee …)
sudo make install
cd ..
You’re ready to move on to PocketSphinx.
To obtain PocketSphinx, execute the following commands:
git clone git://github.com/cmusphinx/pocketsphinx.git
cd pocketsphinx
git checkout 4e4e607
./autogen.sh
make
(Time for a second cup of coffee …)
sudo make install
cd ..
To update the system with your new libraries, run sudo ldconfig.
TESTING THE SPEECH RECOGNITION
Now that you have the building blocks of your speech recognition in place, you’ll want to test that it actually works before continuing.
Now you can run a test of PocketSphinx using
pocketsphinx_continuous -inmic yes.
You should see something like the following, which indicates the system is ready for you to start speaking:
...
...
Listening...
Input overrun, read calls are too rare (non-fatal)
You can safely ignore the warning. Go ahead and speak!
When you’re finished, you should see some technical information along with PocketSphinx’s best guess as to what you said, and then another READY prompt letting you know it’s ready for more input.
INFO: ngram_search.c(874): bestpath 0.10 CPU 0.071 xRT
INFO: ngram_search.c(877): bestpath 0.11 wall 0.078 xRT
what
READY....
RECO FROM FILE with large LM:
arecord -f s16_LE -r 16000 test16k.wav
pocketsphinx_continuous -infile test16k.wav 2>&1 | tee ./psphinx.log
xRT= sum of fwdflat, CPU xRT (from Nikolay )
My files are located at: https://github.com/slowrunner/Pi3RoadTest
Pi3RoadTest/ top directory
recoMic/ contains recognition test using pocketsphinx with the microphone
recoFile/ contains recognition test and results for pocketsphinx with file input
copy the folder cmusphinx-5prealpha-en-us-ptm-2.0 into recoMic/ and into recoFile/
download the prebuilt acoustic model from the Sphinx SourceForge: http://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/US%20English%20Generic%20Acoustic%20M…
copy into recoMic/ and into recoFile/
extract it with:
tar -xvf cmusphinx-en-us-ptm-5.2.tar.gz.
See the notes file in each dir for exact commands to run the test and how to process the log file to get results statistics.
================
In lm mode log shows performance information (latest psphinx supposedly will show for grammar mode also).
For individual or TOTAL:
Add up fwdtree + fwdflat + bestpath CPU time = CPU time spent recognizing
Add up fwdtree + fwdflat + bestpath xRT (<1 e.g. 0.52 means 1s of audio takes 0.52 seconds of CPU time, or 0.52% of one core to perform the reco.
(To calculate length of audio processed divide total CPU time by percent of CPU)
on Pi 3: (126 phrase corpus using 136 words, 278 bi-grams, 295 tri-grams)
1.2GHz single core processing
64 phrases:
Total CPU: 75.7s 0.52 xRT 146s audio
Total Wall: 190s 1.3 xRT (146s audio) -
Wall time includes startup, tear down, output and logging.
Reco from mic cannot be faster than realtime.
0.52 xRT CPU means 52% of one core used by ASR.