Setting Up CloudSpeech API on the Raspberry Pi with VoiceHAT

This notebook is licensed under the Apache License, Version 2.0. Written by Geoffrey Momin.
We recommend using a clean distribution of Raspbian on an (at least) 16GB SD card. There are boundless support documents online to guide you through the Raspbian setup process.
Once you're ready to install CloudSpeech, continue below:

Installing the Dependencies

sudo apt-get install git
cd
git clone https://github.com/geoffmomin/aiyprojects-raspbian.git AIY-projects-python

Now setup the services:

cd ~/AIY-projects-python
scripts/install-deps.sh
sudo scripts/install-services.sh

Configuring the VoiceHAT driver

To use voiceHAT, you need to make sure you're using the latest Raspbian kernel. Then, you can configure ALSA:

sudo scripts/configure-driver.sh
sudo reboot

After the reboot, run:

cd ~/AIY-projects-python
sudo scripts/install-alsa-config.sh
python3 checkpoints/check_audio.py
sudo reboot

Now get your Google Cloud Platform Credentials

You can find a great guide to set up your credentials over here: https://cloud.google.com/docs/authentication/

Time to Code!

We are going to use python version 3.X to develop this program. Make sure you're running the latest version of python. You can check by running "python -v" (without quotes) in terminal.

Import all dependencies


In [ ]:
import aiy.audio
import aiy.cloudspeech
import aiy.voicehat
import serial

ser = serial.Serial('/dev/ttyUSB0',9600)

The above modules activate the microphone, calls the CloudSpeech API, and activates the VoiceHAT driver. Now you can define the main function.


In [ ]:
def main():
    recognizer = aiy.cloudspeech.get_recognizer()
    recognizer.expect_phrase("move forward")
    recognizer.expect_phrase("move backwards")
    recognizer.expect_phrase("stop")
    
    button = aiy.voicehat.get_button()
    led = aiy.voicehat.get_led()
    aiy.audio.get_recorder().start()
    
    while True:
        print("Please press the button and speak")
        button.wait_for_press()
        print("Listening")
        text = recognizer.recognize()
        if not text:
            print("Sorry, I did not understand that")
        else:
            print("You said: ", text)
            if 'move forward' in text:
                ser.write("F")
            elif 'move backwards' in text:
                ser.write("B")
            elif 'stop' in text:
                ser.write("X")
            elif 'goodbye' in text:
                break

if __name__ == '__main__'
    main()