In a typical free recall experiment, after the experiment completes the experimenter (or a team of experience-hungry undergraduates) will manually transcribe the verbal responses from a subject by listening to audio files, and coding each word. This process can take hours, and is typically not exciting, to say the least. To help with this problem, we created a decode_speech
function, which wraps the Google Speech API and a software package called ffmpeg
to automatically transcribe the responses. Furthermore, it allows the experimenter to transcribe in (almost) realtime, which makes adaptive free recall experiments a possibility. To use this feature (assuming that you are using a mac or linux machine), you must first set up ffmpeg and Google Speech API:
ffmpeg
ffmpeg
is native application that processes audio and video files. We will use it to convert .wav files to the .flac format, which will allow us to send the files to Google Speech. To set up:
/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
brew install ffmpeg
Under the hood, quail
uses the Google Speech API to transcribe audio responses. Follow the instructions below to set it up. Note: the API is not free, but its quite reasonable. Up to 60 minutes/month is free, and after that it costs \$0.006 per 15 seconds. For a typical study (16 study/test blocks) allowing for a minute of recall after each, the price comes out to ~$0.38 per subject. To set it up, follow these steps:
Sign up for a Google Cloud account.
Create a project.
Enable to Speech API.
Set up a service account.
If you followed these steps, a JSON formatted API keyfile will be downloaded to your local computer. This file is your ticket to speech decoding, so keep it safe. Everything should now be setup! Below is a basic example of how to use it:
#import
import quail
# decode speech
recall_data = quail.decode_speech('../data/sample.wav', keypath='path/to/keyfile.JSON')
# print results
print(recall_data)
The credentials can also be set up as an environmental variable. To do this, edit your .bash_profile, adding the line:
export GOOGLE_APPLICATION_CREDENTIALS='/path/to/keyfile.JSON'
You'll need to launch a fresh terminal instance and then the decode_speech
function should work without the explicit keypath:
# decode speech
recall_data = quail.decode_speech('../data/sample.wav')