Playing Sound

Subscribe to Tech With Tim!

This tutorial will show you how to create a python voice assistant!

Installing Packages

For this tutorial we will use a few different python packages to get microphone input data and output speech from the computer speakers.

Modules Needed
- PyAudio
- SpeechRecognition
- gTTS
- playsound

To install these we can use pip. If for some reason your pip is not working, check out this link.

To install these modules with pip open a command prompt window and type the following.

pip install gTTS
pip install SpeechRecognition
pip install playsound

Installing PyAudio

PyAudio is a dependency of some of the modules we just installed so we need to install it as well. You can attempt to install it with pip but I personally have run into issues doing that. So I have listed the instructions on how to install PyAudio from the source wheel file below.

Step 1: Determine your Python Version.
You can do this by typing "python" or "python3" in your cmd window.

Step 2: Download appropriate PyAudio wheel file from here.
Look for a wheel file that matches the version and architecture(32 bit or 64 bit) of your python install.

Step 3: Open a CMD window in the same directory as the downloaded .whl file.

Step 4: Install the .whl file with pip. "pip install "

Importing Modules

We will start by importing all the things we will be using in the next few tutorials.

import speech_recognition as sr
from gtts import gTTS
import os
import time
import playsound

Playing Sound

Now that we have these modules installed we can start writing the code to play sound from our computers speakers. We will use the gTTS(google text to speech) module to do this.

def speak(text):
    tts = gTTS(text=text, lang='en')
    filename = 'voice.mp3'

This function will create an audio file (.mp3) that has the google voice saying whatever text we pass in. It will then save that file to the same directroy as our python script, load it in using playsound and play the sound.

To use this function we can call it and pass some text.

speak("hello tim")