In the last tutorial we learned how to output sound from our python script using the gTTS module. In this video we will do the opposite, we will get user input and turn it into text data that we can process.
Let's get started by creating a function called get_audio.
This function will be able to detect a users voice, translate the audio to text and return it to us. It will even wait until the user is speaking to start translating/recording the audio.
We'll do some more advanced things using specific commands in later videos but for now lets test out responding to some simple messages from our user.
text=get_audio()if"hello"intext:speak("hello, how are you?")elif"what is your name"intext:speak("My name is Tim")
Now try testing it out by running the program and seeing what the computer says back to you.
Just make sure when you're running this that you've set up a microphone as an input device from the settings in your computer. Also make sure you have an audio output source!
importosimporttimeimportplaysoundimportspeech_recognitionassrfromgttsimportgTTSdefspeak(text):tts=gTTS(text=text,lang="en")filename="voice.mp3"tts.save(filename)playsound.playsound(filename)defget_audio():r=sr.Recognizer()withsr.Microphone()assource:audio=r.listen(source)said=""try:said=r.recognize_google(audio)print(said)exceptExceptionase:print("Exception: "+str(e))returnsaidtext=get_audio()if"hello"intext:speak("hello, how are you?")elif"what is your name"intext:speak("My name is Tim")