Making a Voice Assistant – part 1

I use my Alexa Echo devices daily for various purposes. While I find it incredibly useful, Alexa does have limitations, and it would be great if it were more customizable. I decided to try making my own voice assistant that I can fully customize to my needs. I attempted this many years ago when I was first learning how to code, and mostly failed. At the time, the only method available to me for speech synthesis was to have a database of pre-recorded audio files that I could call when appropriate, and that was very limiting. Now however, there is a nifty text-to-speech conversion Python library called “pyttsx3“. Using this library alongside Google’s cloud speech recognition API (which can be accessed using the SpeechRecognition 3.10.0 library) allowed me to put together a very capable voice assistant. So far, this is what I have:

# -*- coding: utf-8 -*-
"""
Created on Fri Oct 13 13:31:34 2023

@author: austin dixon
"""

import os
import time
import playsound
import speech_recognition as sr
from gtts import gTTS
import pyttsx3
from datetime import datetime
import wikipedia
import time
from datetime import date

def voiceChange():
    eng = pyttsx3.init() #initialize an instance
    voice = eng.getProperty('voices') #get the available voices
    # eng.setProperty('voice', voice[0].id) #set the voice to index 0 for male voice
    eng.setProperty('voice', voice[1].id) #changing voice to index 1 for female voice
if __name__ == "__main__":
    voiceChange()

def get_audio(): #listen to input from microphone
    r = sr.Recognizer()
    with sr.Microphone() as source:
        audio = r.listen(source)
        said = ""
        try:
            said = r.recognize_google(audio) #use Google Cloud to process language
            print(said)
        except Exception as e:
            print("Exception: " + str(e))

    return said

wake_word = "hey Darla"
break_word = "stop"

engine = pyttsx3.init() #start the voice engine

while True:
    text = get_audio() #get microphone input as string
    if break_word in text: #stop running loop
        break

    elif wake_word in text or "darling" in text: #listen for wake word
        text = text.replace(wake_word, '')

#########################################################################################################################
#smalltalk
        if "what is your name" in text or "who are you" in text or "what are you" in text:
            engine.say("My name is Darla. I am a virtual assistiant created by the, brilliant, Austin Dixon.")
            engine.runAndWait()
        elif "hello" in text:
            engine.say("Hello, how are you today?")
            engine.runAndWait()
        elif "how are you" in text or "how are you doing" in text or "how are you feeling" in text:
            engine.say("I'm doing ok, but I do feel a little artificial... Ha Ha.")
            engine.runAndWait()
        
#########################################################################################################################
#search wikipedia
        elif "search" in text:
            text = text.replace('search', '')
            text = text.replace('for', '')
            text = text.replace('the', '')
            text = text.replace('internet', '')
            text = text.replace('wikipedia', '')
            result = ""
            try:
                result = wikipedia.summary(text)
            except Exception:
                pass
            if result == "":
                engine.say("Sorry, I didn't find any results for " + text)
                engine.runAndWait()
            else:
                engine.say(result)
                engine.runAndWait() 

#########################################################################################################################
# timer
        elif "timer" in text:
            text = text.replace('set', '')
            text = text.replace('a', '')
            text = text.replace('timer', '')
            text = text.replace('for', '')
            if "coffee" in text:
                engine.say("Setting coffee timer for 5 minutes.")
                engine.runAndWait()
                time.sleep(300) #set timer for five mins for coffee
                engine.say("Your coffee is ready! .... Your coffee is ready! .... Your coffee is ready!")
                engine.runAndWait()
            elif "hour" in text:
                if "hours" in text:
                    text = text.replace('hours', '')
                    engine.say("Setting timer for " + text + " hours")
                    engine.runAndWait()
                else:
                    text = text.replace('hour', '')
                    engine.say("Setting timer for one hour")
                    engine.runAndWait()
                hours = int(text) #convert str to int to get hours
                secs = hours * 3600 #convert hours to secs
                time.sleep(secs)
                engine.say("Your time is up! .... Your time is up! .... Your time is up!")
                engine.runAndWait()
            elif "minute" in text:
                if "minutes" in text:
                    text = text.replace('minutes', '')
                    engine.say("Setting timer for " + text + " minutes")
                    engine.runAndWait()
                else:
                    text = text.replace('minute', '')
                    engine.say("Setting timer for one minute")
                    engine.runAndWait()
                mins = int(text) #convert str to int to get mins
                secs = mins * 60 #convert mins to secs
                time.sleep(secs)
                engine.say("Your time is up! .... Your time is up! .... Your time is up!")
                engine.runAndWait()
            elif "seconds" in text:
                secs = int(text.replace('seconds', '')) #get the number from text and convert str to int
                engine.say("Setting timer for " + str(secs) + " seconds")
                engine.runAndWait()
                time.sleep(secs)
                engine.say("Your time is up! .... Your time is up! .... Your time is up!")
                engine.runAndWait()
        
#########################################################################################################################
#time/date
        elif "time" in text:
            time_now = datetime.now()
            current_time = time_now.strftime("%I:%M %p")
            engine.say("The current time is " + current_time)
            engine.runAndWait()
        elif "date" in text:
            today = date.today()
            engine.say("Today's date is " + str(today))
            engine.runAndWait()
            
#########################################################################################################################
#if input doesn't match any trigger words
        else:
            engine.say("You just asked me " + text + "... sorry, but I do not know anything about that yet.")
            engine.runAndWait()

At the moment, the voice assistant only has a few capabilities. It listens for a wake word to activate, and then can give you the current time or date, it can set a timer, it can search Wikipedia to answer questions, and it has some small-talk options. I intend to build in many more features, including fetching news headlines, giving weather forecasts, web scrapping various sites, setting reminders, sending emails, and controlling some of my smart home gadgets.

Some of these features are going to require a web server running some custom JavaScript that can gather data from online sources and send that data to my voice assistant via JSON requests. So building the web app that will serve as an API to my voice assistant is one of my next steps.

I also want the voice assistant to run on hardware as an IoT device, so that is going to take some tinkering. I have ordered the components I need to build this, and I am waiting for them to arrive. The final step will be deciding what to use to house my project. Maybe 3D printing an animatronic, talking head? Or recreating the Hal 9000? I haven’t decided yet.

RSS
Twitter
LinkedIn