I use my Alexa Echo devices daily for various purposes. While I find it incredibly useful, Alexa does have limitations, and it would be great if it were more customizable. I decided to try making my own voice assistant that I can fully customize to my needs. I attempted this many years ago when I was first learning how to code, and mostly failed. At the time, the only method available to me for speech synthesis was to have a database of pre-recorded audio files that I could call when appropriate, and that was very limiting. Now however, there is a nifty text-to-speech conversion Python library called “pyttsx3“. Using this library alongside Google’s cloud speech recognition API (which can be accessed using the SpeechRecognition 3.10.0 library) allowed me to put together a very capable voice assistant. So far, this is what I have:
# -*- coding: utf-8 -*-
"""
Created on Fri Oct 13 13:31:34 2023
@author: austin dixon
"""
import os
import time
import playsound
import speech_recognition as sr
from gtts import gTTS
import pyttsx3
from datetime import datetime
import wikipedia
import time
from datetime import date
def voiceChange():
eng = pyttsx3.init() #initialize an instance
voice = eng.getProperty('voices') #get the available voices
# eng.setProperty('voice', voice[0].id) #set the voice to index 0 for male voice
eng.setProperty('voice', voice[1].id) #changing voice to index 1 for female voice
if __name__ == "__main__":
voiceChange()
def get_audio(): #listen to input from microphone
r = sr.Recognizer()
with sr.Microphone() as source:
audio = r.listen(source)
said = ""
try:
said = r.recognize_google(audio) #use Google Cloud to process language
print(said)
except Exception as e:
print("Exception: " + str(e))
return said
wake_word = "hey Darla"
break_word = "stop"
engine = pyttsx3.init() #start the voice engine
while True:
text = get_audio() #get microphone input as string
if break_word in text: #stop running loop
break
elif wake_word in text or "darling" in text: #listen for wake word
text = text.replace(wake_word, '')
#########################################################################################################################
#smalltalk
if "what is your name" in text or "who are you" in text or "what are you" in text:
engine.say("My name is Darla. I am a virtual assistiant created by the, brilliant, Austin Dixon.")
engine.runAndWait()
elif "hello" in text:
engine.say("Hello, how are you today?")
engine.runAndWait()
elif "how are you" in text or "how are you doing" in text or "how are you feeling" in text:
engine.say("I'm doing ok, but I do feel a little artificial... Ha Ha.")
engine.runAndWait()
#########################################################################################################################
#search wikipedia
elif "search" in text:
text = text.replace('search', '')
text = text.replace('for', '')
text = text.replace('the', '')
text = text.replace('internet', '')
text = text.replace('wikipedia', '')
result = ""
try:
result = wikipedia.summary(text)
except Exception:
pass
if result == "":
engine.say("Sorry, I didn't find any results for " + text)
engine.runAndWait()
else:
engine.say(result)
engine.runAndWait()
#########################################################################################################################
# timer
elif "timer" in text:
text = text.replace('set', '')
text = text.replace('a', '')
text = text.replace('timer', '')
text = text.replace('for', '')
if "coffee" in text:
engine.say("Setting coffee timer for 5 minutes.")
engine.runAndWait()
time.sleep(300) #set timer for five mins for coffee
engine.say("Your coffee is ready! .... Your coffee is ready! .... Your coffee is ready!")
engine.runAndWait()
elif "hour" in text:
if "hours" in text:
text = text.replace('hours', '')
engine.say("Setting timer for " + text + " hours")
engine.runAndWait()
else:
text = text.replace('hour', '')
engine.say("Setting timer for one hour")
engine.runAndWait()
hours = int(text) #convert str to int to get hours
secs = hours * 3600 #convert hours to secs
time.sleep(secs)
engine.say("Your time is up! .... Your time is up! .... Your time is up!")
engine.runAndWait()
elif "minute" in text:
if "minutes" in text:
text = text.replace('minutes', '')
engine.say("Setting timer for " + text + " minutes")
engine.runAndWait()
else:
text = text.replace('minute', '')
engine.say("Setting timer for one minute")
engine.runAndWait()
mins = int(text) #convert str to int to get mins
secs = mins * 60 #convert mins to secs
time.sleep(secs)
engine.say("Your time is up! .... Your time is up! .... Your time is up!")
engine.runAndWait()
elif "seconds" in text:
secs = int(text.replace('seconds', '')) #get the number from text and convert str to int
engine.say("Setting timer for " + str(secs) + " seconds")
engine.runAndWait()
time.sleep(secs)
engine.say("Your time is up! .... Your time is up! .... Your time is up!")
engine.runAndWait()
#########################################################################################################################
#time/date
elif "time" in text:
time_now = datetime.now()
current_time = time_now.strftime("%I:%M %p")
engine.say("The current time is " + current_time)
engine.runAndWait()
elif "date" in text:
today = date.today()
engine.say("Today's date is " + str(today))
engine.runAndWait()
#########################################################################################################################
#if input doesn't match any trigger words
else:
engine.say("You just asked me " + text + "... sorry, but I do not know anything about that yet.")
engine.runAndWait()
At the moment, the voice assistant only has a few capabilities. It listens for a wake word to activate, and then can give you the current time or date, it can set a timer, it can search Wikipedia to answer questions, and it has some small-talk options. I intend to build in many more features, including fetching news headlines, giving weather forecasts, web scrapping various sites, setting reminders, sending emails, and controlling some of my smart home gadgets.
Some of these features are going to require a web server running some custom JavaScript that can gather data from online sources and send that data to my voice assistant via JSON requests. So building the web app that will serve as an API to my voice assistant is one of my next steps.
I also want the voice assistant to run on hardware as an IoT device, so that is going to take some tinkering. I have ordered the components I need to build this, and I am waiting for them to arrive. The final step will be deciding what to use to house my project. Maybe 3D printing an animatronic, talking head? Or recreating the Hal 9000? I haven’t decided yet.