So I mentioned HAPP in my last post, but didn’t get into too much detail about what I’m working on. As I said last time, I have always wanted something to do with home automation, and I needed a project for Hack Huntsville.
So my first idea was that I wanted to voice control Pandora. When I’m across the room and I want to change a song, I’m too lazy to get up and change the song. (Pathetic, I know). So I started a project in Python.
At first, I wanted to find out SpeehRecognition. I found a nice library, which of course needed some audio library as well (they suggested pyaudio, which I also downloaded. I spent two hours trying to get my microphone to work (and went through a few different speech libraries in the meantime). It must have been comical at the Hack-a-thon, because here I am shouting at my computer in a crowded room, or walking around trying to find a quiet place. The looks I got were pretty good.
However, I started digging into the API (always look into the API further when you are stuck, a lesson I learn again and again.) I found that you could tweak some microphone settings, and it turned out that my settings were not so ideal for a crowded room. After playing with energy thresholds, I was able to finally see the text “HELLO” appear on my screen (I had it printing out any words it could decipher).
After a quick burst of elation, I knew that I could figure out quite a bit from there. I was trying to decide what to call my project at this point. Iron man has JARVIS, Eureka had S.A.R.A.H, and I wanted some cool name to go with mine. I started off by keying off the words “Hey, Listen”, but did not want to go with Navi. I wanted a name, just like KITT or HAL from media past. I’m a sucker for acronyms, and eventually I thought of HAPP, or Home Automation Personal Project. As a bonus, the python script is happ.py, which is how I was feeling while I was working on the Hacka-thon project.
So I was able to enter in my first command and needed to know what to do next. I had quite a bit of familiarity with Selenium, so I decided to open up a Pandora Window in Selenium with a default station (Dave Matthews Band of course). You can see that work here. From there, coding up an exit, and a “next song” feature weren’t too bad. Within about three hours total, I went from nothing to being able to start/stop/skip songs on Pandora.
I decided to take things a little bit further and added a new voice command “tornado warning” to pull up weather. Then, as a spur of moment idea, I wanted the computer to talk back. At a recent PyHSV talk, I showed what GhostDriver can do with Python. It pretty much opens up a headless web browser, and lets me control it through Python. Knowing this tool, I settled on wanting to get the temperature from the computer when I ask it. After quick download of python text to speech (pyttsx), I was in the business.
Ended up being a piece of cake, and I was ready to call it quits on the night. I had already worked a full day and had no intention of staying through the night.
All in all, I got a handful of voice commands to work, had the computer speaking back to me, and was on the road to automating the things that I want for my house. Its a long way to light switches and thermostats, but if I can control my computer with voice commands, I think that’s a pretty good start.
My next step is to clean up the code a bit ( make it a little less hacky). This involves config files, Python 3 conversions, and making it work on linux.
Maybe by next post, I can share some of the technical fun that I got to explore.