Voice Controlled Smart Home Automation

The objective of this project it to look at technologies and services that can be used to simplify interaction with our smart home, using speech recognition. This is a logical extension of our natural language project, which is text based.

Voice recognition has improved in recent years and become much more mainstream with services like Siri from Apple, Voice Search from Google and S Voice Search from Samsung.

Voice recognition is not just limited to Smartphones and tablets though. Samsung has recently added voice control to their Smart TVs.

Speech Recognition

Speech recognition is difficult to do well. When I started work in 1989, it was in the area of speech recognition systems and voice interactive services. Whilst things have moved on, they haven't really moved on that far. Free-flow (connected) speech recognition for any speaker is the holy grail but, it is still a long way off. Most speech recognition systems are much more accurate when they are trained to an individual speaker.

Keyword Spotting

Keyword spotting is one way to improve accuracy. If you can spot a control keyword, which means that what follows is a command or request, it is much easier to translate the small segment of speech that follows. Services like Siri do something similar but you press a button to start the recognition process.

Implementation Options

Android

Android natively provides speech to text capability and this could be wrapped into a custom home automation controller app. This is a good example. The speech recognition is pretty good and plenty good enough for home automation applications.

AT&T Speech To Text API

The AT&T Speech To Text API uses the Watson speech engine.

Voice Shortcuts Launcher Android App

Having come across this article we started looking at the Voice Shortcuts Launcher app, which is free on the Google Play store.

This app can also be used as a widget on your desktop. Our initial experiments show that this is an excellent way to provide voice control of our home automation system as this initial YouTube video shows:

The above video shows custom voice commands set up to mirror the commands supported by our natural language text interface. Not only does this work well architecturally but, it also enables us to re-use the interface and intelligence already developed. The 'response engine' now has awareness of speech recognition interfaces, so that it can then generate appropriate voice responses.

Many of the voice responses are pre-generated audio files but we also have the ability to generate speech responses dynamically. There are various ways to play out voice files from PHP but, on our Windows 7 machine, we are using ffplay. We have set up our Home Control System (HCS) to speak out audio over speakers integrated into our current home.

The above video shows a later version of the web interface, with speech confirmation of commands being provided by our Home Control System (HCS).

Fibaro Home Center

Fibaro home Center
The Fibaro Home Center now supports some voice control capability as featured in their latest video but, this is specific to the Fibaro hub and eco-system.

Fibaro - Your home, Your Imagination

Guile 3D

Guile 3D produce software to implement a virtual assistant with both speech recognition and text to speech synthesis. There are also a range of avatars to choose from.

Ivee

Ivee
Ivee is a physical device that connects via Wi-Fi and claims to enable voice control of your smart home. It seems to have quite limited functionality though and many of the useful features and capability is marked as 'coming soon'. We have ruled this out as being too limited in terms of features and limited integration capability. All of the useful functionality (and more) can be better achieved using a Smartphone. There is a lot more information on Ivee on the Kickstarter project page. Ivee uses uses AT&T's Speech API (Watson).

Nuance Nina

Nuance's Nina (Nuance Interactive Natural Assistant) is a virtual assistant with speech recognition and text to speech capability.

Siri Proxy

We have had a look at using the Siri Proxy but, it is not really suitable for our needs. It requires a separate home network and the installation of a DNS server. It also only works with a limited set of Apple devices.

Siri proxy on the Raspberry Pi:

Siri Proxy

Windows 7

The Windows 7 operating system has speech recognition built in as standard as described here but, we can't see this being easily applied to our usage scenarios (though our Home Control System (HCS) is running Windows 7 at the moment).

Google Web Speech API

Google have a very interesting Web Speech API (specification) which looks very suitable.

We tested it in the latest PC Chrome browser and it works really well.

Nexus 7 Chrome browser not supported
Strangely, the Chrome browser on the Nexus 7 doesn't support this API yet, despite being at version 29.x.x.

Summary

The voice recognition part of this project is on-going and we will be updating this project page as things progress. The main focus of this project is as part of our wider Smart Home Assistant now though.

Related News & Articles

Share ...
We are on ...
Facebook Twitter
YouTube Flickr Follow us on Pinterest