Speech assistants are one of the most important ways to access services in the future. They are usable without further instructions even by children and elderly. And they’re hands-free. These advantages are reflected in their growing adoption: according to voicebot.ai, already one third of American households have a smart speaker .
Amazon’s Alexa is leading the market, followed by Google Assistant. Also, Baidu, Alibaba, Xiaomi and Apple Siri are important players. Strategy Analysis runs regular reports on market share data . Obviously, usage is quite different by market. For example, Baidu, Alibaba and Xiaomi are stronger in Asian markets. But overall, Amazon Alexa together with its Echo smart speaker ecosystem is the perfect place to start if you want to reach as many people as possible, globally.
Developing for Amazon Alexa
When you decide to create a “Skill” for Amazon Alexa, you have two basic options:
- 3rd Party Tools: for example, Voiceflow or the Microsoft Bot Framework. While you still need to create the Alexa skill in Amazon’s frontend (so that it is also discoverable by Alexa-powered devices), the skill design & development then mostly happens in these tools. Often, their editors are easier to use and/or even offer cross-platform support.
Also keep in mind that developing an Alexa skill with the Alexa Skills Kit is free (unless it grows huge in terms of traffic), while most 3rd party options add some costs to running the skill. After all, these service providers need to make a living, too!
Getting Started with Voiceflow
In our Alexa for Wellbeing Online Challenge, many attendees have never written a single line of code, ever. Therefore, we decided to focus on Voiceflow during the main part of the hackathon. With this approach, all teams managed to create & pitch a workable prototype, which fully works on an Amazon Echo speaker.
In Voiceflow, creating a skill is as simple as writing what the smart speaker should say, providing possible user answers and finally connecting these blocks by drawing lines. The conversation will then flow along these connected paths. This approach is especially well-suited for linear conversations, where the smart speaker asks questions, which the user then answers.
Simple data storage across sessions is built in. Additional blocks allow extending the Voiceflow-powered skills with more features, like accessing Google Docs or showing text & images on smart speakers with screens.
Video Tutorials & Language
At the build.well.being 2020 event, I hosted live tutorials for all event participants. The recordings have later also been published online by the University.
Due to the target group, the videos are in German. Therefore, they are especially helpful to German-speaking people, as few resources exist outside of English.
However, you can simply activate the auto-generated captions on YouTube and have a direct translation into your native language. It’s a great new world where language is getting less and less of a barrier!
1. Voiceflow Setup & Your First Skill
In the first video, I’m demonstrating how you can build your first Alexa skill with Voiceflow. You will see how to connect blocks of Alexa speech output with prompts for users, so that they can decide which path to proceed with.
At the end, I’m also uploading the skill to the Alexa Skills Console, so that it’s possible to test with the official Alexa simulator or a real device. All of this is possible within 40 minutes, including explanations!
Of course, with Voiceflow you don’t get access to the backend source code; but the wonderful thing is that you don’t need it; Voiceflow skills just work out of the box.
2. Decisions and Errors
After the first quick start, this video takes a closer look at how users can make decisions through voice. How can a skill extract information (“Slots”) from sentences (“Intents” / “Utterances”)?
Additionally, it’s common that something goes wrong. With traditional PC or mobile interfaces, the user can only click on the buttons you offer. But in a voice user interface, the user can say anything, at any time. Even if you don’t expect it. An additional error source is that the speaker might not have understood what the user said.
Therefore, every skill should include thorough error handling. Designing that is one of the hardest parts of skill development. You should reserve a lot of testing time to find possible paths where your skill gets stuck in a dead end!
3. Personalize Skills & Remember in Voiceflow
A smart speaker should be a personal experience. Your skill should be able to remember what the user said and use this in future conversations. Imagine speaking to someone who immediately forgets everything you said. You wouldn’t like that person, right?
Therefore, in this video I’m explaining how you can use variables to personalize the skill. For example, you can ask the user for his name and greet him for every future session!
In addition to the user-provided name, your skill can remember any kind of information. For example, how far you got in a continuous physical workout program that your skill accompanies.
Also, at some point you might want to deviate from linear conversations with branches, and let the user freely choose what to do next. This is also covered by this part of the skill design course.
4. Extend Your Skill: Reminders & Web Services
Now that you’ve mastered the basics, lets dig into some details that many skills need:
- Query and integrate data from web services: extract information from JSON data provided by a REST service. This could be useful if you’d like to integrate for example the current weather, or if your skill should fetch the latest information from a service provided by your company.
- Visuals: many Amazon Echo devices have a screen. When your skill is running, you can utilize it to show additional information. This can help to improve the user experience.
- Set reminders: want the user to perform some task at a specific point in time, like to start their next workout? A skill can set a reminder – but you need the user’s permission to do so first. In this video, I show the complete workflow of what you need to do to enable this functionality.
With these additional features, you can create almost every skill. So, the last part of the video focuses on the limitations of Voiceflow: how far can you get with the tool, and what are the differences to Amazon’s Alexa Skill Kit?