Speech & Gestures with Amazon Sumerian (Part 2)

Configuring speech for the Amazon Sumerian Host

In the first part of the article series, we set up an Augmented Reality app with a host (= avatar). Now, we’ll dive deeper and integrate host interactions. To make the character more life-like, it should look at you. We’ll assign speech files and ensure that the gestures of the character match the spoken content.

But before we set out on these tasks, let’s take a minute to look at some vital concepts of Amazon Sumerian.

Behaviors, State Machines & Events

Unless you want your app to just show a static scene, you’ll need to integrate actions. The trigger for an action could react to interactive user inputs. Alternatively, you define what happens sequentially – e.g., first a new object appears in the scene, then the host avatar explains it.

Technically, this is solved using a state machine. Each entity can have multiple different states. A behavior is a collection of these states. States transition from one to another based on actions & their events (= interactions or timing).

Sumerian State Machines - Behaviors contain states, which have actions that can trigger events, which lead to transitions to other states.
Sumerian State Machines – Behaviors contain states, which have actions that can trigger events, which lead to transitions to other states.

Each state has a name: e.g., “Waiting”, “Moving”, “Talking”. In addition, each state typically has one or more actions: e.g., waiting for five seconds, animating the movement of the entity or playing a sound file. Sumerian comes with pre-defined actions. Additionally, you can provide your own JavaScript code for custom or more complex tasks.

These actions can trigger events. Some examples: the wait time of 5 seconds is over, the movement is completed or the sound file finished playing. Using a transition, you can then transition to a different state.

By combining several states together with transitions, you can make entities interact with the user or perform other tasks to ensure your scene is dynamic.

Continue reading “Speech & Gestures with Amazon Sumerian (Part 2)”

Amazon Sumerian & Augmented Reality, Part 1

Amazon Sumerian Host, placed in the real world with Google ARCore

Many AR / VR use cases involve virtual trainings or guide topics. With Amazon Sumerian, you can quickly create cross-platform apps for these scenarios. The main advantage is the large amount of ready-made content: avatars (called hosts) and virtual environment templates. Through the direct integration of Amazon Web Services (AWS), it’s easy to make the host speak to the user – including lip sync, gestures and even conversations through bots.

Of course, you can create similar solutions with Unity. But Sumerian requires far less prior 3D software knowledge and is therefore ideal for smaller projects as well as prototypes. The interface and generic setup is still quite similar to Unity; so it’s a good evolution to switch to Unity – if needed – after you’ve created your first few apps and services with Amazon Sumerian.

Additionally, right now Amazon is hosting an AR / VR challenge with lots of prizes for the best apps of various categories. So, it’s a great time to explore Sumerian!

What is Amazon Sumerian?

Essentially, Sumerian is a browser-based 3D editing platform. It allows developing for most AR and VR platforms, including Oculus, Vive, Windows Mixed Reality, as well as the browser, Google ARCore and Apple ARKit.

Behind the scenes, it’s based on WebXR. That’s the evolution of WebVR, which was mainly targeting VR headsets. With WebXR, you can access sound, controllers and also anchor objects to the real environment in Mixed Reality scenarios.

Amazon Sumerian Account Setup

First, you need to set up your Amazon account. Amazon offers an AWS free tier, which gives you access to many services and provides some usage quotas for free for the first 12 months. Afterwards, you can still continue using selected services for free. Note that Sumerian is not part of these, but 12 months provides enough time to test & develop your service.

Continue reading “Amazon Sumerian & Augmented Reality, Part 1”

Node.js and Cloud NoSQL Databases: Azure Cosmos DB

Azure Cosmos DB Quickstart

Learn how to access a cloud-based NoSQL database from Node.js. The Azure Cosmos DB stores documents (e.g., JSON) and allows scaling for improved performance plus geo-redundancy with one click. The access interface also allows well-known SQL queries.

This guide uses the latest Azure Cosmos DB JavaScript module (released as final version just 17 days ago). Additionally, this article is based on the ES 2017 standard. The async / await syntax makes the code short and readable. In contrast to many other tutorials, this article focuses on the minimum code required to understand the concepts.

The complete source code of this article is available on GitHub.

Continue reading “Node.js and Cloud NoSQL Databases: Azure Cosmos DB”

Asynchronous JavaScript with Promises & Async/Await in JavaScript

From the perspective of a C# developer, the introduction of Async and Await into the latest JavaScript version (ECMAScript 2017+) is a welcome addition. It makes asynchronous code a lot cleaner and more readable.

However, a lot of legacy libraries and code snippets are out there. It’s usually difficult to go all-in with async/await. This article is a short intro to error handling and the evolution of asynchronous development in JavaScript.

Error Handling in JavaScript

Most asynchronous operations like web requests can cause an error. Thus, let’s spend a minute reviewing the basics of the code flow.

Continue reading “Asynchronous JavaScript with Promises & Async/Await in JavaScript”

How to Record a Video from a Unity ARCore App on Android

ARCore Recorded Video converted to an Animated GIF

A video is a great way to showcase your Unity app. To capture the full visual fidelity of your app, you need to record at the highest possible quality with a smooth frame rate.

Several screen recording apps are available in the Google Play Store. However, there’s an easy and completely free way that provides the highest possible quality.

This short guide demonstrates how to record the screen with an APK file generated by Unity. Of course, it works for both AR and Non-AR Apps. Continue reading “How to Record a Video from a Unity ARCore App on Android”

Remote ARCore with Unity’s Experimental ARInterface

Overall, the AR ecosystem is still small. Nevertheless, it’s fragmented. Google develops ARCore, Apple creates ARKit and Microsoft is working on the Mixed Reality Toolkit. Fortunately, Unity started unifying these APIs with the ARInterface.

At Unite Austin, two of the Unity engineers introduced the new experimental ARInterface. In November 2017, they released it to the public via GitHub. It looks like this will be integrated into Unity 2018 – the new features of Unity 2018.1 include “AR Crossplatfom Kit (ARCore/ARKit API)“.

Remote Testing of AR Apps

The traditional mobile AR app development cycle includes compiling and deploying apps to a real device. That takes a long time and is tedious for quick testing iterations.

A big advantage of ARKit so far has been the ARKit Unity Remote feature. The iPhone runs a simple “tracking” app. It transmits its captured live data to the PC. Your actual AR app is running directly in the Unity Editor on the PC, based on the data it gets from the device. Through this approach, you can run the app by simply pressing the Play-button in Unity, without native compilation.

This is similar to the Holographic Emulation for the Microsoft HoloLens, which has been available for Unity for some time.

The great news is that the new Unity ARInterface finally adds a similar feature to Google ARCore: ARRemoteInterface. It’s available cross-platform for ARKit and ARCore.

ARInterface Demo App

In this article, I’ll explain the steps to get AR Remote running on Google ARCore. For reference: “Pirates Just AR” also posted a helpful short video on YouTube. Continue reading “Remote ARCore with Unity’s Experimental ARInterface”

NFC Tags, NDEF and Android (with Kotlin)

Android: Launch App through NFC Tags

In this article, you will learn how to add NFC tag reading to an Android app. It registers for auto-starting when the user taps a specific NDEF NFC tag with the phone. In addition, the app reads the NDEF records from the tag.

NFC & NDEF

Apple added support for reading NFC tags with iOS 11 in September 2017. All iPhones starting with the iPhone 7 offer an API to read NFC tags. While Android included NFC support for many years, this was the final missing piece to bring NFC tag scenarios to the masses. Continue reading “NFC Tags, NDEF and Android (with Kotlin)”

How To: RecyclerView with a Kotlin-Style Click Listener in Android

Android RecyclerView - Click Listener - Flow

In this article, we add a click listener to a RecyclerView on Android. Advanced language features of Kotlin make it far easier than it has been with Java. However, you need to understand a few core concepts of the Kotlin language.

To get started with the RecyclerView, follow the steps in the previous article or check out the finished project on GitHub. Continue reading “How To: RecyclerView with a Kotlin-Style Click Listener in Android”

Kotlin & RecyclerView for High Performance Lists in Android

Android: RecyclerView - Adapter Flow

RecyclerView is the best approach to show scrolling lists on Android. It ensures high performance & smooth scrolling, while providing list elements with flexible layouts. Combined with modern language features of Kotlin, the code overhead of the RecyclerView is greatly reduced compared to the traditional Java approach.

Sample Project: PartsList – Getting Started

In this article, we’ll walk through a sample scenario: a scrolling list for a maintenance app, listing machine parts: “PartsList”. However, this scenario only affects the strings we use – you can copy this approach for any use case you need. Continue reading “Kotlin & RecyclerView for High Performance Lists in Android”

Using Natural Language Understanding, Part 4: Real-World AI Service & Socket.IO

The final vital sign checklist app with natural language understanding

In this last part, we bring the vital sign check list to life. Artificial Intelligence interprets assessments spoken in natural language. It extracts the relevant information and manages an up-to-date, browser-based checklist. Real-time communication is handled through Web Sockets with Socket.IO.

The example scenario focuses on a vital signs checklist in a hospital. The same concept applies to countless other use cases.

In this article, we’ll query the Microsoft LUIS Language Understanding service from a Node.js backend. The results are communicated to the client through Socket.IO.

Connecting LUIS to Node.JS

In the previous article, we verified that our LUIS service works fine. Now, it’s time to connect all components. The aim is to query LUIS from our Node.js backend. Continue reading “Using Natural Language Understanding, Part 4: Real-World AI Service & Socket.IO”