In the first part of the article series, we set up an Augmented Reality app with a host (= avatar). Now, we’ll dive deeper and integrate host interactions. To make the character more life-like, it should look at you. We’ll assign speech files and ensure that the gestures of the character match the spoken content.
But before we set out on these tasks, let’s take a minute to look at some vital concepts of Amazon Sumerian.
Behaviors, State Machines & Events
Unless you want your app to just show a static scene, you’ll need to integrate actions. The trigger for an action could react to interactive user inputs. Alternatively, you define what happens sequentially – e.g., first a new object appears in the scene, then the host avatar explains it.
Technically, this is solved using a state machine. Each entity can have multiple different states. A behavior is a collection of these states. States transition from one to another based on actions & their events (= interactions or timing).
These actions can trigger events. Some examples: the wait time of 5 seconds is over, the movement is completed or the sound file finished playing. Using a transition, you can then transition to a different state.
By combining several states together with transitions, you can make entities interact with the user or perform other tasks to ensure your scene is dynamic.
Many AR / VR use cases involve virtual trainings or guide topics. With Amazon Sumerian, you can quickly create cross-platform apps for these scenarios. The main advantage is the large amount of ready-made content: avatars (called hosts) and virtual environment templates. Through the direct integration of Amazon Web Services (AWS), it’s easy to make the host speak to the user – including lip sync, gestures and even conversations through bots.
Of course, you can create similar solutions with Unity. But Sumerian requires far less prior 3D software knowledge and is therefore ideal for smaller projects as well as prototypes. The interface and generic setup is still quite similar to Unity; so it’s a good evolution to switch to Unity – if needed – after you’ve created your first few apps and services with Amazon Sumerian.
Additionally, right now Amazon is hosting an AR / VR challenge with lots of prizes for the best apps of various categories. So, it’s a great time to explore Sumerian!
What is Amazon Sumerian?
Essentially, Sumerian is a browser-based 3D editing platform. It allows developing for most AR and VR platforms, including Oculus, Vive, Windows Mixed Reality, as well as the browser, Google ARCore and Apple ARKit.
Behind the scenes, it’s based on WebXR. That’s the evolution of WebVR, which was mainly targeting VR headsets. With WebXR, you can access sound, controllers and also anchor objects to the real environment in Mixed Reality scenarios.
Amazon Sumerian Account Setup
First, you need to set up your Amazon account. Amazon offers an AWS free tier, which gives you access to many services and provides some usage quotas for free for the first 12 months. Afterwards, you can still continue using selected services for free. Note that Sumerian is not part of these, but 12 months provides enough time to test & develop your service.
In the first part, we took a look at how an algorithm identifies keypoints in camera frames. These are the base for tracking & recognizing the environment.
For Augmented Reality, the device has to know more: its 3D position in the world. It calculates this through the spatial relationship between itself and multiple keypoints. This process is called “Simultaneous Localization and Mapping” – SLAM for short.
Sensors for Perceiving the World
The high-level view: when you first start an AR app using Google ARCore, Apple ARKit or Microsoft Mixed Reality, the system doesn’t know much about the environment. It starts processing data from various sources – mostly the camera. To improve accuracy, the device combines data from other useful sensors like the accelerometer and the gyroscope.
Creating apps that work well with Augmented Reality requires some background knowledge of the image processing algorithms that work behind the scenes. One of the most fundamental concepts involves anchors. These rely on keypoints and their descriptors, detected in the recording of the real world.
Overall, the AR ecosystem is still small. Nevertheless, it’s fragmented. Google develops ARCore, Apple creates ARKit and Microsoft is working on the Mixed Reality Toolkit. Fortunately, Unity started unifying these APIs with the ARInterface.
The traditional mobile AR app development cycle includes compiling and deploying apps to a real device. That takes a long time and is tedious for quick testing iterations.
A big advantage of ARKit so far has been the ARKit Unity Remote feature. The iPhone runs a simple “tracking” app. It transmits its captured live data to the PC. Your actual AR app is running directly in the Unity Editor on the PC, based on the data it gets from the device. Through this approach, you can run the app by simply pressing the Play-button in Unity, without native compilation.
This is similar to the Holographic Emulation for the Microsoft HoloLens, which has been available for Unity for some time.
The great news is that the new Unity ARInterface finally adds a similar feature to Google ARCore: ARRemoteInterface. It’s available cross-platform for ARKit and ARCore.
We don’t have a Christmas tree in our apartment. But in today’s world, this is what Augmented Reality is for, right? Therefore, I decided to create an AR Christmas Tree in 5 minutes. This also gave me an opportunity to check out the new Google ARCore Developer Preview 2.
Christmas Tree 3D Model
First off, you need a 3D model of a Christmas tree. Two of the most accessible sources are Google Poly and Microsoft Remix 3D. Sticking to models created directly by Google and Microsoft, these two are the choices:
ARCore has a great feature – light estimation. The ARCore SDK estimates the global lighting, which you can use as input for your own shaders to make the virtual objects fit in better with the captured real world. In this article, I’m taking a closer look at how the light estimation works in the current ARCore preview SDK.
Following the basic project setup of the first part of this article, we now get to the fascinating details of the ARCore SDK. Learn how to find and visualize planes. Additionally, I’ll show how to instantiate objects and how to anchor them to the real world using Unity.
ARCore by Google is still in preview and only runs on a select few phones including the Google Pixel 2. In this article, I’m creating a demo app for ARCore using the ARCore SDK for Unity (Preview 1).
It’s following up on the blog post series where I segmented a 3D model of the brain from an MRI image. Instead of following these steps, you can download the final model used in this article for free from Google Poly.