In the first part of the article series, we set up an Augmented Reality app with a host (= avatar). Now, we’ll dive deeper and integrate host interactions. To make the character more life-like, it should look at you. We’ll assign speech files and ensure that the gestures of the character match the spoken content.
But before we set out on these tasks, let’s take a minute to look at some vital concepts of Amazon Sumerian.
Behaviors, State Machines & Events
Unless you want your app to just show a static scene, you’ll need to integrate actions. The trigger for an action could react to interactive user inputs. Alternatively, you define what happens sequentially – e.g., first a new object appears in the scene, then the host avatar explains it.
Technically, this is solved using a state machine. Each entity can have multiple different states. A behavior is a collection of these states. States transition from one to another based on actions & their events (= interactions or timing).
These actions can trigger events. Some examples: the wait time of 5 seconds is over, the movement is completed or the sound file finished playing. Using a transition, you can then transition to a different state.
By combining several states together with transitions, you can make entities interact with the user or perform other tasks to ensure your scene is dynamic.
Many AR / VR use cases involve virtual trainings or guide topics. With Amazon Sumerian, you can quickly create cross-platform apps for these scenarios. The main advantage is the large amount of ready-made content: avatars (called hosts) and virtual environment templates. Through the direct integration of Amazon Web Services (AWS), it’s easy to make the host speak to the user – including lip sync, gestures and even conversations through bots.
Of course, you can create similar solutions with Unity. But Sumerian requires far less prior 3D software knowledge and is therefore ideal for smaller projects as well as prototypes. The interface and generic setup is still quite similar to Unity; so it’s a good evolution to switch to Unity – if needed – after you’ve created your first few apps and services with Amazon Sumerian.
Additionally, right now Amazon is hosting an AR / VR challenge with lots of prizes for the best apps of various categories. So, it’s a great time to explore Sumerian!
What is Amazon Sumerian?
Essentially, Sumerian is a browser-based 3D editing platform. It allows developing for most AR and VR platforms, including Oculus, Vive, Windows Mixed Reality, as well as the browser, Google ARCore and Apple ARKit.
Behind the scenes, it’s based on WebXR. That’s the evolution of WebVR, which was mainly targeting VR headsets. With WebXR, you can access sound, controllers and also anchor objects to the real environment in Mixed Reality scenarios.
Amazon Sumerian Account Setup
First, you need to set up your Amazon account. Amazon offers an AWS free tier, which gives you access to many services and provides some usage quotas for free for the first 12 months. Afterwards, you can still continue using selected services for free. Note that Sumerian is not part of these, but 12 months provides enough time to test & develop your service.
In this last part, we bring the vital sign check list to life. Artificial Intelligence interprets assessments spoken in natural language. It extracts the relevant information and manages an up-to-date, browser-based checklist. Real-time communication is handled through Web Sockets with Socket.IO.
The example scenario focuses on a vital signs checklist in a hospital. The same concept applies to countless other use cases.
Training Artificial Intelligence to perform real-life tasks has been painful. The latest AI services now offer more accessible user interfaces. These require little knowledge about machine learning. The Microsoft LUIS service (Language Understanding Intelligent Service) performs an amazing task: interpreting natural language sentences and extracting relevant parts. You only need to provide 5+ sample sentences per scenario.
In this article series, we’re creating a sample app that interprets assessments from vital signs checks in hospitals. It filters out relevant information like the measured temperature or pupillary response. Yet, it’s easy to extend the scenario to any other area.
The vision: automatic checklists, filled out by simply listening to users explaining what they observe. The architecture of the sample app is based on a lightweight architecture: HTML5, Node.js + the LUIS service in the cloud.
Such an app would be incredibly useful in a hospital, where nurses need to perform and log countless vital sign checks with patients every day.
In part 1 of the article, I’ve explained the overall architecture of the service. In this part, we get hands-on and start implementing the Node.js-based backend. It will ultimately handle all the central messaging. It communicates both with the client user interface running in a browser, as well as the Microsoft LUIS language understanding service in the Azure Cloud.
Creating the Node Backend
During the last few years, cognitive services became immensely powerful. Especially interesting is natural language understanding. Using the latest tools, training the computer to understand real spoken sentences and to extract information is reduced to a matter of minutes. We as humans no longer need to learn how to speak with a computer; it simply understands us.
I’ll show how to use the Language Understanding Cognitive Service (LUIS) from Microsoft. The aim is to build an automated check-list for nurses working at hospitals. Every morning, they record the vital sign of every patient. At the same time, they document the measurements on paper checklists.
With the new app developed in this article, the process is much easier. While checking the vital signs, nurses usually talk to the patients about their assessments. The “Vital Signs Checklist” app filters out the relevant data (e.g., the temperature or the pupillary response) and marks it in a checklist. Nurses no longer have to pick up a pen to manually record the information.
ARCore has a great feature – light estimation. The ARCore SDK estimates the global lighting, which you can use as input for your own shaders to make the virtual objects fit in better with the captured real world. In this article, I’m taking a closer look at how the light estimation works in the current ARCore preview SDK.
Are there any other ways to 3D print segmented medical data coming from MRI / CT / Ultrasound by splitting it in two halves?
In the first part of this article, the result was that the support structures required by a standard 3D printer significantly reduce the details present on the surface of the printed body part.
Christoph Braun had the idea for another method to reduce the support structures to a minimum: by splitting the object in two halves, each has a flat surface area that can be used as the base for the 3D print.
Based on the 4-part tutorial where we segmented the brain from an MRI image, one of the most interesting application areas is printing such 3D models. In that sense, it makes no difference if the data is coming from an MRI (e.g., a brain or tumor), CT (e.g., the skull) or ultrasound. In this article, we’ll look at how to prepare the 3D model for 3D printing.
In this part, we print the MRI brain model using the Witbox 2 3D printer with plastic and deal with support structures. The aim is to make this process accessible for everyone – so you don’t need specialized and expensive software & hardware; we’ll instead use open source and free tools as much as possible.
In the previous blog posts, we’ve used a simple grayscale threshold to define the model surface for visualizing a MRI / CT / Ultrasound in 3D. In many cases, you need to have more control over the 3D model generation, e.g., to only visualize the brain, a tumor or a specific part of the scan.