Digital Healthcare, Augmented Reality, Machine Learning, Cloud Computing and more! Andreas Jakl is a professor @ St. Pölten University of Applied Sciences, Microsoft MVP for Windows Development and Amazon AWS Educate Cloud Ambassador & Community Builder.
Currently, Facebook’s Spark AR Studio is restrictive with supported audio formats. Unfortunately, only M4A with specific settings is allowed. This short tutorial is a guidance on how to convert artificially generated neural voices (in this case coming from an mp3 file as produced by Amazon Polly) to the m4a format accepted by Spark AR. I’m using the free Audiacity tool, which integrates the open-source FFmpeg plug-in.
Neither Amazon Polly nor the Microsoft Azure Text-to-Speech cognitive service can directly produce an m4a audio file. In its additional settings, Polly offers MP3, OGG, PCM and Speech Marks. MP3 goes up to a sample rate of 24000 Hz, PCM is limited to 16000 Hz.
In a recent research project, we researched possibilities for interactive storytelling, usability, and interaction methods of an Augmented Reality app for patient education. We developed an ARCore app with Unity that helps patients with strabismus to better understand the processes of examinations and eye surgeries. Afterwards, we performed a 2-phase evaluation with a total of 24 test subjects.
Low health literacy is a well-known and serious issue. 1 in 5 American adults lack skills to fully understand implications of processes related to their health . Audio and computer-aided instructions can be helpful. Especially spoken instructions lead to a higher rate of understanding . A smartphone app that combines multiple approaches can therefore provide great benefits.
We developed and evaluated a prototype Augmented Reality (AR) mobile application called Enlightening Patients with Augmented Reality (EPAR). The app is designed for patient education about strabismus and the corresponding eye surgery. It is intended to be used in addition to the doctors’ mandatory consultations.
The topic was free to choose and up to the creativity of the students. Their creation had to pass the manual skill certification process performed by Amazon. This means that they didn’t have to just develop the skill, but also provide all required metadata like description and icons.
With 2D image tracking, you can create real-life anchors. You need pre-defined markers; Google calls the system Augmented Images. Just point your phone at the image, and your app lets the 3D model immediately appear on top of it.
In the previous part of the tutorial, we wrote Unity scripts so that the user could place 3D models in the Augmented Reality world. A raycast from the smartphone’s screen hit a trackable in the real world, where we then anchored the object. However, this approach requires user interaction and a good user experience to guide users, especially if they’re new to AR.
Using 2D Image Tracking
You need to provide reference images, which your app’s users will then encounter in the real world. AR Foundation distinguishes these images and tracks their physical location.
Some usage scenarios where 2D image tracking is helpful:
In the first two parts, we set up an AR Foundation project in Unity. Next, we looked at to handle trackables in AR. Now, we’re finally ready to place virtual objects in the real world. For this, we perform a raycast and then create an anchor at the target position. How to perform this with AR Foundation? How to attach an anchor to the world or to a plane?
AR Raycast Manager
If you’d like to let the user place a virtual object in relation to a physical structure in the real world, you need to perform a raycast. You “shoot” a ray from the position of the finger tap into the perceived AR world. The raycast then tells you if and where this ray intersects with a trackable like a plane or a point cloud.
A traditional raycast only considers objects present in its physics system, which isn’t the case for AR Foundation trackables. Therefore, AR foundation comes with its own variant of raycasts. They support two modes:
After setting up the initial AR Foundation project in Unity in part 1, we’re now adding the first basic augmented reality features to our project. How does AR Foundation ensure that your virtual 3D objects stay in place in the live camera view by moving them accordingly in Unity’s world space? AR Foundation uses the concept of trackables. For each AR feature you’d like to use, you will additionally add a corresponding trackable manager to your AR Session Origin.
When developing mobile Augmented Reality apps, you usually want to target both Android and iOS phones. AR Foundation is Unity’s approach to provide a common layer, which unifies both Google’s ARCore and Apple’s ARKit. As such, it is the recommended way to build AR apps with Unity.
To work with AR Foundation, you first have to understand its structure. The top layer of its modulare design doesn’t hide everything else. Sometimes, the platform-dependent layers and their respective capabilities shine through, and you must consider these as well.
AR Foundation is a highly modular system. At the bottom, individual provider plug-ins contain the glue to the platform-specific native AR functionality (ARCore and ARKit). On top of that, the XR Subsystems provide different functionalities; with a platform-agnostic interface.
In dialog trees for voice assistants, you often need to introduce some randomness. If the smart speaker doesn’t always repeat the same phrases, the dialog sounds more natural. Many other use cases exist as well, e.g., you might want to ask the user a random question in a quiz.
Random Block in Voiceflow
To enable this functionality, Voiceflow includes a “Random” block. This enables choosing a different path each time. The “no duplicates” option ensures that it’s not going the same path twice.
However, while this works fine in the Voiceflow testing environment, it currently has issues when using the skill live on Amazon Alexa. Additionally, you might sometimes want to have more control over the process – e.g., pre-set the random choices, store them in a database for advanced logging or tease the next item when the skill ends.
In the final part, let’s look at how we can generate and use the AR depth maps through Unity’s AR Foundation. In the previous part, we tested the ready-made example. Now, it’s time to write code ourselves.
In this case, I’m using Unity 2021.1 (Alpha) together with AR Foundation 4.1.1 to make sure we have the latest AR support & features in our app. But as written in the previous article, Unity 2020.2 should be sufficient.
I’ve tested the example on Android (Google Pixel 4 with Android 11 & ARCore), but it should work fine also on iOS with ARKit.
XR Plug-in management: activate the management in the project settings. Additionally, enable the ARCore Plug-in provider. To check if everything was installed, open Window > Package Manager. You should see both AR Foundation as well as ARCore XR Plugin with at least version 4.1.1.
Android player settings: switch to the Android build platform, uncheck multithreaded rendering, remove Vulkan from the rendering APIs, make sure the package name is personalized and finally set the minimum API level to at least 24 (Android 7.0).
Scene setup: add the required prefabs and GameObjects to your scene. Right-click in the hierarchy panel > XR > XR Session. Also add the XR Session Origin.
By default, the AR depth map is always returned in Landscape Right orientation, no matter what screen orientation your app is currently in. While we could of course adapt the map to the current screen rotation, we want to keep this example focused on the depth map. Therefore, simply lock the screen orientation through Project Settings > Player > Resolution and Presentation > Orientation > Default Orientation: Landscape Right.
The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
The technical storage or access that is used exclusively for statistical purposes.The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.