Digital Healthcare, Augmented Reality, Mobile Apps and more! Andreas Jakl is a lecturer for Digital Healthcare & Smart Engineering @ St. Pölten University of Applied Sciences, Microsoft MVP for Windows Development and Amazon AWS Educate Cloud Ambassador.
How to make real-time HDR lighting and reflections possible on a smartphone? Based on the unique properties of human perception and the challenges of capturing the world’s state and applying it to virtual objects. Is it still possible?
Google found an interesting approach, which is based on using Artificial Intelligence to fill the missing gaps. In this article, we’ll take a look at how ARCore handles this. The practical implementation of this research is available in the ARCore SDK for Unity. Based on this, a short hands-on guide demonstrates how to create a sphere that reflects the real world – even though the smartphone only captures a fraction of it.
Google ARCore Approach to Environmental HDR Lighting
To still make environmental HDR lighting possible in real-time on smartphones, Google uses an innovative approach, which they also published as a scientific paper . Here, I’ll give you a short, high-level overview of their approach:
First, Google captured a massive amount of training data. The video feed of the smartphone camera captured both the environment, as well as three different spheres. The setup is shown in the image below.
In part 1, we looked at how humans perceive lighting and reflections – vital basic knowledge to estimate how realistic these cues need to be. The most important goal is that the scene looks natural to human viewers. Therefore, the virtual lighting needs to be closely aligned with real lighting.
But how to measure lighting in the real world, and how to apply it to virtual objects?
How do you need to set up virtual lighting to satisfy the criteria mentioned in part 1? Humans recognize if an object doesn’t fit in:
The image above from the Google Developer Documentation shows both extremes. Even though you might still recognize that the rocket is a virtual object in the right image, you’ll need to look a lot harder. The image on the left is clearly wrong, especially due to the misplaced shadow.
Realistically merging virtual objects with the real world in Augmented Reality has a few challenges. The most important:
Realistic positioning, scale and rotation
Lighting and shadows that match the real-world illumination
Occlusion with real-world objects
The first is working very well in today’s AR systems. Number 3 for occlusion is working OK on the Microsoft HoloLens; and it’s soon also coming to ARCore (a private preview is currently running through the ARCore Depth API – which is probably based on the research by Flynn et al. ).
But what about the second item? Google put a lot of effort into this recently. So, let’s look behind the scenes. How does ARCore estimate HDR (high dynamic range) lighting and reflections from the camera image?
Remember that ARCore needs to scale to a variety of smartphones; thus, a requirement is that it also works on phones that only have a single RGB camera – like the Google Pixel 2.
In the near future, we will primarily interact with technology through voice. Especially for older generations and kids, voice has the lowest entry barrier – compared to the complexity of computers or even smartphones. Simply start talking to speech assistants like Amazon Alexa, and they will help immediately.
To make most use of it, I’ve co-organized the “Alexa for Wellbeing Online Challenge” during the last few weeks. Together with AWS Educate and Hilfswerk Lower Austria, we’ll host a 10-day online hackathon, open to everyone.
In this article, we’ll configure AWS Identity and Access Management (IAM) to easily use Amazon Sumerian with multiple users. This is especially important for classrooms or trainings. You often don’t want to loose time by having attendees set up and activate their own AWS accounts, including their personal credit cards.
Instead, by setting up sub-users in your account beforehand, you have complete control over the experience and can get started right away. Additionally, it helps with troubleshooting for exercises.
Right now, no ready-made AWS Educate classrooms are available that support Amazon Sumerian. If that changes, the classrooms would be a good alternative, as it gives students their own free AWS credits instead of everything billed to a central account.
Securing Your Account
The first step is making sure you own root account is properly secured. A major part is enabling Multi-Factor Authentication (MFA) for your root account. Especially when working in teams and with source control, it’s an easy-to-make mistake to upload your credentials somewhere; you don’t want others to have full control over your whole AWS account, as this can incur major charges to your credit card. Therefore, it’s best to enable MFA before you continue.
In an Augmented Reality scene, users looks at the live camera feed. Virtual objects anchor at specific positions of the real world. Our task is to let the user place virtual in the real world. To achieve that, the user simply taps on the smartphone screen. Through a hit test, our script then creates an anchor in the real world and links that to a virtual 3D model entity.
That’s the high level overview. To code this anchoring logic, a few intermediate steps are needed:
Hit Test: converts coordinates of the user’s screen tap and sends the normalized coordinates to the AR system. This checks what’s in the real world at that position.
Register Anchor: next, our script instructs the AR system to create an anchor at that position.
Link Anchor: finally, the ID of the created anchor is linked to our entity. This allows Sumerian to continually update the transform of our 3D entity. Thus, the object stays in place in the real world, even when the user moves around.
Transforming these steps into code, this is what our code architecture looks like. It includes three call-backs, starting with the touch event and ending with the registered and linked AR anchor.
The WebXR standard isn’t finished yet. How does the web-based Amazon Sumerian platform integrate with the real world for Augmented Reality? We’ll take a look at the glue that binds the 3D WebGL contents from the web view to the native AR platform (ARCore / ARKit). To access this, we will also look at Sumerian internal engine classes like ArAnchorComponent, which handle the cross-platform web-to-native mapping.
This article continues from part 1, which covered the scripting basics of Amazon Sumerian and prepared the scene for AR placement.
Anchors in Amazon Sumerian
Let’s start with a bit of background of how Sumerian handles AR.
Ultimately, a 3D model is placed in the user’s real environment using an “Anchor”. This is directly represented in Sumerian. To create an anchor in your scene, your code goes through the following steps:
Activate placement mode by tapping on a specific object in the 3D scene. In this case, I decided that tapping the host avatar triggers placement mode.
The host then explains what to do: tapping on another surface moves the host and related objects. Guide users through the process. The Sumerian hosts are ideal to explain the process.
The user taps on a real-world surface in the AR scene.
Next, the scene contents move, the anchor updates and the host confirms.
New ES6 Based Scripting
Additionally, Amazon Sumerian is evolving its scripting language. A major upgrade to ES6 is underway. It’s fully based on classes and fits better into the actions and state machines used in other places of Sumerian. The new APIs are still marked as “Preview”. However, the old APIs are already called “Legacy” or “Old Script Format”.
I decided to re-write the script using the new APIs. It involves calling a lot of internal parts of Sumerian. Thus, it’s a lot more complex than all other examples for the new API currently out there. However, it’s interesting to dig more into the internals of how a modern, web-based AR environment works.
Download hands-on workshop slides and material for a complete getting-started guide to your first 3D experience – with a background in digital healthcare!
The build.well.being conference is an annual networking event for the doers in Digital Healthcare. The fast-paced event compresses a lot of useful information into a short day: sessions from health professionals (including a keynote from Brian Anthony, associate director at the MIT.nano). Student project pitches. Plus: hands-on workshops.
Together with Anna Runefelt, I was running a challenging workshop: introducing attendees with a healthcare background to the world of Augmented / Virtual Reality. The aim of the hands-on workshop: creating your first live 3D experience in about 1 hour. This was possible thanks to the easy-to-use interface of Amazon Sumerian. Most of the attendees who followed along indeed managed to get a fully working 3D scene running on their laptops.
Image classification & content description is incredibly powerful. Cloud-based computer vision services instantly return a JSON-based description of what they see in photos.