Categories
App Development Artificial Intelligence Speech Assistants

Local Debugging of Alexa Skills with Visual Studio Code

Creating an Alexa-hosted skill is a fantastic way to start developing for voice assistants. However, you will eventually face issues that you need to debug in code. Alexa offers local skill debugging through Visual Studio Code but setting it up is a bit tricky. This guide will take you through the necessary steps.

Skill Environment

This guide focuses on a Python-based skill and uses Windows as a local dev environment. Most also applies to other environments.

I’ll start with a blank skill. First, create the skill in the Alexa Developer Console. The skill name I’m using in this example is “local debugging test”. The “type of experience” is “Other”, with a “Custom” model, as I’d like to start with a minimal blank skill. In the “Hosting services” category, choose the “Alexa-hosted (Python)” category. In the last step about templates, stick with “Start from Scratch”, which will give you a minimal Hello World-type voice interaction. The following screenshot summarizes the settings:

Review of the settings for the new Alexa skill that we will configure for local debugging through Visual Studio.
Categories
App Development AR / VR Cloud Speech Assistants

How-To: Convert Neural Voice Audio from Amazon Polly (mp3) to Spark AR (m4a)

Currently, Facebook’s Spark AR Studio is restrictive with supported audio formats. Unfortunately, only M4A with specific settings is allowed. This short tutorial is a guidance on how to convert artificially generated neural voices (in this case coming from an mp3 file as produced by Amazon Polly) to the m4a format accepted by Spark AR. I’m using the free Audiacity tool, which integrates the open-source FFmpeg plug-in.

Spark AR has the following requirements on audio files:

  • M4A format
  • Mono
  • 44.1 kHz sample rate
  • 16-bit depth

Generating Audio using Text-to-Speech (mp3 / PCM)

Neither Amazon Polly nor the Microsoft Azure Text-to-Speech cognitive service can directly produce an m4a audio file. In its additional settings, Polly offers MP3, OGG, PCM and Speech Marks. MP3 goes up to a sample rate of 24000 Hz, PCM is limited to 16000 Hz.

Categories
Android App Development AR / VR

2D Image Tracking with AR Foundation (Part 4)

With 2D image tracking, you can create real-life anchors. You need pre-defined markers; Google calls the system Augmented Images. Just point your phone at the image, and your app lets the 3D model immediately appear on top of it.

In the previous part of the tutorial, we wrote Unity scripts so that the user could place 3D models in the Augmented Reality world. A raycast from the smartphone’s screen hit a trackable in the real world, where we then anchored the object. However, this approach requires user interaction and a good user experience to guide users, especially if they’re new to AR.

Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.

YouTube privacy policy

If you accept this notice, your choice will be saved and the page will refresh.

Using 2D Image Tracking

You need to provide reference images, which your app’s users will then encounter in the real world. AR Foundation distinguishes these images and tracks their physical location.

Some usage scenarios where 2D image tracking is helpful:

  • Recognition of real-world objects
  • Automatically place information on top of objects
  • Create an indoor info or navigation system
  • Often quicker & easier than plan detection
Categories
Android App Development AR / VR

Raycast & Anchor: Placing AR Foundation Holograms (Part 3)

In the first two parts, we set up an AR Foundation project in Unity. Next, we looked at to handle trackables in AR. Now, we’re finally ready to place virtual objects in the real world. For this, we perform a raycast and then create an anchor at the target position. How to perform this with AR Foundation? How to attach an anchor to the world or to a plane?

Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.

YouTube privacy policy

If you accept this notice, your choice will be saved and the page will refresh.

AR Raycast Manager

If you’d like to let the user place a virtual object in relation to a physical structure in the real world, you need to perform a raycast. You “shoot” a ray from the position of the finger tap into the perceived AR world. The raycast then tells you if and where this ray intersects with a trackable like a plane or a point cloud.

A traditional raycast only considers objects present in its physics system, which isn’t the case for AR Foundation trackables. Therefore, AR foundation comes with its own variant of raycasts. They support two modes:

Categories
Android App Development AR / VR

Trackables and Managers in AR Foundation (Part 2)

After setting up the initial AR Foundation project in Unity in part 1, we’re now adding the first basic augmented reality features to our project. How does AR Foundation ensure that your virtual 3D objects stay in place in the live camera view by moving them accordingly in Unity’s world space? AR Foundation uses the concept of trackables. For each AR feature you’d like to use, you will additionally add a corresponding trackable manager to your AR Session Origin.

Trackables

In general, a trackable in AR Foundation is anything that can be detected and tracked in the real world. This starts with basics like anchors, point clouds and planes. More advanced tracking even allows environmental probes for realistic reflection cube maps, face tracking, or even information about other participants in a collaborative AR session.

Trackable managers available in AR Foundation.
Trackable managers available in AR Foundation.

Each type of trackable has a corresponding manager class as part of the AR Foundation package that we added to our project.

Categories
Android App Development AR / VR

AR Foundation Fundamentals with Unity (Part 1)

When developing mobile Augmented Reality apps, you usually want to target both Android and iOS phones. AR Foundation is Unity’s approach to provide a common layer, which unifies both Google’s ARCore and Apple’s ARKit. As such, it is the recommended way to build AR apps with Unity.

However, few examples and instructions are available. This guide provides a thorough step-by-step guide for getting started with AR Foundation. The full source code is available on GitHub.

AR Foundation Architecture and AR SDKs

To work with AR Foundation, you first have to understand its structure. The top layer of its modulare design doesn’t hide everything else. Sometimes, the platform-dependent layers and their respective capabilities shine through, and you must consider these as well.

AR Foundation is a highly modular system. At the bottom, individual provider plug-ins contain the glue to the platform-specific native AR functionality (ARCore and ARKit). On top of that, the XR Subsystems provide different functionalities; with a platform-agnostic interface.

Categories
Android App Development AR / VR Digital Healthcare

Hit Test & Augmented Reality Anchors with Amazon Sumerian (Part 3)

In an Augmented Reality scene, users looks at the live camera feed. Virtual objects anchor at specific positions of the real world. Our task is to let the user place virtual in the real world. To achieve that, the user simply taps on the smartphone screen. Through a hit test, our script then creates an anchor in the real world and links that to a virtual 3D model entity.

That’s the high level overview. To code this anchoring logic, a few intermediate steps are needed:

  1. Hit Test: converts coordinates of the user’s screen tap and sends the normalized coordinates to the AR system. This checks what’s in the real world at that position.
  2. Register Anchor: next, our script instructs the AR system to create an anchor at that position.
  3. Link Anchor: finally, the ID of the created anchor is linked to our entity. This allows Sumerian to continually update the transform of our 3D entity. Thus, the object stays in place in the real world, even when the user moves around.

Transforming these steps into code, this is what our code architecture looks like. It includes three call-backs, starting with the touch event and ending with the registered and linked AR anchor.

Categories
Android App Development AR / VR Digital Healthcare

Augmented Reality Anchors and Amazon Sumerian’s ArAnchorComponent (Part 2)

The WebXR standard isn’t finished yet. How does the web-based Amazon Sumerian platform integrate with the real world for Augmented Reality? We’ll take a look at the glue that binds the 3D WebGL contents from the web view to the native AR platform (ARCore / ARKit). To access this, we will also look at Sumerian internal engine classes like ArAnchorComponent, which handle the cross-platform web-to-native mapping.

This article continues from part 1, which covered the scripting basics of Amazon Sumerian and prepared the scene for AR placement.

Anchors in Amazon Sumerian

Let’s start with a bit of background of how Sumerian handles AR.

Ultimately, a 3D model is placed in the user’s real environment using an “Anchor”. This is directly represented in Sumerian. To create an anchor in your scene, your code goes through the following steps:

Categories
Android App Development AR / VR Digital Healthcare

Augmented Reality Object Placement with Amazon Sumerian (Part 1)

How to (re)-position the virtual objects in the real world in an Augmented Reality experience – while still having an interactive scene? Elegantly guide your users through the placement process.

The official AR tutorial from Amazon contains a simple script: by tapping anywhere in the scene, it instantly moves the objects to that position. However, for the Digital Healthcare Explained app, I needed a more flexible behavior:

  1. Activate placement mode by tapping on a specific object in the 3D scene. In this case, I decided that tapping the host avatar triggers placement mode.
  2. The host then explains what to do: tapping on another surface moves the host and related objects. Guide users through the process. The Sumerian hosts are ideal to explain the process.
  3. The user taps on a real-world surface in the AR scene.
  4. Next, the scene contents move, the anchor updates and the host confirms.

New ES6 Based Scripting

Additionally, Amazon Sumerian is evolving its scripting language. A major upgrade to ES6 is underway. It’s fully based on classes and fits better into the actions and state machines used in other places of Sumerian. The new APIs are still marked as “Preview”. However, the old APIs are already called “Legacy” or “Old Script Format”.

While documentation for the new Sumerian Engine APIs directly is already available, it’s very brief and doesn’t contain many examples. The official tutorials are still based on the legacy API.

I decided to re-write the script using the new APIs. It involves calling a lot of internal parts of Sumerian. Thus, it’s a lot more complex than all other examples for the new API currently out there. However, it’s interesting to dig more into the internals of how a modern, web-based AR environment works.

Categories
App Development AR / VR Digital Healthcare

Create engaging Healthcare Experiences with Augmented Reality

Download hands-on workshop slides and material for a complete getting-started guide to your first 3D experience – with a background in digital healthcare!

Conference Session

The build.well.being conference is an annual networking event for the doers in Digital Healthcare. The fast-paced event compresses a lot of useful information into a short day: sessions from health professionals (including a keynote from Brian Anthony, associate director at the MIT.nano). Student project pitches. Plus: hands-on workshops.

Amazon Sumerian Workshop @ build.well.being, © FH St. Pölten | Tobias Sautner
Amazon Sumerian Workshop @ build.well.being, © FH St. Pölten | Tobias Sautner

Together with Anna Runefelt, I was running a challenging workshop: introducing attendees with a healthcare background to the world of Augmented / Virtual Reality. The aim of the hands-on workshop: creating your first live 3D experience in about 1 hour. This was possible thanks to the easy-to-use interface of Amazon Sumerian. Most of the attendees who followed along indeed managed to get a fully working 3D scene running on their laptops.