Augmented Reality Anchors and Amazon Sumerian's ArAnchorComponent (Part 2)

The WebXR standard isn’t finished yet. How does the web-based Amazon Sumerian platform integrate with the real world for Augmented Reality? We’ll take a look at the glue that binds the 3D WebGL contents from the web view to the native AR platform (ARCore / ARKit). To access this, we will also look at Sumerian internal engine classes like ArAnchorComponent, which handle the cross-platform web-to-native mapping.

This article continues from part 1, which covered the scripting basics of Amazon Sumerian and prepared the scene for AR placement.

Anchors in Amazon Sumerian

Let’s start with a bit of background of how Sumerian handles AR.

Ultimately, a 3D model is placed in the user’s real environment using an “Anchor”. This is directly represented in Sumerian. To create an anchor in your scene, your code goes through the following steps:

Create Anchor component: Create a ArAnchorComponent() and attach this to an entity in your scene.
Hit test: Perform a hit test through the native AR system of the device. This connects the virtual 3D model to a physical place in the world. A hit test answers the question: the user tapped on the screen – what is in the real world at that position? And what are its 3D coordinates, represented by a transformation?
Register the anchor with the AR system of the device. For this, supply the transform (= position) from the previous step.
Anchor ID: The device’s AR system returns an anchor ID. Register this with Sumerian’s ArAnchorComponent(). From now on, the engine takes care of updating the entity’s transform every frame, according to the user’s movement in the world.

Sumerian & ARCore / ARKit

On a mobile device, Amazon Sumerian renders the scene contents using a web view. A project template provides the glue from the web-based scene to the phone’s AR system. It uses ARCore / ARKit to track the environment and to update the anchors. These changes are then sent to the JavaScript logic of Sumerian running in the web view. This is necessary as mobile browsers don’t fully support WebXR yet.

In the ARCore starter app template, the SumerianConnector.java contains the relevant code to handle anchors.

The registerAnchor() @JavascriptInterface function calls createAnchor() from Google’s ARCore Session. Then, the Java code triggers a JavaScript call-back. This provides the request ID as well as the hash code of the ARCore anchor.

The code in the Amazon Sumerian ARCore starter app that creates an ARCore anchor on request by the JavaScript-code from Sumerian. Then, it uses a call-back to Javascript to send the anchor ID (hash code).

In the same SumerianConnector class, update() is called once per frame. It applies the ARCore camera transform to the camera in the Sumerian world. Additionally, it loops over all currently tracking anchors in ARCore. Their transforms are then sent to the JavaScript Sumerian scene through: ARCoreBridge.anchorTransformUpdate().

To summarize, an ARCoreBridge JavaScript class handles the communication between our Amazon Sumerian scripts and the native Android app which interfaces with ARCore. Whenever you create an anchor in Sumerian, that request gets forwarded to ARCore – and its response gets back to JavaScript together with some data.

A similar process is performed for requesting a hit test. Again, your Sumerian code interfaces with the native ARCore code, which actually handles the hit test and sends the response back to your code.

ArSystem & ARCoreBridge in Sumerian

If you look closely at the Java code above, you will notice that it actually calls functions from an ARCoreBridge JavaScript class. Also, you will later see that the functions we call from the Sumerian JavaScript API sometimes have slightly different names than you see in the Java code. Obviously, there is a bridge class involved that takes care of gluing together the native code to our custom Sumerian JavaScript.

These components are part of Amazon Sumerian. What you can do with ArSystem is listed in the documentation. It forwards hit tests, anchors, lightning information, image targets as well as transforms between the native system and your code.

Deep Dive into Amazon Sumerian Engine Code

In case you’re interested and would like to dive deeper: you can take a look at Sumerian’s actual code. Since we’re using JavaScript, the browser obviously loads the ArSystem class as part of Sumerian.

While you have the Sumerian Console open, open the Chrome DevTools. Go to search (Ctrl + Shift + F). Search for “registerAnchor”: this is the method we’d like to find. This shows the ArSystem.js class as part of the results. Click to see its code. As the license notes at the top reveal, this part of the engine is licensed under MIT, so it’s actually open source – great!

The registerAnchor() JavaScript function in Amazon Sumerian's ArSystem. — The registerAnchor() JavaScript function in Amazon Sumerian’s ArSystem.

This short class provides the missing link between our code and the native code. For example, in Sumerian’s ArSystem, we call registerAnchor(). The ArSystem class then transforms this to a call to this._delegate.registerAnchor() through a delegate – which is ultimately culminates in a call of the native Java / Swift call. The delegate implementation is different for ARCore and ARKit, depending on which platform you run the app on.

Software Architecture for AR Anchors

Now that we’ve seen how Sumerian handles AR behind the scenes, let’s get back to our Sumerian scene and walk through the necessary steps to create an anchor.

Our AnchorPositioning script will interface with several other entities in the Sumerian scene. For example, the status of the placement mode has to be global, so that other entities can react to it as well. In addition, our script will interface with the AR system of Sumerian as well as other components of the world (= the entire scene).

This diagram shows the software architecture of our script, as well as its connections to other parts. We will add them in this part of the tutorial. These steps conclude the preparation for the actual AR raycast and the AR anchor creation.

Software architecture of how the AnchorPositioning script interfaces with the other components of Amazon Sumerian.

Step 1: Setting Up Global Communication

First of all, we initialize the global communication between the script and the rest of the scene (= the world).

A global state should be available so that all other entities in the scene can easily react to whether the user is currently placing the scene or not. We call this variable placeMode, which can be true / false. Other components can then for example ignore interactions while this variable is set to true.

The world in which our script lives in is accessed through ctx.world. That class for example allows searching for other entities in the scene. We’re interested in its value function. You can give a global value a (string) name, get and set the value, as well as subscribe to updates for it (monitor). Sounds just like what we need. Here is the complete code of our script so far:

The complete script code so far. Here, we’re writing the first lines of the start(ctx) function.

As you can see, we get the reference to this world value and store it in this.placeMode. This essentially stores it as a property of our script class instance. It makes it easier to update this value from various places of our code, including from callbacks.

In addition, we also save a reference to the world event called WorldPlaced. This event will be emitted by our script when the user finished placing the world.

So essentially, other scripts can monitor placement mode in two ways, depending on what is more convenient to them:

By checking the state of the world.placeMode variable (true / false).
By listening to the StartPlaceWorld and WorldPlaced world events.
(the first event is sent by another part of the scene, and our script will listen to that. We’ll come to that shortly)

Step 2: Anchor Entity

In our Anchor Positioning script, we already set up an entity in the scene that owns the AR anchor in part 1 of this article series. We called the reference property in our script anchorEntity. The property automatically shows up in the Sumerian web editor.

Every class property is automatically accessible via this.. In case the developer assigned an entity in the editor, everything’s done and we can directly work with the variable.

However, if no anchor entity was defined in the editor, this.anchorEntity will be undefined / null. In this case, we want to use the entity where this script component is attached to – a self-reference. Through ctx.entity, we ensure that a reference to our ‘own’ entity is persisted through the script.

Append the following code in the start(ctx) method:

Check if the user assigned an anchor entity through the Sumerian web editor. If not, use a self-reference to the entity that owns our script component.

Step 3: World Reference for HTML DOM Events

The next initialization step: retrieve the Sumerian World reference. This represents the loaded scene. We need it to gain wider access to the environment.

The Sumerian World has a public and internal version. If it’s internal, that doesn’t mean it’s not accessible to you. It only means that it’s a lower level API. The Sumerian engineers don’t think you will need these too often. We do dive deeper into Sumerian, though. As such, we also imported the classes from the internal module under the name si at the beginning of the script.

As we’re pro developers doing low-level code, of course we need the internal variant. The way to access it is simple once you have seen it. First, you get the public world reference through an entity. Then, the internal version of the World class has a static method called forPublicWorld(). It returns the internal variant of the public world. Easy, right?

We just store it in this.internalWorld for further use as an instance property:

Get a reference to the Amazon Sumerian engine internal version of the World.

Step 4: Check for AR Support

Next, we check if the platform our script is running on has AR support. The ArSystem class of Amazon Sumerian is only available on ARCore / ARKit enabled Android / iOS phones. If we don’t get a reference to this class, the script stops further initialization.

ArSystem is part of the internal Amazon Sumerian APIs. Now, we use si to retrieve the internal ArSystem class from the Sumerian engine. Again, append this code snippet within the start(ctx) method:

Check for AR platform support with the internal ArSystem class of Amazon Sumerian.

Step 5: Create & Attach Sumerian ArAnchorComponent

After the general setup, the next code snippet actually modifies our Amazon Sumerian scene at run-time. We know that the device supports AR. Also, we ensured that we have a valid reference to an entity which will own the anchor in our scene.

Now, create an instance of the ArAnchorComponent. It manages the relationship between the platform’s AR system and our Amazon Sumerian scene. This component is also part of Sumerian’s internal API.

Attach the AR Anchor component to the entity in Amazon Sumerian

What’s tricky is what follows next. The ArAnchorComponent must be added to the entity in the scene. The normal engine API doesn’t provide the function to do so.

Similar to the Sumerian World, the Sumerian Entity has two versions: public and internal. We already have a reference to the public entity (-> step 2). The internal Entity class has a function that returns the internal Entity object based on a public Entity: si.Entity.forPublicEntity().

Store this internal anchor entity through a new this.internalAnchorEntity instance property. The script will need it again later within a call-back to assign the anchor ID.

Additionally, the internal entity allows adding a component. Thus, we attach the newly created arAnchorComponent to the Entity.

Setup is Done!

In this part, we went deep into the Amazon Sumerian engine. We set up the script, got references to public and Sumerian internal objects. Also, we analyzed how native ARCore / ARKit actually interface with the web-based 3D scene that is developed in JavaScript.

That’s all the required preparation for actually handling user interactions in the next part. This includes performing a raycast from the touch position, setting an anchor and finally linking that to our anchorEntity.