Categories
Android AR / VR

Environmental HDR Lighting & Reflections in ARCore: Human Perception (Part 1)

Realistically merging virtual objects with the real world in Augmented Reality has a few challenges. The most important:

  1. Realistic positioning, scale and rotation
  2. Lighting and shadows that match the real-world illumination
  3. Occlusion with real-world objects

The first is working very well in today’s AR systems. Number 3 for occlusion is working OK on the Microsoft HoloLens; and it’s soon also coming to ARCore (a private preview is currently running through the ARCore Depth API – which is probably based on the research by Flynn et al. ).

But what about the second item? Google put a lot of effort into this recently. So, let’s look behind the scenes. How does ARCore estimate HDR (high dynamic range) lighting and reflections from the camera image?

Remember that ARCore needs to scale to a variety of smartphones; thus, a requirement is that it also works on phones that only have a single RGB camera – like the Google Pixel 2.

Light Estimation

The goal is simple: realistic HDR lighting for virtual objects. Ideally, this should work from low dynamic range source images – as the real-time camera stream that feeds into smartphone-based AR systems can’t capture HDR lighting. The less source material the algorithm requires to predict the lighting, the better; a single frame would of course be ideal. Is that possible?

In the publication Learning to predict indoor illumination from a single image by Gardner et al. , they showed some impressive results estimating the light source from a normal photo and applying a similar lighting situation to the virtual objects. This affects both the location of lights as well as their intensity. The underlying algorithm includes a deep convolutional neural network.

Two images, left shows the original photo; right image has an inserted virtual object with similar lighting applied.
Sample results by Gardner et al. show an inserted virtual object with realistic lighting, estimated from the standard color photo.

Lighting Cues and Concepts

In the Google AR developer documentation, they highlight several important properties that need to be correctly simulated to blend virtual and real objects:

  • Specular highlights: shiny bits of surfaces; move with the viewer
  • Shadows: layout of scene & where lights come from
  • Shading: surface + light influences reflection
  • Reflection: reflected colors
Properties of lighting highlighted in a sample image.
Key properties of lighting cues visible in an image. Source image by Google.

All these properties directly influence the color and brightness of each pixel in an image. On the one hand side, the AR engine needs to work with these inputs to estimate the light and material properties. On the other hand, similar settings then have to be applied to the virtual objects.

Human Perception & Lighting

A key fact to keep in mind is that we as humans usually only indirectly perceive the light field. We can only see object appearances and use our experience to infer the source.

The question is: how good are we doing this? Several studies tried to find this out. te Pas et al. asked humans to judge images of two spheres to answer three questions:

  1. Are they the same material?
  2. Are they illuminated the same way?
  3. Is illumination or material the same?

The spheres were either photographs or computer-generated. This shows an example of what users had to judge in the experiment:

Sample image of the experiment, showing two spheres.
Sample image that users saw in the study – are material and illumination the same? Image by te Pas et al.

In the image above, both spheres are from real photos and not computer generated. However, they differ both in illumination and material. Not easy to judge, right? It gets a bit clearer once you look at the images of the test set with a little more context, as seen here:

Photographs of real, smooth spheres – but with different materials and / or illuminations.
Sample images of one of the test sets used by . In this test set, all objects are real photographs, but they are different in material or illumination.

What were the results of the study? They recommend the following:

  • To make correct material perception possible, include higher order aspects of the light field and apply a realistic 3D texture (meso scale texture).
  • To make a correct perception of the light field possible, you need to put emphasis on the realism of global light properties (in particular its mean direction & diffuseness).

Article Series

How is it possible to computationally perceive the light field of a scene? This will be covered in the second part of the article series. Finally, in the third part, I’ll show an example of how you can visualize ARCore’s reflection map in a Unity scene.

  1. Human Perception (Part 1)
  2. Virtual Lighting (Part 2)
  3. Real-World Reflections in Unity 3D (Part 3)

Bibliography

[1]
J. Flynn et al., “DeepView: View Synthesis With Learned Gradient Descent,” presented at the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 2367–2376. Accessed: Nov. 18, 2020. [Online]. Available: https://openaccess.thecvf.com/content_CVPR_2019/html/Flynn_DeepView_View_Synthesis_With_Learned_Gradient_Descent_CVPR_2019_paper.html
[1]
S. F. te Pas and S. C. Pont, “A comparison of material and illumination discrimination performance for real rough, real smooth and computer generated smooth spheres,” in Proceedings of the 2nd symposium on Applied perception in graphics and visualization, in APGV ’05. A Coroña, Spain: Association for Computing Machinery, Aug. 2005, pp. 75–81. doi: 10.1145/1080402.1080415.
[1]
M.-A. Gardner et al., “Learning to predict indoor illumination from a single image,” ACM Trans. Graph., vol. 36, no. 6, p. 176:1-176:14, Nov. 2017, doi: 10.1145/3130800.3130891.