Augmented Reality and the Web

Past, Present and Future


Blair MacIntyre
Principal Research Scientist, Mozilla

Professor, School of Interactive Computing, Georgia Tech
@blairmacintyre / bmacintyre@mozilla.com

Past

Background, Terminology, Technology

Present

Web and AR right now

Future

Where do we want to go, how can we get there?

But First, Why?



Media Mission Mashup

Prelinary Ideas, I am Looking for Feedback!

And Hopefully Involvement!

Past

What is AR?

(or MR or Holographic Computing?)

Mixing media with a
person's perception of the world
registered in 3D, in real-time

Other approaches to context-based media:

  • Heads-up-Displays (Glass)
  • Map mashups (Ingress, Pokemon GO)
  • Geofencing (alerts)

Ivan Sutherland, "The Ultimate Display", mid-1960's












Authoring Tools for AR












Again: Why the Web?

The "Current" Web is Not Quite There

Local Video: Latency, no camera intrinsics

Tracking: Overheads and Speed


W3C standards, AWE.js, various Web-for-Native plugins

AR Technology

System Setup

Ron Azuma. "A Survey of Augmented Reality" Presence (1997)

System Setup

Ron Azuma. "A Survey of Augmented Reality" Presence (1997)

System Setup

Ron Azuma. "A Survey of Augmented Reality" Presence (1997)

System Setup

Ron Azuma. "A Survey of Augmented Reality" Presence (1997)

Can only display relative to what we
already know or can sense
about the world


Must display in real time (akin to VR)


Core problems are displays, sensing, and services for world knowledge.

Sense the world relative to display


Inside-out

(e.g., computer vision & depth sensing, object recognition)

Outside-in

(e.g., GPS, Vive Lighthouse)

Current Approaches to AR








Head Mounted Displays

(Perhaps HMD's will dominate when
they look like this, but that's a long way off)








Traditional (Handheld and Desktop) Displays











Projection Displays








Observations About Using AR

1: Intended Use

Tools vs Awareness












1: Intended Use

Tools vs Awareness












2: The "R" in "AR" Matters

AR depends on knowledge of the world

Can only augment what the system knows about


Knowledge (in apps) equals risk (for users and others)

3: Different ways to Leverage "R"

World as Massive Display Space (Structure)












3: Different ways to Leverage "R"

Recognize and/or Augment Specific Things (Semantics)












Present

This Presentation is Running in Argon4

on an iPhone

Using argon.js + reveal.js + aframe.js + argon-aframe.js

Stuff around the room












Computer vision AR w/ Vuforia












Planetary scale geographic AR












Custom Reality (Panorama)












Added Argon to reveal.js Demo

Add some new scripts up top


<script src="resources/js/aframe.js"> </script> 
<script src="resources/js/argon.js"> </script> 
<script src="resources/js/argon-aframe.js"> </script> 
						

Add a simple AFrame scene down below


<ar-scene> 
 ...
</ar-scene>
						

Adjust the CSS a bit, add some Javascript and we're off...

Simple Declarative 3D AR Content


<a-box position="0 3 -10" radius="0.25" color="gold" 
       rotation="0 0 45">
  <a-animation attribute="rotation" from="0 0 45" to="0 360 45" 
	       dur="1000" easing="ease-in-out" 
	       repeat="indefinite">
  </a-animation>
</a-box>
						

A-Frame markup to create a spinning gold diamond

Geospatial frames of reference


<ar-geopose id="GT" lla="-84.394539 33.772501" userotation="false"> 
  <a-entity fixedsize="20" billboard>
    <a-plane rotation="0 0 0" width="2.9" height="4" src="#buzzpin" 
             transparent="true" ></a-plane>
    <a-entity css-object="div: #gtdiv" scale="0.02 0.02 0.02" 
        position="0 4 0" 
        showdistance="Tech Tower @ GT<br>Atlanta, GA, USA<br>It is ">
    </a-entity>
  </a-entity>
</ar-geopose>
						

A-Frame markup to put a pin at Georgia Tech

Simple Vuforia Setup and Use


<ar-scene vuforiakey="#vuforiakey"
       vuforiadataset__stonesandchips="src:url(StonesAndChips.xml);">
  <a-assets>
    <a-asset-item id="vuforiakey" src="key.txt"></a-asset-item>
  </a-assets>

  <ar-frame id="frame" trackvisibility="true" visible="false"
            parent="vuforia.stonesandchips.stones">
		...
  </ar-frame>
</ar-scene>
						

A-Frame markup to put content on a visual target

Web Ecosystem is Rich and Diverse

Many tools, from the simple to the elaborate

Mashups may suggest new ways of creating 3D!

Twine












(twinery.org)

Thoughts on AR plus Web

1: There will be Apps, Not Just "Content"

(or something like apps)

Non-trivial content requires code and interaction

Security and privacy require sandboxing

2: Will want to run many apps at once

Current focus on individual apps w/ full control

As with 2D/3D before, user's will want to mix and match

Mash up pages/apps, mix 2D/AR/VR content, mix displays and sensing

3: Decouple apps from "Reality"

Decoupling necessary for portability

Let app creators and users control "representation of reality"

Don't limit users to their local reality

4: AR is "just" a capability

There will still be apps, but not "AR apps"

AR something you add to a system or app


Think about what AR is good for and how it fits into web ecosystem

Future

My Hope: Mixed Media Mashups

2D + 3D + AR + VR + mobile + desktop + HMDs + ...

Focus on User choice


Web for AR/VR
and
Web in AR/VR

Plans ...

Time Frame Activity Goals
now "Web Now"
(w/ argon.js)
use cases
mashups w/ 2D, VR
web services
3-12 months Mozilla AR app
(mobile + desktop)
technical experiments
browser interfaces
1-2 years W3C, etc standards
(WebRTC, WebVR, ...)
1-3 years Firefox
(and peers)
WebAR
1-5 years ecosystem,
new devices
HMDs, IoT,
wearables, cars, ...

Going Beyond "Web Now"

Enabling Technology

Low-latency Video Processing Pipeline

Video frames into GPU/JS + vision processing

(WebRTC, workers, WebAssembly, Servo, etc)


Flesh out Needed Services

Already massive amount of content tied to the world
(Much of it lacks altitude, is inaccurate)

Camera Intrinsics database
(Vuforia, ARToolkit)
(eventually: W3C media-capture-depth, KHRONOS OpenKCam, ...)


Need new services, for object or location-based search and discovery,
world knowledge, object recognition, ...

What Might a First AR Browser App Look Like?

How Might This Be Done Using Gecko (or Servo)?

Keep: multiple apps, custom realities

Add: live video access, custom device support (Tango, Hololens, etc)

Explore: Privacy issues

Nice Features

Efficiency

Privacy / Security

Clean Multi-app Integration

Outstanding Isses

Battery

Device position (GPS, indoor localization)

Camera intrinsics

Experiment with Browser Experience

Extend Windows+Tabs model: "webRTC+vision", enhanced WebVR

Embeddable AR/VR Gecko

AR/VR-centric: Argon, sort of. AR/VR-first?

Other: link to the world, object recognition-based, Prox

Experiment with Use Cases for Web-AR

AR at intersection of Web, context/IoT, and 3D/VR










Perhaps Aim for a Complete Experience:
A Real AR "Pokemon GO"-like Game?












Thanks!

Contact me via email: bmacintyre@mozilla.com

To try some of this yourself



https://blairmacintyre.github.io/hawaii-all-hands-2016/