Augmented Reality and the Web

Past, Present and Future

Blair MacIntyre
Principal Research Scientist, Mozilla

Professor, School of Interactive Computing, Georgia Tech
@blairmacintyre / bmacintyre@mozilla.com

Past

Background, Terminology, Technology

Present

Web and AR right now

Future

Where do we want to go, how can we get there?

But First, Why?

Media

Mission

Mashup

Prelinary Ideas, I am Looking for Feedback!

And Hopefully Involvement!

Past

What is AR?

(or MR or Holographic Computing?)

Mixing media with a
person's perception of the world
registered in 3D, in real-time

Other approaches to context-based media:

Heads-up-Displays (Glass)
Map mashups (Ingress, Pokemon GO)
Geofencing (alerts)

Which is not to say that there's anything wrong with a small, lightweight, heads-up display for giving continuous access to contextually relevant information. Until all aspects of AR technology improve dramatically, we aren't going to be able to do much more in something that could be worn all the time. Beyond wearable heads-up displays there are other ways we can deliver location or content-based content to users, and while some call themselves AR, those are not what I'm focusing on here. When I talk about AR, I'm referring to the idea the NASA and Boeing folks had when they coined the term, namely mixing media with a person's perception of the world, registered in 3D. I don't limit AR to just visual media, or to specific display technologies. I say this not to start an argument about what is or isn't AR; rather, I simply want to clarify what I am thinking about when I use the term.

Ivan Sutherland, "The Ultimate Display", mid-1960's

One of the most exciting uses of AR is to create collaborative experiences, for both colocated or remote participants. While we've imagined these for years it's finally becoming possible to actually deliver such experiences, illustrated by the Hololens use of Skype for remote assistance. Many people have imagined or experiemented with multiple games, and with using avatars in the place of remote participants in meetings or other experiences. At Georgia Tech, we've recently prototyped a classroom for CS education, using projection AR based on Microsoft Research's RoomAlive software, where we building on what we know about studio-based education pedagogy to improve CS education by continuously explosing the hidden work products students are creating as they learn to program. This idea of passively and continuously exposing hidden work to facilitate collaboration or education is very powerful, and may one day represent a major win for AR technologies.

Authoring Tools for AR

Again: Why the Web?

The "Current" Web is Not Quite There

Local Video: Latency, no camera intrinsics

Tracking: Overheads and Speed

W3C standards, AWE.js, various Web-for-Native plugins

AR Technology

System Setup

Ron Azuma. "A Survey of Augmented Reality" Presence (1997)

System Setup

Ron Azuma. "A Survey of Augmented Reality" Presence (1997)

System Setup

Ron Azuma. "A Survey of Augmented Reality" Presence (1997)

System Setup

Ron Azuma. "A Survey of Augmented Reality" Presence (1997)

Can only display relative to what we
already know or can sense
about the world

Must display in real time (akin to VR)

Core problems are displays, sensing, and services for world knowledge.

Sense the world relative to display

Inside-out

(e.g., computer vision & depth sensing, object recognition)

Outside-in

(e.g., GPS, Vive Lighthouse)

Current Approaches to AR

Head Mounted Displays

It matters more what the user wants to do when deciding on platform. First, the companies in the space have been trying to distinguish themselves from each other with terms that pretty much mean the same thing. Google used AR to mean heads-up-displays, so Microsoft decided to use Holographic computing to distinguish Hololens from Glass. And then Magic Leap and others decided to use Mixed Reality to emphasize that their displays do a better job of mixing graphics with the world than the traditional approach to see through displays. I understand why they did this, from a marketing and branding perspective, and even agree with the distinctions each of these companies are making, but the downside is there there are a bunch of false distinctions and confusion when people try to talk about the technologies. To me, these terms are interchangable. Similarly, I think it's worth remembering that a lot of what is done in AR using different technologies is often done for expedience. There are good reasons for picking different technologies for different applications, and its often the case that the wrong technologies were used for experiences in the past simply because they were the only ones available. For example, a lot of our work on AR games would have been much more suited to HMD's like Hololens, but the tech wasn't available. I've seen a lot of discussion where people are proposing AR experiences on new HMDs and contrasting them with the work people have done in the past, and presenting that past work as flawed because the developers chose no not use HMDs.

(Perhaps HMD's will dominate when
they look like this, but that's a long way off)

Traditional (Handheld and Desktop) Displays

Projection Displays

Observations About Using AR

1: Intended Use

Tools vs Awareness

1: Intended Use

Tools vs Awareness

2: The "R" in "AR" Matters

AR depends on knowledge of the world

Can only augment what the system knows about

Knowledge (in apps) equals risk (for users and others)

3: Different ways to Leverage "R"

World as Massive Display Space (Structure)

3: Different ways to Leverage "R"

Recognize and/or Augment Specific Things (Semantics)

Present

This Presentation is Running in Argon4

on an iPhone

Using argon.js + reveal.js + aframe.js + argon-aframe.js

Stuff around the room

Computer vision AR w/ Vuforia

Planetary scale geographic AR

Custom Reality (Panorama)

Added Argon to reveal.js Demo

Add some new scripts up top


<script src="resources/js/aframe.js"> </script> 
<script src="resources/js/argon.js"> </script> 
<script src="resources/js/argon-aframe.js"> </script>

Add a simple AFrame scene down below


<ar-scene> 
 ...
</ar-scene>

Adjust the CSS a bit, add some Javascript and we're off...

Simple Declarative 3D AR Content


<a-box position="0 3 -10" radius="0.25" color="gold" 
       rotation="0 0 45">
  <a-animation attribute="rotation" from="0 0 45" to="0 360 45" 
	       dur="1000" easing="ease-in-out" 
	       repeat="indefinite">
  </a-animation>
</a-box>

A-Frame markup to create a spinning gold diamond

Geospatial frames of reference


<ar-geopose id="GT" lla="-84.394539 33.772501" userotation="false"> 
  <a-entity fixedsize="20" billboard>
    <a-plane rotation="0 0 0" width="2.9" height="4" src="#buzzpin" 
             transparent="true" ></a-plane>
    <a-entity css-object="div: #gtdiv" scale="0.02 0.02 0.02" 
        position="0 4 0" 
        showdistance="Tech Tower @ GT<br>Atlanta, GA, USA<br>It is ">
    </a-entity>
  </a-entity>
</ar-geopose>

A-Frame markup to put a pin at Georgia Tech

Simple Vuforia Setup and Use


<ar-scene vuforiakey="#vuforiakey"
       vuforiadataset__stonesandchips="src:url(StonesAndChips.xml);">
  <a-assets>
    <a-asset-item id="vuforiakey" src="key.txt"></a-asset-item>
  </a-assets>

  <ar-frame id="frame" trackvisibility="true" visible="false"
            parent="vuforia.stonesandchips.stones">
		...
  </ar-frame>
</ar-scene>

A-Frame markup to put content on a visual target

Web Ecosystem is Rich and Diverse

Many tools, from the simple to the elaborate

Mashups may suggest new ways of creating 3D!

Twine

(twinery.org)

Thoughts on AR plus Web

1: There will be Apps, Not Just "Content"

(or something like apps)

Non-trivial content requires code and interaction

Security and privacy require sandboxing

2: Will want to run many apps at once

Current focus on individual apps w/ full control

As with 2D/3D before, user's will want to mix and match

Mash up pages/apps, mix 2D/AR/VR content, mix displays and sensing

3: Decouple apps from "Reality"

Decoupling necessary for portability

Let app creators and users control "representation of reality"

Don't limit users to their local reality

4: AR is "just" a capability

There will still be apps, but not "AR apps"

AR something you add to a system or app

Think about what AR is good for and how it fits into web ecosystem

Future

My Hope: Mixed Media Mashups

2D + 3D + AR + VR + mobile + desktop + HMDs + ...

Focus on User choice

Web for AR/VR
and
Web in AR/VR

Plans ...

Time Frame	Activity	Goals
now	"Web Now" (w/ argon.js)	use cases mashups w/ 2D, VR web services
3-12 months	Mozilla AR app (mobile + desktop)	technical experiments browser interfaces
1-2 years	W3C, etc	standards (WebRTC, WebVR, ...)
1-3 years	Firefox (and peers)	WebAR
1-5 years	ecosystem, new devices	HMDs, IoT, wearables, cars, ...

Going Beyond "Web Now"

Enabling Technology

Low-latency Video Processing Pipeline

Video frames into GPU/JS + vision processing

(WebRTC, workers, WebAssembly, Servo, etc)

Flesh out Needed Services

Already massive amount of content tied to the world
(Much of it lacks altitude, is inaccurate)

Camera Intrinsics database
(Vuforia, ARToolkit)
(eventually: W3C media-capture-depth, KHRONOS OpenKCam, ...)

Need new services, for object or location-based search and discovery,
world knowledge, object recognition, ...

What Might a First AR Browser App Look Like?

How Might This Be Done Using Gecko (or Servo)?

Keep: multiple apps, custom realities

Add: live video access, custom device support (Tango, Hololens, etc)

Explore: Privacy issues

Nice Features

Efficiency

Privacy / Security

Clean Multi-app Integration

Outstanding Isses

Battery

Device position (GPS, indoor localization)

Camera intrinsics

Experiment with Browser Experience

Extend Windows+Tabs model: "webRTC+vision", enhanced WebVR

Embeddable AR/VR Gecko

AR/VR-centric: Argon, sort of. AR/VR-first?

Other: link to the world, object recognition-based, Prox

Experiment with Use Cases for Web-AR

AR at intersection of Web, context/IoT, and 3D/VR

Perhaps Aim for a Complete Experience:
A Real AR "Pokemon GO"-like Game?

Thanks!

Contact me via email: bmacintyre@mozilla.com

To try some of this yourself

Download Argon4 for iOS
http://argonjs.io: argon.js source code, samples & docs

https://blairmacintyre.github.io/hawaii-all-hands-2016/

Augmented Reality and the Web

Past, Present and Future

Past

Present

Future

But First, Why?

Prelinary Ideas, I am Looking for Feedback!

And Hopefully Involvement!

Past

What is AR?

Ivan Sutherland, "The Ultimate Display", mid-1960's

Authoring Tools for AR

Again: Why the Web?

The "Current" Web is Not Quite There

AR Technology

System Setup

System Setup

System Setup

System Setup

Sense the world relative to display

Current Approaches to AR

Head Mounted Displays

Traditional (Handheld and Desktop) Displays

Projection Displays

Observations About Using AR

1: Intended Use

1: Intended Use

2: The "R" in "AR" Matters

3: Different ways to Leverage "R"

3: Different ways to Leverage "R"

Present

This Presentation is Running in Argon4

Stuff around the room

Computer vision AR w/ Vuforia

Planetary scale geographic AR

Custom Reality (Panorama)

Added Argon to reveal.js Demo

Simple Declarative 3D AR Content

Geospatial frames of reference

Simple Vuforia Setup and Use

Web Ecosystem is Rich and Diverse

Twine

Thoughts on AR plus Web

1: There will be Apps, Not Just "Content"

2: Will want to run many apps at once

3: Decouple apps from "Reality"

4: AR is "just" a capability

Future

My Hope: Mixed Media Mashups

Plans ...

Going Beyond "Web Now"

Enabling Technology

Flesh out Needed Services

What Might a First AR Browser App Look Like?

How Might This Be Done Using Gecko (or Servo)?

Nice Features

Outstanding Isses

Experiment with Browser Experience

Experiment with Use Cases for Web-AR

Perhaps Aim for a Complete Experience: A Real AR "Pokemon GO"-like Game?

Thanks!

Perhaps Aim for a Complete Experience:
A Real AR "Pokemon GO"-like Game?