Beyond the web of today

Introduction to Web Directions Conference

Kenneth Rohde Christensen opens the conference, introducing himself and his long-standing work with WebKit, browser implementations, and as a co-inventor of progressive web apps. He currently leads web strategy at Intel.

Intel's Involvement in Web Development

Christensen explains Intel's focus on software and its role in enhancing hardware functionality. He emphasizes Intel's commitment to web standards and sustainability, aiming to make web a key application platform with high performance and extensive capabilities.

Strengthening Web's Unique Qualities

He discusses Intel's goals to strengthen the web's unique qualities such as speed, cross-platform functionality, and safety, while also addressing the challenges developers face in supporting multiple platforms.

Progressive Web App Specifications and Core Web Improvements

Christensen highlights developments in progressive web apps, including web manifest and service workers. He also covers improvements in core web technologies like CSS Grid, container queries, and Open UI.

Focus on Performance and Diverse APIs

Intel's focus on performance and WebAssembly is detailed, along with their efforts to support diverse experiences through APIs, particularly for applications like Bluetooth connectivity.

Importance of Interoperability and Web Trends

He emphasizes interoperability across browsers and the significant growth in progressive web apps, noting the high usage of browsers and web applications on desktop platforms.

Project Fugu and Bridging Gaps Between Native and Web

Project Fugu's role in bridging the gap between native and web functionalities is discussed, with examples like Visual Studio Code and Photoshop running in browsers through advancements in WebAssembly.

Performance Enhancements and WebAssembly

Intel's work on WebAssembly, SIMD, WebGPU, and Web Neural Networks is elaborated, showcasing their efforts to improve web performance and bring advanced computing capabilities to the web.

Introducing Web Neural Network API

Christensen introduces the Web Neural Network API and its integration into Intel's hardware roadmaps, highlighting the collaboration with Microsoft and the performance benefits of using Web Neural Network.

Focus on Creating Diverse Web Experiences

He discusses Intel's focus on enabling diverse web experiences, including advancements in telemetry, streaming, AI, and other APIs for web developers.

Innovating in Stylus Support and Other APIs

Christensen presents innovations in stylus support, highlighting the Universal Stylus Initiative and demonstrating the Pen Customization API's potential in enhancing web interactions.

Closing Remarks and Empowering the Web

In his closing remarks, Christensen reiterates Intel's commitment to empowering the web and enhancing future web experiences, inviting attendees to engage further during the conference.

Kenneth Rohde Christiensen: Good morning and welcome to Web Directions.

I hope you're going to get some two amazing days.

let's get started.

So thanks for introducing me, John.

this is me.

I've been working on the web for many years.

I guess more than 15 years, actually.

I've been working on WebKit, browser implementations Different standards I'm one of the co inventors of progressive web apps.

You might've heard about that.

and today I work at Intel, trying to lead our strategy on the web and actually go and implement all of these things.

so yeah, the first thing you might be wondering is you said, Intel, you're a semiconductor, you do chips.

Why do you care about the web?

the thing is that software.

Everything today, like software, is what's controlling the world today.

And I can make a chip that has an amazing new feature.

If you can't use it as a developer, it's useless.

You're just paying for dead silicon.

So we need to enable everything to run on our hardware, to optimize it, and to make sure that you have.

We have the APIs available to you.

So at Intel, we really value openness, choice, and trust.

Everything should run our system.

So we're heavily involved in standards.

You might recognize Wi Fi, the W3C, that's the World Wide Web Consortium.

That's what's relevant to us today.

bytecode alliance, USB, et cetera.

We also love sustainability.

This is something that's really close to my heart.

we actually.

Last year was number one most sustainable company in the US.

I'm pretty proud of that.

the Intel vision is that the web is an unquestionable key application platform.

We have all those core building blocks that you need to create applications.

while it's also fast, it should be as fast as native apps.

And you should have all the capabilities available to you to create those applications you want to create.

But, that said, we also want to strengthen the unique qualities of the web.

it's really cool that you can type in a URL and it just immediately loads.

You don't have to wait five minutes to download a big app and install it and all of that.

It works across different platforms, so just across a platform thing.

And, most importantly, it's safe.

you can go to many sites and you shouldn't worry.

about like this, like stealing your data on your computer.

So why do we care?

if you're a developer, you probably want to support more than one platform.

that costs money.

So are you going to do an iOS app, an Android app, a Windows app, a Mac app?

You probably don't have the money for that.

And you might also lack people with the skills.

So people have always been looking at these cross platform solutions to this, or hybrid.

What we're seeing today, especially on desktop, is that a lot of people are using web or using like maybe some hybrid solution like Electron.

To get those experiences out there.

So we can say the web is doing really well because it's on the right path.

We have really good building blocks for installability, creating these app like experiences.

We have the progressive web app specs like web manifest.

it's one thing I've worked on, service workers.

We also keep on improving the core.

We better CSS.

We, of course we have like CSS grid now.

A lot of people are excited about that.

A lot of people have always wanted container queries.

That's finally in browsers.

We have Open UI.

There's a group of people, a community group, working together on creating new HTML elements that people really need, like a styleable select element.

They just created a popover that is even implemented in the future, upcoming version of Safari.

So that's also going to be available to everyone.

New APIs, especially around navigation.

That's something I've complained a lot about.

Like, session history has a lot of quirks.

Now, at least in Chromium based browsers as Edge and Chrome, we have Navigation API.

There's the coming View Transition that allows you to create all these nice animations, like you click on a video, it will expand to a different view and expand back.

All of that is being solved on the web, so you can create those amazing experiences.

At Intel, we have a relentless focus on performance, so we need those applications to run fast on our hardware as well.

So we're working on WebAssembly.

We've been adding support of something called SIMD.

I'll talk about what that is later.

It stands for Single Structural Mobile Data.

Worked together with Google, for instance, MediaPipe.

That's their library based on WebAssembly that, that actually like does background blurring in, say like Google Meet, stuff like that.

And, we're bringing native Machine Learning to the web as well.

This is something we're working on.

So what is really exciting for me is that today Intel is building the web and web standards into our hardware roadmaps.

So we always consider this.

And we're innovating on new capabilities as a product called Project Fugu.

We'll also talk about that later.

We want developers to be able to bet on the web.

It's really important for us.

You should be able to create nice, great experiences without having, with ease, without having to rely on big libraries that might not work together with other libraries that has a lot of magic going on.

So if you look at DevTools, you might not understand what's going on.

You can't modify it.

And we want to provide APIs for diverse experiences.

We don't want the web to just be for one specific set of apps.

if, you're Lego and you want to create a web app where you can control a Lego thing, you probably need Bluetooth.

So we need to make sure that you have that API available, otherwise you can't create that sort of apps on the web.

and we want the web to be there for the next industry, revolution.

Something like today we're getting AI.

It's very important.

Because you're investing in the web as a developer.

you want to make sure that you can keep investing in, five years and you don't need to suddenly.

Remove everything and start over and do a native app.

Interop also really matters.

So it's really important that what you built...

work across different browsers, sometimes the feature is not available in a certain browser and you can do feature detection.

That's fine.

But at least behavior should be the same.

So I'm very happy that all major browsers have gone together and worked on this interop project every year.

They choose a certain set of APIs, they believe are the core APIs that developers need today.

They actually ask for feedback from developers and then they go and make sure that they're really well tested and they work the exact same way in all browsers.

It has been a pretty winning strategy.

for Intel, we see that there's 17.

4 million JavaScript developers today.

More than 60 percent of time, on a PC is spent on the web.

And we're seeing enormous growth in progressive web apps.

This is the data from Intel.

We have some stats ourselves.

and it's interesting.

You see the top application on Windows, is actually Chrome.

It's a browser.

Second most popular application is Edge.

That is also a browser.

Then you have Firefox, that's also a browser.

then you have Outlook, that is becoming a progressive web app.

also web.

then you have other browsers like Brave, Vivaldi, etc.

Then you have Excel and Word, they're also becoming progressive web apps.

It's basically all turning into web.

Really good.

But as I said before, we really need diverse APIs, so you can build all these kinds of experiences.

on the web, we've really done that with the last year, we've come up with like seamless copy paste where you can really modify what you're pasting in, what kind of data, frictionless access to local files, very important for desktop, and safe access to, external hardware.

This is part of a project we call Project Fugu.

So Project Fugu was a project that was started by Google and then Intel and Microsoft and...

Later, Samsung and Electron joined to make sure that we have all those APIs available to developers, basically bridging the gap between native and web, making sure that you can rely on the web in the future.

Here are some great examples.

First, Visual Studio Code.

I don't know if you knew it, but you can go to vscode.dev today, and you can even install it as a progressive web app.

And you can start coding.

You can open your local repositories because it's using a new API called the file system access API that we engineered to get us as part of the Project Fugu, even like this is the app.

I always heard people say the one app that is never going to be available on the web is going to be Photoshop because that's impossible.

And here we are, Photoshop is running in the browser.

They made that possible because we had like WebAssembly.

So WebAssembly allowed them to take their old C++ code, get it compiled to somebody that works in the browser.

It was unfortunately too slow, so we worked together with them on bringing multi thread support to WebAssembly, that made a big difference.

Then we worked on this like SIMD, I'll talk about that later, some other specific instruction sets we have in our chips.

And there was actually one case that was 160 times faster, but it averaged three to four times faster, and that made it possible for them to have Photoshop in the browser.

They had another issue, like images are really big, you might have an image that's five gigabyte, that's not fitting in memory, so they needed some kind of swapping cache, so normally you could use the hard drive for that.

So, Google came up with this API called, originally it was called like native IO, then it's turned into become origin private file system.

So it's based on the file system access APIs, very similar.

but it allows a website per origin.

So it's only yours to create a file that you don't see if you go to your file explorer or whatever you have to look at the file system.

You don't see it.

It's hidden.

It's to your site only.

And this, file has like specific APIs that makes it really fast.

This has also allowed people to bring MySQL to the web.

so WebSQL is being deprecated in browsers.

Instead, you can just download MySQL and have it work fast.

Another example, here you have iRobot.

They have this tool for kids, schools, it's called Root.

And basically you can learn to program like in the visual programming style.

and then you can put a pen in the middle, I believe, and you can draw on some paper.

and all of this is controlled by Bluetooth because we have the Web Bluetooth API also as part of Project Fugu.

What is really important, is that this stays safe on the web.

so a lot of people, they have this idea that, native app is great, but it's a fake kind of safety, just because an app is being reviewed by someone doesn't mean that it's safe by, the way that things was designed on native, is that long ago, everything was safe.

You had access to everything.

So you don't know what these apps are doing.

you just trust them, but on the web, you visit many, sites every day, every month.

so it really needs to be safe.

One example, Bluetooth, WebBluetooth.

you don't want a website to be able to see what Bluetooth devices are around you.

Maybe use that for tracking.

so on the web, you don't get access to Bluetooth unless you click on something, then it's able to show you a dialogue.

They can add some filters to what devices to show you.

Those devices in that dialogue, it's not shared with any website.

Only the one you select and connect to.

So we're trying to always to keep this in mind and innovating in safety.

and we're always improving.

So what's the current focus at Intel?

at Intel, we're working on capabilities like bridging the gap as part of Project Fugu, and we're working on performance to make sure that you have near native performance, as we say at Intel, we don't want any silicon left behind and, and we really want to optimize for silicon.

So sometimes performance is actually enabled by new capabilities.

here's some examples, WebAssembly provides really fast CPU execution.

if you have some code that really has to be fast, you could write that in WebAssembly.

it now has support for SIMD and multi threaded support.

WebGL is also something we've worked on, works on almost every device today.

But we've been working on WebGPU and bringing that to web.

That's 3.

7 times faster than WebGL in, in, some of our testing.

That's really amazing.

And we're bringing, Web Neural Network to the web where you can access, it can work on CPU, GPU, and any accelerator that supports, Machine Learning.

So SIMD, the first thing we added to WebAssembly, it stands for Single Instruction Multiple Data.

So you can imagine normally I could add two numbers, it's a scalar.

But actually many, CPUs that support doing this per vector.

And it's really important if, for instance, if you say that you are doing something with an image, every pixel is normally like four numbers.

There's red, green, blue, and alpha.

So now you can maybe multiply these, in one instruction.

That's generally four times faster.

And actually on our hardware we have that, normally it's like the standard, because that's supported globally, on ARM and every other silicon, supports 128 bits, so that's normally four values of 32 bits.

We actually have hardware that supports up to 512.

normal CPUs actually have 256 bits.

So what we are doing now is that we found out, if you, can see that you're using two instructions with 128 bit, why not just use our hardware for 265 bits?

So we did that, detect that manually, like, dynamically.

And that provides you an, 20 to 50, percent speed up.

Just using SIMD.

So one of the examples of what we've been doing, another example we talked about enabling hardware.

Chips today often have, encoders and decoders in the chip.

In the sock.

so you can imagine that to get like good, YouTube, playback time on your phone.

It's, it should use that hardware fit.

So that's great.

That works fine for playback, but if you like a game developer, you don't want to do streaming, or you're creating like Google Meet or Microsoft Teams.

You might want to modify that.

You probably want to set the resolution, the frame rate, and all of that.

And we haven't been able to do that before.

So we've been working together with Google on WebCodec, an API that exposes all of this hardware enabled codecs and encoders to web developers.

And that is being adopted by companies such as Zoom.

To make this really much better on the web and more performant.

Something else that's happened, you've probably heard about this on your, maybe your phone.

You talk, ARM talk about Big Little, or Apple talk about performance cores and efficiency cores.

We have the same, today.

So you, can see the difference here.

Every time we have one of these performance cores, that's the best performance you can get.

You could actually just have maybe four efficiency cores instead.

That's better if you're doing parallelism, it's better for battery, but we need to use that on the web as well.

So one of the things we did is that we just looked, for instance, at different threads inside of Chrome.

some of them are not so important.

So we did it should just be a background task that should be like a resource efficient task, display critical, and the schedule, we'll try to schedule those the right way.

And for instance, with video conferencing, we had like more than 100 milliwatts power savings in Windows just by making this one change.

So another example.

We also just look at Chrome, Chrome is a multi threaded architecture.

You have many processes.

Every tab is its own process.

There's the compositor thread where it's doing all the graphics.

So there's a lot of what's called IPC, Inter Process Communications, going on.

And here we just went and exercised it.

Can maybe, we can reduce that communication to the bare minimum.

And by doing that, we got like, here a 10 percent power saving on, video playback on Windows as well.

And this benefits everyone, there's also benefits for ARM and AMD, so we're doing all these kind of optimizations inside of Chromium.

I mentioned like WebGPU before, so WebGPU is bringing modern graphic and compute to the web.

Intel has been involved basically since the beginning, on the standard and on the implementation and it was recently at Google I/O announced by Sundar, the CEO of Google, that they're now shipping WebGPU in Chrome.

And you'll see there also a shout out to Intel for invaluable support of making this possible.

So this is really, cool.

WebGPU is awesome for graphics.

This is basically like in the old days everyone was doing like something like OpenGL.

that had a lot of state, it managed state.

So a lot of CPU, and that was great when people started to learn how to program like 3d graphics.

it was awesome.

Then people found out it's not always the most efficient, especially if you're doing a game engine, you could probably do it better manually.

So people came up with, Apple came up with Metal.

there's Vulkan, used on Android and other platforms.

There's DirectX, all our modern, graphics libraries.

And, basically, this is what WebGPU is for the web.

It's like a modern, low level API that allows you to do something amazing graphic.

You're probably not going to use it directly.

You're going to use something like Babylon.js or Unity or, 3GS when developing.

And that also provides me like a segue to AI.

Let's talk about AI because WebGPU is modern graphics.

It also allows you to run, compute.

Basically GPUs have this called shaders.

We have a vertex shader But you also have a compute shader.

We can actually do calculations.

That's why a lot of people are using GPUs for Machine Learning because it's great at that.

So cool.

We have this available to the web as well now and actually we're trying to support those existing frameworks.

So Intel is the owner of the WebGPU back end for TensorFlow, for Machine Learning.

So we've been collaborating closely with Google on making this possible.

And here you see, like, how much faster it is.

So this is, this one case here was a bit slower, but in general it's much faster, just moving to WebGPU.

And in one case it's almost two times as fast.

It's more than two times as fast.

So here's one example of what you can do now in the browser without using the cloud.

So we're doing stable, Diffusion here.

It should take around seven seconds, I believe.

So I'm generating a, a sunflower here.

And that's just running in the web browser using WebGPU.

What is interesting about Machine Learning, so I'm focusing on Machine Learning because it's one of the new big thing that is on everyone's mind, is that there's a lot of different kind of hardware.

So in some cases, the CPU is actually the best for inference.

it could be for like...

You just need to get a result really fast.

memory is on the CPU, so you have access to that.

So it might be very fast to get result, but you also see if you, performance, it's never going to be great, and you're spending a lot of power.

In some use cases, that's better.

For some models, this actually works better.

Then you have something like the very best today is like discrete GPU.

You don't have a discrete dispute normally in your phone, it will take too much power, but you see it scales really nicely, get good performance.

What you notice is that it also has a minimum, you need first off to prepare all your data and copy it to the GPU, so it's not good for like a quick result.

It's good for sustainable AI, keep on figuring out what's in the screen, track my eyes, something like that.

It's really great for that.

if you go to an integrated GPU, it normally shares the memory with the CPU.

So it's slightly, it's better, but it doesn't scale as much as a discrete one, because of power envelopes, et cetera.

And then they have the new thing, what's called like a hardware accelerator.

We call it VPU for Versatile Processing Unit.

It used to be Video Processing Unit.

That's why the V, I believe, Microsoft call it an NPU for Neural Processing Unit.

Google calls it a Tensor Processing Unit.

Yeah.

Many names, same thing.

But it would be really cool to get that available to the developers.

You also have to understand that AI can also just be there in a general API.

Like we've been working together with Zoom on, for instance, adding background blurring to the web platform.

So that might use like a VPU in the background and be really fast.

You don't necessarily need to use TensorFlow or anything like that.

So let's look at this new Web Neural Network API that's soon coming to browsers.

actually, I said that these Web APIs are becoming part of our roadmap and this is the first time I've seen an official intro slide that actually shows Web Neural Network.

That's a Web API.

I think that's pretty amazing for our team.

Very proud of the team here.

And, as it says on the slide, our next CPUs coming out this year, actually has a VPU built in.

really great.

we have some good quotes here from, Bilt.

There's one from Microsoft saying we are working on Web Neural Network, can do local inferencing in browser.

We also implement the Web Neural Network standard and working with Intel.

It's coming soon at this point.

So it's like even Microsoft and Intel and others are very excited about this coming to the web.

Here's some stats.

Let's just zoom in.

this is just using CPU.

That's the implementation we have in Chrome today.

We also have an experimental GPU implementation.

There'll be a VPU implementation.

But you see here, if you're using WebAssembly today, you get like one.

It's that's our base.

if you go native, that's 9.

8 times faster than WebAssembly.

So this, if we start adding, SIMD, then you see WebAssembly becomes much faster.

You can, three times speed up.

If you use Web Neural Network, it's basically as fast as native.

So I think that's pretty amazing.

That really shows that how much faster we can get this on the web.

so let's look at this in a, I don't know if the sound is going to work here, but here we have one of my colleagues, Benham Jiang, who did this demo.

He's trying to show you that they're doing background blurring using a model.

using, I believe it's WebGL.

Let's see what, I think it's SIMD and then with the Web Neural Network, and then he's doing background, segmentation as well.

And you'll see the speed up.

Let's see if, if it's going to play with sound.

Hello, you will see the inference time and frames per second data showing the indicators.

Furthermore, on the right side, I collected the performance data for every two seconds, display a total of 11 groups.

And calculate the medium value of them.

Let me click the web backend button.

The inference time and the frames per second is faster, and the performance comparison result is based on the medium value above.

As you can see, the web is about 2.5 times faster than WebAssembly.

Now, I will showcase the background replacement performance with DeepLab model.

It is a relatively bigger model than CLP Segmentation, and I will compare the WebAssembly performance with WebNN.

Let me go to the full screen and click the Background Replacement button, and change the background at first.

It's running on CLP Segmentation now.

Let me click the DeepLab button.

By using the WebAssembly backend, you can see the inference time is more than 130 million seconds, and the frames per second is about 7 in the indicators.

Let me switch to the WebNN backend.

The performance is much faster now.

The inference time is less than 40 million seconds, and the frames per second is more than 30.

It's about 4 times faster than WebAssembly.

So you, really see that this can really make a difference for web developers, especially because it's going to democratize AI, like you can get it on any website, you don't have to install an app and you can start to use this for any kind of use case you find with AI, right on the client.

so why are we bringing this to the web?

we want to enable those new, exciting use cases for web developers.

and of course we want to take advantage of our hardware, right?

So there's different options today for using machine learning.

You can use WebAssembly that runs on the CPU, great for some use cases.

You can use WebGPU that is running on GPU only.

With the new Web Neural Network API.

It's, it'll run on whatever hardware you have available and it will allow you to choose.

so here's like an understanding of how it works.

below you have like hardware, so there's like chips, like a SOC often called.

It contains maybe a GPU, a CPU and a BPU then you have like native APIs, like on Windows, you'll be DirectML.

or you can use like an API like Intel's OpenVino.

then you have browsers and on top of that, you get those low level APIs, but as a developer, you're probably gonna use something like TensorFlow.js or MediaPipe web for Google for, or.

OpenCV.

js, or you're going to use like high level APIs, like I said, just turn on noise suppression, background segmentation, etc as we're bringing those to the web.

So WebNN is a low level API, just like WebGPU, it's not supposed, you're not supposed to use that directly unless you're a framework developer, but provides really native like performance and reliability of results.

if you know anything about Machine Learning, you can see this is like how it looks, very low level, if you don't know about it, you can ignore it.

This would be like how you would code it, but as I said, you're probably going to use a framework instead.

our current focus at Intel is also bringing working on capabilities to creating these amazing experiences.

one of the things I've been working on is telemetry.

how could we provide some of the telemetry we have on our chips to web developers?

we actually had a lot of requests for this, for instance, Zoom wanted, for instance, saying Oh, it's really bad on, some hardware.

people are in a meeting, someone's taking minute notes, and they type.

And nothing happens.

And then they see 500 letters on the, keyboard, on the screen.

and we really want to avoid that.

So maybe it would be good to know that it's the system is being stressed and maybe we should just turn off the video feeds or lower the resolution or whatever we can do.

So we've been working on, bringing that to the web.

like example here, like you can turn off noise suppression, you could adjust the resolution, feeds, et cetera.

The API, because everything on the web can be abused, we want to make it privacy preserving and secure.

So we came up with some high level, like pressure states, like the system is on a nominal pressure, it's on a fair pressure, now it's getting serious.

So you see there's some, explanation here of what that would mean, and critical now is really, bad.

Very simple API, try to make sure that it works in workers and iframes, because that's what people want.

Maybe you want to have your Zoom or Meet SDK built in, maybe on a website as a.

like a hotline, integration with permissions.

we were building this on the Observer pattern.

It's very popular on the web today.

I have one example and let me see if I can actually show that here.

And if that's actually going to work.

The idea here is quite simple, it's a website, you turn it on, it will show like a nice emoji of how the system is doing, if it's not turned on, it will just be sleeping, and then I have this Mandelbrot, I'm, just doing that, and running Mandelbrot in as many workers as possible, so you can stress the system by, adding workers and you'll see the guy is, here, he's under critical pressure.

It's just a very nice example where we just, we're announcing an audio trial, so that's in the next version of Chrome, I believe 115.

That's audio trial.

You can sign up and start using this API, try it out, on your websites.

Pretty cool.

There's an article here on developer.chrome.com.

If you want to check that out, we're also working on improvement for streaming.

So I said a bit about AI before.

You've probably noticed that during COVID, like a lot of people started working from home.

we saw like a, it's like Google meet had a 30 times increase in daily usage from January, 2020.

so it's really, important.

That also means it's very important to us.

Because people use that on their machines.

So we've been looking at and talking to developers, what kind of APIs do you need for, meetings?

And they said, native background blurring, that's fast and efficient.

face detection, face framing, noise suppression, uh, web codecs, we already talked about.

So here's some example of what we've been working on.

So here's one of my coworkers, showing that we have this, background blurring, natively supported in browsers.

So this is a prototype.

I actually believe there's an [indistinct] as well in Chrome today.

so it's one of the examples.

we played around with the APIs, tried to make it really simple.

So this is using WebRTC.

So you see, basically it's just you just add a capability.

I don't know if we ended up with exactly this API.

There's some, playing around to find out what is the best API for users.

Do they really want to configure the amount?

Maybe they want just two states.

Maybe just want background blurring.

Here's some examples of the, is it actually faster and better?

And yes, it is.

MediaPipe, that's what Google was using.

For Meet, you see that's using, here with GPU, so probably using WebGL.

that takes a lot more power, than not using the feature.

And with our prototype, you see it's...

A little bit more power, and this is not even the final prototype.

This is just, I don't know how to implement that.

That's not even using our VPU yet, so that would be more efficient even.

Face detection, something we worked on at Intel for a long time.

We played around with OpenCV some years ago, many years ago, It's 8 megabytes with WebAssembly.

It's a bit too big for regular users to adopt it.

Google, they did something very similar with the MediaPipe.

It's two megabytes.

It's also big, but it's maybe manageable.

we worked on shape detection, which apparently is, actually something, an experimental feature on the latest version of iOS.

So you can apparently play around with that.

So that's pretty cool.

so that, but that works on still images.

So it's really good for say like detecting a barcode, a QR code.

so now we're working on doing this as part of WebRTC instead.

And of course we have an example here, so another one of my co workers, and you'll see it's just, it's detecting where his face is, and it even works with more than one person, so let's see, someone else is joining, and it should be tracking two people.

Perfect.

But the same thing as like here, we also do, we're playing around with the API shape.

Always want feedback from developers like yourself.

here we're just looking at bounding box, but in reality we will, we probably want something as a contour as well.

So you can do like funny hats.

It's very nice where the eyes, where it was the nose and you can do funny things.

So yeah, early prototype, same thing as before, much more efficient.

people today will be using something like MediaPipe.

wow, it takes a lot of power.

You're losing many hours of your battery.

but with this new implementation, it's not so bad.

Maybe it's just 20 percent more.

we've also worked on APIs some years ago, with my co worker, Rijo, showing that some cameras actually support native zoom and pan and tilt.

so you can do that manually on the web.

You could run a Machine Learning algorithm and do that yourself.

But why not just bring that also as a high level API, just follow my face, follow me around.

so we also have a prototype of that working.

and, we've been looking at something like, eye gaze correction.

It's not the best picture, it's a bit difficult to see, but a lot of people, they have the camera up here.

And you're looking at the screen to your friends and it looks like everyone's looking down.

so actually using Machine Learning and slightly modifying the eyes, it looks, it's maybe a bit like Uncanny Valley, so let's see if we do this or not.

But it's like, those are the things we're toying around with.

and of course like we also brought some years ago, we brought all these like knobs to, to, to the web, like with media captures, you can actually change the color temperature brightness manually.

But why not just do that?

As well, let's just do it like a high level API, correct lighting.

Perfect.

Before we're ending off, I just want to show you one last thing we've been working on.

So this is my coworker, Alexis Minard.

he's working on stylus support because it's also one of those use cases for the web.

This is super exciting.

And there's actually a standard Intel is involved with called Universal Stylus Initiative.

that allow people to buy stylus from different companies.

They can use it at the same time.

And in this, small video, it's going to show you.

That you can actually configure and store in every stylus, like the size of the pen, the color, and you can just, oh, this is my red pen, this is my blue pen, and there's even some of these pens that you can use for detecting colors in the real world and have that, work.

So let's check it out and see if sound is working.

[computer audio] Hi everyone, my name is Alexis Minard and I'm a software engineer at Intel.

Today I'm going to showcase the, use cases that we can enable with the Pen Customization API that we're proposing in the W3C.

So let me take this laptop and then switch it to a kind of a tablet and then, and then showcase the API.

I have with me the styluses.

They are compatible with the, USI, standouts.

So USI stands for Universal Stylus Initiative.

it's a, pen to touchscreen protocol.

that works across brands and, touch controllers, right?

the benefit of this, Universal Studies initiative is that the, pens, they don't need any pairing whatsoever.

And this pen from HP works, but I can take this pen from another brand and it works.

And then the last one is from Lenovo, and same thing, it does work, right?

so that's very convenient, you can take your pen and then use them on different hardware of different brands, and you don't have to worry of buying a pen for a given computer, right?

The other thing that the, Universal Stylus, Initiative pens have is a little memory inside them, where you can store preferences or customization, for example, your preferred inking color, your preferred inking style or your preferred inking width, right.

So, in this web application, we made a little panel that only shows up if the paint customization API is supported, and then if you click on it, you can see what's inside the stack, right?

In this case, there was a red color that was stored with a 21 pixel width and then a pencil style, right?

let's say I can draw and then I like my setup and then I want to store, for example, let's say the black color into the pen, right?

And I can just store it and then automatically it gets stored into the memory of the stylus, right?

As you can see, if I pick another pen, then, we can see a different set of, preferences, right?

And then if I take the last pen, for example, here, you can see here a different set of colors, right?

and, attributes.

as it is right now, it's not super useful.

Now, imagine a scenario where you set up your pen the way you like, with maybe a different tip, and then, you want to be able to quickly switch them like if you were doing in the real life, right?

that's, in this application, we actually developed a little setting here, which is I'm saying that anytime a pen comes close to the screen, then we will automatically fetch the preferred customization and then set them into the editor automatically, right?

So here in your case, so like you can see up, it's drawing with the right preferences, which was black, 21, and pencil, right?

Now, let's say I take this other pen here, and then it will draw with another color, right?

And same thing with the last one, which was yellow, if you remember.

it's very convenient.

I can, switch pen very quickly and then do things, right?

as a designer, it's very cool.

so that's one of the use cases.

The other use case that we enable is, for example, on this pen.

This pen has a little color sensor, on the end of it.

It's on the other end of its tip, right?

it basically, it's a kind of a color picker, but in the real life, right?

for example, you brings it close to a color like here, and then you can scan the color with it by just clicking on the button, right?

And then when we bring it here, automatically we read the color.

You can see it was picked up right here, right?

Which is quite convenient, right?

So now let's say and then I can draw with it, right?

You can see it's a different color.

Now, let's say I scan that blue color So I can scan it and then if I bring it here you can see it picks it up, right?

So and then I can you know draw with it, right?

And then let's do one more time with the red or the green color Sorry And then you can bring it here, you can see it picks it up, right?

So it's very convenient, if you have an actual color palette here, you can read to it and then store it and then bring it to your computer, right?

so yeah, that's basically the kind of two main use cases that we enable with the new API and I hope you enjoyed the video and feel free to give me feedback and I'm looking forward to talking to you.

Thank you.

Yeah.

So that was my last example of, to show that we work on all kinds of different APIs, it's not always just like CPU or hardware, sometimes we're actually working on a new experiences as well.

even like foldable displays, we worked on a spec for that.

so with that said, like Intel, we are working to empower the web and future experiences.

And I hope you enjoyed this conference.

And I will stick around both days, so come talk to me if you want to learn more.

And with that said, I have now kicked off this conference.

Take care.

Thanks for listening.

A collection of logos and images on a plain white background. On the top left, there is a blue square with the text "TAG W3C alumni". In the center, there is a photo of a smiling man with sunglasses, posing by the sea. On the top right, there's a technical drawing of a skateboard. Below the skateboard drawing, a yellow square with "JS" written in black. Various logos are scattered around: Google Chrome, Microsoft Edge, Intel, the W3C logo, and Code23 at the bottom left.

I'm @KennethRohde

Principal Engineer and Web Platform Architect at Intel Corp.

  • 15+ years experience building browsers and web standards
  • Tech-lead on the Nokia N9 browser (MeeGo)
  • Co-inventor of Progressive Web Apps
  • Was part of the W3C Technical Architecture Group for 4 years
  • Leading Intel’s participation in the capabilities effort, Project Fugu

We're a semiconductor

But the world today runs on software

Open, Choice, Trust

Strong supporter of open-source and standards

intel.

The image shows a collection of logos and phrases emphasizing openness, choice, and trust, highlighting a commitment to open-source and standards. Logos include USB, BYTECODE ALLIANCE, a Chromium-like logo, oneAPI, OpenVINO, Code, PCI Express, W3C, Intel, Wi-Fi, and a WebAssembly logo (denoted by "WA"). A Linux penguin mascot is also present, symbolizing open-source.

We also love sustainability

Something close to my heart

Strong believer in sustainability, responsibility and inclusiveness

https://csrreportbuilder.intel.com/pdfbuilder/pdfs/CSR-2021-22-Full-Report.pdf
Code23

The slide features a large purple heading that reads "We also love sustainability," followed by a subtitle "Something close to my heart." Below is a symbol representing sustainability and a statement about being a strong believer in sustainability, responsibility, and inclusiveness, along with a URL link to a CSR report. On the right side of the slide, there's a section from Barron's report titled "Special Report: Barron's 100 Most Sustainable U.S. Companies 2022," highlighting Intel's rise from 47th to 1st place in sustainability rankings. The section includes a comparison of 2022 and 2021 ranks, with Intel's current rank, ticker symbol, and final score at the top of a list of companies. The Intel logo is situated at the top right and bottom right corners.

Intel Vision

The web is an unquestioned Key Application Platform

  • Offer the core building blocks for application development
  • Achieve great, near native-like performance
  • Offer capabilities and common features available to native apps

The unique qualities of the web, strengthened

  • Remain cross platform to work across all desktop OSes
  • More responsive - load instantly, be responsive from the get-go
  • Meet and exceed the user safety and privacy needs

Vision

Why do we care?

  • Platforms/ecosystems are only as important as the experience they bring
  • The cost of creating high quality apps for the target audience often a balancing act

Today it is often mobile first with a web based (or hybrid solution) for the desktop

Platforms?

Native or cross platform solution?

Skillsets?

Code23 intel.

The slide is titled "Why do we care?" and explores the decision-making process around software development platforms. It includes three points: "Platforms/ecosystems are only as important as the experience they bring," "The cost of creating high quality apps for the target audience often a balancing act," and "Today it is often mobile first with a web-based (or hybrid solution) for the desktop." To the right, three questions are posed: "Platforms?" with a thinking face emoji, "Native or cross platform solution?" and "Skillsets?" with an image of a wallet with a dollar sign and a juggler juggling colored balls.

Vision

The web is doing well because it is on the right path

  • The Web is becoming an unrivaled Application Development Platform
    • Solid application building blocks
      • Installability: Web App Manifest and Service Workers
      • Improving the core: CSS Grid, Flexbox, Container Queries, Open UI
      • Design Systems powered by Web Components
      • URLPattern, Navigation API and View Transitions
    • Relentless focus on performance
      • Web Assembly, SIMD optimizations
      • MediaPipe with native support for framing, background blurring etc.
      • Native support for machine learning
    • We are building the web into our hardware roadmaps!
    • Innovating on capabilities via Project Fugu, e.g., File System Access

Vision

Interop matters

Interop 2023 Dashboard

The image displays the "Interop 2023 Dashboard" with the subtitle "Interop matters". At the top, there are two tabs labeled "STABLE" and "EXPERIMENTAL". Below, there's a split between "INTEROP" and "INVESTIGATIONS". Under "INTEROP", there's a number '62' in gold, and under "INVESTIGATIONS", there's a red no-entry symbol. Along the bottom, there are scores and logos for different web browsers: '86' above the Chrome Dev logo, '86' above the Edge Dev logo, '74' above the Firefox Nightly logo, and another '86' above the Safari Technology Preview logo. In the bottom-left corner, there's the Code23 logo in purple, and in the bottom-right corner, there's the Intel logo.

Vision

A winning strategy

JS Developers – 17.4 M[1]

  • Web app & front end
  • Web framework & runtime

60% of Dev community

60-65%[2] of the time spent on PC is spent browsing the web

~65% Time spent on Web

Rise of modern web fueled by PWA[3]

  • Growth of available PWAs in 2021
    • 270%
  • CAGR 2021-27
    • 31.9%
  • Google & MSFT leading the way
  • Cross platform with great support on desktop OSes like Windows and ChromeOS
  • Most browsers are built around open-source projects which we can contribute to
  • Web features are being standardized in the open by interested parties
  • Interoperability and backwards compatibility ensure the platform keeps evolving

Web & Progressive Web Apps dominate PC users’ time

Slide has two sections: "Application Usage by Category," depicted as a donut chart with various categories like Communication (9%), Office (9%), Game (5%), and others with corresponding usage times. "Browsers are top apps on PC," displayed as a horizontal bar chart with Chrome, Edge, Firefox, Outlook, and other applications showing varying usage levels. "Almost 2/3 of PC cycles worldwide are spent on the web" is stated at the bottom, with an inset box showing Web use at 61% on weekdays and 66% on weekends, sourced from Intel DCA, July 2022, with data from ~8M PCs worldwide.

Capabilities

Project Fugu

A diverse set of APIs is required

  • Core capabilities for desktop apps are now built in
  • Seamless copy and paste
  • Frictionless access to local files
  • Safe access to external hardware for education, hobbyists, and enterprise usage

Project Fugu effort

Features a love heart and logos of Google, Microsoft, Intel, Samsung, and Electron, highlighting collaboration or support for the Project Fugu effort.

Capabilities

Enabling new developer experiences

Visual Studio Code

Visual Studio Code for the Web

Visual Studio Code for the Web provides a free, zero-install Microsoft Visual Studio Code experience running entirely in your browser, allowing you to quickly and safely browse source code repositories and make lightweight code changes. To get started, go to https://vscode.dev in your browser.

Made possible with File System Access API

Screenshot of the Visual Studio Code interface.

Capabilities

Give new life to legacy apps

Made possible with a set of new APIs

  • High performance storage - Origin Private File Systems provide highly optimized in-place and exclusive write access
  • Web Assembly to bring existing C++ code to the web
  • Dynamic Multithreading support for Web Assembly
  • SIMD - Halide is essential to Adobe's performance and it provides a 3-4x speedup on average and in some cases an 80-160x speedup.
  • P3 colorspace support for canvas
  • Web Components
Code23 intel.

Logos for Web Assembly and Code23, an image of an elephant being edited in a software program, and the Intel logo."

Capabilities

Connect with external devices

Introducing the Root® robots for coding, discovery, and play.

Creative and seriously fun, a love for coding is one of the best gifts you can give a child. The iRobot® Root® coding robots make learning to code easy and natural in any environment, at home or at school.

Icons for Bluetooth and a pufferfish symbol with a copyright sign. Image of a robotic device on a golf-themed mat with a GUI interface in the foreground showing coding elements. A web browser URL bar is at the top.

Capabilities

Safety and privacy

Native apps offers "fake safety", but users use just a few apps

  • Signing, app store approval, yet often own installers (Windows)
  • Very powerful direct access to many native APIs without user approval
  • Users install a limited set of apps, but browse many web sites/apps per month

Web APIs are designed with safety and privacy in mind (hard challenge).

We are constantly innovating and refining our approaches

Logos for USB, Bluetooth, and NFC are shown.

Current Focus

Capabilities

Bridge the gap between the web and native

No silicon left behind as the desktop moves to the web

Performance

Achieve near-native performance

Optimize the use of Intel silicon

Performance

SIMD: Single Instruction Multiple Data

Web Assembly standardizes 128 bit (4 times 32 values) SIMD operations
Intel (Adv. Vector Extensions) has hardware supporting 256 bit (AVX2) and 512 bit (AVX512)

There are two diagrams comparing SIMD and Scalar modes. SIMD Mode diagram shows eight pairs of labeled boxes (A7 to B0) side by side above an addition symbol, resulting in eight new boxes that represent the sum of the corresponding A and B boxes. Scalar Mode diagram shows a single pair of labeled boxes (A and B) above an addition symbol leading to a single box representing the sum. Below, there are two horizontal bar representations showing the width of SIMD data paths for SSE (128-bit) and AVX (256-bit).

Performance

Accelerating Wasm performance

Convert 128-bit Wasm SIMD instructions into 256-bit IA instructions dynamically

The left side of the slide includes two code snippet boxes labeled "128bit SIMD" with example instructions, pointing towards a box labeled "256bit SIMD" that has a similar but shorter code snippet with "ymm" registers instead of "xmm" ones. On the right side, there's a graph with the title "XNNPACK end2end Benchmark" and the subtitle "Speedup: higher is better." The graph displays a comparison of performance speed-ups between "YMM" and "XMM" across several benchmarks, with "YMM" consistently higher. At the bottom, there's a reference to the source of the benchmark and the machine configuration used.

Performance

WebCodecs: Accelerating media processing

Media is not just playback but also video conferencing

Modern SOCs have hardware decoders and encoders

Highly efficient, great for battery life, configurable (size, framerate etc.)

Now only available to web apps and sites

  • Video formats: AV1, AVC1, VP8, VP9, HEVC
  • Audio formats: MP3, MP4a, Opus, Vorbis, ULAW, ALAW
  • HDR (High Dynamic Range) support on Intel silicon

Graphical representations of a media processing flow, which includes a filmstrip icon leading to a "Demuxer JavaScript/Wasm," which then connects to both a "WebCodecs VideoDecoder" pointing to a computer monitor icon labeled "Rendering WebGL/WebGPU/Canvas," and a "WebCodecs AudioDecoder" pointing to a speaker icon labeled "AudioWorklet."

The logos of Opus, AV1, and a film reel are displayed at the top-right, and the Intel logo is at the bottom-right.

Performance

Heterogeneous Compute

Since Alder Lake (12th Gen Intel Core) we support E-cores and P-cores.

On the left, a vertical block diagram representing a CPU architecture with various components labeled, such as "Display," "GNA 3.0," multiple "LLC," "Media," and "32EU," among others. On the right, two separate blue blocks labeled "P-Core" and "E-Cores" are connected by arrows to a concept labeled "Building Blocks," with another arrow pointing to "P-Core" indicating "Best Performance" and an arrow pointing to "E-Cores" indicating "Most Efficient."

Performance

Power efficient with heterogeneous compute

Reduce power by tagging threads/tasks with their roles instead of priority

A diagram with various elements:

  • 1. At the top is a flowchart showing Chromium processes and threads, where new thread types are highlighted.
  • 2. Below, APIs and schedulers are shown for different operating systems (Windows, MacOS, Linux/CrOS) aligning with specific thread management tools such as Thread QoS API, Thread Priority API, QoS Class API, nice, cgroup, and uclamp.
  • 3. At the bottom, hardware components are diagrammed with SoC connected to HGS+ (for Intel platform), and dynamic core/frequency scheduling that leads to a P-Cluster and an E-Cluster.
  • 4. A legend on the right side distinguishes between Chromium existing (green bars), Chromium new (light green bar), Operating system (orange bar), and HW components (blue bar).

Performance

Improving power efficiency for video conferencing

  • Assign threads with different ThreadTypes
    • ThreadPriority: desired task starting time
    • ThreadQoS: desired task completion time
  • Assign WebRTC signal/network/worker threads with ResourceEfficient Type
    • Scheduled frequently
    • Less sensitive to latency
    • Less computation heavy

ResourceEfficient threads will be scheduled to E-Cores whenever possible

100+mW power savings achieved for video call on Windows (w/ 12th Gen Intel Core)

Diagram shows a workflow starting with a Decoder, leading to an Encoder, then splitting into Signal, Network, and Worker processes, labeled as "ResourceEfficient." Arrows point from these processes to "P-core" and "E-core" CPU cores. Below, the cores are labeled with "L2 (MLC)" and "L3 (LLC)".

Performance

Video playback power-efficiency improvements

  • Single compressed frame (CPU mem)
  • Decoded frame (GPU mem)
  • Decode IPCs
  • Composition IPCs
  • Driver invocation

Media Player

Browser Process

1 sample
16.67ms

Repeat

Media Player

Browser Process

20 samples
333ms

Repeat

Decoder

Compositor

GPU process

Decoder

MF Utility Process

Renderer Algorithm

16.67 ms

GPU Driver/Hardware

10% SoC power saving for video playback on Windows (w/ 12th Gen Intel Core)

Video playback power-efficiency improvements diagram showing two flowcharts. The left flowchart has a sequence of operations for video playback with frequent interactions labeled 'Media Player,' 'Browser Process,' 'Decoder,' 'Compositor,' and 'GPU Process' with a cycle taking 16.67ms. The right flowchart shows an optimized version with 'IPC Reduction' leading to fewer interactions, same sequence labeled ‘Media Player,’ ‘Decoder,’ and ‘MF Utility Process’ with a cycle taking 333ms.

Performance

Web GPU

Modern graphics and compute

On the right, there is an animated character running.

Deep Intel involvement in WebGPU

  • Spec
    • Both API and WGSL specs are ready
    • Close to official release
  • Browser Implementation
    • Chrome shipped WebGPU in M113 on May 2
    • Firefox and Safari are catching up
  • Early Adopters
    • Machine Learning frameworks: TensorFlow.js, TVM, IREE
    • Machine Learning applications: Google Meet, Adobe Photo Web, Zoom
    • Game engines: Unity, Unreal, Cocos, PlayCanvas, Construct3
    • Rendering frameworks: Three.js, Babylon.js, Orillusion
    • Others: Snap, Node.js, Deno, Google Meet, Google Earth, Sketchfab

Chrome ships WebGPU

After years of development, the Chrome team ships WebGPU which allows high-performance 3D graphics and data-parallel computation on the web.

A timeline starting from February 2017 to May 2023, marking significant milestones in the development and implementation of WebGPU where Intel has been involved. There are bullet points detailing API and WGSL specifications readiness, browser implementation progress in Chrome, Firefox, and Safari, and mentions of early adopters across various sectors like machine learning frameworks, applications, game engines, rendering frameworks, and other technology platforms.

AI at Scale

WebGPU is great for graphics!

A large, central image of a shiny robotic head against a blurred street background. To the right of the image are the logos of Babylon.js, Unreal Engine, and Unity, indicating compatibility or support.

Segway to AI at Scale

The web is ready for the next industry revolution

AI at Scale

WebGPU is more than graphics

  • It's a clear trend that AI is moving to the client side
  • At client side, GPU is the most powerful accelerator for AI
  • WebGPU unleashes the GPU compute capability to the web for the first time, thus makes the web AI ready

AI at Scale

WebGPU support for existing frameworks

  • Intel is owner of WebGPU backend for Tensorflow.js
  • Close collaboration with Google team
  • Reached official release in May
  • Support 160/174 kernels, with specific optimizations on Intel platforms
  • Better perf than WebGL; Much better perf than WASM on middle or large sized models
  • Help to mature WebGPU, including Video Uploading, Timestamp Query, Workgroup Memory Init, Uniformity Analysis, etc.

the TensorFlow logo. A timeline graphic is displayed with milestones: "BEGAN EFFORT" in September 2019, "FIRST NPM RELEASE" in January 2021, "CHROME OT COMPATIBLE RELEASE" in October 2021, and "OFFICIAL RELEASE" in July 2023.

AI at Scale

WebGPU Performance

webgpu vs webgl

Benchmarks

Config: Tigerlake Windows

Bar chart comparing WebGPU versus WebGL performance across various benchmarks with bars representing percentage improvement of WebGPU over WebGL. The chart shows positive performance gains for all benchmarks. The slide is branded with the TensorFlow logo on the top right and features the text "Config: Tigerlake Windows" in the bottom right corner.

AI at Scale

WebGPU Stable Diffusion

A screenshot of a user interface for a demo. The interface contains fields for input prompt (pre-filled with "Vincent van Gogh sunflower"), negative prompt (optional), a drop-down menu for selecting scheduler set to "Multi-step DPM Solver (20 steps)," an option to render intermediate steps, and an initiation setting for GPU device utilizing WebGPU and Nvidia technology. Below the interface options, there's a status message "Generating ... at stage vae, 7 secs elapsed." and a "Generate" button. A video control bar is seen below this. At the bottom right corner, there's a logo for Intel, and text states "GPU: NV3090." There's also an image within the UI showing a street flooded with water and a dog wading in the middle, beneath a traffic light displaying a red signal.

AI at Scale

Heterogenous Hardware for AI

Source: Intel illustration of generally accepted principles.

A chart plotting 'Performance / Throughput (TOPS)' against 'IP Power (W)'. Various computing units are represented as colored shapes, indicating their relative positions in terms of power and performance. The units include VPU (Dedicated AI Engine), TPU, iGPU (Best for Parallelism & Throughput), dGPU (Highest Scalable Performance), and CPU (Fastest Response). Each unit is accompanied by a brief description of its ideal use-case, such as power efficiency, parallelism, advanced content creation, or low-latency AI tasks.

AI at Scale

AI everywhere, even transparently

PWA technology enables modern Zoom experiences on ChromeOS and on any browser

  • Leverages Web Codecs for full control over media processing, incl. hardware acceleration

New web capabilities led by Intel to enhance these experiences:

  • Background blurring
  • Face detection / auto-framing
  • Eye gaze correction
  • Lighting correction
  • Noise suppression
  • Compute Pressure

https://pwa.zoom.us/wc

intel.

A thumbnail of a video call interface with reaction emojis.

AI at Scale

Coming soon! Access the best hardware

Web Neural Network API

Bringing efficient machine learning to the web

The slide features a screenshot of an AnandTech website article with the headline "Intel Discloses New Details On Meteor Lake VPU, Lays Out Vision For Client AI" by Ryan Smith dated May 29, 2023. Beneath the headline is an image of a semiconductor chip. To the right, there is a diagram of connected nodes and text stating, "ISV Frameworks for Innovation and Scale," along with logos for ONNX, OpenVINO, Web Neural Network API, and DirectML. The slide footer includes "Roadmap for Innovation" and "Embracing and enabling an open ecosystem," along with the Intel logo and confidentiality notice.

AI at Scale

"We’re also working on WebNN [...] can do local inferencing in-browser [...] we’re also implementing the WebNN standard and working with Intel [...] it’s coming soon at this point."

Jeff Mendenhall, Microsoft Principal PM, AI Frameworks Hybrid & Edge AI
Microsoft Build 2023: Deliver AI-powered experiences across cloud and edge, with Windows
https://youtu.be/ngdDL6Bj7lw?t=375

"Embracing and enabling an open ecosystem [with] W3C Web Neural Network API"

Anandtech
Intel Client AI at Scale
https://www.anandtech.com/Gallery/Album/8282#11

"Another big piece we've been working on in this space is WebNN and that's to enable web-based apps."

John Rayfield (Intel VP & GM of Client AI)
Intel Technology showcase

AI at Scale

Bar chart comparing TensorFlow-Lite Web CPU performance for MobileNetV2 and ResNet50 models, highlighting higher performance in native environments versus WebNN with XNNPACK backend and WebAssembly SIMD. Labels indicate speedup times, with WebNN achieving close to native performance. The footer notes that WebNN delivers near-native power and performance, with disclaimers regarding test conditions and proprietary information.

AI at Scale

AI in action

  • Video shows background blur and video segmentation
  • Models run via Web Neural Network (CPU backend) and Web Assembly
  • Web Neural Network provides a great speedup and near native performance

Belem Zhang, Engineering Manager at Intel Corporation

Code23

A headshot of a person labeled "Belem Zhang, Engineering Manager at Intel Corporation," and a diagram of a neural network.

Screencast of a video call demo

AI at Scale

Why bring “native” machine learning to the web?

  • Machine Learning enables new compelling user
  • Lack of access to purpose-built AI hardware disadvantages the web platform in comparison to native platforms

MediaPipe

Background blur and replacement powered by MediaPipe Web, source

There are also three logos displayed at the bottom right: TensorFlow.js, ONNX Runtime, and an Intel logo at the bottom.

On the right side, there is an image demonstrating background blur and replacement powered by MediaPipe Web. The top part of the image shows a woman with a blurred background, while the bottom part shows the same woman with a different background inserted behind her.

AI at Scale

Hardware Accelerated Web APIs

Web API Programming model Kernel representation Execution environment Processing Unit
WebAssembly Code-generation (SIMD) Assembly code (Wasm) WASM runtime CPU only
WebGPU Explicit GPU programming (SPMD) Compute shader (WGSL) Compute pipeline GPU only
WebNN Computational dataflow (Model) MLOperator (graph) Native ML runtime CPU, GPU, or VPU

AI at Scale

Hardware Accelerated Web Overview

Use cases

  • Image Classification
  • Object Detection
  • Background Segmentation
  • Noise Suppression
  • Natural Language

Frameworks

  • TensorFlow.js
  • ONNXRuntime Web
  • MediaPipe Web
  • OpenCV.js

Web API

  • WebAssembly (CPU)
  • WebGPU (GPU)
  • WebNN (VPU etc)

Web Engines

  • Web Browser e.g. Chrome, Edge, Safari
  • Web Interoperable Runtime e.g. Node, Deno

Native ML APIs

  • XNNPACK
  • NNAPI
  • DirectML
  • oneDNN
  • OpenVINO

AI at Scale

WebNN, a low-level ML API

  • Web Neural Network is an emerging web standard API for AI acceleration
  • Bring a unified abstraction of neural networks for framework authors
  • Access AI hardware through native machine learning
  • Close to native performance and reliability of results

AI at Scale

Low Level Programming Model

Diagram of a "Low Level Programming Model" for machine learning at scale. There's a flowchart describing the process from MLContext creation through to computing context with MLGraphs, highlighting steps like MLGraphBuilder creation, Computational Graph (Web) with operations like conv2d, add, and relu, and compilation into a Compiled Graph (Native) with a fused conv2d operation. A legend explains the color-coding for input, constant, output, and intermediate operands, as well as operations. Device and power preferences are noted, and associations for WebNN API and other Web APIs are indicated with color-coded squares and arrows depicting call flow and association.

Usage example

const context = navigator.ml.createContext();
const builder = new MLGraphBuilder(context);
// 1. Create a computational graph 'c = a * b'.
const a = builder.input('a', {type: 'float32', dimensions: [3, 4]});
const b = builder.input('b', {type: 'float32', dimensions: [4, 3]});
const c = builder.matmul(a, b);
// 2. Compile it into an executable.
const graph = await builder.build({'c': c});
// 3. Bind inputs to the graph and execute for the result.
const bufferA = new Float32Array(3*4).fill(1.0);
const bufferB = new Float32Array(4*3).fill(0.8);
const bufferC = new Float32Array(3*3);
await context.compute(graph, {'a': bufferA, 'b': bufferB, 'c': bufferC});
// 4. Print the results.
console.log('Output value: ' + bufferC);

On the left is a graphical representation showing two matrices labeled "a" and "b" with dimensions 3x4 and 4x3 respectively, being multiplied to produce a matrix "c" with dimensions 3x3, depicted by an operation block "MatMul".

Intel® Current Focus

Capabilities for amazing user experiences

Telemetry

Compute Pressure API

Check if there's enough juice for the frills

A worried face emoji on the right.

Telemetry

Web experiences informed by system insights

Gain insights into different kinds of system pressure, starting with CPU

  • Adjust number of video feeds
  • Adjust video resolution and frame-per-second
  • Skip feed filters and non-essentials like WebRTC noise suppression
  • Turn quality-vs-speed and size-vs-speed towards “speed” in WebCodecs
Code23 intel.

The four key points are illustrated with images:

Telemetry

High-level state changes

Abstract CPU stalls, temperature, other factors

Pressure states represents the minimal set of useful states that allows websites to react to changes in compute and system pressure with minimal degradation in quality or service, or user experience.

WebIDL

enum PressureState { "nominal", "fair", "serious", "critical" };

The PressureState enum represents the pressure state with the following states:

  • "nominal": The conditions of the target device are at an acceptable level with no noticeable adverse effects on the user.
  • "fair": Target device pressure, temperature and/or energy usage are slightly elevated, potentially resulting in reduced battery-life, as well as fans (or systems with fans) becoming active and audible. Apart from that the target device is running flawlessly and can take on additional work.
  • "serious": Target device pressure, temperature and/or energy usage is consistently highly elevated. The system may be throttling as a countermeasure to reduce thermals.
  • "critical": The temperature of the target device or system is significantly elevated and it requires cooling down to avoid any potential issues

Telemetry

Simple, yet flexible API

  • Observe/unobserve
  • Set frequency of observation
  • Changes are queued and pending changes can be retrieved at any time like after un-observation
  • High-level state changes and contributing factors, e.g., power supply and/or thermals
  • Support for workers and <iframe>'s
  • Integration with permissions

Telemetry

Compute Pressure API

Click for demo!

Compute Pressure demo

API Status: enabled

On the right, there's a "Compute Pressure demo" section indicating "API Status: enabled" with a red stop button, below which is a warning icon and the words "Critical pressure". Below that is a "Mandelbrot simulation" section with instructions to start the simulation and controls including "Start simulation", "Stop simulation", "Add worker", and "Remove worker". A colorful Mandelbrot set image is shown with the number '5' overlayed.

Announcing the second Compute Pressure origin trial

https://developer.chrome.com/en/blog/compute-pressure-origin-trial-2/

The image shows a webpage with a header reading "Announcing the second Compute Pressure origin trial" and a subheader stating "Published on Tuesday, May 30, 2023." There are two contributing authors listed: Kenneth Christiansen, and Arnaud (Arno) Mandy . Below, there's a brief paragraph mentioning a year-long collaboration between Intel, Google, and other parties on the Compute Pressure API, noting the ability to register for an origin trial to test this new API in Chrome 115. It mentions this post explains the problems the API is designed to solve and shows how to use it. The page features an industrial-themed image with pipes, a pressure gauge, and two illuminated light bulbs against a metal wall.

Collaboration

Improvements for streaming

A laughing emoji is placed on the right side

Collaboration

The new normal

  • 30% companies used web conference for first time after COVID-19
  • Microsoft Teams logo 270 M monthly users
  • Cisco Webex logo 22 M meetings/month
  • Google Meet logo 30X increase daily usage from Jan 2020
  • Zoom logo 10 M (Dec’19) → 300 M (April’20);
  • Downloaded in record numbers:
    • 14X(US),
    • 55X(Italy)
    • 22X(France)
    • 20X(UK)

Devices People Use to Access Video Calls

  • Laptop or desktop computer: 77%
  • Conference room equipment: 34%
  • Mobile phone: 31%
  • Tablet: 13%

Most Popular Places for Taking a Work Video Call

  • Home office: 51%
  • Coworking space: 33%
  • Restaurants or coffee shops: 24%
  • Bedroom: 21%
  • Airport: 14%

Collaboration

What kind of features do developers need to improve the video call experience on client devices

  • Background Blur
  • Face Detection
  • Face Framing
  • Eye Gaze Correction
  • Lighting Correction
  • Noise Suppression
  • Speech to Text
  • WebCodecs

screencast of getUserMedia Camera Constraint Effects Test

Collaboration

Add intelligence to your existing web app

const stream = navigator.mediaDevices.getUserMedia({video: true});
const [videoTrack] = stream.getVideoTracks();

const videoElement = document.querySelector('video');
const videoWorker = new Worker('video_worker.js');
videoWorker.postMessage({videoTrack: [videoTrack]});
const {data} = await videoWorker.onmessage();
videoElement.srcObject = new MediaStream([data.videoTrack]);

self.onmessage = async ({data: {videoTrack}}) => {
  const processor = new MediaStreamTrackProcessor({videoTrack});
  //...
}

const capabilities = videoTrack.getCapabilities();
if (capabilities.backgroundBlur && capabilities.backgroundBlur.max > 0){
  await track.applyConstraints({
    advanced: [{backgroundBlur: capabilities.backgroundBlur.max}]
  });
} else

Collaboration

Do more with less power

A bar chart titled "Package Power Consumption: Background Blur at 5 fps" compares the relative power consumption of different technologies: Chromium+getUserMedia, WebRTC API our proposal, MediaPipe-GPU bokehEffect, and TF.js BodyPix. The chart indicates that "No background blur" and the "Early prototype" consume significantly less power than MediaPipe-GPU and TF.js implementations. Annotations highlight "No background blur" and "Early prototype" as reference points. On the bottom right, the test setup is noted: Dell Latitude 9420, Tiger Lake, Windows 11 Pro.

Collaboration

Face Detection on the Web

Over the years

OpenCV
  • OpenCV.js
    • WASM
    • CPU
    • 8 MB
Media Pipe

WebRTC

  • ShapeDetection
  • Works only with still Images
  • gUM() constraint
  • In-stream face detection

getUserMedia Camera Constraint Effects Test

shows a man's face centered on a computer screen with a yellow-green facial detection box around his face, illustrating the "getUserMedia Camera Constraint Effects Test." The interface has several icons on the right-hand side, indicating various camera controls or settings.

Collaboration

const supports = navigator.mediaDevices.getSupportedConstraints();
if (!supports.faceDetectionMode) {
    throw 'Face detection is not supported';
}

const stream = await navigator.mediaDevices.getUserMedia({
    video: { faceDetectionMode: 'bounding-box' }
});

const [videoTrack] = stream.getVideoTracks();

// Use a video worker and show to user.
const videoElement = document.querySelector("video");
const videoGenerator = new MediaStreamTrackGenerator({kind: 'video'});
const videoProcessor = new MediaStreamTrackProcessor({track: videoTrack});
const videoSettings = videoTrack.getSettings();
const videoWorker = new Worker('video-worker.js');
videoWorker.postMessage({
    videoReadable: videoProcessor.readable,
    videoWritable: videoGenerator.writable
}, [videoProcessor.readable, videoGenerator.writable]);
videoElement.srcObject = new MediaStream([videoGenerator]);
videoElement.onloadedmetadata = event => videoElement.play();

Collaboration

// video-worker.js:
self.onmessage = async function(e) {
  const videoTransformer = new TransformStream({
    async transform(videoFrame, controller) {
      for (const face of videoFrame.detectedFaces) {
        let s = '';
        for (const f of face.contour) {
          s += `(${f.x}, ${f.y}),`
        }
        console.log('Face @ (${s})');
      }
      controller.enqueue(videoFrame);
    }
  });
  e.data.videoReadable
    .pipeThrough(videoTransformer)
    .pipeTo(e.data.videoWritable);
}

Collaboration

More Power-efficient Intelligent Collaboration

Bar graph depicting package power consumption for face detection at 15 fps across various technologies: Chromium+ getUserMedia, WebRTC API our proposal, Shape Detection API, OpenCV /WASM, face.js TF TinyFace, and MediaPipe SR GPU. Annotations point out "No face detection" and "Early prototype" for specific bars. Test setup listed: Dell Latitude 9420, Tiger Lake, Windows 11 Pro.

Collaboration

Center of attention

An image of a man standing in a room that appears to be a living space with a cat tree on the left and a window on the right. The man is wearing a plaid shirt and has a neutral expression.

Collaboration

Eye Gaze Correction

const videoStream = await navigator.mediaDevices.getUserMedia({
  video: true,
});

// Show camera video stream to the user.
const video = document.querySelector("video");
video.srcObject = videoStream;

// Get video track capabilities.
const videoTrack = videoStream.getVideoTracks()[0];
const capabilities = videoTrack.getCapabilities();

// Check whether eyegazeCorrection is supported.
if (!capabilities.eyegazeCorrection) return;

async function applyEyegazeCorrection() {
  try {
    await track.applyConstraints({
      eyegazeCorrection: true;
    });
  } catch (err) {
    console.error(err);
  }
}
intel

Eye Gaze Correction concept with two side-by-side images of a man: left with eyes looking slightly away, and right with eyes corrected to look at the camera.

Low level Knobs – Image Capture

Developers today have full control over:

  • Exposure time
  • Brightness
  • Color temperature
  • Saturation
  • Sharpness
Code23 intel.

An inset image of a camera interface with a person's face on the screen, indicating a practical application of these settings. The image includes a web browser header indicating the source from "github.io".

Collaboration

const videoCapabilities = videoTrack.getCapabilities();
if ((videoCapabilities.lightingCorrection || []).includes(true)) {
    await videoTrack.applyConstraints({
        advanced: [{
            lightingCorrection: true
        }]
    });
} else {
    // Lighting correction is not supported.
    // Consider falling back to some other method.
}

Lighting Correction will choose the perfect values automatically

One last thing!

Better stylus support

Alexis Menard, Web Platform Architect at Intel Corporation

USI
Universal Stylus Initiative® logo

An open laptop displaying a blank, white screen with a sticker on the wrist rest, placed on a dark surface with three markers to the left and a partially visible Rubik's cube to the right. In the top right corner, there's a smartphone displaying a person's photo. The word "Creation" is in the upper left corner of the screen.

Intel®

Empowering the web and future experiences!