Name: 面向移動開發者的TensorFlow Lite (Google I/O '18) (TensorFlow Lite for mobile developers (Google I/O '18))
Uploaded: 2021-01-14T10:33:29.000Z
Duration: 37 min 7 s
Description: 【看影片學英語】數萬部 YouTube 影片，搭配英漢字典即點即查，輕鬆掌握單字發音與用法，長久累積看電影不必再看字幕。

thank you so much for coming to our session this morning.

I'm on the Tensorflow Light team, and we work on bringing machine learning to mobile and small devices.

And later on I will introduce my colleague Andrew Sally, who will be doing the second half of this talk.

So the last couple of days have been really fun for me.

I've gotten to meet and speak with many off you, and it's been really nice to see the excitement around tensorflow light.

And today I'm happy to be here and talk to you about all the work that our team is doing to make machine.

Learning on small device is possible and easy, so in today's talk, we'll cover three areas.

First, we'll talk about why machine learning directly on device is important and how it's different than what you may do on the server.

Second will walk you through what we have built with tensorflow light and lastly will show you how you can use tensorflow light in your own APs.

So first, let's talk about devices for a bit.

What do we mean when we say a device while usually a mobile device basically our phones, so our phones are with us all the time.

We interact with them so many times during the day, and more than phones come with a large number off sensors on them, which give us really rich data about the physical world around us.

Another category of devices is what we call edge devices, and this industry has seen a huge explosion in the last few years.

To some examples are smart speakers smartwatches smart cameras?

And as this market has grown, we see that technology, which only used to be available on more expensive devices, is now available on far cheaper ones.

So now we're seeing that there is this massive growth and devices they're becoming increasingly capable, both mobile on edge.

And this is opening up many opportunities for novel applications for machine learning.

So I expect that many off you are already familiar with the basic idea of machine learning.

But for those that aren't, I'm going really quickly cover the core concept.

So let's start with an example off something that we may want to do.

So in the past, what we would have done was to write a lot of rules that were hard ported, very specific about some specific characteristics that we expected to see in parts of the image.

This was time consuming, hard to do and frankly didn't work all that well.

And this is where machine learning comes in with machine learning, we learn based on examples.

So a simple way to think about machine learning is that we use algorithms to learn from data, and then we make predictions about similar data that has not been seen before.

So it's a two step process forced the mortal Lorne's, and then we use it to make predictions.

The process of mortal learning is what we typically called training, and when the model is making predictions about data is what we call inference.

This is a high level view or what's happening during training.

The model is passed in label data that is, input data, along with the associative prediction and since in this case we know what the right answer is, we're able to calculate the error that is, how many times is the model getting it wrong and by how much we used these errors to improve the model, and this process is repeated many, many times until we reach the point that we think that the model is good enough or that this is the best that we can do.

This involves a lot of steps in coordination, and that is why we need a framework to make this easier.

It's Google's framework for machine learning.

It makes it easy to train and build neural networks, and it is cross platform.

It works on CIB use GPS to abuse as well as mobile and embedded platforms, and the mobile and embedded piece of tensorflow, which we call tensorflow light, is what we're gonna be focusing on in our talk today.

So now we want to talk about why would you consider doing machine learning directly on device?

And there's several reasons that you may consider.

But probably the most important one is Leighton.

See if the processing is happening on the device, then you're not sending data back and forth to the server.

So if you're use case involves real time processing off data such as audio or video than it's quite likely that you would consider doing this.

Other reasons are that your processing can happen.

Even when your device is not connected to the Internet, the data stays on device.

This is really useful if you're working with sensitive user data, which you don't wantto put on servers.

It's more power efficient because your devices not spending power transmitting data back and forth.

And lastly, we're in a position to take advantage off all the sensor data that's already available and accessible on the device.

But there's a catch like they're always is, And the catch is that doing on device ML is hard.

Many of these devices have some pretty tight constraints.

They have small batteries, tight memory and very little computation Power Tensorflow was built for processing on the server, and it wasn't a great fit for these use cases.

And that is the reason that we built tensorflow light.

It's a lightweight machine learning library for mobile and embedded platforms, so this is a high level or review of the system.

It consists of a converter where we convert models from Tensorflow format to tens of low light format and for efficiency reasons.

We use a format, which is different than it consists of an interpreter, which runs and device.

There are library off ops and cardinals, and then we have a B I's, which allow us to take advantage of hardware acceleration whenever it is available.

Tensorflow Light is cross platform, so it works on Android, IOS, Lennox and a high level devil upward workflow here would be to take a train tensorflow model converted to Tensorflow light format and then update your APS to use the Tensorflow light interpreter using the appropriate A P I on.

Iowa's developers also have the option off using corn amount instead.

And what they would do here is to take their train tensorflow model and converted to core ML using the Tensorflow Decorum Alcon border and then use the converted model with the core Am l wrong time.

So the two common questions that we get when we talk to developers of our tensorflow light is Is it small?

One of our fundamental design goals of tensorflow light was to keep the memory and binary size small, and I'm happy to say that the size off our core interpreter is only 75 kilobytes.

And when you include all the supported ops, the sizes 400 kilobytes.

So, first of all, we've been really careful about which dependencies we include.

Secondly, Tensorflow Light uses flat buffers, which are far more memory efficient than protocol.

Buffers are one other feature that I want to call out.

You're in tensorflow Light is what we call selective registration and that allows developers to only use the ops that their model needs and as they can keep the footprint small now moving on to the second question, which is off speed.

So we made several design choices throughout the system to enable fast start up, low laden see and high throughput.

So let's start with the mortal file format.

Tensorflow light users, flight buffers, like I said, and Flat Buffers is across Black Forum efficient serialization library.

It was originally created at Google for game development and is now being used for other performance sensitive applications.

The ad wanted refusing flat buffers is that we can directly access the data without doing parsing or UNP arcing off the large files which contained waits.

Another thing that we do it at the time of conversion is that we prefer use the activations and biases, and this leads to faster execution.

At Runtime, the Tensorflow Light interpreter uses a static memory and static execution plan.

This leads to faster load times, many off the colonel's that tons of low light comes red have been specially optimized to run fast on the on unarmed sea views.

Now let's talk about hardware acceleration.

As machine learning has grown in prominence, it has for quite a bit of innovation at the silicon Larry.

And many hardware companies are investing in building custom chips, which can accelerate neural network processing.

GPO's and GI ESPYs, which have been around for some time, are also now being increasingly used to do machine learning tasks.

Tensorflow Light was designed to take advantage of hardware acceleration, whether it is through GP.

Use d Espy's or custom ai ai chips on Android.

The recently released on Joy Neural Network A.

B I is an abstraction layer, which makes it easy for tensorflow light to take advantage of the underlying acceleration.

The way this works is that hardware renders right specialized drivers or custom acceleration code for their hardware platforms and integrate with the android and an FBI tensorflow.

Light, in turn, integrates with the Android and FBI via its internal delegation, a.

A point to note here is that developers only need to integrate their APS with tens of low light.

Tensorflow light will take care off, abstracting away the details off hardware acceleration from them.

I were also working on building direct GPU Acceleration in Tensorflow Light GPO's are widely available in use.

And, like I said before, they're now being increasingly used for doing machine learning tasks similar to an FBI.

Developers only integrate with tensorflow light if they want to take advantage of the GPU acceleration.

So the last bit on performance that I want to talk about this corn ization.

And this is a good example, often optimization, which cuts across several components in our system.

A simple way to think about it is that it refers to techniques to store numbers and to perform calculations on numbers in formats that are more compact than 32 bed floating point representations and why is this important?