Placeholder Image

字幕列表 影片播放

  • [MUSIC PLAYING]

  • DANIEL SITUNAYAKE: Hey, everybody.

  • So my name's Daniel.

  • LAURENCE MORONEY: And I'm Lawrence.

  • DANIEL SITUNAYAKE: And we have an awesome talk

  • for you, this afternoon.

  • So I've been super excited watching

  • the keynote, because there's just been so much stuff

  • that is relevant to what we're going

  • to be talking about today, which is running machine

  • learning on devices.

  • So we've seen the kind of amazing things

  • that you can do if you're running ML at the edge

  • and on device.

  • But there are a ton of options that developers have

  • for doing this type of stuff.

  • And it can be a little bit hard to navigate.

  • So we decided to put this session together

  • to give you an overview.

  • We're going to be showing you all the options,

  • walking you through some code, and showing you

  • some awesome demos.

  • So hopefully, you'll enjoy.

  • So first, we're going to walk through using TensorFlow

  • to train models and then export saved models, which

  • you can then convert to deploy on devices.

  • We're then going to see a number of different ways

  • that you can deploy to Android and iOS devices.

  • And finally, we're going to talk about

  • some new and super exciting hardware devices that you

  • can use to run your models.

  • So first of all, I want to give you

  • an overview of some different technologies and the device

  • types they each allow you to support.

  • So to begin with, we have ML Kit,

  • which is designed to make it super easy to deploy

  • ML inside of mobile apps.

  • We then have TensorFlow.js, which basically

  • lets you target any device that has a JavaScript interpreter,

  • whether that's in browser or through Node.js.

  • So that even supports embedded platforms.

  • And finally, TensorFlow Lite gives you

  • high performance inference across any device or embedded

  • platform, all the way from mobile phones

  • to microcontrollers.

  • So before we get any further, let's

  • talk a little bit about TensorFlow itself.

  • So TensorFlow is Google's tool chain for absolutely everything

  • to do with machine learning.

  • And as you can see, there are TensorFlow tools

  • for basically every part of the ML workflow,

  • from loading data through to building models

  • and then deploying them to devices and servers.

  • So for this section of the talk, we're

  • going to focus on building a model with TensorFlow

  • and then deploying it as a TensorFlow Lite model.

  • There are actually tons of ways to get up

  • and running with TensorFlow on device.

  • So the quickest way is to try out our demo apps and sample

  • code.

  • And we also have a big library of pretrained models

  • that you can drop into your apps that are ready to use.

  • You can also take these and retrain them

  • based on your own data using transfer learning.

  • You can, as you've seen this morning,

  • use Federated Learning to train models

  • based on distributed data across a pool of devices.

  • And you can finally build models from scratch,

  • which is what Laurence is now going to show off.

  • LAURENCE MORONEY: Thank you, Daniel.

  • Quick question for everybody.

  • How many of you have ever built a machine learn model?

  • Oh, wow.

  • DANIEL SITUNAYAKE: Wow.

  • LAURENCE MORONEY: Oh, wow.

  • Big round of applause.

  • So hopefully, this isn't too basic for you,

  • what I'm going to be showing.

  • But I want to show just the process

  • of building a model and some of the stops

  • that you can then do to prepare that model

  • to run on the mobile devices that Daniel was talking about.

  • Can we switch to the laptop, please?

  • Can folks at the back read that code?

  • Just wave your hands if you can.

  • OK, good.

  • Wave them like this if you need it bigger.

  • OK, some do, or you just want to stretch.

  • Let's see.

  • How's that?

  • OK, cool.

  • So I'm just going to show some very basic TensorFlow

  • code, here.

  • And I wanted to show the simplest

  • possible neural network that I could.

  • So for those of you who've never built something in machine

  • learning or have never built a machine learn model,

  • the idea is like with a neural network,

  • you can do some basic pattern matching

  • from inputs to outputs.

  • We're at Google I/O, so I'm going

  • to talk about inputs and outputs a lot.

  • And in this case, I'm creating the simplest

  • possible neural network I can.

  • And this is a neural network with a single layer

  • and a single neuron in that neural network.

  • And that's this line of code, right here.

  • The Keras.layers.Dense units equal 1.

  • Input shape equals 1.

  • And I'm going to then train this neural network on some data,

  • and that's what you can see in the line

  • underneath-- the Xs and the Ys.

  • Now, there is a relationship between these data points.

  • Can anybody guess what that relationship is?

  • There's a clue in the 0 and 32.

  • AUDIENCE: Temperature.

  • LAURENCE MORONEY: Yeah, a temperature conversion, right?

  • So the idea is I could write code that's like 9 over 5

  • plus whatever, plus 32.

  • But I want to do it as a machine learn model, just

  • to give as an example.

  • So in this case, I'm going to create this model.

  • And with this model, I'm just training it

  • with six pairs of data.

  • And then what it will do is it will start then

  • trying to infer the relationship between these data points.

  • And then from that, going forward,

  • it's a super simple model to be able to do a temperature

  • conversion.

  • So how it's going to work is it's going to make a guess.

  • And this is how machine learning actually works.

  • It just makes a wild guess as to what the relationship

  • between these data is.

  • And then it's got something called a loss function.

  • And what that loss function is going to do

  • is it's going to see how good or how bad that guess actually is.

  • And then based on the data from the guess

  • and the data from the loss function,

  • it then has an optimizer, which is this.

  • And what the optimizer does is it creates another guess,

  • and then it will measure that guess to see how well

  • or how badly it did.

  • It will create another guess, and it will measure that,

  • and so on, and so on, until I ask it to stop

  • or until it does it 500 times, which is what this line of code

  • is actually doing.

  • So if I'm going to create this model quite simply,

  • we'll see it's going to train.

  • There was one that I created earlier,

  • so my workbook's taking a moment to get running.

  • My network connection's gone down.

  • Hang on.

  • Let me refresh and reload.

  • You love it when you dry run a demo, and it works great.

  • We get that warning.

  • I'll run that.

  • I'll do that.

  • And now, it starts training, hopefully.

  • There we go.

  • It's starting to train, now.

  • It's going through all these epochs.

  • So it's going to do that 500 times.

  • And then at the end of the 500 times,

  • it's going to have this trained model.

  • And then this trained model, I'm just

  • going to ask it to predict.

  • So for example, if I give it 100 degrees centigrade,

  • what's that going to be in Fahrenheit?

  • The real answer is 212, but it's going

  • to give me 211 and something, because this

  • isn't very accurate.

  • Because I've only trained it on six points of data.

  • So if you think about it, there's a linear relationship

  • between the data from Fahrenheit to centigrade on those six,

  • but the computer doesn't know that.

  • It doesn't go linearly forever.

  • It could go like this, or it could change.

  • So it's giving me a very high probability

  • that for 100 degrees centigrade, it would be 212 Fahrenheit.

  • And that comes out as 211 degrees centigrade.

  • So I've just built a model.

  • And what we're going to take a look at next

  • is, how do I get that model to work on mobile?

  • Can we switch back to the slides, please?

  • So the process is pretty simple.

  • The idea is like using Keras or using

  • an estimator in TensorFlow.

  • You build a model.

  • You then save that model out in a file format

  • called SavedModel.

  • And in TensorFlow 2, we're standardizing on that file

  • format to make it easier for us to be able to go across

  • different types of runtimes, like JavaScript,

  • TFX in the web, or TensorFlow Lite.

  • By the way, the QR code on this slide is to the workbook

  • that I just showed a moment ago.

  • So if you want to experiment with that workbook

  • for yourself, if you're just learning, please go ahead

  • and do so.

  • It's a public URL, so feel free to have fun with it.

  • And I put a whole bunch of QR codes

  • in the rest of the slides.

  • Now, once you've done that, in TensorFlow Lite,

  • there's something called the TensorFlow Lite Converter.

  • And that will convert our SavedModel

  • into a TensorFlow Lite model.

  • So the process of converting means

  • it's going to shrink the model.

  • It's going to optimize the model for running on small devices,

  • for running on devices where battery life is

  • a concern, and things like that.

  • So out of that process, I get a TensorFlow Lite model,

  • which I can then run on different devices.

  • And here's the code to actually do that.

  • So we've got a little bit of a breaking change

  • between TensorFlow 1 and TensorFlow 2.

  • So in the workbook that was on that QR code,

  • I've put both pieces of code on how to create the SavedModel.

  • And then once you've done that, the third line from the bottom

  • here is the TF Lite Converter.

  • And all you have to do is say here's

  • the SavedModel directory.

  • Run the TF Lite Converter from SavedModel in that directory,

  • and it will generate a .tflite file for me.

  • And that .tflite file is what I can then use on mobile.

  • So let's take a look at that in action,

  • if we can switch back to the laptop.

  • So all I'm going to do within the same workbook

  • is I'm going to run that code that I just saw.

  • And I'm using TensorFlow 1.x and Colab here.

  • And we shall see that it actually

  • has saved out a model for me in this directory.

  • And I need that in the next piece of code,

  • because I have to tell it the directory that it got saved to.

  • So I'll just paste that in, and then I'll run this out.

  • And we can see the TF Lite Converter is what

  • will do the conversion for us.

  • So if I run that, it gives me the number 612.

  • Can anybody guess why it gives me the number 612?

  • It's not an HTTP code.

  • I thought it was that, at first, too.

  • That's actually just the size of the model.

  • So the model that I just trained off those six pieces of data,

  • when that got compiled down, it's a 612 byte model.

  • So if I go in there, you can see I saved it

  • in /tmp/model.tflite.

  • And if in my Colab, if I go and I look at /tmp directory,

  • we'll see model.tflite is there.

  • And I could download that then to start using it

  • in my mobile apps if I like.

  • Can we switch back to the slides, please?

  • So now, we have the model.

  • We've trained the model.

  • Obviously, the models you're going to train

  • are hopefully a little bit more complicated than the one

  • that I did.

  • You've been able to convert that model to TF Lite.

  • And now, what can you do with that model,

  • particularly on mobile?

  • Well, there's three sets of options that I want to cover.

  • The first one, if you were at the developer keynote,

  • you probably saw ML Kit.

  • And ML Kit is super cool.

  • For me, in particular, it uses the Firebase programming API.

  • Any Firebase fans, here?

  • Yeah, woo!

  • The Firebase API for programming,

  • I find particularly cool.

  • It's got a really nice asynchronous API.

  • And when you think about it, when I'm using a model,

  • I'm going to be passing data to the model.

  • The model is going to run some inference,

  • and it's going to send something back to me.

  • So it's perfect for Firebase, and it's

  • perfect for that asynchronous API that Firebase gives us.

  • If we don't want to use Firebase-- and remember,

  • Firebase ships with a bunch of models that work out

  • of the box for vision detection and some of the AutoML

  • stuff that we saw today.

  • But you can also ship your custom TF Lite model

  • into Firebase if you want.

  • But if you don't want to use Firebase, or maybe your model

  • is going to be deployed in a country where Firebase isn't

  • supported, or you want it to work completely offline,

  • and things like that, then the idea is you can still deploy

  • a model directly to your app.

  • And I'm going to show a TensorFlow Lite for that,

  • and getting low level, and using TensorFlow Lite directly

  • instead of going through the ML Kit wrapper.

  • And then finally, there's the mobile browser.

  • So you can actually deploy a model.

  • You can convert it to JSON, and you

  • can deploy it to run, actually, in a mobile browser, which

  • I find pretty cool.

  • So first, let's take a look at ML Kit.

  • So ML Kit is Google's solution for Firebase developers

  • and for mobile developers who want

  • to have machine learning models running in their applications.

  • Any ML Kit users here, out of interest?

  • Oh, not many.

  • Wow.

  • Well, you're in for a treat if you haven't used it yet.

  • Go check out the Firebase booth, the Firebase sandbox.

  • They got some really cool stuff that you can play with.

  • But just to show how it works, the idea

  • is that in the Firebase console, you can either

  • pick one of the preexisting models that Firebase gives you,

  • or you can upload the model that you just created.

  • So in Firebase, you've got the option

  • to say a custom model I've uploaded-- here,

  • you can see one that I did a couple of weeks ago

  • of this model that I uploaded.

  • It's now in the Firebase console,

  • and I can use it within my Firebase app.

  • And I can use it alongside a lot of the other Firebase goodies

  • like analytics.

  • Or a really cool one is A/B testing.

  • So I can have two versions of my model.

  • I could A/B test to see which works best.

  • Those kind of services are available to Firebase

  • developers, and when integrated with machine learning,

  • I find that makes it pretty cool.

  • And then once I've done that, now,

  • when I start building my application,

  • I do get all of the goodness of the Firebase programming API.

  • So if this is on Android, the idea is with TensorFlow Lite,

  • there's a TensorFlow Lite runtime

  • object that we'll often call the interpreter.

  • And here, you can see, I'm just calling interpreter.run.

  • I'm passing it my inputs.

  • So in this case, if it's Fahrenheit to centigrade

  • conversion, I'm just going to pass it a float.

  • And then in its onSuccessListener,

  • it's going to give me a call back when the model has

  • finished executing.

  • So it's really nice in the sense that it

  • can be very asynchronous.

  • If you have a really big model that

  • might take a long time to run, instead of you locking up

  • your UI thread, it's going to be working nice and asynchronously

  • through ML Kit.

  • So in my addOnSuccessListener, I'm adding a SuccessListener.

  • It's going to give me a call back with the results.

  • And then that result, I can parse

  • to get my output from the machine learn model.

  • And it's really as simple as that.

  • And in this case, I'm passing it in a float.

  • It's converting the temperature.

  • It's sending a float back to me.

  • And that's why my getOutput array is a float array

  • with a single element in it.

  • That's one thing if you haven't worked in machine learning

  • and if you haven't built machine learning models before, one

  • of the things that you'll encounter a lot

  • is that when you're passing data in,

  • you pass data in as tensors.

  • But when you are mapping those tensors to a high level

  • programming language, like Java or Kotlin,

  • you tend to use arrays.

  • And when it's passing stuff back to you,

  • it's passing back a tensor.

  • And again, they tend to map to arrays,

  • and that's why in the code here, you're seeing arrays.

  • So iOS.

  • Any iOS fans, here?

  • Oh, a few.

  • Hey, nobody booed.

  • You said they would boo.

  • [CHUCKLING]

  • So in iOS, it also works.

  • So for example, again, I have my interpreter in iOS.

  • This is Swift code.

  • I'll call the .run method on my interpreter.

  • I'll pass it the inputs, and I will get the outputs back.

  • And again, in this very simple model,

  • I'm just getting a single value back.

  • So it's just my outputs at index 0 I'm going to read.

  • If you're doing something more complex,

  • your data in and your data out structures

  • are going to be a bit more complex than this.

  • But as Daniel mentioned earlier on,

  • we have a bunch of sample applications

  • that you can dissect to take a look at how they actually

  • do it.

  • So that's ML Kit.

  • And that's a rough look at how it

  • can work with the custom models that you build

  • and convert to run in TensorFlow Lite.

  • But let's take a look at the TensorFlow Lite runtime itself.

  • So now, if I'm building an Android application,

  • and I've built my model, and I don't

  • want to depend on an exterior service like the Firebase

  • service to deploy the model for me,

  • I want to bundle the model with my app.

  • And then, however, the user gets the app, via the Play Store

  • or via other means, the model is a part of that.

  • Then it's very easy for me to do that.

  • So that .tflite file that I created earlier on, all I have

  • to do is put that in my assets folder in Android as an asset,

  • just like any other-- like any image, or any JPEG,

  • or any of those kind of things.

  • It's just an asset.

  • But the one thing that's really important,

  • and it's the number one bug that most people will

  • hit when they first start doing this,

  • is that when Android deploys your app to the device

  • to run it, it will zip up.

  • It will compress everything in the Assets folder.

  • The model will not work if it is compressed.

  • It has to be uncompressed.

  • So when you build out Gradle, you just specify aaptOptions.

  • You say noCompress "tflite", and then it

  • won't compress the tflite file for you.

  • And then you'll be able to run it and do inference.

  • So many times, I've worked with people building their first TF

  • Lite application, it failed unloading it

  • into the interpreter.

  • They had no idea why.

  • And it's they've forgotten to put this line.

  • So if you only take one thing away from this talk,

  • take this slide away, because it will

  • solve a lot of your problems when

  • you get started with TF Lite.

  • Then of course, still in build.gradle,

  • all you have to do is, in your dependencies,

  • you add the implementation of the TensorFlow-lite runtime.

  • And what that's going to do is it's

  • going to give you the latest version of TensorFlow Lite,

  • and then that will give you the interpreter that you

  • can use it.

  • And this QR code is a link.

  • I've put the full app that I'm going

  • to show in a moment on GitHub, so you can go and have a play

  • and hack around with it if you like.

  • So now, if I want to actually do inference--

  • so this is Kotlin code.

  • And there's a few things to take a look at here

  • on how you'll do inference and how you'll actually

  • be able to get your model up and running to begin with.

  • So first of all, there are two things that I'm declaring here.

  • Remember earlier, we tend to use the term interpreter

  • for the TF Lite runtime.

  • So I'm creating a TF Lite object, which

  • I'm going to call interpreter.

  • Sorry, I'm going to create an interpreter object, which

  • I'm going to call TF Lite.

  • And then I'm going to create a MappedByteBuffer object, which

  • is the TF Lite model.

  • Now earlier, remember, I said you put the TF Lite

  • model into your Assets folder.

  • How you read it out of the Assets folder

  • is as a MappedByteBuffer.

  • I'm not going to show the code for that in the slides,

  • but it's available on the download,

  • if you want have a look at it for yourself.

  • And then you're also going to need a TF Lite Options object.

  • And that TF Lite Options object is

  • used to set things like the number of threads

  • that you wanted to execute on.

  • So now, to instantiate your model

  • so that you can start using it, it's as easy as this.

  • So first of all, I'm going to call a loadModelfile function.

  • That loadModelfile function is what

  • reads the TF Lite model out of the Assets folder

  • as a MappedByteBuffer.

  • And it gives me my MappedByteBuffer

  • called tflitemodel.

  • In my options, I'm going to say, for example,

  • I just want this to run on one thread.

  • And then when I instantiate my interpreter like this,

  • by giving it that MappedByteBuffer of the model

  • and giving it the options, I now have an interpreter

  • that I can run inference on in Android itself.

  • And what does the inference look like?

  • It will look something like this.

  • So remember earlier, when I mentioned

  • a neural network takes in a number of inputs as tensors,

  • it gives you a number of outputs as tensors.

  • Those tensors, in a higher level language like Kotlin or Java

  • or Swift, will map to arrays.

  • So even though I'm feeding in a single float,

  • I have to feed that in as an array.

  • So that's why here, my input value

  • is a float array with a single value in it.

  • So if I want to convert 100, for example,

  • that's going to be a float array with a single value containing

  • 100.

  • And that F is for float, not for Fahrenheit.

  • When I was rehearsing these slides before,

  • somebody was like, oh, how'd you put Fahrenheit

  • into code like that.

  • But it's a float.

  • It's not Fahrenheit.

  • And then when I'm reading, we have

  • to get down a little low level here,

  • because the model's going to send me out a stream of bytes.

  • I know that those bytes map to a float, but Kotlin doesn't.

  • Java doesn't.

  • So those stream of bytes, I know they're mapping to a float.

  • And a float has 4 bytes, so I'm going to create a byte buffer.

  • I'm going to allocate 4 bytes to that byte buffer,

  • and I'm just going to set its order to be native order.

  • Because there's different orders like

  • big-endian and little-endian.

  • But when you're using TF Lite, always just use native order.

  • And then to do my inference, I call tflite.run.

  • I give it my input value.

  • I give it my output value.

  • It'll read from the input value.

  • It'll write to the output file.

  • And then on the output file, if I want to get my prediction,

  • it's written those 4 bytes.

  • I have to rewind them, and then I'm going to read a float.

  • And what Kotlin will do is say, OK, I'm

  • taking those 4 bytes out of that buffer.

  • And I'm going to give you back a float from them.

  • So that's how I would do an inference.

  • It seems very complex for a very simple task, like float in,

  • float out, but the structure is the same regardless

  • of how complex your input is and how complex your output is.

  • So while this might seem to be the 20 pound hammer for the one

  • pound nail, it's also the same hammer

  • when you have a 20 pound nail.

  • So that was Android, iOS is very similar.

  • So in iOS, all I have to do is I put my model in my application.

  • So I just put my TF Lite model.

  • It's an asset like any other.

  • And then in code, first of all, I create a pod.

  • And in my pod file, I'll have a pod for TensorFlow Lite.

  • I've spoken at I/O for the last five years,

  • and this is my first time ever showing C++ code.

  • I'm kind of geeking out a little bit.

  • Right now, it supports objective C++.

  • We do have a Swift wrapper in some of our sample

  • applications, but the Swift wrapper, right now,

  • only works in a few scenarios.

  • We're working at generalizing that.

  • So for now, I'm just going to show C++ code.

  • Any C++ fans, here?

  • Oh, wow.

  • More than I thought.

  • Nice.

  • So then your C++ code, it's exactly the same model as I was

  • just showing.

  • So first of all, I'm going to create an interpreter,

  • and I'm going to call that interpreter.

  • And now, I'm going to create two buffers.

  • So these are buffers of unsigned ints.

  • One buffer is my input buffer that I call ibuffer.

  • The other buffer is my output buffer that I call obuffer.

  • And both of these, the interpreter,

  • I'm just going to say, hey, use a typed_tensor for these.

  • So that's my input.

  • That's my output.

  • And when I tflite.run, it's going to read from the input,

  • write to the output.

  • Now, I have an output buffer, and I can just get my inference

  • back from that output buffer.

  • So that was a quick tour of TF Lite, how

  • you can build your model, save it as a TF Lite,

  • and I forgot to show--

  • oh, no.

  • I did show, sorry, where you can actually download as a TF Lite.

  • But I can demo it now running an Android.

  • So if we can switch back to the laptop?

  • So I'm going to go to Android Studio.

  • And I've tried to make the font big enough.

  • We can all see the font.

  • And let me just scroll that down a little bit.

  • So this was the Kotlin code that I showed a moment ago,

  • and the simplest possible application that I could build

  • is this one.

  • So it's a simple Android application.

  • It's got one button on it.

  • That button says do inference.

  • When you push that, it's hard code.

  • It'll pass 100 to the model and get back

  • the response for the model.

  • I have it in debug mode.

  • I have some breakpoints set.

  • So let's take a look at what happens.

  • So once I click Do Inference, I'm hitting this breakpoint now

  • in Android Studio.

  • And I've set up my input inputVal,

  • and we can see my inputVal is containing just 100.

  • And if I step over, my outputVal has been set up.

  • It's a direct byte buffer.

  • Right now, position is 0.

  • Its capacity is 4.

  • I'm going to set its order, and then I'm

  • going to pass the inputVal containing

  • 100, the outputVal, which is my empty 4 byte

  • buffer, to tflite.run.

  • Execute that, and the TF Lite interpreter has done its job,

  • and it's written back to my outputVal.

  • But I can't read that yet.

  • Remember earlier, the position was 1.

  • The limit was 4.

  • The capacity was 4.

  • It's written to it, now.

  • So that buffer is full.

  • So when I rewind, now, we can see

  • my position has gone back to 0.

  • So I know I can start reading from that buffer.

  • So I'm going to say outputVal.getFloat,

  • and we'll see the prediction that comes back is that 211.31.

  • So that model has been wrapped by the TF Lite runtime.

  • I've given it the input buffer.

  • I've given that the output buffer.

  • I've executed it, and it's given me back that code.

  • And actually, there's one really cool Kotlin language

  • feature that I want to demonstrate, here.

  • I don't know if anybody has seen this before.

  • This might be the first time we've actually

  • ever shown this on stage.

  • But if I want to run this model again,

  • you'll notice that there are line numbers here.

  • All I have to do is type goto 50.

  • I'm seeing who's still awake.

  • Of course, it's gosub 50.

  • So that's just a very quick and very simple

  • example of how this would run in Android.

  • And again, that sample is online.

  • It's on GitHub.

  • I've put it on GitHub so that we can have a play with it.

  • All right, if we can switch back to the slides?

  • So the third of the options that I had mentioned-- first

  • was ML Kit.

  • Second was to TF Lite.

  • The third of the options was then

  • to be able to use JavaScript and to be able to run

  • your model in a browser.

  • So TensorFlow.js is your friend.

  • So the idea is that with TensorFlow.js,

  • in your Python, when you're building the model,

  • you've PIP installed a library called TensorFlow.js.

  • And then that gives you a command

  • called TensorFlow.js converter.

  • With TensorFlow.js converter, if you'd

  • saved that as a saved model, as we showed earlier on,

  • you just say, hey, my input format's a SavedModel.

  • Here's the directory the SavedModel is in.

  • Here's the directory I want you to write it to.

  • So now, once it's done that, it's

  • actually going to take that SavedModel

  • and convert that into a JSON object.

  • So now, in a super, super simple web page-- and this QR code,

  • again, has that web page--

  • now, all I have to do is say, here's

  • the URL of that model.json.

  • And I will say const model = await tf.loadLayersModel,

  • giving it that URL.

  • So if you're using TensorFlow.js in your browser

  • with that script tag right at the top of the page, now,

  • that model is loaded from a JSON serialization.

  • And I can start running inference

  • on that model in the browser.

  • So here was how I would use it.

  • Again, I'm setting up my inputs.

  • And JavaScript, you know earlier,

  • I was saying you're going to pass in arrays,

  • and you get out arrays, except in a high level language?

  • Sorry, you pass in tensors.

  • You get out tensors.

  • High level language tends to wrap them in arrays.

  • TensorFlow.js actually gives you a tensor 2D object,

  • and that's what I'm using here.

  • So the tensor 2D object takes two parameters.

  • The first parameter is the array that you want to pass in,

  • and you can see here that array is just the value 10.

  • It's a single item array.

  • And then the second parameter is the shape of that array.

  • So here, the first parameter is a 10.

  • The second parameter is 1, 1.

  • And that's the shape of that array.

  • It's just a 1 by 1 array.

  • So once I've done that, and I have my input, now,

  • if I want to run inference using the model, all I have to do

  • is say model.predictinput.

  • And it will give me back my results.

  • In this case, I was alerting the results.

  • But in my demo, I'm actually going to write it.

  • So if we can switch back to the demo box?

  • And I have that super simple web page hosted on the web.

  • And I've put that model in there, and it's going to run.

  • This is a slightly different model.

  • I'll show training that model in a moment.

  • This was just the model where y equals 2x minus 1.

  • So I'm doing an inference where x equals 10.

  • And if x equals 10, y equals 2x minus 1 will give you 19.

  • And when I train the model on six items of data,

  • it says 18.97.

  • So again, all I do is, in Python, I can train the model.

  • With TensorFlow.js, I can then convert that model

  • to a JSON object.

  • And then in TensorFlow.js, I can instantiate a model off

  • of that JSON and then start doing predictions

  • in that model.

  • If we can switch back to the demo machine for a moment?

  • Oh, no.

  • I'm still on the demo machine, aren't I?

  • I can show that in action in a notebook.

  • I lost my notebook.

  • So this notebook is also available,

  • where I gave those QR codes.

  • And this notebook, again, is a very similar one

  • to the one I showed earlier on super, super simple neural

  • network, single layer with a single neuron.

  • I'm not going to step through all of it now.

  • But the idea is if you PIP install TensorFlow.js right now

  • in Google Colab, it will upgrade Google Colab from TensorFlow

  • 1.13 to TensorFlow 2.

  • So if you run through this and install that,

  • you'll see that happening.

  • And then once you have TensorFlow 2 on your machine,

  • then you can use the TensorFlow.js converter,

  • as shown here, giving it the input format

  • and giving it the SavedModel from the directory

  • as I'd done earlier on.

  • And it will write out to /temp/linear.

  • The one thing to take note of, though,

  • if you are doing this yourself is that when it writes to that,

  • it won't just write the JSON file.

  • It also writes a binary file.

  • So when you upload the JSON file to the web server--

  • to be able to create a model off of that JSON file, make sure

  • the binary file is in the same directory as the JSON file.

  • Or the model off of the JSON file

  • is going to give you some really weird results.

  • That's also the number one bug that I found when people

  • have been using TensorFlow.js.

  • It's that they will convert to the JSON file.

  • They'll upload it to their server.

  • They'll have no idea what that random binary file was,

  • and they're getting indeterminate results back

  • from that model.

  • So make sure when you do that, you get that model.

  • I don't know if I have one here that I prepared earlier

  • that I can show you what it looks like.

  • I don't.

  • It's empty, right now.

  • But when you run this and you write it out,

  • you'll see that the model.json and a binary file are there.

  • Make sure you upload both of them to use it.

  • Can we switch back to the slides, please?

  • So that was a quick summary, where

  • we saw that model that you build using Python can be converted

  • to TensorFlow Lite.

  • You can save it as a SavedModel.

  • You can convert it to TensorFlow Lite and use it in ML Kit,

  • or use it directly in TensorFlow Lite itself.

  • Alternatively, if you want to convert it to .js,

  • then it will save it out as a JSON file.

  • You can convert it to a JSON file

  • and then use that in JavaScript.

  • So that's the summary of being able to use

  • those models on mobile devices.

  • But now, Daniel is going to tell us

  • all about going beyond phones and the web.

  • So thank you, Daniel.

  • Thank you.

  • DANIEL SITUNAYAKE: Thank you, Laurence.

  • [APPLAUSE]

  • Awesome.

  • So like Lawrence says so far, we've talked about phones.

  • But these aren't the only devices that we use every day.

  • So I'm going to talk about some new tools

  • that our team has designed to help developers use machine

  • learning everywhere.

  • So our homes and cities are filled

  • with devices that contain embedded computing power.

  • And in fact, every year, literally billions of devices

  • are manufactured that contain small but highly capable

  • computation devices called microcontrollers.

  • So microcontrollers are at the core

  • of most of our digital gadgets, everything

  • from the buttons on your microwave

  • through to the electronics controlling your car.

  • And our team started to ask, what

  • if developers could deploy machine learning

  • to all of these objects?

  • So at the TensorFlow Dev Summit, we

  • announced an experimental interpreter

  • that will run TensorFlow models on microcontrollers.

  • So this is actually a new frontier for AI.

  • We have super cheap hardware with super huge battery life

  • and no need for an internet connection,

  • because we're doing offline inference.

  • So this enables some incredible potential applications,

  • where AI can become truly personal while still preserving

  • privacy.

  • We want to make it ridiculously easy for developers

  • to build these new types of products,

  • so we've actually worked with SparkFun

  • to design a microcontroller development

  • board that you can buy today.

  • It's called the SparkFun Edge, and it's

  • powered by an ultra efficient ARM processor that's packed

  • with sensors and I/O ports.

  • So you can use it to prototype embedded machine learning code.

  • And we have example code available

  • that shows how you can run speech recognition

  • in a model that takes up less than 20 kilobytes of memory,

  • which is crazy.

  • So I'm now going to give you a quick demo of the device,

  • and I'll show you what some of this code

  • looks like for running inference.

  • So you should remember before we do this, all of this

  • is all available on our website, along with documentation

  • and tutorials.

  • And the really cool thing, while you're here at I/O,

  • you should head over to the Codelabs area.

  • And you can try hands on development

  • with the SparkFun Edge boards.

  • So let's switch over to the camera, here.

  • LAURENCE MORONEY: I think that actual image was bigger

  • than 20 kilobytes.

  • DANIEL SITUNAYAKE: Yeah, definitely.

  • It's kind of mind-blowing that you

  • can fit a speech model into such a small amount of memory.

  • So this is the device itself.

  • So it's just a little dev board.

  • I'm going to slide the battery in.

  • So the program we have here, basically,

  • is just running inference.

  • And every second of audio that comes in,

  • it's running through a little model that looks

  • for a couple of hot words.

  • You can see this light is flashing.

  • It flashes once every time inference is run.

  • So we're getting a pretty decent frame right,

  • even though it's a tiny, low powered microcontroller

  • with a coin cell battery.

  • So what I'm going to do now is take my life in my hands

  • and try and get it to trigger with the hot words.

  • And hopefully, you'll see some lights flash.

  • Yes, yes, yes.

  • First time, not lucky.

  • Yes, yes, yes.

  • Yes, yes, yes.

  • So it's not working so great when we've got the AC going,

  • but you saw the lights lighting up there.

  • And I basically got a really simple program

  • that looks at the confidence score

  • that we get from the model that the word yes was detected.

  • And the higher the confidence, the more lights appear.

  • So we got three lights there.

  • So it's pretty good.

  • Let's have a look at the code.

  • So if we can go back to the slides?

  • So all we do, basically, to make this work is

  • we've got our model, which is just a plain old TensorFlow

  • Lite model that you trained however you wanted to

  • with the rest of our TensorFlow tool chain.

  • And we have this model available as an array

  • of bytes within our app.

  • We're going to pull in some objects

  • that we're going to use to run the interpreters.

  • So first of all, we create a resolver,

  • which is able to pull in the TensorFlow ops

  • that we need to run the model.

  • We then create some memory that is allocated

  • for some of the working processes that

  • are going to happen as we input data

  • and run some of the operations.

  • And then we build and interpret the object,

  • which we pass all this stuff into,

  • that is actually going to execute the model for us.

  • So the next thing we do is basically

  • generate some features that we're

  • going to pass into the model.

  • So we have some code not pictured

  • here, which takes audio from the microphones that

  • are on the board and transforms that into a spectrogram

  • that we then feed into the model.

  • Once we have done that, we invoke the model,

  • and we get an output.

  • So the output is just another tensor,

  • and we can look through that tensor

  • to find which of our classes were able to be matched.

  • And hopefully, in this case, it was

  • the yes that showed up as the highest probability.

  • So all of this code is available online.

  • We have documentation that walks you through it.

  • And like I said, the device is available here, in I/O,

  • in the Codelabs labs area, if you'd like to try it yourself.

  • So tiny computers are great, but sometimes, you just

  • need more power.

  • So imagine you have a manufacturing plant that

  • is using computer vision to spot faulty parts on a fast moving

  • production line.

  • So we recently announced the Coral platform,

  • which provides hardware for accelerated

  • inference at the Edge.

  • So these are small devices still,

  • but they use something called the Edge

  • TPU to run machine learning models incredibly fast.

  • So one of our development boards here

  • can run image classification on several simultaneous video

  • streams at 60 frames per second.

  • So it's super awesome.

  • We have these devices available to check out

  • in the Codelabs area, as well.

  • And in addition, in the ML and AI sandbox,

  • there's a demo of showing a use case,

  • spotting faulty parts in manufacturing.

  • So once again, it's super easy to run TensorFlow Lite models

  • on Coral devices.

  • And this example shows how you can load a model,

  • grab camera input, run inference, and annotate

  • an output image in just a few lines of code.

  • So all of this, again, is available

  • online on the Coral site.

  • So we've shown a ton of exciting stuff today, and all of it

  • is available on TensorFlow.org and the Coral site, right now.

  • So you'll be able to find example code, example apps,

  • pretrained models, and everything

  • you need to get started with deploying to device.

  • And I've got some links up here for you.

  • But while you're here at I/O, there

  • are a ton of other opportunities to play with on device ML.

  • So we have a couple of sessions that I'd like to call out,

  • here.

  • We have the TensorFlow Lite official talk, tomorrow,

  • which is going to go into a lot more depth around TensorFlow

  • Lite and the tools we have available for on device

  • inference and converting models.

  • And we also have a talk on What's

  • New in Android ML, which is this evening at 6:00 PM.

  • So you should definitely check both of those out.

  • And in the Codelabs area, we have a load of content.

  • So if you're just learning TensorFlow,

  • we have a six part course you can take to basically go end

  • to end from nothing to knowing what you're talking about.

  • And then we have a couple of Codelabs

  • you can use for on device ML, and I

  • think there's a Coral codelab, as well.

  • So thank you so much for showing up.

  • And I hope this has been exciting,

  • and you've got a glimpse of how you can do on device ML.

  • Like you saw in the keynote, there's

  • some amazing applications, and it's up to you

  • to build this amazing, new future.

  • So thank you so much for being here.

  • [MUSIC PLAYING]

[MUSIC PLAYING]

字幕與單字

單字即點即查 點擊單字可以查詢單字解釋

B1 中級

機器學習在你的設備上。選項(Google I/O'19 (Machine Learning on Your Device: The Options (Google I/O'19))

  • 3 0
    林宜悉 發佈於 2021 年 01 月 14 日
影片單字