TensorFlow Extended (TFX)概述和預訓練工作流程(TF Dev Summit '19) (TensorFlow Extended (TFX) Overview and Pre-training Workflow (TF Dev Summit '19))

字幕列表影片播放

[MUSIC PLAYING]
CLEMENS MEWALD: My name is Clemens.
I'm the product lead for TensorFlow Extended,
the end-to-end machine learning platform
that we built for TensorFlow.
And we have a lot of exciting announcements,
so let's jump right in.
A lot of you may be familiar with this graph.
We published this in a paper in 2017.
And the main point that I usually make on this graph
is that there's more to machine learning than just the training
part.
In the middle, the trainer piece,
that's where you train your machinery model.
But if you want to do machine learning in production
reliably and in a robust way, you actually
need all of these other components
before and after, and in parallel, to the training
algorithm.
And often I hear, sometimes from researchers, well,
I really only do research.
I only care about training the machine learning model
and I don't really need all of these upstream and downstream
things.
But what I would argue is that research often
leads to production.
And what we want to avoid is researchers
having to re-implement their hard work,
in a model that they've built, when they want to put
the model into production.
That's actually one of the main reasons
why we open sourced TensorFlow because we really
wanted the research community to build the models in a framework
that we can then use and actually move into production.
A second comment that I hear often
is, well, I only have a very small data set
that fits in a single machine.
And all of these tools are built to scale up
to hundreds of machines.
And I don't really need all of these heavy tools.
But what we've seen time and time again at Google
is that small data today becomes large data tomorrow.
And there's really no reason why you
would have to re-implement your entire stack just
because your data set grew.
So we really want to make sure that you
can use the same tools early on in your journey
so that the tools can actually grow with you and your product,
with the data, so that you can scale the exact same code
to hundreds of machines.
So we've built TensorFlow Extended as a platform
at Google, and it has had a profound impact
to how we do machine learning and production
and into becoming an AI-first company.
So TFX really powers some of our most important Alphabet
companies.
Of course, Google is just one of the Alphabet companies.
So TFX is used at six different Alphabet companies.
And within Google, it's really used
with all of the major products.
And also, all of the products that
don't have billions of users [INAUDIBLE] this slide.
And I've said before that we really
want to make TFX available to all of you
because we've seen the profound impact it
has had on our business.
And we're really excited to see what
you can do with the same tools in your companies.
So a year ago we talked about the libraries
that we had open sourced at that point in time.
So we talked about TensorFlow Transform, the training
libraries, Estimators and Keras, TensorFlow Model Analysis,
and TensorFlow Serving.
And I made the point that, back then, as today, all of these
are just libraries.
So they're low-level libraries that you still
have to use independently and stitch together
to actually make work and train for your own use cases.
Later that year, we added TensorFlow Data Validation.
So that made the picture a little more complete.
But we're still far away from actually being done yet.
However, it was extremely valuable to release
these libraries at that point in time
because some of our most important partners
externally has also had a profound impact with some
of these libraries.
So we've just heard from our friends at Airbnb.
They use TensorFlow Serving in that case study
that they mentioned.
Our friends at Twitter just published this fascinating blog
post of how they used TensorFlow to rank tweets
on their home timeline.
And they've used TensorFlow Model Analysis to analyze
that model on different segments of the data
and used TensorFlow Hub to share some of the word embeddings
that they've used for these models.
So coming back to this picture.
For those of you who've seen my talk last year,
I promised everyone that there will be more.
Because, again, this is only the partial platform.
It's far away from actually being an end-to-end platform.
It's just a set of libraries.
So today, for the very first time,
we're actually sharing the horizontal layers
that integrate all of these libraries
into one end-to-end platform, into one end-to-end product,
which is called TensorFlow Extended.
But first, we have to build components out
of these libraries.
So at the top of this slide, you see in orange, the libraries
that we've shared in the past.
And then in blue, you see the components
that we've built from these libraries.
So one observation to be made here is that, of course,
libraries are very low level and very flexible.
So with a single library, we can build many different components
that are part of machine learning pipeline.
So in the example of TensorFlow Data Validation,
we used the same library to build
three different components.
And I will go into detail on each one of these components
later.
So what makes a component?
A component is no longer just a library.
It's a packaged binary or container
that can be run as part of a pipeline.
It has well-defined inputs and outputs.
In the case of Model Validation, it's
the last validated model, a new candidate model,
and the validation outcome.
And that's a well-defined interface
of each one of these components.
It has a well-defined configuration.
And, most importantly, it's one configuration model
for the entire pipeline.
So you configure a TFX pipeline end to end.
And some of you may have noticed,
because Model Validation needs the last validated model,
it actually needs some context.
It needs to know what was the last model that was validated.
So we need to add a metadata store that actually provides
this context, that keeps a record of all
of the previous runs so that some of these more advanced
capabilities can be enabled.
So how does this context get created?
Of course, in this case, the trainer produces new models.
Model Validator knows about the last validated model
and the new candidate model.
And then downstream from the Validator,
we take that new candidate model and the validation outcome.
And if the validation outcome is positive,
we push the model to the serving system.
If it's negative, we don't.
Because usually we don't want to push
a model that's worse than our previous model
into our serving system.
So the Metadata Store is new.
So let's discuss why we need this
and what the Metadata Store does.
First, when most people talk about machine learning
workflows and pipelines, they really
think about task dependency.
They think there's one component and when that's finished,
there's another component that runs.
However, all of you who actually do machine learning
in production know that we actually need data dependency,
because all of these components consume artifacts and create
artifacts.
And as the example of Model Validation has showed,
it's incredibly important to actually know
these dependencies.
So we need a system that's both task and data aware so
that each component has a history of all
of the previous runs and knows about all of the artifacts.
So what's in this Metadata Store?
Most importantly, type definitions
of artifacts and the properties.
So in our case, for TFX, it contains the definition
of all of the artifacts that are being consumed and produced
by our components and all of their properties.
And it's an extensible type system,
so you can add new types of artifacts,
if you add new components.
And you can add new properties to these artifacts,
if you need to track more properties of those.
Secondly, we keep a record of all of the execution
of the components.
And with that execution, we store
all of the input artifacts that went into the execution,
all of the output artifacts that were produced,
and all of the runtime configuration
of this component.
And, again, this is extensible.
So if you want to track things like the code snapshot that
was used to produce that component,
you can store it in the Metadata Store, as well.
So, putting these things together
allows us to do something we call lineage tracking
across all executions.
Because if you think about it, if you
know every execution, all of its inputs and all of its outputs,
you can piece together a story of how an artifact was created.
So we can actually, by looking at an artifact,
say what were all of the upstream executions
and artifacts that went into producing this artifact,
and what were all of the downstream runs
and downstream artifacts that were produced using
that artifact as an input?
Now, that's an extremely powerful capability,
so let me talk you through some of the examples of what
this enables.
The first one is a pretty straightforward one.
Let's say I want to list all of the training
runs that I've done in the past.
So in this case, I am interested in the trainer
and I want to see all of the training runs
that were recorded.
In this case, I had two training runs.
And I see all of the properties of these training runs.
This is pretty straightforward, yet nothing new to see here.
However, I just spoke about lineage.
We can visualize that lineage and all this information
that we have.
The first comment on this slide to make
is we're working on a better UI.
This is really just for demonstration purposes.
But if you look at the end of this graph to the right side,
you see the model expert path.
This is the specific instance of a model that was created.
And as you can see, we see that the model
was created by the trainer.
And the trainer created this model
by consuming a Schema, Transform and Examples.
And, again, these are specific instances.
So the IDs there are not just numbering,
they're Schema of ID number four and Transform
of ID number five.
And for each one of those artifacts,
we also see how they were created upstream.
And this allows us to do this lineage tracking and going
forward and backward in our artifacts.
The narrative I used was walking back from the model but,
similarly, you could look at your training data
and say, what were all of the artifacts that were produced
using that training data?
This slide shows a visualization of the data distribution
that went into our model.
Now, at first glance, this may not
be something earth shattering because we've done this before.
We can compute statistics and we can visualize them.
But if we look at the code snippet,
we're not referring data or statistics.
We're referring to a model.
So we say for this specific model,
show me the distribution of data that the model was trained on.
And we can do this because we have a track
record of all of the data and the statistics that
went into this model.
We can do a similar thing in the other direction
of saying for a specific model, show me the sliced metrics that
were produced downstream by TensorFlow Model Analysis,
and we can get this visualization.
Again, just by looking at a model
and not specifically pointing to the output of TensorFlow Model
Analysis.
Of course, we know all of the models that were trained
and where all of the checkpoints lie
so we can start TensorBoard and point to some
of our historic runs.
So you can actually look at the TensorBoard
for all of the models that you've trained in the past.
Because we have a track record of all of the models
that you've trained, we can launch TensorBoard and point it
to two different directories.
So you can actually compare two models in the same TensorBoard
instance.
So this is really model tracking and experiment comparison
after the fact.
And we enable this by keeping a track record of all of this.
And, if we have multiple models, you
can also look at the data distribution
for multiple models.
So this usually helps with debugging a model.
If you train the same model twice, or on different data,
and it behaves differently, sometimes it
can pay off to look at whether the data distribution has
changed between the two different ones.
And it's hard to see in this graph,
but here we're actually overlaying
two distributions of the statistics
for one model and the other.
And you would see if there's a considerable drift
between those two.
So all of these are enabled by this lineage tracking
that I just mentioned.
Another set of use cases is visualizing previous runs
over time.
So if you train the same model over time, over new data,
we can give you a time series graph of all of the evaluation
metrics over time, and you can see
if your model improves or gets worse over time
as you retrain them.
Another very powerful use case is carrying over state
from previous models.
Because we know that you've trained the model in the past,
we can do something we call warm starting.
So we can re-initialize the model
with weights from a previous run.
And sometimes we want to re-initialize the entire model
or maybe just an embedding.
And in this way, we can continue training
from where we left off with a new data set.
And another very powerful application of this
is being able to reuse previously computed outputs.
A very common workflow is to iterate on the model
and basically iterate on your model architecture.
Now, if you have a pipeline that ingests data,
applies transformations to your data,
and then you train a model--
every time you make a small change to your model,
you don't want to recompute everything upstream.
There's no reason why you would have to re-ingest your data,
re-compute the transform just because you changed something
in your model.
Because we have a track record of all of the previous steps,
we can make a decision of saying your data hasn't changed,
your transform code hasn't changed,
so we will reuse the artifacts that were produced upstream.
And you can just iterate much, much faster on your model.
So this improves iteration speeds,
and it also saves compute because you're not
re-computing things that you've already computed in the past.
So now, we've talked about components quite a bit.
Now how do we actually orchestrate TFX pipelines?
First, every component has something we
call a driver and a publisher.
The driver's responsibility is to basically retrieve state
from the Metadata Store to inform
what work needs to be done.
So in the example of Model Validation,
the driver looks into the Metadata Store
to find the last validated model,
because that's the model that we need
to compare with the new model.
The publisher then basically keeps the record of everything
that went into this component, everything that was produced,
and all of the runtime configuration,
so that we can do that linear tracking
that I mentioned earlier.
And in between sits the executor.
And the executor is blissfully unaware of all of this metadata
stuff because it's extremely important for us
both to make that piece relatively simple.
Because if you want to change the code in one
of these components, if you want to change the training code,
you shouldn't have to worry about drivers and publishers.
You should just have to worry about the executor.
And it also makes it much, much easier
to write new components for the system.
And then we have one shared configuration model
that sits on top that configures end-to-end TFX pipelines.
And let's just take a look at what that looks like.
As you can see, this is a Python DSL.
And, from top to bottom, you see that it
has an object for each one of these components.
From ExampleGen, StatisticsGen, and so on.
The trainer component, you can see, basically
receives its configuration, says that your inputs
come from the transferred output and the schema that
was inferred.
And let's just see what's inside of that trainer.
And that's really just TensorFlow code.
So in this case, as you can see, we just use an estimator.
And we use to estimator train and evaluate method
to actually train this model.
And it takes an estimator.
And we just use one of our peak end estimators, in this case.
So this is a wide and deep model that you can just
instantiate and return.
But what's important to highlight here
is that we don't have an opinion on what this code looks
like, it's just TensorFlow.
So anything that produces a safe model as an output
is fair game.
You can use a Keras model that produces the inference graph
or, if you choose to, you can go lower level
and use some of the lower-level APIs in TensorFlow.
As long as it produces a safe model in the right format
that it can be used TensorFlow Serving,
or the [? eval graph ?] that can be used in TensorFlow Model
Analysis, you can read any type of TensorFlow code you want.
So, if you've noticed, we still haven't
talked about orchestration.
So we now have a configuration system, we have components,
and we have a metadata store.
And I know what some of you may be thinking right now.
Is he going to announce a new orchestration system?
And the good news is no--
at least not today.
Instead, we talked to a lot of our users, to a lot of you,
and unsurprisingly found out--
whoops.
Can we go back one slide?
Yup.
Unsurprisingly found out that there's
a significant installed base of orchestration
systems in your companies.
We just heard from Airbnb.
Of course, they developed Airflow.
And there's a lot of companies that use Kubeflow
And there's a number of other orchestration systems.
So we made a deliberate choice to support
any number of orchestration systems
because we don't want to make you adopt
a different orchestration system just
to orchestrate TFX pipelines.
So the installed base was reason number one.
Reason number two is we really want
you to extend TFX pipelines.
What we publish is really just our opinionated version
of what a TFX pipeline looks like
and the components that we use at Google.
But we want to make it easier for you
to add new components before and after and in parallel
to customize the pipeline to your own use cases.
And all of these orchestration systems
are really made to be able to express arbitrary workflows.
And if you're already familiar with one of those orchestration
systems, you should be able to use them for your use case.
So here we show you two examples of what
that looks like with Airflow and Kubeflow pipelines.
So on the left you see that same TFX pipeline configured
that is executed on Airflow.
And there on my example we use this for a small data
set so we can iterate on it fast on a local machine.
So in the Chicago taxicab example we use 10,000 records.
And on the right side, you see the exact same pipeline
executed on Kubeflow pipelines, on Google Cloud
so that you can take advantage of Cloud Dataflow and Cloud ML
Engine and scale it up to the 100 million [INAUDIBLE]
in that data set.
What's important here is it's the same configuration,
it's the same components, so we run the same components
in both environments, and you can
choose how you want to orchestrate them
in your own favorite orchestration system.
So this is what this looks like if it's put together.
TFX goes all the way from your raw data
to your deployment environment.
We discussed a shared configuration model
at the top, the metadata system that keeps track of all the
runs no matter how you orchestrate
those components, and then two ways
that we published of how to orchestrate them
with Airflow and with Kubeflow pipelines.
But, as mentioned, you can choose to orchestrate a TFX
pipeline in any way you want.
All of this is available now.
So you can go on GitHub, on github.com/tensorflow/tfx
to check out our code and see our new user guide
on tensorflow.org/tfx.
And I also want to point out that tomorrow we
have a workshop where you can get
hands-on experience with TensorFlow Extended,
from 12:00 to 2:00 PM.
And there's no prerequisites.
You don't even have to bring your own laptop.
So with this, we're going to jump into an end-to-end example
of how to actually go through the entire workflow
with the Chicago taxicab data set And just
to set some context.
So the Chicago taxi data set is a record
of cab rides in Chicago for some period of time.
And it contains everything that you would expect.
It contains when the trip started, when they ended,
where they started and where they ended,
how much was paid for it, and how it was paid.
Now, some of these features need some transformation,
so latitude and longitude features need to be bucketized.
Usually it's a bad idea to do math
with geographical coordinates.
So we bucketize them and treat them as categorical features.
Vocab features, which are strings, need to be integerized
and some of the Dense Float features need to be normalized.
We feed them into a wide and deep model.
So, the Dense features we feed into the deep part of the model
and all of the others we use in the wide part.
And then the label that we're trying to predict
is a Boolean, which is if the tip is
larger than 20% of the fare.
So really what we're doing is we're
building a high tip predictor.
So just in case if there's any cab drivers in the audience
or listening online, come find me later
and I can help you set this up for you.
I think it would be really beneficial to you
if you could predict if a cab ride gives a high tip or not.
So let's jump right in.
And we start with data validation and transformation.
So the first part of the TFX pipeline is ingesting data,
validating that data-- if it's OK--
and then transforming it such that it can
be fed into a TensorFlow graph.
So we start with ExampleGen. And the ExampleGen component really
just ingests data into it a TFX pipeline.
So it takes this input, your raw data.
We ship by default capabilities for CSV and TF Records.
But that's of course extensible as we
can ingest any type of data into these pipelines.
What's important is that, after this step,
the data is in a well-defined place where we can find it--
in a well-defined format because all
of our downstream components standardize on that format.
And it's split between training and eval.
So you've seen the configuration of all of these components
before.
It's very minimal configuration in most of the cases.
Next, to move onto data analysis and validation.
And I think a lot of you have a good intuition why
that is important.
Because, of course, machine learning is just
the process of taking data and learning models
that predict some field in your data.
And you're also aware that if you feed garbage in,
you get garbage out.
This will be no hope in a good machine learning model
if the data wrong, or if the data have errors in them.
And this is even reinforced if you have continuous pipelines
that train on data that was produced by a bad model
and you're just reinforcing the same problem.
So first, what I would argue is that data understanding
is absolutely critical for model understanding.
There's no hope in understanding why
a model is mis-predicting something if you don't
understand what the data looked like
and if the data was OK that was actually fed into the model.
And the question you might ask as a cab
driver is why are my trip predictions bad in the morning
hours?
And for all of these questions that I'm highlighting here,
I'm going to try to answer them with the tools
that we have available in TFX later.
So I will come back to these questions.
Next, we really would like you to treat your data
as your treat code.
There's a lot of care taken with code these days.
It's peer reviewed, it's checked into shared repositories,
it's version controlled, and so on.
And data really needs to be a first class
citizens in these systems.
And with this question, what are our expected
values from our payment types, that's
really a question about the schema of your data.
And what we would argue is that the schema
needs to be treated with the same care
as you treat your code.
And catching errors early is absolutely critical.
Because I'm sure, as all of you know,
errors propagate through the system.
If your data are not OK, then everything else downstream
goes wrong as well.
And these errors are extremely hard to correct for or fix
if you catch them relatively late in the process.
So really catching those problems as early as possible
is absolutely critical.
So in the taxicab example, you would ask a question
like is this new company that I have in my data set a typo
or is it actually a real company which
is a natural evolution of my data set?
So let's see if we can answer some of these questions
with the tools we have available,
starting with Statistics.
So the StatisticsGen component takes in your data,
computes statistics.
The data can be training, eval data,
it can also be serving logs--
in which case, you can look at the skew between your training
and your serving data.
And the statistics really capture the shape of your data.
And the visualization components we
have draw your attention to things
that need your attention, such as if a feature is missing
most of the times, it's actually highlighted in red.
The configuration for this component is minimal, as well.
And let me zoom into some of these visualizations.
And one of the questions that I posed earlier
was why are my tip predictions bad in the morning hours?
So one thing you could do is look at your data set
and see that for trip start hour, in the morning
hours between 2:00 AM and 6:00 AM,
you just don't have much data because there's not
that many taxi trips at that time.
And not having a lot of data in a specific area of your data
can mean that your model is not robust, or has higher variance.
And this could lead to worse predictions.
Next, you move on the SchemaGen. SchemaGen
takes this input, the output of StatisticsGen,
and it infers schema for you.
In the case of the chicago taxicab example,
there's very few features, so you
could handwrite that schema.
Although, it would be hard to handwrite what you expect
the string values to look like.
But if you have thousands of features,
it's hard to actually handwrite that expectation.
So we infer that schema for you the first time we run.
And the schema really represent what you expect from your data,
and what good data looks like, and what
values your string features can take on, and so on.
Again, very minimal configuration.
And the question that we can answer,
now, is what are expected values for payment types?
And if you look here at the very bottom,
you see the field payment type can
take on cash, credit, card dispute, no charge, key card,
and unknown.
So that's the expectation of my data
that's expressed in my schema.
And now the next time I run this,
and this field takes on a different value,
I will get an anomaly--
which comes from the ExampleValidator.
The ExampleValidator takes the statistics and the schema
as an input and produces an anomaly report.
Now, that anomaly report basically
tells you if your data are missing features,
if they have the wrong valency, if your distributions have
shifted for some of these features.
And it's important to highlight that the anomalies
report is human readable.
So you can look at it and understand what's going on.
But it's also machine readable.
So you can automatically make decisions
based on the anomalies and decide
not to train a model if you have anomalies in your data.
So the ExampleValidator just takes this input, statistics,
and the schema.
And let me zoom into one of these anomaly reports.
Here you can see that the field company has taken
on unexpected string values.
That just means that these string values weren't
there in your schema before.
And that can be a natural evolution of your data.
The first time you run this, maybe
you just didn't see any trips from those taxi companies.
And by looking at it, you can say, well, all of these look
like they're normal taxicab companies.
So you can update your schema with this expectation.
Or if you saw a lot of scrambled text in here,
you would know that there's a problem in your data
that you would have to go and fix.
Moving on, we actually get to Transform.
And let me just recap the types of transformations
that we want to do.
I've led them here in red--
in blue, sorry.
So we want to bucketize the longitude and latitude
features.
We want to convert the strings to ints, which
is also calling integerizing.
And for the Dense features, we want to actually normalize them
to a mean of zero and a standard deviation of one.
Now, all of these transformations
require you to do a full pass of your data
to compute some statistics.
To bucketize, you need to figure out
the boundaries of the buckets.
To do a string to integer, you need
to see all of the string values that show up in your data.
And to scale to a Z-score, you need
to compute the mean and the standard deviation.
Now, this is exactly what we built TensorFlow Transform for.
TensorFlow Transform allows you to express
a pre-processing function of your data
that contains some of these transformations that require
a full pass of your data.
And it will then automatically run a data processing graph
to compute those statistics.
So in this case, you can see the orange boxes
are statistics that we require.
So for minimalization, we require a mean
and the standard deviation.
And what TensorFlow Transform does
is it has a utility function that says scale to Z-score.
And it will then create a data processing graph for you
that computes the mean and the standard deviation
of your data, return the results,
and inject them as constants into your transformation graph.
So now that graph is a hermetic graph
that contains all of the information
that you need to actually apply your transformations.
And that graph can then be used in training
and in serving, guaranteeing that there's
no drift between them.
This basically eliminates the chances
of training serving skew by applying
the same transformations.
And at serving time, we just need to fit in the raw data,
and all the transformations are done as part of the TensorFlow
graph.
So how does that look like in the TFX Pipeline?
The Transform component takes in data, schema--
the schema allows us to parse the data more easily--
and code.
In this case, this is the user-provided pre-processing
function.
And it produces the Transform graph, which I just mentioned,
which is a hermetic graph that applies
the transformations, that gets attached
to your training and your serving graph.
And it optionally can materialize the Transform data.
And that's a performance optimization that sometimes you
need when you want to feed hardware accelerators really
fast, it can sometimes pay off to materialize
some transformations before your training step.
So in this case, the configuration of the component
takes in a module file.
That's just the file where you configure
your pre-processing function.
And in this code snippet, the actual code
is not that important.
But what I want to highlight is--
the last line in this code snippet
is how we transform our label.
Because, of course, the label is a logic expression of saying,
is the tip greater than 20% of my fare?
And the reason why I want to highlight this
is because you don't need analyze phases for all
of your transformations.
So in cases where you don't need analysis phases,
the transformation is just a regular TensorFlow graph
that transforms the features.
However, to scale something to Z-score, to integerize strings,
and to bucketize a feature, you definitely
need analysis phases, and that's what Transform helps you with.
So this is the user code that you would write.
And TF Transform would create a data processing graph
and return the results and the transform graph
that you need to apply these transformations.
So now that we've done with all of this,
we still haven't trained our machine learning model yet,
right?
But we've made sure that we know that our data is
in a place where we can find it.
We know it's in a format that we can understand.
We know it's split between training and eval.
We know that our data are good because we validated them.
And we know that we're applying transforms consistently
between training and serving.
Which brings us to the training step.
And this is where the magic happens, or so they say.
But, actually, it's not because the training step in TFX
is really just the TensorFlow graph and the TensorFlow
training step.
And the training component takes in the output of Transform,
as mentioned, which is the Transform
graph and, optionally, the materialized data, a schema,
and the training code that you provide.
And it creates, as output, TensorFlow models.
And those models are in the safe model format,
which is the standard serialized model
format in TensorFlow, which you've heard quite a bit
about this morning.
And in this case, actually, we produce two of them.
One is the inference graph, which
is used by TensorFlow Serving, and another one
is the eval graph, which contains
the metrics and the necessary annotations
to perform TensorFlow Model Analysis.
And so this is the configuration that you've seen earlier.
And, again, that the trainer takes in a module file.
And the code that's actually in that module file, again,
I'm just going to show you the same slide
again just to reiterate the point, is just TensorFlow.
So, in this case, it's the train and evaluate method
from estimators and a [? canned ?] estimator
that has been returned here.
But again, just to make sure you're
aware of this, any TensorFlow quote here
that produces the safe model in the right format is fair game.
So all of this works really well.
So with this, we've now trained the TensorFlow model.
And now I'm going to hand it off to my colleague,
Christina, who's going to talk about model evaluation
and analysis.
[MUSIC PLAYING]