模型理解和業務現實（TensorFlow擴展 (Model Understanding and Business Reality (TensorFlow Extended))

字幕列表影片播放

Hi, I'm Robert Crowe,
and today I'm going to be talking about TensorFlow Extended also known as TFX,
and how it helps you put your amazing machine learning models
into production.
This is the final episode of our five-part series
on real-world machine learning and production.
We've covered a lot so far in episodes 1 through 4,
so if you haven't seen those yet, I'd really recommend watching them.
In today's episode, we'll be looking at an example
of how model understanding is critical for meeting your business goals.
Let's get started.
♪ (music) ♪
We've talked about how TFX and TensorFlow model analysis
let you do deep analysis of your model's performance.
Let's look at why that's important.
In this example we have an online retailer who is selling shoes.
They're using a model to predict click-through rates
and using those predictions to decide how much inventory to order
for each of their products.
Everything seems to be working great,
when suddenly, they discover that their model's AUC
and prediction accuracy for a particular part of their product line,
men's dress shoes,
has started getting much worse than it was before.
Now, how much inventory should they order for men's dress shoes?
If these are high-end dress shoes,
the cost could be a significant part of their business.
That's why doing deep analysis of your model's performance,
not just once, but on an ongoing basis, is critical for your business.
TFX creates pipelines that enable that kind of ongoing deep analysis.
Remember that it's not just overall model performance.
Mispredictions on different parts of your data
do not have the same uniform cost to your business.
The data that you have is almost never the data that you wish you had,
and your model's objectives, things like AUC,
are really just proxies for your actual business objectives,
things like knowing how much inventory to order.
Finally, the real world doesn't stand still,
so your data and business conditions are constantly changing.
That's why you need to continue to monitor and analyze
how your model reacts to changes.
One way to look at this is to think about a triangle
which we call the ML Insights Triangle.
We found that usually when there's a problem
with your model's performance for your business,
it's because an assumption was violated.
The question is, which one?
So, what are these assumptions?
First, has something about the realities of our business changed?
Maybe we have a new supplier or a new product has been released.
Maybe our customers' behavior has changed.
All of these can affect our business,
and how well our models perform for our business.
Have we started getting bad data?
Maybe a sensor has gone bad,
or a service endpoint started getting flaky,
or maybe a software update has broken something,
or maybe the feature set that we've been using
isn't working for the current business conditions,
or maybe the problem really is with our model.
Maybe we need to change the architecture
or create an ensemble with a rules-based system
or just retune the hyper parameters.
When things go wrong, you need to start investigating
to look for potential problems.
The place to start is always with your data
because if your data isn't right, nothing will be right.
Fortunately, TFX builds tools and processes
for investigating your data right into your pipelines
with the StatisticsGen, SchemaGen and ExampleValidator components
and the tools provided by TensorFlow Data Validation.
You should look for outliers and missing values in your data
and also look for changes in the distributions
for each of your features.
For example, seasonality and trend can affect your data over time,
and if you don't look for it, you might not be aware of it.
TensorFlow Data Validation or TFDV,
provides visualization tools like this for investigating your data
and making comparisons between the data you're seeing now
and the data you were seeing last week or last month.
These are really valuable when you are trying to dive into your data.
You could also look for particular combinations of features,
regions of your loss surface where your data may be sparse.
Coverage of your feature space is important for model performance
and it will change over time as your data changes.
In regions where coverage is sparse,
you may need to focus on collecting examples
to fill in those spaces,
which might require creating new features or eliminating features
that aren't providing good predictive information.
This can often be a result of changes in your business conditions.
Perhaps a new shoe came on the market
or someone bought TV media that shifted CTRs from one brand to another.
Change is a constant in business and in life
and it's a constant for your data too.
Another way to investigate the problem
is to really dig into your model's performance.
Fortunately, TFX builds tools and processes
for doing deep analysis of your model's performance
right into your pipelines with the Evaluator component
and the tools provided by TensorFlow Model Analysis or TFMA.
It's really important to look at not just the top level metrics for your model,
but your model performance on individual slices of your data.
What slices make sense?
Try to think about combinations of features,
regions of your loss surface that define different parts of your data.
Look at edge cases and corner cases.
Look at important subsets of your data
or critical, but rare situations.
There is an art to understanding your data and how it reflects your business
and TFMA gives you tools like this
to explore and evolve your understanding of it.
We also make the "what-if" tool
available for exploring and experimenting with your data and your model.
It's a great tool for doing what-if experiments
to see how your model responds to changes
and in the process developing a better understanding
of your model and your data.
The results it displays aren't exact
because it only works with samples of your data,
but it can give you approximate results that can point you in the right direction.
It works in both TensorBoard and Jupyter Notebooks
and pulls in data from MO Metadata,
so that we can compare the results we have today
with last week or last month.
But remember, no model is 100% accurate all the time.
What matters is the cost to your business.
So to really understand the misprediction cost,
you need to join it with your business data
and calculate how much the inaccuracies in your model's objectives,
which are really just proxies for your business objectives,
end up costing you.
Without doing that, you have no way of knowing
if changes in your model's performance are a little problem
or a big problem or an emergency.
So that's how TFX helps you manage your models,
manage your ML Applications and manage your business.
TFX is the framework that Google and Alphabet companies use
for our production ML and now it's available for everyone to use.
For more information on TFX, visit us at tensorflow.org/tfx,
clone the repos on GitHub,
and don't forget to comment and like us below.
And thanks for watching.
♪ (music) ♪