字幕列表 影片播放
Let's look in this video at the process of how supervised learning works.
讓我們通過這段視頻來了解監督學習的工作過程。
Supervised learning algorithm will input a data set and then what exactly does it do and what does it output?
有監督的學習算法會輸入一個數據集,然後它究竟會做什麼,輸出什麼?
Let's find out in this video.
讓我們通過這段視頻一探究竟。
Recall that a training set in supervised learning includes both the input features such as the size of the holes, and also the output targets such as the price of the holes.
回想一下,監督學習中的訓練集既包括輸入特徵(如孔洞大小),也包括輸出目標(如孔洞價格)。
The output targets are the right answers that the model will learn from.
輸出目標是模型將學習的正確答案。
To train the model, you feed the training set, both the input features and the output targets to your learning algorithm.
要訓練模型,就需要向學習算法提供訓練集,包括輸入特徵和輸出目標。
Then your supervised learning algorithm will produce some function.
然後,你的監督學習算法會產生一些函數。
We'll write this function as lowercase f, where f stands for function.
我們將這個函數寫成小寫 f,其中 f 代表函數。
Historically, this function used to be called a hypothesis, but I'm just going to call it a function f in this class.
從歷史上看,這個函數被稱為假設,但在本課中我只把它叫做函數 f。
The job of f is to take a new input x and output an estimate or prediction, which I'm going to call y-hat, and is written like the variable y, with this little hat symbol on top.
f 的工作是接收新的輸入 x,然後輸出一個估計值或預測值,我把它稱為 y-hat,寫法類似於變量 y,上面有一個小帽子符號。
In machine learning, the convention is that y-hat is the estimate or the prediction for y.
在機器學習中,y-hat 是對 y 的估計或預測。
The function f is called the model. x is called the input or the input feature, and the output of the model is the prediction y-hat.
函數 f 被稱為模型,x 被稱為輸入或輸入特徵,模型的輸出是預測結果 y-帽子。
The model's prediction is the estimated value of y.
模型的預測值就是 y 的估計值。
When the symbol is just a letter y, then that refers to the target, which is the actual true value in the training set.
當符號僅為字母 y 時,則表示目標值,即訓練集中的實際真實值。
In contrast, y-hat is an estimate.
相比之下,y-hat 是一個估計值。
It may or may not be the actual true value.
它可能是,也可能不是實際的真實值。
Well, if you're helping your client to sell their house, well, the true price of the house is unknown until they sell it.
好吧,如果你在幫客戶賣房子,那麼在他們賣掉房子之前,房子的真實價格是未知的。
So your model f, given the size, outputs a price which is the estimated, that is the prediction of what the true price will be.
是以,您的模型 f 在給定規模的情況下,會輸出一個估計價格,也就是對真實價格的預測。
Now, when we design a learning algorithm, a key question is, how are we going to represent the function f?
現在,當我們設計一種學習算法時,一個關鍵問題是,我們該如何表示函數 f?
Or in other words, what is the math formula we're going to use to compute f?
或者換句話說,我們要用什麼數學公式來計算 f?
For now, let's stick with f being a straight line.
現在,我們還是把 f 看成一條直線吧。
So your function can be written as f subscript w comma b of x equals, I'm going to use w times x plus b.
是以,你的函數可以寫成 f 下標 w 逗號 b 的 x 等於,我要用 w 乘以 x 加 b。
I'll define w and b soon, but for now, just know that w and b are numbers, and the values chosen for w and b will determine the prediction y-hat based on the input feature x.
我很快就會定義 w 和 b,但現在只需知道 w 和 b 是數字,而為 w 和 b 選擇的值將決定基於輸入特徵 x 的預測 y-hat。
So this f w b of x means f is a function that takes x's input, and depending on the values of w and b, f will output some value of a prediction y-hat.
是以,x 的 f w b 表示 f 是一個函數,它接受 x 的輸入,根據 w 和 b 的值,f 將輸出某個預測值 y-帽子。
As an alternative to writing this f w comma b of x, I'll sometimes just write f of x without explicitly including w and b in the subscript.
除了寫 x 的 f w 逗號 b 之外,有時我還會直接寫 x 的 f,而不在下標中明確包含 w 和 b。
It's just a simpler notation, but means exactly the same thing as f w b of x.
這只是一個更簡單的符號,但意思與 x 的 f w b 完全相同。
Let's plot the training set on the graph, where the input feature x is on the horizontal axis, and the output target y is on the vertical axis.
讓我們把訓練集繪製在圖表上,輸入特徵 x 在橫軸上,輸出目標 y 在縱軸上。
Remember, the algorithm learns from this data and generates a best fit line, like maybe this one here.
記住,算法會從這些數據中學習並生成一條最佳擬合線,就像這裡的這條。
This straight line is the linear function f w b of x equals w times x plus b.
這條直線就是 x 的線性函數 f w b,等於 w 乘以 x 加 b。
Or more simply, we can drop w and b and just write f of x equals wx plus b.
更簡單地說,我們可以去掉 w 和 b,直接寫成 x 的 f 等於 wx 加 b。
Here's what this function is doing.
這個函數的作用是
It's making predictions for the value of y using a straight line function of x.
這是利用 x 的直線函數來預測 y 的值。
So you may ask, why are we choosing a linear function where linear function is just a fancy term for a straight line, instead of some non-linear function like a curve or a parabola?
那麼你可能會問,為什麼我們要選擇線性函數(線性函數只是直線的一個花哨術語),而不是曲線或拋物線等非線性函數呢?
Well, sometimes you want to fit more complex non-linear functions as well, like a curve like this.
有時候,你還想擬合更復雜的非線性函數,比如像這樣的曲線。
But since this linear function is relatively simple and easy to work with, let's use a line as a foundation that will eventually help you to get to more complex models that are non-linear.
不過,由於這種線性函數相對簡單,易於操作,我們還是以直線為基礎,最終幫助您建立更復雜的非線性模型。
This particular model has a name.
這個特殊的型號有一個名字。
It's called linear regression.
這就是所謂的線性迴歸。
More specifically, this is linear regression with one variable, where the phrase one variable means that there's a single input variable or feature x, namely the size of the house.
更具體地說,這是一個變量的線性迴歸,其中 "一個變量 "的意思是隻有一個輸入變量或特徵 x,即房子的大小。
Another name for a linear model with one input variable is univariate linear regression, where uni means one in Latin and where variate means variable.
只有一個輸入變量的線性模型的另一個名稱是單變量線性迴歸,其中 uni 在拉丁語中是一個的意思,variate 是變量的意思。
So univariate is just a fancy way of saying one variable.
是以,單變量只是一種花哨的說法。
In a later video, you'll also see a variation of regression where you want to make a prediction based not just on the size of a house, but on a bunch of other things that you may know about the house, such as the number of bedrooms and other features.
在稍後的視頻中,您還會看到迴歸的一種變體,即您不僅要根據房子的大小,還要根據您可能知道的房子的其他資訊(如臥室數量和其他特徵)進行預測。
And by the way, when you're done with this video, there is another optional lab.
順便說一下,當你看完這段視頻後,還有一個可選的實驗室。
You don't need to write any code.
您無需編寫任何代碼。
Just review it, run the code, and see what it does.
只需查看它,運行代碼,看看它能做什麼。
That will show you how to define in Python a straight line function.
這將告訴你如何在 Python 中定義直線函數。
And the lab will let you choose the values of W and B to try to fit the training data.
實驗室會讓你選擇 W 和 B 的值,嘗試與訓練數據相匹配。
You don't have to do the lab if you don't want to, but I hope you play with it when you're done watching this video.
如果你不想做實驗,也可以不做,但我希望你在看完這段視頻後能玩一玩。
So that's linear regression.
這就是線性迴歸。
In order for you to make this work, one of the most important things you have to do is construct a cost function.
為了使這一方法奏效,最重要的事情之一就是構建成本函數。
The idea of a cost function is one of the most universal and important ideas in machine learning and is used in both linear regression and in training many of the most advanced AI models in the world.
代價函數的概念是機器學習中最普遍、最重要的概念之一,被用於線性迴歸和訓練世界上許多最先進的人工智能模型。
So let's go on to the next video and take a look at how you can construct a cost function.
讓我們繼續觀看下一個視頻,看看如何構建成本函數。