字幕列表 影片播放
Machine learning is creating tremendous economic value today.
如今,機器學習正在創造巨大的經濟價值。
I think 99 percent of the economic value created by machine learning today is through one type of machine learning, which is called supervised learning.
我認為,目前機器學習所創造的經濟價值中,有 99% 是通過一種叫做監督學習的機器學習實現的。
Let's take a look at what that means.
讓我們來看看這意味著什麼。
Supervised machine learning, or more commonly supervised learning, refers to algorithms that learn X to Y or input to output mappings.
監督機器學習,或更常見的監督學習,是指學習 X 到 Y 或輸入到輸出映射的算法。
The key characteristic of supervised learning is that you give your learning algorithm examples to learn from that include the right answers, where by right answer I mean the correct label Y for a given input X.
監督式學習的主要特點是,你要給學習算法提供包含正確答案的學習示例,這裡的正確答案是指給定輸入 X 的正確標籤 Y。
And it's by seeing correct pairs of input X and desired output label Y that the learning algorithm eventually learns to take just the input alone without the output label and gives a reasonably accurate prediction or guess of the output.
正是通過觀察輸入 X 和所需輸出標籤 Y 的正確配對,學習算法最終學會了只接收輸入而不接收輸出標籤,並對輸出做出合理準確的預測或猜測。
Let's look at some examples.
讓我們來看幾個例子。
If the input X is an email and the output Y is this email spam or not spam, this gives you your spam filter.
如果輸入 X 是一封電子郵件,輸出 Y 是這封電子郵件是垃圾郵件或不是垃圾郵件,這就是垃圾郵件過濾器。
Or if the input is an audio clip and the algorithm's job is to output the text transcript.
或者,如果輸入的是音頻片段,而算法的任務是輸出文本轉錄。
Then this is speech recognition.
這就是語音識別。
Or if you want to input English and have it output the corresponding Spanish, Arabic, Hindi, Chinese, Japanese, or something else translation, then that's machine translation.
或者,如果您想輸入英語,然後讓它輸出相應的西班牙語、阿拉伯語、印地語、中文、日語或其他翻譯,這就是機器翻譯。
Or the most lucrative form of supervised learning today is probably used in online advertising.
或者說,當今最賺錢的監督學習形式可能是用於網絡廣告。
Nearly all the large online ad platforms have a learning algorithm that inputs some information about an ad and some information about you and then tries to figure out if you will click on that ad or not.
幾乎所有大型在線廣告平臺都有一種學習算法,它輸入廣告的一些資訊和你的一些資訊,然後試著判斷你是否會點擊該廣告。
Because by showing you ads that you're slightly more likely to click on, for these large online ad platforms, every click is revenue.
因為對於這些大型在線廣告平臺來說,通過向你展示你更有可能點擊的廣告,每一次點擊都是收入。
This actually drives a lot of revenue for these companies.
這實際上為這些公司帶來了大量收入。
This is something that one's done a lot of work on.
我們在這方面做了大量工作。
Maybe not the most inspiring application, but it certainly has a significant economic impact in some companies today.
也許這不是最鼓舞人心的應用,但它肯定會對當今一些公司的經濟產生重大影響。
Or if you want to build a self-driving car, the learning algorithm would take as input an image and some information from other sensors, such as a radar or other things, and then try to output the position of, say, other cars so that your self-driving car can safely drive around the other cars.
或者,如果你想製造一輛自動駕駛汽車,學習算法會將影像和其他傳感器(如雷達或其他設備)的一些資訊作為輸入,然後嘗試輸出其他汽車的位置,這樣你的自動駕駛汽車就能安全地繞過其他汽車。
Or take manufacturing.
或者以製造業為例。
I've actually done a lot of work in this sector at Landing AI.
實際上,我在蘭亭人工智能公司做了很多這方面的工作。
You can have a learning algorithm take as input a picture of a manufactured product, say a cell phone that just rolled off the production line, and have the learning algorithm output whether or not there is a scratch, dent, or other defect in the product.
你可以讓學習算法輸入一張製成品的照片,比如剛下線的手機,然後讓學習算法輸出產品是否有劃痕、凹痕或其他缺陷。
This is called visual inspection and is helping manufacturers reduce or prevent defects in their products.
這就是所謂的目視檢測,可幫助製造商減少或防止產品缺陷。
In all of these applications, you would first train your model with examples of inputs X and the right answers, that is, the labels Y.
在所有這些應用中,首先要使用輸入 X 和正確答案(即標籤 Y)的示例來訓練模型。
After the model has learned from these input-output or X and Y pairs, it can then take a brand new input X, something it's never seen before, and try to produce the appropriate corresponding output Y.
在模型從這些輸入-輸出或 X 和 Y 對中學習之後,它就可以接受一個全新的輸入 X(它以前從未見過的東西),並嘗試產生相應的輸出 Y。
Let's dive more deeply into one specific example.
讓我們更深入地瞭解一個具體的例子。
Say you want to predict housing prices based on the size of a house.
假設您想根據房屋面積預測房價。
You've collected some data, and say you plot the data, and it looks like this.
您收集了一些數據,假設您繪製了數據圖,它看起來是這樣的。
Here on the horizontal axis is the size of the house in square feet.
這裡的橫軸是房屋的面積,組織、部門是平方英尺。
And yes, I live in the United States where we still use square feet.
是的,我生活在美國,我們仍然使用平方英尺。
I know most of the world uses square meters.
我知道世界上大多數國家使用平方米。
And here on the vertical axis is the price of the house in, say, thousands of dollars.
這裡的縱軸是房子的價格,比如說,以千美元為組織、部門。
So with this data, let's say a friend wants to know what's the price for their 750 square foot house.
有了這些數據,假設一位朋友想知道自己 750 平方英尺房子的價格是多少。
How can a learning algorithm help you?
學習算法如何幫助您?
One thing a learning algorithm might be able to do is, say, fit a straight line to the data.
學習算法能做的一件事是,比如說,根據數據擬合一條直線。
And reading off the straight line, it looks like your friend's house could be sold for maybe about, I don't know, $150,000.
從直線上看,你朋友的房子可以賣到 15 萬美元左右。
But fitting a straight line isn't the only learning algorithm you can use.
但是,擬合直線並不是唯一的學習算法。
There are others that could work better for this application.
還有其他更適合這種應用的方法。
For example, rather than fitting a straight line, you might decide that it's better to fit a curve, a function that's slightly more complicated or more complex than a straight line.
例如,與其擬合一條直線,不如擬合一條曲線,一條比直線稍複雜或更復雜的函數。
If you do that and make a prediction here, then it looks like, well, your friend's house could be sold for closer to $200,000.
如果你這樣做,並在這裡進行預測,那麼看起來,你朋友的房子可能會以接近 20 萬美元的價格出售。
One of the things you see later in this class is how you can decide whether to fit a straight line, a curve, or another function that is even more complex to the data.
在本課後面的內容中,你會看到如何決定是擬合直線、曲線,還是其他更復雜的函數。
Now, it doesn't seem appropriate to pick the one that gives your friend the best price.
現在看來,選一個給你朋友最優惠價格的似乎並不合適。
But one thing you see is how to get an algorithm to systematically choose the most appropriate line or curve or other thing to fit to this data.
但你會看到的一點是,如何讓算法系統地選擇最合適的直線、曲線或其他東西來擬合這些數據。
What you've seen in this slide is an example of supervised learning.
你在這張幻燈片中看到的就是監督學習的一個例子。
Because we gave the algorithm a data set in which the so-called right answer, that is, the label or the correct price y, is given for every house on the plot.
因為我們給了算法一個數據集,在這個數據集中,地塊上的每棟房子都有所謂的正確答案,即標籤或正確的價格 y。
And the task of the learning algorithm is to produce more of these right answers, specifically predicting what is the likely price for other houses like your friend's house.
而學習算法的任務就是產生更多這樣的正確答案,特別是預測像你朋友家這樣的其他房子的可能價格。
That's why this is supervised learning.
這就是有監督學習的原因。
To define a little bit more terminology, this housing price prediction is a particular type of supervised learning called regression.
再定義一下術語,這種房價預測是一種特殊的監督學習,稱為迴歸。
And by regression, I mean we're trying to predict a number from infinitely many possible numbers, such as the house prices in our example, which could be 150,000 or 70,000 or 183,000 or any other number in between.
我所說的迴歸,是指我們試圖從無限多的可能數字中預測一個數字,比如我們例子中的房價,可能是 150,000 或 70,000 或 183,000 或介於兩者之間的任何其他數字。
So that's supervised learning, learning input-output or x-to-y mappings.
這就是監督學習,學習輸入-輸出或 X-Y 映射。
And you saw in this video an example of regression, where the task is to predict a number.
你在這段視頻中看到了一個迴歸的例子,任務是預測一個數字。
But there's also a second major type of supervised learning problem called classification.
但還有第二類主要的監督學習問題,即分類問題。
Let's take a look at what that means in the next video.
讓我們在下一段視頻中看看這意味著什麼。