Placeholder Image

字幕列表 影片播放

  • This is a three. It's sloppily written and rendered at an extremely low resolution of 28 by 28 pixels.

    這是一個隨意書寫的28*28像素、解析度很低的數字 3

  • But your brain has no trouble recognizing it as a three and I want you to take a moment to appreciate

    但你的大腦一看見就能輕鬆辨識出來 ,我想要你好好欣賞這點

  • How crazy it is that brains can do this so effortlessly?


  • I mean this this and this are also recognizable as threes,

    我的意思是,這個、這個、還有這個,都能被識別為 3

  • even though the specific values of each pixel is very different from one image to the next.


  • The particular light-sensitive cells in your eye that are firing when you see this three

    當你看到這張 3 在眼中所激發的感光細胞

  • are very different from the ones firing when you see this three.

    跟當你看到這張 3 所激發的感光細胞是非常不同的

  • But something in that crazy smart visual cortex of yours


  • resolves these as representing the same idea while at the same time recognizing other images as their own distinct ideas

    能將這兩個 3 視為同一個概念,同時將其他圖像視為不同的概念

  • But if I told you hey sit down and write for me a program that takes in a grid of 28 by 28

    要是我要你:「嘿!坐下來幫我寫個程式, 」

  • pixels like this and outputs a single number between 0 and 10 telling you what it thinks the digit is

    「輸入像這個 28*28 像素的數字圖像」

  • Well the task goes from comically trivial to dauntingly difficult

    「接著輸出該程式認為的 0 到 10 之間的一個數字 ,必須跟你認為的一樣。」

  • Unless you've been living under a rock


  • I think I hardly need to motivate the relevance and importance of machine learning and neural networks to the present into the future


  • But what I want to do here is show you what a neural network actually is


  • Assuming no background and to help visualize what it's doing not as a buzzword but as a piece of math


  • My hope is just that you come away feeling like this structure itself is


  • Motivated and to feel like you know what it means when you read or you hear about a neural network quote-unquote learning


  • This video is just going to be devoted to the structure component of that and the following one is going to tackle learning


  • What we're going to do is put together a neural network that can learn to recognize handwritten digits


  • This is a somewhat classic example for

    當你看到或聽到機器藉著神經網路來「學習」時 是了解其意涵的

  • Introducing the topic and I'm happy to stick with the status quo here because at the end of the two videos I want to point


  • You to a couple good resources where you can learn more and where you can download the code that does this and play with it?


  • on your own computer


  • There are many many variants of neural networks and in recent years


  • There's been sort of a boom in research towards these variants


  • But in these two introductory videos you and I are just going to look at the simplest plain-vanilla form with no added frills


  • This is kind of a necessary


  • prerequisite for understanding any of the more powerful modern variants and


  • Trust me it still has plenty of complexity for us to wrap our minds around


  • But even in this simplest form it can learn to recognize handwritten digits

    但這兩支入門影片只會帶你來認識,最簡單的一種神經網路:「多層感知機」(MLP) 最基本的樣子

  • Which is a pretty cool thing for a computer to be able to do.


  • And at the same time you'll see how it does fall short of a couple hopes that we might have for it


  • As the name suggests neural networks are inspired by the brain, but let's break that down


  • What are the neurons and in what sense are they linked together?


  • Right now when I say neuron all I want you to think about is a thing that holds a number


  • Specifically a number between 0 & 1 it's really not more than that


  • For example the network starts with a bunch of neurons corresponding to each of the 28 times 28 pixels of the input image


  • which is


  • 784 neurons in total each one of these holds a number that represents the grayscale value of the corresponding pixel


  • ranging from 0 for black pixels up to 1 for white pixels


  • This number inside the neuron is called its activation and the image you might have in mind here

    基本是介於 0 和 1 之間的數字,但實際上不止於此

  • Is that each neuron is lit up when its activation is a high number?


  • So all of these 784 neurons make up the first layer of our network

    也就是說輸入層總共有 784 個神經元,每個都有乘載數字 ,每個數字代表了對應像素的灰階值

  • Now jumping over to the last layer this has ten neurons each representing one of the digits

    灰階值 0 即黑色,1 即白色

  • the activation in these neurons again some number that's between zero and one


  • Represents how much the system thinks that a given image?


  • Corresponds with a given digit. There's also a couple layers in between called the hidden layers


  • Which for the time being?

    於是全部的 784 個神經元,組成了神經網路的第一層

  • Should just be a giant question mark for how on earth this process of recognizing digits is going to be handled

    我們現在跳到最後一層,這層有 10 個神經元,各自表示 0 到 9 的數字

  • In this network I chose two hidden layers each one with 16 neurons and admittedly that's kind of an arbitrary choice

    同樣在這邊的神經元也各自有著介於 0 到 1 的激勵值

  • to be honest I chose two layers based on how I want to motivate the structure in just a moment and


  • 16 well that was just a nice number to fit on the screen in practice


  • There is a lot of room for experiment with a specific structure here


  • The way the network operates activations in one layer determine the activations of the next layer


  • And of course the heart of the network as an information processing mechanism comes down to exactly how those


  • activations from one layer bring about activations in the next layer


  • It's meant to be loosely analogous to how in biological networks of neurons some groups of neurons firing


  • cause certain others to fire

    而安排 16 個神經元,只是為了符合版面,也是為了讓你看得清楚

  • Now the network


  • I'm showing here has already been trained to recognize digits and let me show you what I mean by that


  • It means if you feed in an image lighting up all


  • 784 neurons of the input layer according to the brightness of each pixel in the image


  • That pattern of activations causes some very specific pattern in the next layer


  • Which causes some pattern in the one after it?


  • Which finally gives some pattern in the output layer and?


  • The brightest neuron of that output layer is the network's choice so to speak for what digit this image represents?


  • And before jumping into the math for how one layer influences the next or how training works?


  • Let's just talk about why it's even reasonable to expect a layered structure like this to behave intelligently

    意思是:當你輸入一張 28x28 像素的圖像,它將點亮

  • What are we expecting here? What is the best hope for what those middle layers might be doing?

    所有 784 個神經元 每個都比照對應像素的灰階值,來決定自己的激勵值

  • Well when you or I recognize digits we piece together various components a nine has a loop up top and a line on the right


  • an 8 also has a loop up top, but it's paired with another loop down low


  • A 4 basically breaks down into three specific lines and things like that


  • Now in a perfect world we might hope that each neuron in the second-to-last layer


  • corresponds with one of these sub components


  • That anytime you feed in an image with say a loop up top like a 9 or an 8


  • There's some specific

    我們在期待什麼呢? 我們最想要神經網路的隱藏層怎麼運作呢?

  • Neuron whose activation is going to be close to one and I don't mean this specific loop of pixels the hope would be that any


  • Generally loopy pattern towards the top sets off this neuron that way going from the third layer to the last one

    一個 9 字,上面有圓圈,而在右邊有一條直線

  • just requires learning which combination of sub components corresponds to which digits

    一個 8 字,上面也有一个圓圈但在下面與另一個圓圈相連

  • Of course that just kicks the problem down the road

    一個 4 基本上可以拆解成就像那些特定的筆畫

  • Because how would you recognize these sub components or even learn what the right sub components should be and I still haven't even talked about


  • How one layer influences the next but run with me on this one for a moment


  • recognizing a loop can also break down into subproblems

    每次你輸入一個有頂部有個圓圈的圖像如 9 或 8 時

  • One reasonable way to do this would be to first recognize the various little edges that make it up

    隱層第二層的某些特定神經元的激勵值就會接近 1

  • Similarly a long line like the kind you might see in the digits 1 or 4 or 7


  • Well that's really just a long edge or maybe you think of it as a certain pattern of several smaller edges


  • So maybe our hope is that each neuron in the second layer of the network


  • corresponds with the various relevant little edges


  • Maybe when an image like this one comes in it lights up all of the neurons


  • associated with around eight to ten specific little edges


  • which in turn lights up the neurons associated with the upper loop and a long vertical line and


  • Those light up the neuron associated with a nine


  • whether or not

    同樣的道理,你在數字 1,4 或者 7 中所看到的一條長線

  • This is what our final network actually does is another question, one that I'll come back to once we see how to train the network


  • But this is a hope that we might have. A sort of goal with the layered structure like this


  • Moreover you can imagine how being able to detect edges and patterns like this would be really useful for other image recognition tasks


  • And even beyond image recognition there are all sorts of intelligent things you might want to do that break down into layers of abstraction


  • Parsing speech for example involves taking raw audio and picking out distinct sounds which combine to make certain syllables

    所有大約有 8 到 10 種有關的特定神經元

  • Which combine to form words which combine to make up phrases and more abstract thoughts etc


  • But getting back to how any of this actually works picture yourself right now designing

    最後點亮數字 9 的神經元

  • How exactly the activations in one layer might determine the activations in the next?


  • The goal is to have some mechanism that could conceivably combine pixels into edges


  • Or edges into patterns or patterns into digits and to zoom in on one very specific example

    但至少我們可能有點希望 像是一種以這樣分層結構為目標的

  • Let's say the hope is for one particular


  • Neuron in the second layer to pick up on whether or not the image has an edge in this region here


  • The question at hand is what parameters should the network have


  • what dials and knobs should you be able to tweak so that it's expressive enough to potentially capture this pattern or


  • Any other pixel pattern or the pattern that several edges can make a loop and other such things?


  • Well, what we'll do is assign a weight to each one of the connections between our neuron and the neurons from the first layer


  • These weights are just numbers


  • then take all those activations from the first layer and compute their weighted sum according to these weights I

    邊或者把邊結合成式樣或者式樣成爲數字 在這個特別的例子裡面

  • Find it helpful to think of these weights as being organized into a little grid of their own


  • And I'm going to use green pixels to indicate positive weights and red pixels to indicate negative weights


  • Where the brightness of that pixel is some loose depiction of the weights value?


  • Now if we made the weights associated with almost all of the pixels zero


  • except for some positive weights in this region that we care about


  • then taking the weighted sum of


  • all the pixel values really just amounts to adding up the values of the pixel just in the region that we care about


  • And, if you really want it to pick up on whether there's an edge here what you might do is have some negative weights


  • associated with the surrounding pixels


  • Then the sum is largest when those middle pixels are bright, but the surrounding pixels are darker

    我把正的權重值標記為綠色 負的權重值標記為紅色

  • When you compute a weighted sum like this you might come out with any number

    當顏色越亮代表它的值跟 0 差距越大

  • but for this network what we want is for activations to be some value between 0 & 1


  • so a common thing to do is to pump this weighted sum

    所有的權重值都改為 0

  • Into some function that squishes the real number line into the range between 0 & 1 and


  • A common function that does this is called the sigmoid function also known as a logistic curve


  • basically very negative inputs end up close to zero very positive inputs end up close to 1


  • and it just steadily increases around the input 0


  • So the activation of the neuron here is basically a measure of how positive the relevant weighted sum is

    這樣當中間的像素亮但是周圍的像素暗 就可以得到最大的加權總和

  • But maybe it's not that you want the neuron to light up when the weighted sum is bigger than 0

    當你計算加權總和時 它的值可能是任意實數

  • Maybe you only want it to be active when the sum is bigger than say 10

    但是在這裡我們想要計算完的結果介於 0 跟 1 之間

  • That is you want some bias for it to be inactive


  • what we'll do then is just add in some other number like negative 10 to this weighted sum

    把這個實數軸壓縮成一個介於 0 到 1 之間

  • Before plugging it through the sigmoid squishification function


  • That additional number is called the bias

    基本上越小的數會越來越接近 0 越大的數會越來越接近 1

  • So the weights tell you what pixel pattern this neuron in the second layer is picking up on and the bias

    輸入值在 0 附近的會平穩增長

  • tells you how high the weighted sum needs to be before the neuron starts getting meaningfully active


  • And that is just one neuron


  • Every other neuron in this layer is going to be connected to all

    也許你只想要在它大於 10 的時候啟動

  • 784 pixels neurons from the first layer and each one of those 784 connections has its own weight associated with it


  • also each one has some bias some other number that you add on to the weighted sum before squishing it with the sigmoid and

    我們只要在加權總和後面加上一個像是 負10 之類的數

  • That's a lot to think about with this hidden layer of 16 neurons


  • that's a total of 784 times 16 weights along with 16 biases


  • And all of that is just the connections from the first layer to the second the connections between the other layers


  • Also, have a bunch of weights and biases associated with them


  • All said and done this network has almost exactly


  • 13,000 total weights and biases

    在這一層的每個神經元都會連接第一層共 784 個神經元

  • 13,000 knobs and dials that can be</