Placeholder Image

字幕列表 影片播放

  • This is a three. It's sloppily written and rendered at an extremely low resolution of 28 by 28 pixels.

    這是一個隨意書寫的28*28像素、解析度很低的數字 3

  • But your brain has no trouble recognizing it as a three and I want you to take a moment to appreciate

    但你的大腦一看見就能輕鬆辨識出來 ,我想要你好好欣賞這點

  • How crazy it is that brains can do this so effortlessly?

    人腦能夠毫無障礙地辨識是非常厲害的

  • I mean this this and this are also recognizable as threes,

    我的意思是,這個、這個、還有這個,都能被識別為 3

  • even though the specific values of each pixel is very different from one image to the next.

    即使前後圖像的圖形組成有很大差異

  • The particular light-sensitive cells in your eye that are firing when you see this three

    當你看到這張 3 在眼中所激發的感光細胞

  • are very different from the ones firing when you see this three.

    跟當你看到這張 3 所激發的感光細胞是非常不同的

  • But something in that crazy smart visual cortex of yours

    但在你驚人聰明的視覺皮層的處理下

  • resolves these as representing the same idea while at the same time recognizing other images as their own distinct ideas

    能將這兩個 3 視為同一個概念,同時將其他圖像視為不同的概念

  • But if I told you hey sit down and write for me a program that takes in a grid of 28 by 28

    要是我要你:「嘿!坐下來幫我寫個程式, 」

  • pixels like this and outputs a single number between 0 and 10 telling you what it thinks the digit is

    「輸入像這個 28*28 像素的數字圖像」

  • Well the task goes from comically trivial to dauntingly difficult

    「接著輸出該程式認為的 0 到 10 之間的一個數字 ,必須跟你認為的一樣。」

  • Unless you've been living under a rock

    這個任務將不再是家常便飯,而變得嚇死人的困難

  • I think I hardly need to motivate the relevance and importance of machine learning and neural networks to the present into the future

    除非你是山頂洞人

  • But what I want to do here is show you what a neural network actually is

    我想不用再強調機器學習和神經網路之間,對未來發展的關聯性和重要性

  • Assuming no background and to help visualize what it's doing not as a buzzword but as a piece of math

    我現在要向你展示神經網路究竟是什麼

  • My hope is just that you come away feeling like this structure itself is

    假設你沒有相關背景知識

  • Motivated and to feel like you know what it means when you read or you hear about a neural network quote-unquote learning

    我會視覺化神經網路的運作

  • This video is just going to be devoted to the structure component of that and the following one is going to tackle learning

    並且把它當作一門數學,不僅僅是當下流行詞語

  • What we're going to do is put together a neural network that can learn to recognize handwritten digits

    我希望你將能理解為什麼神經網路是長這個樣子

  • This is a somewhat classic example for

    當你看到或聽到機器藉著神經網路來「學習」時 是了解其意涵的

  • Introducing the topic and I'm happy to stick with the status quo here because at the end of the two videos I want to point

    這支影片將解釋神經網路的構造

  • You to a couple good resources where you can learn more and where you can download the code that does this and play with it?

    而下一部影片將解釋機器學習

  • on your own computer

    我們要做的是打造一個可以辨識手寫數字的神經網路

  • There are many many variants of neural networks and in recent years

    這是介紹這種主題很典型的範例

  • There's been sort of a boom in research towards these variants

    我樂於保持這種模式,因為在看完兩支影片後

  • But in these two introductory videos you and I are just going to look at the simplest plain-vanilla form with no added frills

    我會給你一些很好的網站,你可以在那裡學到很多,並且下載程式碼

  • This is kind of a necessary

    在你的電腦裡好好研究

  • prerequisite for understanding any of the more powerful modern variants and

    神經網路發展成很多很多不同類型

  • Trust me it still has plenty of complexity for us to wrap our minds around

    而且近年來對這些的研究有爆炸性的趨勢

  • But even in this simplest form it can learn to recognize handwritten digits

    但這兩支入門影片只會帶你來認識,最簡單的一種神經網路:「多層感知機」(MLP) 最基本的樣子

  • Which is a pretty cool thing for a computer to be able to do.

    這是必要的入門知識

  • And at the same time you'll see how it does fall short of a couple hopes that we might have for it

    對於將來要理解現在任何一種強大的神經網路

  • As the name suggests neural networks are inspired by the brain, but let's break that down

    而且相信我,今天的主題已經夠複雜了,足以讓你腦袋打結

  • What are the neurons and in what sense are they linked together?

    即使是這麼簡單的神經網路也可以經由學習來分辨手寫數字

  • Right now when I say neuron all I want you to think about is a thing that holds a number

    這對電腦來說是非常酷的一件事

  • Specifically a number between 0 & 1 it's really not more than that

    而且與此同時你也將看到神經網路不盡人意的地方

  • For example the network starts with a bunch of neurons corresponding to each of the 28 times 28 pixels of the input image

    神經網路一如其名,是啟發自生物的大腦神經結構

  • which is

    讓我們來剖析它吧

  • 784 neurons in total each one of these holds a number that represents the grayscale value of the corresponding pixel

    何謂神經元,又是什麼機制讓它們連在一起的?

  • ranging from 0 for black pixels up to 1 for white pixels

    現在,當我說「神經元」,我要你聯想到它是乘載一個數字的容器

  • This number inside the neuron is called its activation and the image you might have in mind here

    基本是介於 0 和 1 之間的數字,但實際上不止於此

  • Is that each neuron is lit up when its activation is a high number?

    例如:神經網路以輸入圖像的每個像素,對應到每個神經元作為輸入

  • So all of these 784 neurons make up the first layer of our network

    也就是說輸入層總共有 784 個神經元,每個都有乘載數字 ,每個數字代表了對應像素的灰階值

  • Now jumping over to the last layer this has ten neurons each representing one of the digits

    灰階值 0 即黑色,1 即白色

  • the activation in these neurons again some number that's between zero and one

    這些在神經元中的數字稱為「激勵值」

  • Represents how much the system thinks that a given image?

    在此你可能注意到

  • Corresponds with a given digit. There's also a couple layers in between called the hidden layers

    每當神經元激勵值越高,該神經元就越亮

  • Which for the time being?

    於是全部的 784 個神經元,組成了神經網路的第一層

  • Should just be a giant question mark for how on earth this process of recognizing digits is going to be handled

    我們現在跳到最後一層,這層有 10 個神經元,各自表示 0 到 9 的數字

  • In this network I chose two hidden layers each one with 16 neurons and admittedly that's kind of an arbitrary choice

    同樣在這邊的神經元也各自有著介於 0 到 1 的激勵值

  • to be honest I chose two layers based on how I want to motivate the structure in just a moment and

    表示對於給定的圖像,神經網路對於實際數字的判斷結果

  • 16 well that was just a nice number to fit on the screen in practice

    在輸入層和輸出層之間,有數個「隱藏層」

  • There is a lot of room for experiment with a specific structure here

    現在在本入門影片裡

  • The way the network operates activations in one layer determine the activations of the next layer

    對於神經網路是如何進行判斷的,我們只能先把它看做是巨大的問號

  • And of course the heart of the network as an information processing mechanism comes down to exactly how those

    本影片展示的視覺化神經網路,我設計了兩個隱層,個別搭載16個神經元

  • activations from one layer bring about activations in the next layer

    這只是擺好看的設定

  • It's meant to be loosely analogous to how in biological networks of neurons some groups of neurons firing

    老實說之所以選擇兩個隱層,是基於視覺化讓你看得清楚的考量,待會解釋

  • cause certain others to fire

    而安排 16 個神經元,只是為了符合版面,也是為了讓你看得清楚

  • Now the network

    在實際應用上,神經網路的結構經實驗不斷調整可以變得非常巨大且特殊

  • I'm showing here has already been trained to recognize digits and let me show you what I mean by that

    神經網路操作一層激勵值的方式,會決定下一層的激勵值

  • It means if you feed in an image lighting up all

    這正是神經網路的核心價值,不同以往的資料處理技術,現在發展成只要輸入

  • 784 neurons of the input layer according to the brightness of each pixel in the image

    激勵值從上一層傳到下一層,最後輸出足夠正確的結果

  • That pattern of activations causes some very specific pattern in the next layer

    神經網路的本質就是模仿生物的大腦,就像一叢腦細胞被激發

  • Which causes some pattern in the one after it?

    引發其他神經細胞的串聯反應

  • Which finally gives some pattern in the output layer and?

    我現在展示的

  • The brightest neuron of that output layer is the network's choice so to speak for what digit this image represents?

    這個神經網路已經訓練完成了,可以準確辨識圖像中的數字

  • And before jumping into the math for how one layer influences the next or how training works?

    讓我來解釋「傳遞激勵值」這點

  • Let's just talk about why it's even reasonable to expect a layered structure like this to behave intelligently

    意思是:當你輸入一張 28x28 像素的圖像,它將點亮

  • What are we expecting here? What is the best hope for what those middle layers might be doing?

    所有 784 個神經元 每個都比照對應像素的灰階值,來決定自己的激勵值

  • Well when you or I recognize digits we piece together various components a nine has a loop up top and a line on the right

    決定的數值分布狀態會影響下一層被啟動的神經元的分布

  • an 8 also has a loop up top, but it's paired with another loop down low

    又會導致下一層不同的分布

  • A 4 basically breaks down into three specific lines and things like that

    最後抵達輸出層,輸出層的神經元也會有特定的分布

  • Now in a perfect world we might hope that each neuron in the second-to-last layer

    而最亮的那個就是神經網路所認為最有可能答對的答案

  • corresponds with one of these sub components

    但在一腳踏進數學之前,要先知道上層如何影響下層,而且機器學習為什麼會有用

  • That anytime you feed in an image with say a loop up top like a 9 or an 8

    為什麼我們認為層狀結構會像這樣聰明地運作是非常合理的?

  • There's some specific

    我們在期待什麼呢? 我們最想要神經網路的隱藏層怎麼運作呢?

  • Neuron whose activation is going to be close to one and I don't mean this specific loop of pixels the hope would be that any

    在你或我在辨識圖中的數字時我們會把各種筆畫拼湊在一起。

  • Generally loopy pattern towards the top sets off this neuron that way going from the third layer to the last one

    一個 9 字,上面有圓圈,而在右邊有一條直線

  • just requires learning which combination of sub components corresponds to which digits

    一個 8 字,上面也有一个圓圈但在下面與另一個圓圈相連

  • Of course that just kicks the problem down the road

    一個 4 基本上可以拆解成就像那些特定的筆畫

  • Because how would you recognize these sub components or even learn what the right sub components should be and I still haven't even talked about

    一個理想的情況中我們會希望第二層的每個神經元

  • How one layer influences the next but run with me on this one for a moment

    能識別這些筆劃的其中之一

  • recognizing a loop can also break down into subproblems

    每次你輸入一個有頂部有個圓圈的圖像如 9 或 8 時

  • One reasonable way to do this would be to first recognize the various little edges that make it up

    隱層第二層的某些特定神經元的激勵值就會接近 1

  • Similarly a long line like the kind you might see in the digits 1 or 4 or 7

    而我要的並不是單單適用這種圓圈,而是更廣泛的各種圓圈皆適用

  • Well that's really just a long edge or maybe you think of it as a certain pattern of several smaller edges

    如此,在隱層第二層到輸出層的神經元

  • So maybe our hope is that each neuron in the second layer of the network

    只需要學習對應於數字的筆畫的組合

  • corresponds with the various relevant little edges

    當然這又丟出了一道難題

  • Maybe when an image like this one comes in it lights up all of the neurons

    因為你怎麼讓那些神經元知道那些數字該對應到那些特定的筆畫?

  • associated with around eight to ten specific little edges

    而我甚至還沒開始講上一層怎麼影響下一層,但是再聽我解釋一下這裡

  • which in turn lights up the neurons associated with the upper loop and a long vertical line and

    辨識一個圓圈的問題也可以分解成辨識一些較小零件的問題

  • Those light up the neuron associated with a nine

    一個合理的方法是認出組成它的各式各樣的邊

  • whether or not

    同樣的道理,你在數字 1,4 或者 7 中所看到的一條長線

  • This is what our final network actually does is another question, one that I'll come back to once we see how to train the network

    真的就是好幾小條的短線,根據特定筆畫順序,組合成的長線

  • But this is a hope that we might have. A sort of goal with the layered structure like this

    所以我們期望在這網絡第二層

  • Moreover you can imagine how being able to detect edges and patterns like this would be really useful for other image recognition tasks

    對應著各式各樣的一些小邊

  • And even beyond image recognition there are all sorts of intelligent things you might want to do that break down into layers of abstraction

    也許出現一個像這樣的一個圖像就點亮

  • Parsing speech for example involves taking raw audio and picking out distinct sounds which combine to make certain syllables

    所有大約有 8 到 10 種有關的特定神經元

  • Which combine to form words which combine to make up phrases and more abstract thoughts etc

    它接著點亮後上方的圓圈和一根垂直的長線以及點亮和一條線相聯的神經元

  • But getting back to how any of this actually works picture yourself right now designing

    最後點亮數字 9 的神經元

  • How exactly the activations in one layer might determine the activations in the next?

    不管這個是不是我們最終的網絡實際上的實施是另一個問題

  • The goal is to have some mechanism that could conceivably combine pixels into edges

    這個我們在知道怎樣了訓練網絡之後我在回過來講

  • Or edges into patterns or patterns into digits and to zoom in on one very specific example

    但至少我們可能有點希望 像是一種以這樣分層結構為目標的

  • Let's say the hope is for one particular

    你可以進一步想像怎樣能來檢測像這樣的邊和式樣對其他的圖像識別功能真是有用的

  • Neuron in the second layer to pick up on whether or not the image has an edge in this region here

    並甚至在圖像識別之外做各種各樣智能的東西也許你也想分解成一些抽象的層

  • The question at hand is what parameters should the network have

    例如句子的分析涉及到把原始的語音提出一些獨特的聲音構成一些音節再構成

  • what dials and knobs should you be able to tweak so that it's expressive enough to potentially capture this pattern or

    詞再構成詞組以及更為抽象的思想等。

  • Any other pixel pattern or the pattern that several edges can make a loop and other such things?

    但回到這些實際是怎樣工作的把你自己現在就放到這個的情景怎樣來設計

  • Well, what we'll do is assign a weight to each one of the connections between our neuron and the neurons from the first layer

    如何在讓這層中的激勵函數可以決定下一層的激勵函數呢?

  • These weights are just numbers

    這目標是有一些機能它想起來可以集中到一個特定的樣本來把一些像素結合成

  • then take all those activations from the first layer and compute their weighted sum according to these weights I

    邊或者把邊結合成式樣或者式樣成爲數字 在這個特別的例子裡面

  • Find it helpful to think of these weights as being organized into a little grid of their own

    我們希望第二層的這一個神經元

  • And I'm going to use green pixels to indicate positive weights and red pixels to indicate negative weights

    可以正確的辨識出這個圖像裡有沒有一條邊

  • Where the brightness of that pixel is some loose depiction of the weights value?

    現在我們想知道的是網路裡有哪些參數

  • Now if we made the weights associated with almost all of the pixels zero

    要怎麼調整這些參數才能讓完整的表達出是這個圖案

  • except for some positive weights in this region that we care about

    還是其他的圖案或是由數個邊組合成的圓圈之類的

  • then taking the weighted sum of

    我們會分配給神經元和輸入層間的每一個連接線一個權重

  • all the pixel values really just amounts to adding up the values of the pixel just in the region that we care about

    權重單純只是一個數字而已

  • And, if you really want it to pick up on whether there's an edge here what you might do is have some negative weights

    然後計算所有激勵函數的加權總和

  • associated with the surrounding pixels

    把這些權重整理成一個圖像應該更好理解

  • Then the sum is largest when those middle pixels are bright, but the surrounding pixels are darker

    我把正的權重值標記為綠色 負的權重值標記為紅色

  • When you compute a weighted sum like this you might come out with any number

    當顏色越亮代表它的值跟 0 差距越大

  • but for this network what we want is for activations to be some value between 0 & 1

    除了我們所關注的區域以外

  • so a common thing to do is to pump this weighted sum

    所有的權重值都改為 0

  • Into some function that squishes the real number line into the range between 0 & 1 and

    然後去取得所有像素的加權總合

  • A common function that does this is called the sigmoid function also known as a logistic curve

    幾乎就等於只有我們所關注的區域的值提升了

  • basically very negative inputs end up close to zero very positive inputs end up close to 1

    如果知道這裡是不是真的存在一條邊

  • and it just steadily increases around the input 0

    你只需要在周圍加上負的權重

  • So the activation of the neuron here is basically a measure of how positive the relevant weighted sum is

    這樣當中間的像素亮但是周圍的像素暗 就可以得到最大的加權總和

  • But maybe it's not that you want the neuron to light up when the weighted sum is bigger than 0

    當你計算加權總和時 它的值可能是任意實數

  • Maybe you only want it to be active when the sum is bigger than say 10

    但是在這裡我們想要計算完的結果介於 0 跟 1 之間

  • That is you want some bias for it to be inactive

    所以我們通常會把這個值丟進一個函數裡面

  • what we'll do then is just add in some other number like negative 10 to this weighted sum

    把這個實數軸壓縮成一個介於 0 到 1 之間

  • Before plugging it through the sigmoid squishification function

    有一個常見的函數叫做「Sigmoid」也被稱為「邏輯函數」

  • That additional number is called the bias

    基本上越小的數會越來越接近 0 越大的數會越來越接近 1

  • So the weights tell you what pixel pattern this neuron in the second layer is picking up on and the bias

    輸入值在 0 附近的會平穩增長

  • tells you how high the weighted sum needs to be before the neuron starts getting meaningfully active

    所以從神經網路得到的激勵函數基本上就代表加權總和的大小

  • And that is just one neuron

    但是不是每次只要加權總和大於零的時候就點亮神經元

  • Every other neuron in this layer is going to be connected to all

    也許你只想要在它大於 10 的時候啟動

  • 784 pixels neurons from the first layer and each one of those 784 connections has its own weight associated with it

    所以要加入一個門檻來確保它不會隨便啟動

  • also each one has some bias some other number that you add on to the weighted sum before squishing it with the sigmoid and

    我們只要在加權總和後面加上一個像是 負10 之類的數

  • That's a lot to think about with this hidden layer of 16 neurons

    再把它塞進邏輯函數裡

  • that's a total of 784 times 16 weights along with 16 biases

    這個附加的數字就叫做偏置

  • And all of that is just the connections from the first layer to the second the connections between the other layers

    所以權重告訴我們下一層的神經元所關注的圖樣

  • Also, have a bunch of weights and biases associated with them

    偏置則告訴我們加權總和要超過什麼程度才是有意義的

  • All said and done this network has almost exactly

    以上只是一個神經元的情況

  • 13,000 total weights and biases

    在這一層的每個神經元都會連接第一層共 784 個神經元

  • 13,000 knobs and dials that can be tweaked and turned to make this network behave in different ways

    而且這 784 條連接線都各有一個屬於自己的權重

  • So when we talk about learning?

    還有每一個神經元都會在計算完加權總和後再加上自己的偏置再用邏輯函數輸出自己的結果

  • What that's referring to is getting the computer to find a valid setting for all of these many many numbers so that it'll actually solve

    讓我們看看這個有著 16 個神經元的隱藏層

  • the problem at hand

    這 16 個神經元都各有 784 個自己的權重和 16 個偏置

  • one thought

    這些還只是第一層和第二層的連接而已

  • Experiment that is at once fun and kind of horrifying is to imagine sitting down and setting all of these weights and biases by hand

    在其他層裡還有他們各自的權重和偏置

  • Purposefully tweaking the numbers so that the second layer picks up on edges the third layer picks up on patterns etc

    整體來說整個網路使用了

  • I personally find this satisfying rather than just reading the network as a total black box

    大約 13,000 個權重和偏置

  • Because when the network doesn't perform the way you

    13,000 個可以調整的參數來讓網路可以呈現不同的結果

  • anticipate if you've built up a little bit of a relationship with what those weights and biases actually mean you have a starting place for

    所以當我們談到學習的時候

  • Experimenting with how to change the structure to improve or when the network does work?

    就是在說如何讓電腦去找到一大堆正確的參數

  • But not for the reasons you might expect

    讓它解決問題

  • Digging into what the weights and biases are doing is a good way to challenge your assumptions and really expose the full space of possible

    有一個仔細想想會很嚇人的情況

  • solutions

    想像一下如果你需要手動調整這些權重和偏置

  • By the way the actual function here is a little cumbersome to write down. Don't you think?

    設定這些數字來讓第二層識別一條邊 然後讓第三層識別圖案

  • So let me show you a more notationally compact way that these connections are represented. This is how you'd see it

    我個人認為這樣想像會比把它整個當成一個黑盒子更好

  • If you choose to read up more about neural networks

    因為當網路的輸出和你的認知有所差異時

  • Organize all of the activations from one layer into a column as a vector

    如果你能足夠了解權重與偏置的關係

  • Then organize all of the weights as a matrix where each row of that matrix

    就更容易該怎麼改變結構來修正

  • corresponds to the connections between one layer and a particular neuron in the next layer

    或是網路能輸出正確的結果但是過程跟你想像中有差異

  • What that means is that taking the weighted sum of the activations in the first layer according to these weights?

    那麼去挖掘權重和偏置的實際境況對於測試你的認知有幫助

  • Corresponds to one of the terms in the matrix vector product of everything we have on the left here

    和找出所有可能的解決方案

  • By the way so much of machine learning just comes down to having a good grasp of linear algebra

    順帶一提 你不覺得這個公式看起來有點複雜嗎?

  • So for any of you who want a nice visual understanding for matrices and what matrix vector multiplication means take a look at the series I did on linear algebra

    所以讓我來示範以簡單的符號來表達整個公式

  • especially chapter three

    如果你以後想要繼續鑽研神經網路就會很常看到

  • Back to our expression instead of talking about adding the bias to each one of these values independently we represent it by

    我們把這一層的激勵函數放到一個向量中

  • Organizing all those biases into a vector and adding the entire vector to the previous matrix vector product

    然後把所有的權重放到矩陣中

  • Then as a final step

    在這個陣列裡的每一列將會對應到這一層和下一層的所有連線的權重

  • I'll rap a sigmoid around the outside here

    這代表矩陣相乘後的每一項都代表其中一個神經元的激勵函數

  • And what that's supposed to represent is that you're going to apply the sigmoid function to each specific

    矩陣相乘的結果就是對應的神經元激勵函數

  • component of the resulting vector inside

    順道一提 學習機器學習需要對線性代數有一定的了解

  • So once you write down this weight matrix and these vectors as their own symbols you can

    所以任何想要透過視覺化的教學理解矩陣和矩陣乘法可以去看我的以前做的線性代數系列影片

  • communicate the full transition of activations from one layer to the next in an extremely tight and neat little expression and

    特別是第三篇

  • This makes the relevant code both a lot simpler and a lot faster since many libraries optimize the heck out of matrix multiplication

    會到正題 當我要加上偏置時也會把每一個偏置

  • Remember how earlier I said these neurons are simply things that hold numbers

    放到矩陣裡在和前面算出來的結果做矩陣加法

  • Well of course the specific numbers that they hold depends on the image you feed in

    最後一步

  • So it's actually more accurate to think of each neuron as a function one that takes in the

    我會用邏輯函數把整個結果包起來

  • outputs of all the neurons in the previous layer and spits out a number between zero and one

    意思是把最後得到的向量一個一個丟進邏輯函數中

  • Really the entire network is just a function one that takes in

    來計算出每一個結果

  • 784 numbers as an input and spits out ten numbers as an output

    現在當我們用簡單的符號來寫下公式

  • It's an absurdly

    就可以很清楚完整的表達每一層之間的關係

  • Complicated function one that involves thirteen thousand parameters in the forms of these weights and biases that pick up on certain patterns and which involves

    還有這也讓我們更簡單快速的編寫程式碼 像是很多的對於矩陣運算有最佳化的函式庫

  • iterating many matrix vector products and the sigmoid squish evocation function

    還記得我一開始說神經元只是一個簡單承載數字的東西嗎

  • But it's just a function nonetheless and in a way it's kind of reassuring that it looks complicated

    當然 它裡面所裝得數字取決於你給他的圖像

  • I mean if it were any simpler what hope would we have that it could take on the challenge of recognizing digits?

    所以更準確地說應該把神經元當作一個函數

  • And how does it take on that challenge? How does this network learn the appropriate weights and biases just by looking at data? Oh?

    它的輸出是取決於上一層所有的神經元然後轉換成一個介於 0 到 1 的數字

  • That's what I'll show in the next video, and I'll also dig a little more into what this particular network we are seeing is really doing

    其實也可以把整個神經網路當作一個函數

  • Now is the point I suppose I should say subscribe to stay notified about when that video or any new videos come out

    有著 784 個輸入值和 10 輸出值的函數

  • But realistically most of you don't actually receive notifications from YouTube, do you ?

    它是一個非常複雜的函數

  • Maybe more honestly I should say subscribe so that the neural networks that underlie YouTube's

    用來選擇正確圖案的權重和偏置參數就有 13000 個

  • Recommendation algorithm are primed to believe that you want to see content from this channel get recommended to you

    還會不停地用到矩陣乘法和邏輯函數

  • anyway stay posted for more

    即便它看起來很複雜但卻是很可靠的一個函數

  • Thank you very much to everyone supporting these videos on patreon

    我的意思是如果它看起來很簡單 我們怎麼能期望它辨識數字呢

  • I've been a little slow to progress in the probability series this summer

    但是它是如何完成這個任務的? 這個網路是如何只透過讀取數據來學習如何調整權重和偏置的?

  • But I'm jumping back into it after this project so patrons you can look out for updates there

    這就是下一集的內容了 我們也會更深入更多關於網路運作的細節

  • To close things off here I have with me Lisha Li

    現在又到了提醒大家如果要獲得更多新影片的通知趕快訂閱

  • Lee who did her PhD work on the theoretical side of deep learning and who currently works at a venture capital firm called amplify partners

    但是大部分的時候你都沒有收到來自 YouTube 的通知吧?

  • Who kindly provided some of the funding for this video so Lisha one thing

    也許我該說趕快訂閱讓 YouYube 的推薦演算法神經網路

  • I think we should quickly bring up is this sigmoid function

    去相信你想看到關於這個頻道的消息

  • As I understand it early networks used this to squish the relevant weighted sum into that interval between zero and one

    總之留意更多消息

  • You know kind of motivated by this biological analogy of neurons either being inactive or active (Lisha) - Exactly

    感謝大家在 Patreon 平台上的支持

  • (3B1B) - But relatively few modern networks actually use sigmoid anymore. That's kind of old school right ? (Lisha) - Yeah or rather

    我的機率系列影片在這個夏天進展得有點慢

  • ReLU seems to be much easier to train (3B1B) - And ReLU really stands for rectified linear unit

    但是在這個系列結束後我會回到那個系列 所以大家可以留意新消息

  • (Lisha) - Yes it's this kind of function where you're just taking a max of 0 and a where a is given by

    在影片的結尾 我邀請了 Lisha Li

  • what you were explaining in the video and what this was sort of motivated from I think was a

    她在博士研究是關於深度學習的理論 現在一個叫做 Amplify Partners 的創投公司任職

  • partially by a biological

    他們慷慨的贊助了這個影片

  • Analogy with how

    所以 Lisha 我們想要快速的提一下「Sigmoid」這個函數

  • Neurons would either be activated or not and so if it passes a certain threshold

    據我所知早期的神經網路都會使用這個函數來讓權重介於 0 到 1 之間

  • It would be the identity function

    用來模仿生物學上的神經元是處於活耀還是不活耀的狀態 (Lisha) 沒錯

  • But if it did not then it would just not be activated so be zero so it's kind of a simplification

    但是現在神經網路幾乎沒有使用「Sigmoid」函數,是因為它已經太老套了是嗎?

  • Using sigmoids didn't help training, or it was very difficult to train

    (Lisha) 沒錯,更準確的說是使用 ReLU 更容易訓練 (3B1B) ReLU 的全名是「線性整流函數」 對吧

  • It's at some point and people just tried relu and it happened to work

    (Lisha) 對,這個函數會返回 0 跟輸入值的最大值

  • Very well for these incredibly

    我的解釋是使用這個函數比較符合

  • Deep neural networks. (3B1B) - All right

    生物學的原理

  • Thank You Lisha

    類似於

  • for background amplify partners in early-stage VC invests in technical founders building the next generation of companies focused on the

    神經元在甚麼時候活耀或不活耀 當它超過一個門檻的時候

  • applications of AI if you or someone that you know has ever thought about starting a company someday

    它就像恆等函數一樣

  • Or if you're working on an early-stage one right now the Amplify folks would love to hear from you

    但如果它沒有超過則輸出 0 所以它是一個很簡單的函數

  • they even set up a specific email for this video 3blue1brown@amplifypartners.com

    使用「Sigmoid」並沒有幫助訓練或是說它很難以訓練

  • so feel free to reach out to them through that

    後來有人嘗試了 ReLU

This is a three. It's sloppily written and rendered at an extremely low resolution of 28 by 28 pixels.

這是一個隨意書寫的28*28像素、解析度很低的數字 3

字幕與單字

單字即點即查 點擊單字可以查詢單字解釋