Placeholder Image

字幕列表 影片播放

  • Hi, I'm Jabril, and welcome to CrashCourse AI!

  • In the supervised learning episode, we taught John Green-bot to learn using a perceptron,

  • a program that imitates one neuron.

  • But our brains make decisions with 100 billion neurons, which have trillions of connections

  • between them!

  • We can actually do a lot more with AI if we connect a bunch of perceptrons together, to

  • create what's called an artificial neural network.

  • Neural networks are better than other methods for certain tasks like, image recognition.

  • The secret to their success is their hidden layers, and they're mathematically very

  • elegant.

  • Both of these reasons are why neural networks are one of the most dominant machine learning

  • technologies used today.

  • [INTRO]

  • Not that long ago, a big challenge in AI was real-world image recognition, like recognizing

  • a dog from a cat, and a car from a plane from a boat.

  • Even though we do it every day, it's really hard for computers.

  • That's because computers are good at literal comparisons, like matching 0s and 1s, one

  • at a time.

  • It's easy for a computer to tell that these images are the same by matching the pixels.

  • But before AI, a computer couldn't tell that these images are of the same dog, and

  • had no hope of telling that all of these different images are dogs.

  • So, a professor named Fei-Fei Li and a group of other machine learning and computer vision

  • researchers wanted to help the research community develop AI that could recognize images.

  • The first step was to create a huge public dataset of labeled real-world photos.

  • That way, computer scientists around the world could come up with and test different algorithms.

  • They called this dataset ImageNet.

  • It has 3.2 million labeled images, sorted into 5,247 nested categories of nouns.

  • Like for example, thedoglabel is nested underdomestic animal,” which is nested

  • underanimal.”

  • Humans are the best at reliably labeling data.

  • But if one person did all this labeling, taking 10 seconds per label, without any sleep or

  • snack breaks, it would take them over a year!

  • So ImageNet used crowd-sourcing and leveraged the power of the Internet to cheaply spread

  • the work between thousands of people.

  • Once the data was in place, the researchers started an annual competition in 2010 to get

  • people to contribute their best solutions to image recognition.

  • Enter Alex Krizhevsky, who was a graduate student at the University of Toronto.

  • In 2012, he decided to apply a neural network to ImageNet, even though similar solutions

  • hadn't been successful in the past.

  • His neural network, called AlexNet, had a couple of innovations that set it apart.

  • He used a lot of hidden layers, which we'll get to in a minute.

  • He also used faster computation hardware to handle all the math that neural networks do.

  • AlexNet outperformed the next best approaches by over 10%.

  • It only got 3 out of every 20 images wrong.

  • In grade terms, it was getting a solid B while other techniques were scraping by with a low

  • C.

  • Since 2012, neural network solutions have taken over the annual competition, and the

  • results keep getting better and better.

  • Plus, AlexNet sparked an explosion of research into neural networks, which we started to

  • apply to lots of things beyond image recognition.

  • To understand how neural networks can be used for these classification problems, we have

  • to understand their architecture first.

  • All neural networks are made up of an input layer, an output layer, and any number of

  • hidden layers in between.

  • There are many different arrangements but we'll use the classic multi-layer perceptron

  • as an example.

  • The input layer is where the neural network receives data represented as numbers.

  • Each input neuron represents a single feature, which is some characteristic of the data.

  • Features are straightforward if you're talking about something that's already a number,

  • like grams of sugar in a donut.

  • But, really, just about anything can be converted to a number.

  • Sounds can be represented as the amplitudes of the sound wave.

  • So each feature would have a number that represents the amplitude at a moment in time.

  • Words in a paragraph can be represented by how many times each word appears.

  • So each feature would have the frequency of one word.

  • Or, if we're trying to label an image of a dog, each feature would represent information

  • about a pixel.

  • So for a grayscale image, each feature would have a number representing how bright a pixel

  • is.

  • But for a color image, we can represent each pixel with three numbers: the amount of red,

  • green, and blue, which can be combined to make any color on your computer screen.

  • Once the features have data, each one sends its number to every neuron in the next layer,

  • called the hidden layer.

  • Then, each hidden layer neuron mathematically combines all the numbers it gets.

  • The goal is to measure whether the input data has certain components.

  • For an image recognition problem, these components may be a certain color in the center, a curve

  • near the top, or even whether the image contains eyes, ears, or fur.

  • Instead of answering yes or no, like the simple Perceptron from the previous episode, each

  • neuron in the hidden layer does some slightly more complicated math and outputs a number.

  • And then, each neuron sends its number to every neuron in the next layer, which could

  • be another hidden layer or the output layer.

  • The output layer is where the final hidden layer outputs are mathematically combined

  • to answer the problem.

  • So, let's say we're just trying to label an image as a dog.

  • We might have a single output neuron representing a single answer - that the image is of a dog

  • or not.

  • But if there are many answers, like for example if we're labeling a bunch of images, we'll

  • need a lot of output neurons.

  • Each output neuron will correspond to the probability for each label -- like for example,

  • dog, car, spaghetti, and more.

  • And then we can pick the answer with the highest probability.

  • The key to neural networks -- and really all of AI -- is math.

  • And I get it.

  • A neural network kind of seems like a black box that does math and spits out an answer.

  • I mean, those middle layers are even called hidden layers!

  • But we can understand the gist of what's happening by working through an example.

  • Oh John Green Bot?

  • Let's give John Green-bot a program with a neural network that's been trained to

  • recognize a dog in a grayscale photo.

  • When we show him this photo first, every feature will contain a number between 0 and 1 corresponding

  • to the brightness of one pixel.

  • And it'll pass this information to the hidden layer.

  • Now, let's focus on one hidden layer neuron.

  • Since the neural network is already trained, this neuron has a mathematical formula to

  • look for a particular component in the image, like a specific curve in the center.

  • The curve at the top of the nose.

  • If this neuron is focused on this specific shape and spot, it may not really care what's

  • happening everywhere else.

  • So it would multiply or weigh the pixel values from most of those features by 0 or close

  • to 0.

  • Because it's looking for bright pixels here, it would multiply these pixel values by a

  • positive weight.

  • But this curve is also defined by a darker part below.

  • So the neuron would multiply these pixel values by a negative weight.

  • This hidden neuron will add all the weighted pixel values from the input neurons and squish

  • the result so that it's between 0 and 1.

  • The final number basically represents the guess of this neuron thinking that a specific

  • curve, aka a dog nose, appeared in the image.

  • Other hidden neurons are looking for other components, like for example, a different

  • curve in another part of the image , or a fuzzy texture.

  • When all of these neurons pass their estimates onto the next hidden layer, those neurons

  • may be trained to look for more complex components.

  • Like, one hidden neuron may check whether there's a shape that might be a dog nose.

  • It probably doesn't care about data from previous layers that looked for furry textures,

  • so it weights those by 0 or close to 0.

  • But it may really care about neurons that looked for thetop of the noseandbottom

  • of the noseandnostrils”.

  • It weights those by large positive numbers.

  • Again, it would add up all the weighted values from the previous layer neurons, squish the

  • value to be between 0 and 1, and pass this to the next layer.

  • That's the gist of the math, but we're simplifying a bit.

  • It's important to know that neural networks don't actually understand ideas likenose

  • oreyelid.”

  • Each neuron is doing a calculation on the data it's given and just flagging specific

  • patterns of light and dark.

  • After a few more hidden layers, we reach the output layer with one neuron!

  • So after one more weighted addition of the previous layer's data, which happens in

  • the output neuron, the network should have a good estimate if this image is a dog.

  • Which means, John Green-bot should have a decision.

  • John Green-bot: Output neuron value: 0.93.

  • Probability that this is a dog: 93%!

  • Hey John Green Bot nice job!

  • Thinking about how a neural network would process just one image makes it clearer why

  • AI needs fast computers.

  • Like I mentioned before, each pixel in a color image will be represented by 3 numbers --- how

  • much red, green, and blue it has.

  • So to process a 1000 by 1000 pixel image, which in comparison is a small 3 by 3 inch

  • photo, a neural network needs to look at 3 million features!

  • AlexNet needed more than 60 million neurons to achieve this, which is a ton of math and

  • could take a lot of time to compute.

  • Which is something we should keep in mind when designing neural networks to solve problems.

  • People are really excited about using deeper neural networks, which are networks with more

  • hidden layers, to do deep learning.

  • Deep networks can combine input data in more complex ways to look for more complex components,

  • and solve trickier problems.

  • But we can't make all networks like a billion layers deep, because more hidden layers means

  • more math which again would mean that we need faster computers.

  • Plus, as a network get deeper, it gets harder for us to make sense of why it's giving

  • the answers it does.

  • Each neuron in the first hidden layer is looking for some specific component of the input data.

  • But in deeper layers, those components get more abstract from how humans would describe

  • the same data.

  • Now, this may not seem like a big deal, but if a neural network was used to deny our loan

  • request for example, we'd want to know why.

  • Which features made the difference?

  • How were they weighed towards the final answer?

  • In many countries, we have the legal right to understand why these kinds of decisions

  • were made.

  • And neural networks are being used to make more and more decisions about our lives.

  • Most banks for example use neural networks to detect and prevent fraud.

  • Many cancer tests, like the Pap test for cervical cancer, use a neural network to look at an

  • image of cells under a microscope, and decide whether there's a risk of cancer.

  • And neural networks are how Alexa understands what song you're asking her to play and

  • how Facebook suggests tags for our photos.

  • Understanding how all this happens is really important to being a human in the world right

  • now, whether or not you want to build your own neural network.

  • So this was a lot of big-picture stuff, but the program we gave John Green-bot had already

  • been trained to recognize dogs.

  • The neurons already had algorithms that weighted inputs.

  • Next time, we'll talk about the learning process used by neural networks to get to

  • the right weights for every neuron, and why they need so much data to work well.

  • Crash Course Ai is produced in association with PBS Digital Studios.

  • If you want to help keep all Crash Course free for everyone, forever, you can join our

  • community on Patreon.

  • And if you want to learn more about the math behind neural networks, check out this video

  • from Crash Course Statistics about them.

Hi, I'm Jabril, and welcome to CrashCourse AI!

字幕與單字

單字即點即查 點擊單字可以查詢單字解釋

B1 中級

神經網絡和深度學習。AI速成班#3 (Neural Networks and Deep Learning: Crash Course AI #3)

  • 2 2
    林宜悉 發佈於 2021 年 01 月 14 日
影片單字