字幕列表 影片播放 列印英文字幕 [MUSIC PLAYING] JEFF DEAN: I'm excited to be here today to tell you about how I see deep learning and how it can be used to solve some of the really challenging problems that the world is facing. And I should point out that I'm presenting the work of many, many different people at Google. So this is a broad perspective of a lot of the research that we're doing. It's not purely my work. So first, I'm sure you may have all noticed, but machine learning is growing in importance. There's a lot more emphasis on machine learning research. There's a lot more uses of machine learning. This is a graph showing how many Arxiv papers-- Arxiv is a preprint hosting service for all kinds of different research. And this is the subcategories of it that are related to machine learning. And what you see is that, since 2009, we've actually been growing the number of papers posted at a really fast exponential rate, actually faster than the Moore's Law growth rate of computational power that we got so nice and used to for 40 years but it's now slowed down. So we've replaced the nice growth in computing performance with growth in people generating ideas, which is nice. And deep learning is this particular form of machine learning. It's actually a rebranding in some sense of a very old set of ideas around creating artificial neural networks. These are these collections of simple trainable mathematical units organized in layers where the higher layers typically build higher levels of abstraction based on things that the lower layers are learning. And you can train these things end to end. And the algorithms that underlie a lot of the work that we're doing today actually were developed 35, 40 years ago. In fact, my colleague Geoff Hinton just won the Turing Award this year along with Yann LeCun and Yoshua Bengio for a lot of the work that they did over the past 30 or 40 years. And really the ideas are not new. But what's changed is we got amazing results 30 or 40 years ago on toyish problems but didn't have the computational resources to make these approaches work on real large scale problems. But starting about eight or nine years ago, we started to have enough computation to really make these approaches work well. And so what are things-- think of a neural net as something that can learn really complicated functions that map from input to output. Now that sounds kind of abstract. You think of functions as like y equals x squared or something. But really these functions can be very complicated and can learn from very raw forms of data. So you can take the pixels of an image and train a neural net to predict what is in the image as a categorical label like that's a leopard. That's one of my vacation photos. From audio wave forms, you can learn to predict a transcript of what is being said. How cold is it outside? You can learn to take input in one language-- hello, how are you-- and predict the output being that sentence translated into another language. [SPEAKING FRENCH] You can even do more complicated things like take the pixels of an image and create a caption that describes the image. It's not just category. It's like a simple sentence. A cheetah lying on top of a car, which is kind of unusual anyway. Your priority for that should be pretty low. And in the field of computer vision, we've made great strides thanks to neural nets. In 2011, the Stanford ImageNet contest, which is a contest held every year, the winning entry did not use neural nets. That was the last year. The winning entry did not use neural nets. They got 26% error. And that won the contest. We know this task-- it's not a trivial task. So humans themselves have about 5% error, because you have to distinguish among 1,000 different categories of things including like a picture of a dog, you have to say which of 40 breeds of dog is it. So it's not a completely trivial thing. And in 2016, for example, the winning entry got 3% error. So this is just a huge fundamental leap in computer vision. You know, computers went from basically not being able to see in 2011 to now we can see pretty darn well. And that has huge ramifications for all kinds of things in the world not just computer science but like the application of machine learning and computing to perceiving the world around us. OK. So the rest of this talk I'm going to frame in a way of-- but in 2008, the US National Academy of Engineering published this list of 14 grand engineering challenges for the 21st century. And they got together a bunch of experts across lots of different domains. And they all collectively came up with this list of 14 things, which I think you can agree these are actually pretty challenging problems. And if we made progress on all of them, the world would be a healthier place. We'd have a safer place. We'd have more scientific discovery. All these things are important problems. And so given the limited time, what I'm going to do is talk about the ones in boldface. And we have projects in Google Research that are focused on all the ones listed in red. But I'm not going to talk about the other ones. And so that's kind of the tour of the rest of the talk. We're just going to dive in and off we go. I think we start with restoring and improving urban infrastructure. Right. We know cities were designed-- the basic structure of cities has been designed quite some time ago. But there's some changes that we're on the cusp of that are going to really dramatically change how we might want to design cities. And, in particular, autonomous vehicles are on the verge of commercial practicality. This is from our Waymo colleagues, part of Alphabet. They've been doing work in this space for almost a decade. And the basic problem of an autonomous vehicle is you have to perceive the world around you from raw sensory inputs, things like light [INAUDIBLE],, and cameras, and radar, and other kinds of things. And you want to build a model of the world and the objects around you and understand what those objects are. Is that a pedestrian or a light pole? Is it a car that's moving? What is it? And then also be able to predict both a short time from now, like where is that car going to be in one second, and then make a set of decisions about what actions you want to take to accomplish the goals, get from A to B without having any trouble. And it's really thanks to deep learning vision based algorithms and fusing of all the sensor data that we can actually build maps of the world like this that are understandings of the environment around us and actually have these things operate in the real world. This is not some distant far off dream. Waymo is actually operating about 100 cars with passengers in the back seat and no safety drivers in the front seat in the Phoenix, Arizona area. And so this is a pretty strong sense that this is pretty close to reality. Now Arizona is one of the easier self-driving car environments. It's like it never rains. It's too hot so there aren't that many pedestrians. The streets are very wide. The other drivers are very slow.