Placeholder Image

字幕列表 影片播放

  • - So welcome everyone to CS231n.

  • I'm super excited to offer this class again

  • for the third time.

  • It seems that every time we offer this class

  • it's growing exponentially unlike most things in the world.

  • This is the third time we're teaching this class.

  • The first time we had 150 students.

  • Last year, we had 350 students, so it doubled.

  • This year we've doubled again to about 730 students

  • when I checked this morning.

  • So anyone who was not able to fit into the lecture hall

  • I apologize.

  • But, the videos will be up on the SCPD website

  • within about two hours.

  • So if you weren't able to come today,

  • then you can still check it out within a couple hours.

  • So this class CS231n is really about computer vision.

  • And, what is computer vision?

  • Computer vision is really the study of visual data.

  • Since there's so many people enrolled in this class,

  • I think I probably don't need to convince you

  • that this is an important problem,

  • but I'm still going to try to do that anyway.

  • The amount of visual data in our world

  • has really exploded to a ridiculous degree

  • in the last couple of years.

  • And, this is largely a result of the large number

  • of sensors in the world.

  • Probably most of us in this room

  • are carrying around smartphones,

  • and each smartphone has one, two,

  • or maybe even three cameras on it.

  • So I think on average there's even more cameras

  • in the world than there are people.

  • And, as a result of all of these sensors,

  • there's just a crazy large, massive amount

  • of visual data being produced out there in the world

  • each day.

  • So one statistic that I really like to kind of put

  • this in perspective is a 2015 study

  • from CISCO that estimated that by 2017

  • which is where we are now that roughly 80%

  • of all traffic on the internet would be video.

  • This is not even counting all the images

  • and other types of visual data on the web.

  • But, just from a pure number of bits perspective,

  • the majority of bits flying around the internet

  • are actually visual data.

  • So it's really critical that we develop algorithms

  • that can utilize and understand this data.

  • However, there's a problem with visual data,

  • and that's that it's really hard to understand.

  • Sometimes we call visual data the dark matter

  • of the internet in analogy with dark matter in physics.

  • So for those of you who have heard of this in physics

  • before, dark matter accounts for some astonishingly large

  • fraction of the mass in the universe,

  • and we know about it due to the existence

  • of gravitational pulls on various celestial bodies

  • and what not, but we can't directly observe it.

  • And, visual data on the internet is much the same

  • where it comprises the majority of bits

  • flying around the internet, but it's very difficult

  • for algorithms to actually go in and understand

  • and see what exactly is comprising all the visual data

  • on the web.

  • Another statistic that I like is that of Youtube.

  • So roughly every second of clock time

  • that happens in the world, there's something like five hours

  • of video being uploaded to Youtube.

  • So if we just sit here and count,

  • one, two, three, now there's 15 more hours

  • of video on Youtube.

  • Google has a lot of employees, but there's no way

  • that they could ever have an employee sit down

  • and watch and understand and annotate every video.

  • So if they want to catalog and serve you

  • relevant videos and maybe monetize by putting ads

  • on those videos, it's really crucial that we develop

  • technologies that can dive in and automatically understand

  • the content of visual data.

  • So this field of computer vision is

  • truly an interdisciplinary field, and it touches

  • on many different areas of science

  • and engineering and technology.

  • So obviously, computer vision's the center of the universe,

  • but sort of as a constellation of fields

  • around computer vision, we touch on areas like physics

  • because we need to understand optics and image formation

  • and how images are actually physically formed.

  • We need to understand biology and psychology

  • to understand how animal brains physically see

  • and process visual information.

  • We of course draw a lot on computer science,

  • mathematics, and engineering as we actually strive

  • to build computer systems that implement

  • our computer vision algorithms.

  • So a little bit more about where I'm coming from

  • and about where the teaching staff of this course

  • is coming from.

  • Me and my co-instructor Serena are both PHD students

  • in the Stanford Vision Lab which is headed

  • by professor Fei-Fei Li, and our lab really focuses

  • on machine learning and the computer science side

  • of things.

  • I work a little bit more on language and vision.

  • I've done some projects in that.

  • And, other folks in our group have worked

  • a little bit on the neuroscience and cognitive science

  • side of things.

  • So as a bit of introduction, you might be curious

  • about how this course relates to other courses at Stanford.

  • So we kind of assume a basic introductory understanding

  • of computer vision.

  • So if you're kind of an undergrad,

  • and you've never seen computer vision before,

  • maybe you should've taken CS131 which was offered

  • earlier this year by Fei-Fei and Juan Carlos Niebles.

  • There was a course taught last quarter

  • by Professor Chris Manning and Richard Socher

  • about the intersection of deep learning

  • and natural language processing.

  • And, I imagine a number of you may have taken that course

  • last quarter.

  • There'll be some overlap between this course and that,

  • but we're really focusing on the computer vision

  • side of thing, and really focusing all of our motivation

  • in computer vision.

  • Also concurrently taught this quarter

  • is CS231a taught by Professor Silvio Savarese.

  • And, CS231a really focuses is a more all encompassing

  • computer vision course.

  • It's focusing on things like 3D reconstruction,

  • on matching and robotic vision,

  • and it's a bit more all encompassing

  • with regards to vision than our course.

  • And, this course, CS231n, really focuses

  • on a particular class of algorithms revolving

  • around neural networks and especially convolutional

  • neural networks and their applications

  • to various visual recognition tasks.

  • Of course, there's also a number

  • of seminar courses that are taught,

  • and you'll have to check the syllabus

  • and course schedule for more details on those

  • 'cause they vary a bit each year.

  • So this lecture is normally given

  • by Professor Fei-Fei Li.

  • Unfortunately, she wasn't able to be here today,

  • so instead for the majority of the lecture

  • we're going to tag team a little bit.

  • She actually recorded a bit of pre-recorded audio

  • describing to you the history of computer vision

  • because this class is a computer vision course,

  • and it's very critical and important that you understand

  • the history and the context of all the existing work

  • that led us to these developments

  • of convolutional neural networks as we know them today.

  • I'll let virtual Fei-Fei take over

  • [laughing]

  • and give you a brief introduction to the history

  • of computer vision.

  • Okay let's start with today's agenda. So we have two topics to cover one is a

  • brief history of computer vision and the other one is the overview of our course

  • CS 231 so we'll start with a very brief history of where vision comes

  • from when did computer vision start and where we are today. The history the

  • history of vision can go back many many years ago in fact about 543 million

  • years ago. What was life like during that time? Well the earth was mostly water

  • there were a few species of animals floating around in the ocean and life

  • was very chill. Animals didn't move around much there they don't have eyes or

  • anything when food swims by they grab them if the food didn't swim