字幕列表 影片播放 列印英文字幕 [MUSIC PLAYING] DAVID MALAN: So odds are you know someone who is or perhaps are yourself a computer person. But what does that mean? Well I propose that a computer person is simply someone who's really good at computational thinking. Thinking methodically, or more formally, thinking algorithmically, whereby you take inputs to some problem, you solve that problem and produce output, but you do so step by step by step, along the way defining any of the terms or ideas that you need to use so that anyone else following along can understand exactly what it is you're doing. Computer science itself is really about computational thinking, then. It's not about programming, with which it's often conflated, although programming is a wonderfully useful tool with which you can solve problems, but computer science itself is about solving problems themselves. And it can be in any number of domains, whether it's software or hardware or artificial intelligence or machine learning or data science or beyond. That's the wonderful thing about computer science-- it's just so wonderfully applicable to other fields. It's all about equipping you with a mental model, so to speak, a set of concepts, as well as some practical skills via which you can bring to bear solutions to problems in other domains entirely. Now what, though, does it mean to solve a problem? I propose that we think about problem-solving quite simply as this. If you've got some problem to be solved, you have an input. And the goal is to solve that problem and come up with some solution, which we'll call simply our output. And in between those inputs and outputs hopefully are the step-by-step instructions via which we can solve that problem. Now how we're going to solve that problem we'll have to come back to. For now we'll consider it to be the proverbial black box. So we don't quite know or need to know right now just how it works, but that it can work, and we'll come back to this soon. But for now, we do need a way to represent these inputs, and we do need a way to represent our outputs. Right now I happen to be speaking English and you happen to be hearing English, and so we've sort of tacitly agreed that we'll represent our words using this particular language. Now computers tend not to use English, they have their own system, if you will, with what you might be familiar even if you don't quite speak it yourself, and indeed, there are different ways you can represent information. For instance, let's consider the simplest of problems-- for instance, counting the number of people in a room. I could do this old-school. I don't need computers or English, I can simply use my physical hand and say, I'm starting with zero people and now I'll count one, and then two, and then three, and then four, and then five, and then-- I'm out of fingers, but I can at least employ my other hand, maybe a couple of feet and get as high as 10, maybe even 20 using this physical system. This is pretty equivalent, actually, to just keeping score counting the old-school way with hash marks. Right now we've not counted anything, but as soon as I start drawing some lines, I might represent the number 1, or 2 people in the room, or 3, or 4. Or by convention I can draw a slash just to make super clear that this is now representing 5, and then I can do another five of those to count up the 10 and beyond. Now these systems of hashes, as well as this system of counting on my fingers, actually has a name. That of unary notation. Uno implying one, simply signifying that I only have one digit-- pun not intended-- via which I can count things. This takes the form of my fingers on my hand or these hash marks on the screen. But it doesn't get me very far, because of course on my one hand with five fingers, I can only count up to five total. And even on the board it feels pretty inefficient, because for every number I want to count higher, I have to draw yet another line on the screen, which just continue to accumulate. But suppose we were a little more clever about this and we thought about this problem from a different angle. Maybe I could take into account not just the number of fingers that I've raised, but the order in which I've raised them or the pattern via which I've permuted them rather than just taking into account how many of them are up. If I take that approach, maybe I can actually count even higher than five on just one hand before resorting to another hand or feet. Now how could I do that? Well instead of counting up 0 to 1 to 2 to 3 to 4 to 5, why do I instead start with some patterns instead? So I'll still start with 0, I'll still start with 1, but now for 2, I'm not just going to raise the second finger, I'm instead going to go from 0 to 1 to 2, putting down that first finger or thumb, simply representing 2 with this one finger. And when I'm ready to represent the number 3, I'll bring that finger back up. And so using just two fingers now have I represented four possible values-- 0, 1, 2, and 3. From 0 to 3, of course, is four total values, and so I've counted as high as 3 now using not three fingers, but only two. How, though, can I count higher than 3? Well, I just need to raise an additional finger. And so let's start at 0, to 1, to 2, to 3, to 4-- whoops-- to 5, to 6, and now to 7. And if it didn't hurt so much, I could continue counting as high as 8 and 9 and 10 and beyond by simply permuting my fingers in different ways. So how high can I now count on this one hand? Using unary notation with fingers just up or down, I can count as high as 5-- from 0 to 5. But with these same five fingers, if I instead use not unary, but let's call it binary notation where we're actually taking into account whether the fingers are up or down and the pattern thereof, well now it would seem that each of these fingers can be in one of two possible states or configurations. They can either be up or they can be down, or just one of them can be up or can be down, or just one other can be up or down. So if there's two possible states or configurations for each of these five fingers, that's like 2 times 2 times 2 times 2 times 2, or 2 to the power of 5 for five fingers, which is actually 32 possible patterns. Which is to say with my one human hand, now can I count using not unary, but binary from 0 to 31, which is 32 possible values. Now what's the goal at hand? It's quite simply to represent information. Because if we have inputs and outputs, we need to decide a priori how we're going to represent those values. Now it turns out in computers, binary is wonderfully well-suited. Because after all, what's the one physical input to any computer you have, whether it's your laptop or desktop or these days, even a phone in your pocket? Well odds are, at the end of the day-- or even perhaps earlier in the day-- you need to physically plug that device into the wall and actually recharge it. And somehow or other there's electricity or electrons flowing from the wall that are stored in the battery in your device if it has a battery, and that is the input that drives your entire machine's operation. And so if that's the only possible input, it's pretty straightforward to imagine that you might have your computer plugged in or not, power is flowing or not, you've got a charge in your battery or you don't. And so this binary world where something can be in two possible states-- on or off, plugged in or not-- is wonderfully conducive to now using binary in the context of computers to represent information. After all, if I want to represent a number, I might simply turn on the lights. And so my phone here has a light built in, and right now that light is off, but if I consider this then to be off, we'll call it a 0. And if I turn this flashlight now on, I can be said to be representing a 1. And so simply with a simple light bulb can I represent two possible values just like my finger can be up and down. But computers don't necessarily use lots of little light bulbs, they use something else that's called a transistor. A transistor is a tiny little switch, and that switch, of course, can be either on or off-- letting electricity in or not. And so when you have a computer, inside of which is a motherboard and CPU and bunches of other pieces of hardware, one of the underlying components ultimately are these things called transistors. And computers today have millions of them inside, each of which can be on or off-- so far more than my five fingers, and that's ultimately how a computer can go about representing information. So long as it has access to some physical supply of electricity can it use that electricity to turn these switches on or off in distinct patterns, thereby representing 0, 1, 2, 3, 4, all the way up to 31 or even millions or billions or beyond. And so computers use binary. Computers speak binary-- only 0's and 1's or off and on, because it's so wonderfully well-conducive to the physical world on which they're ultimately based. We humans, meanwhile, tend to communicate not only in English and other spoken languages, but in a numeric system called decimal. So decimal, dec meaning 10, has 10 digits at its disposal-- 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, whereas binary has just two-- 0 and 1-- and unary, you might say, has just one-- 1, the hash mark that I drew on the board. But if I only have 0's and 1's at my disposal, how can I possibly count to 2 or 3 or beyond? Well, we need some way of mapping these 0's and 1's to the more familiar numbers that we ourselves know. So how can we go about doing this? It's one thing to describe it in terms of patterns on your fingers, but again, a device just has switches that it can turn on and off. Well it turns out even if you don't yet speak binary or have never really thought about doing so, turns out that it's not all that dissimilar to the system of numbers with which we grew up. In fact, in grade school, you and I probably grew up learning how to represent numbers in rather the same way. For instance, consider this-- clearly the number 123. But why is that? It's not inherently the number 123. Really, this is just three symbols on the screen. We of course immediately recognize them as 1 and 2 and 3, and we infer from that, oh, this is the number 123, but why is that exactly? Well if you're like me, you probably grew up representing numbers in the following way, even if it's been quite some time since you thought about it like this. With the symbol here, 1, 2, and 3, if you're like me, you probably grew up thinking of this rightmost digit as being in the ones place, and this middle digit is being in the tens place, and this leftmost digit as being in the one hundredths place. And if it were a longer number, you'd have the one thousandths place and ten thousandths place and beyond. Now how do we get from 1, 2, 3 to 123? Well, the arithmetic we were taught says to do 100 times 1, and then 10 times 2, and then 1 times 3, and then add all three of those products together, giving us, of course, 100 plus 20 plus 3, which of course is 123. Now that actually doesn't feel like much progress, because we started at 123 and we got to 123, but that's not quite that. We actually started with the pattern of symbols-- 1, 2, 3, like a pattern of fingers, and ultimately ascribe mathematical meaning to those symbols as 123. And it turns out computers, even though they speak binary and therefore only have 0's and 1's at their disposal-- no 2 or 3 or 4 or 5 or 6 or 7 or 8 or 9, they count in exactly the same way, and they represent inputs and outputs in fundamentally the same way. Consider this. Suppose that I have 3 bits at my disposal, 3 switches or light bulbs, if you will. And those light bulbs or switches are all initially off. I claim that I'll represent those states as 0, 0, 0. And together, those three 0's represent the decimal number we ourselves know is 0. Now why is that? Well, if here if we consider this to be the ones place as before, but the middle digit we won't consider being in the tens place, we're going to consider this as being in the twos place, and this leftmost digit as being in the fours place. Now why? Well in our decimal system, there's a reason that we have the ones place, the tens place, the hundredths place, and beyond-- those are actually all powers of 10. 10 to the 0, 10 to the 1, 10 to the 2, and beyond. Now in binary, bi meaning two, you only have two digits at your disposal, 0 and 1, and so instead of using powers of 10, we're going to use powers of 2. And indeed we have. 2 to the 0 is 1; 2 to the 1 is 2, 2 to the 2 is 4, and if we kept going, we'd get 8, 16, 32, 64, and beyond. And so this pattern of symbols or this pattern of light bulbs or this pattern of switches, all of which I'm proposing are off, simply represents now the number we humans know as 0. Why? Well we have 4 times 0 plus 2 times 0 plus 1 times 0, which of course gives me 0 plus 0 plus 0, for a total numeric value of 0. So in binary, 0, 0, 0 or all three switches off represents 0, and indeed, that's what we would expect. But we've instead we did this. Suppose that using binary, and therefore these same three places-- ones place, twos place, and fours place and beyond, we had a different pattern of symbols. How, for instance, might we represent 1? Well, if we had a pattern that's 0, 0, and 1 in binary, that's going to translate, of course, to the decimal number we all know and agree is the number 1. How do I represent the number 2? Well, if I start from scratch again, I'm going to represent the decimal number we know as 2 as a pattern that's 0, 1, and 0. A switch that's off, another that's on, another that's off. Why? 4 times 0 is 0, 2 times 1 is 2, 1 times 0 is 0, and so we're left with total of 2. And if we want to represent 3, I don't even need to change the value or state of those switches, I can simply turn this one from off to on, and now I'm representing the number 3. If I want to count up to 4, I can simply start fresh in the fours place, in the twos place, in the ones place here-- 1, 0, and 0. And then lastly, suppose that I want to count up even higher. 7 can simply be 1, 1, and 1, because that gives me a 4, a 2, and a 1. 4 plus 2 is 6, plus 1 is 7. But what if I want to represent the number 8? To represent the number 8, it would seem that I need a little more space. And so I might need to introduce the eights place so that I can have a 1, a 0, a 0, and a 0, eighth place being 2 to the 3, but to do this, I do indeed need more hardware. I need another switch, another light bulb, and another physical device inside of my computer. Now fortunately, our computers today have millions of these tiny transistors and switches, so it's not all that problematic to count as high as 8, but the implication is, that in order to represent larger and larger values, do we need more physical storage? And indeed, thematic and computer science is exactly that constraint. You can only do so much computationally, perhaps, if you have a finite amount of memory or a finite amount of hardware or switches. And we'll see, then, that there is non-trivial implications for how much data you can represent, and how many problems, therefore, you can solve. So that's it for numbers. Using just 0's and 1's can we count as high as we want and represent any number of values so long as we have enough bits at our disposal. But numbers only get us so far in computing. After all, it's not just Excel and calculators with which we use computers. It's of course to send information textually, and to use letters of the alphabet, and compose text messages and emails and documents and more, but how, then, do you represent letters if all you have at the end of the day are these 0's and 1's underneath the hood, so to speak? Well to do that, we all simply need to agree, just like we've agreed to speak in here English for now, on a particular system by which to represent letters of the alphabet. Now we don't have all that many choices at hand because if we only have 0's and 1's-- switches that can be turned on and off-- that doesn't give us terribly many options. But what we could do as humans is just collectively agree that you know what? We are going to use a certain pattern of 0's and 1's to represent capital A. And we're going to use a different pattern of 0's and 1's to represent capital B. A different pattern of 0's and 1's to represent C and D and all the way through Z. And you know what? If we care about lowercase letters, we'll use slightly different patterns of 0's and 1's to represent those as well. And this is exactly what humans did decades ago. We standardized on something called ASCII-- the American Standard Code for Information Interchange, which was the result of people agreeing that we are going to use-- you know what? The number 65 in decimal to represent capital A, and the number 66 to represent capital B, and the number 97 to represent lowercase a, and 98 to represent lowercase b, and everything before and beyond. And so this system, this code, ASCII, is simply a collective agreement that whenever you're in the context of a text-based program and not a numeric-based program, any patterns of bits, such as those that might represent 65 in a calculator, should instead be interpreted in the context of Microsoft Word or an SMS message or iMessage as actually representing a letter. So how bits are interpreted is entirely context-dependent. Depending on the program you have opened, the same pattern of bits might be interpreted in different ways-- as numbers or letters or even something else. So how, then, do we represent so many other symbols that aren't just A through Z? Well it turns out that the American Standard Code for Information Interchange was indeed fairly skewed toward American representation of letters, particularly English. But there are so many more characters with accents and other symbology, let alone punctuation marks, and in foreign languages, too, are there hundreds if not thousands of distinct characters that you need to represent ideally in order to send that text and write that document. But ASCII alone couldn't handle it. Because ASCII tended to use 7 or maybe in some contexts 8 bits total, and with 8 bits-- or binary digits, if you will-- can you represent how many possible values-- 2 times 2 times 2 times 2 times 2 times 2 times 2 times 2 for 8 possible bits, each of which can be a 0 or 1. That only gives you 256 possible letters. Now that's more than enough for 26 letters of the English alphabet, A through Z, both uppercase and lowercase, but once you start adding in punctuation and accents and other characters, you very quickly run out. And so the world produced Unicode instead, which doesn't just use 8 bits or 1 byte, if you will-- 1 byte simply meaning quite simply a bit, but instead introduced a variable length encoding of letters. And so some letters might use 8 bits or 1 byte. Other letters might use 2 bytes or 16 bits. Other letters might use 3 bytes or even 4. And via 4 possible bytes maximally-- 2 to the 32 bits-- can you actually represent 4 billion possible values, which is a huge number of values. But how does the system of encoding actually work? Let's consider by way of an example. In both ASCII and Unicode, ASCII now being a subset of Unicode insofar as it only uses 1 byte per letter, you might represent A with-- capital A with 65, capital B with 66, and so forth. And dot-dot-dot implies we've handled the rest as well and they all happen to be back-to-back-to-back. So suppose in the context of a text message you happen to receive a pattern of 0's and 1's that if you actually did the math in the ones place and the twos place and so forth, actually worked out to be the decimal number 72, 73, 33. Well what text message have you just received? Of course at the end of the day, computers only understand binary, and that information is sent from phone to phone over the internet-- more on that another time. But those 0's and 1's interpreted in the context of a text messaging program will be interpreted not as binary, not as decimal, but probably as ASCII or more generally Unicode. And so if you've received a text message that says, 72, 73, 33, well what might that spell? Well if we consider our excerpt of a chart here, that's 72 and 73, of course, translates to H and an I, and you wouldn't know it from that preceding chart, but 33 it turns out-- just represents a very excited exclamation point, because that, too, has a representation in ASCII and Unicode. Now underneath the hood are patterns of 0's and 1's, but we don't need to think about things at that level. Indeed, what we've done here by introducing ASCII, and in turn, Unicode, is introduce what's called an abstraction, which is a technique in computer science via which we can think about a problem more usefully at a higher level as opposed to the lowest level in which it's actually implemented. After all, it'd be much easier to receive a message that you view as H-I-! than actually as a pattern of 0's and 1's that you then have to think about and translate in your head to the actual message that was intended. So an abstraction is just a way of taking a fairly low level if not very complicated representation and thinking about it in a higher, more useful level that lends itself more readily to communication, or more generally to problem-solving. Now what about all of those other characters on the screen? After all, on a typical board might you have letters and numbers and punctuation, but also these accented characters, too. Well thanks to Unicode, you indeed have as many as 4 billion possibilities, and it's no surprise, perhaps, that proliferating on the internet these days are a little friendly faces called emojis that you might have received yourself in text messages, emails, or some other form. Now these emojis here, though they might look like smiley faces, are actually characters that companies like Google and Microsoft and Apple have simply decided to represent pictorially. Underneath the hood, anytime you receive one of these emojis, you're actually receiving a pattern of bits-- 0's and 1's that in the context of an email or text message or some other program, are being interpreted as and therefore displayed as a corresponding picture that Google and Microsoft and Apple and others have decided how to present visually. And indeed, different platforms present these same patterns of 0's and 1's differently depending on how the artist at those companies have decided to draw these emoji. Consider this one, for instance-- face with tears of joy is its formal name, but more technically, this is encoded by a very specific decimal number. Capital A is a 65, capital B is 66, what is this face with tears of joy? Well it is the number 128,514. After all, if you've got four billion possibilities, surely we could use something within that range to represent this face and any number of others. But underneath the hood, if you actually looked at a text message in which you received a face with tears of joy, what you've actually received is this pattern of 0's and 1's somehow encoded on your phone or your computer. And so in the context of a calculator or Microsoft Excel might we interpret some pattern of bits as numbers. And in the context of a text messaging program or Google Docs or Microsoft Word might we interpret that same pattern of bits as instead characters or even emoji. But what if we want to represent different types of media altogether? After all, the world of computing is not composed of just numbers and letters. There's also colors and images and videos and audio files and more, but at the end of the day, if we only have at our disposal 0's and 1's, how do we go about representing all of those richer media? Well as before, we humans just need to agree collectively to represent those media using 0's and 1's in some standard way. Perhaps one of the most familiar acronyms, even if you've never really thought about what it is that you might see on a computer, is this one-- RGB. It stands for red, green, and blue. And this refers to the process by which many computers represent colors on the screen. In fact, what is a color? Well, you might represent the color black and white simply by a single bit. 1 might represent black and 0 might represent white, or vice versa. We in that case only have what's called 1-bit color, whereby the bit is either on or off implying black or white or white or black. But with RGB, you actually use more bits than just one. Conventionally you would use 8 bits to represent red, 8 bits to represent green, and 8 bits to represent blue. So 24 bits total, the first 8 of which or first byte is red, second is green, third is blue. Now what does that mean? Well in order to represent a color, you simply decide, how much red do you want to add to the mix? How much green and how much blue? Combining essentially those wavelengths of light in order to see a disparate color on the screen. And if you think about red, green, and blue now as being single bytes each, that means you have a possible range of values of 0 to 255. After all, if a single byte is 8 bits, and with 8 bits, each of which can be a 0 or 1-- so 2 times 2 times 2 times 2 times 2 times 2 times 2 times 2 gives you 256 values. So if you start from 0, you can count as high as 255. Suppose that you had 72 as your amount of red for a color, and 73 for green, and 33 for blue. That is to say in the context of a text messaging program, this same pattern of bits-- 72, 73, 33 presented as decimal represented a textural message of HI!. But suppose you were to see that same pattern of bits instead in a photo-editing program like Photoshop, or in the context of opening an image on the screen. Well that same pattern of bits, or in turn, decimal numbers, just represents some amount of red, some amount of green, and some amount of blue. So 72 is a decent amount-- a medium amount of red out of 255 total values, 73 is a medium amount of green, and 33 is a little bit of blue. If you combine now all of these three values in your mind's eye by overlapping them, what you get is this shade of yellow. Which is to say that if you wanted to represent this particular shade of yellow would you use three bytes whose values are respectively 72, 73, and 33. And in the context of Photoshop or some other graphical program would that pattern be interpreted as and displayed as this color on the screen. Now this color is deliberately presented here as just 1 square-- 1 dot, if you will, or more properly, 1 pixel. Because what is an image? Well if you've ever looked really close on your computer screen or on your phone or even on your TV, you might actually see all of the thousands of dots that compose it. This is getting harder to do on today's modern hardware, because if you have something like a retina display, that means these dots or pixels or ever so tiny and ever so close together that it's actually hard now for the human eye to see them, but they are in fact there. Any image on the screen, any photo on the screen is really just a pattern of dots or pixels-- a grid of dots from left to right, top to bottom. And every one of those dots or pixels on the screen probably has 24 bits representing its particular color. Indeed, a picture is essentially just a pattern of color so small that you don't really see all of those dots, but you see the net effect of a beautiful photo or image or something else altogether. So consider how we might see these. When Apple or Google or Microsoft or any other company that supports emojis presents those emojis on the screen, we of course see them as images because Apple and Microsoft and Google have decided what images shall be displayed when it receives a certain pattern of bits. But how are they storing those images and how is your Mac or PC or phone displaying that image to you? Well it's hard to see where those bits are let alone the pixels in it at this particular size, but if I go ahead and zoom in and zoom in and zoom in a little more still, you can begin to see the individual dots or squares or pixels that compose even this one emoji. And so just to display this smiling face, this face with tears of joy, do you need 24 bits for this pixel, 24 bits for this pixel, 24 bits for this pixel, 24 bits for this pixel, and so on and so forth? So if you've ever noticed that when you download an image or download a photo or receive one in the mail, odds are it's not in the order of bytes. It's probably kilobytes for thousands of bytes, or megabytes for millions of bytes. How in the world does a simple photo have so many bytes or in turn bits? It's because every one of the dots in that image takes up some amount of space. So to represent yellow might we use a certain pattern of 0's and 1's or 3 bytes. To represent black or gray or any other number-- colors on the screen might we represent those using different patterns still. Now that's it for images, but what about videos? A video is really just a sequence of images being presented to you so quickly that you and your human eye don't notice that you're really just watching image after image after image. In fact, it's conventional in Hollywood and elsewhere to display as many as 24 images or frames per second, or perhaps even 30 frames or images per seconds. And so even though they are in fact distinct images, you are seeing them so quickly, and the characters on the screen are moving within each frame or image ever so slightly, that it creates ultimately the illusion of movement. And if you think back to childhood, you might have had one of those old-school flip books on which was drawn some kind of cartoon one frame or image at a time. And if you flipped through that flip book one page ever so quickly again and again and again and again, letting the physics of it all take its toll, you might see a character or the picture in the book actually appearing to move, but it's not. All of those pages are just images and all of those images are not moving, but when you see them so fast and so quickly in succession does it appear to create that same movement. So these things here, Animojis in Apple's world, are just videos ultimately that track your facial movements and such, but all they really are are a pattern of bits, each set of which represents an image. And each of those images is displayed to you on your phone so quickly that you see the illusion of movement. And so here again is an abstraction. What is a video? Well, a video is a collection of images. Well what's an image? An image is a collection of pixels. Well what's a pixel? A pixel is simply some number of bits representing some amount of red and green and blue. Well what is a bit? It's simply a representation digitally of something being present, like electricity or not. And so it's much more powerful now to be operating at this level of abstraction-- talking about videos for what they are, and not getting into the weeds of the lowest level that at the end of the day, it's really just some form of electricity that we call bits, that we in turn call pixels, that we in turn called images, that we in turn call videos. And we can operate in a number of these layers of abstraction in order to represent inputs to some problem. To create a film, to convert a film, to display a film might be just one of the problems to solve. All right, so we now have a way to represent information, whether it's binary, decimal, or some other approach altogether. So it's time to solve problems. But how do we go about solving problems? What is inside of this black box? Well that's where our so-called algorithms are-- step-by-step instructions for solving some problem. And to solve these problems, we simply need to decide first what problem to solve. Well let's consider a familiar one, like that of looking up someone's name and number in a phone book. Now these days, the phone book, of course, takes a more digital form, but at the end of the day, these two formats are pretty equivalent. After all, here in my hand is a phone book with maybe 1,000 pages, on which are bunches of names and numbers sorted alphabetically from A to Z. On my phone, meanwhile, while I might have to click an icon instead, do I have contacts that are similarly sorted from A to Z by first name or last name, and touching any one of them pulls up its number. So this one, though, is a little more physical for us to see the solution to a problem, like finding, say, Mike Smith. Mike Smith may or may not be in this phone book, but if he is, my goal quite simply is to find him. So let me try this. Let me try opening this book, and then one step at a time looking down for Mike Smith. And if I don't see him, move on to the next page. If I still don't see him, move on to the next page, and next page, and next page, and so forth, one page at a time. I propose that we consider first-- is this algorithm correct? Opening the phone book, looking down, turning page, and repeating. Well, at some point, if Mike is in this phone book, I am going to reach him, at which point I can call him. And if Mike Smith is not in this phone book, I'm eventually not going to reach him, but I am going to hit the end of the phone book, at which point I'll presumably stop. But it doesn't feel terribly efficient, however correct it may be. Why is it not efficient? Well I'm starting at the beginning of the phone book. I know it's alphabetical, and yet I'm turning one page at a time through the A's, through the B's, through the C's and beyond, just waiting and waiting until I finally reach Mike Smith. Well how might I do this better? Well, I learned in grade school not only how to count by ones, but also an account by twos. So instead of 1 page, 2 page, 3 page, 4, why don't instead do 2, 4, 6, 8, 10, 12-- flying through the phone book. It certainly sounds faster, and it is, but is that approach correct? Well, I propose that there could be a problem. I might just get unlucky, and once I do reach the S section of the phone book, what if by bad luck Mike Smith is sandwiched between two of those pages? And therefore I hit the T section and Z section and run out of pages? I might conclude incorrectly that Mike Smith is not, in fact, in this phone book. But do I have to throw out the algorithm altogether? Probably not, right? I could be a little clever here and improve this algorithm, maintaining its efficiency, but fixing its correctness. After all, if I do see a page on which there are last names starting with T, while I can stop and say, well we know Mike is not farther than this because Smith starts with S, so I can double-back hopefully just one page or few, and therefore fix what's otherwise an incorrect approach in this algorithm. In fact, I can be even smarter than that. As soon as I see someone's name that starts with S-N instead of S-M, I can similarly flip back one page and therefore ensure that I can go twice as fast through most of the phone book, and only once I hit that section do I have to double-back one or terribly few pages. So I get the overall performance gains, and I also solve the problem. But honestly, if you even use this technology anymore, odds are you don't start at the beginning turning one or two pages at a time. If you're like me, you probably more instinctively open up roughly to say the middle. You'll look down and you realize, oh, I haven't found Mike yet because I'm actually here in the M section. But what do you know about where you've just jumped? Well Mike Smith is indeed to the right here, because he's in the name starting with S. But if I'm in the M section, I do know something else. I know that Mike is not in the left-hand section of this book-- he is not among the A through M's. And so I can both literally and figuratively tear this problem in half, throw half of it away, thereby leaving myself with just, say, 500 pages out of 1,000. So whereas that first algorithm I took one byte at a time out of it-- one page at a time, the second algorithm I took two bytes at a time out of it-- two at a time. Well with this algorithm, I just took 500 bytes out of the problem all at once, thereby getting closer to the output to this problem much more quickly. What do I then do? Well, I simply am probably going to apply that same algorithm again and again and again-- dividing and conquering this phone book, if you will. Jumping roughly to the middle now, looking down-- dah, darn it! I ended up in the T section now, so I've gone too far, but I know Mike is not in the right-hand side of this book-- he's not among the T's through Z's. So I can again tear the problem in half, throw that half away, thereby leaving me with just a quarter of the original problem, say 250 pages down, down from 1,000, finding Mike ever so much more quickly than before. And if I repeat and I repeat and I repeat, each time actually looking down looking for Mike, I should hopefully find myself ultimately left with just a single page on which Mike either is or is not, at which point I can call him or quit. So just how much more quickly did I find Mike Smith via this algorithm than the first two? Well in a 1,000-page phone book, I might get unlucky and Mike's not even there, and so in the worst case, I might look in as many as 1,000 pages. In the second algorithm, I might also get unlucky and not ever find Mike because he's just not there, and therefore look at as many 500 pages. But in this third algorithm where I'm iteratively dividing and conquering again and again, splitting the problem in half so I go from a very large input to a much smaller to an even smaller problem still again and again, how many times might I split that phone in half? Well if I start off with roughly 1,000 pages and I split it in half, that gives me 500, 250, 125, and so forth. Turns out that I can split 1,000 pages roughly 10 times in total before I'm left with just that final page. And so consider the implications of that. First algorithm-- 1,000 steps, maybe. Second algorithm-- 500 steps, maybe. Third algorithm-- 10 steps, maybe, to find Mike Smith before I can call or decide he's not there. And that's a pretty powerful technique. And if we extrapolate from this relatively small example or phone book to maybe a much larger phone book, say a phone book that a bit nonsensically has 4 billion pages in it, well, how many steps might those three algorithms take? The first algorithm might take 4 billion. The second algorithm might take 2 billion. The third algorithm might take 4 billion divided by 2 divided by 2 divided by 2-- you can divide 4 billion and half roughly 32 times only, at which point you're left with just that one page. So 4 billion steps for the first algorithm, 2 billion steps for the second algorithm, but 32 steps for the third algorithm. And simply by harnessing this intuition that we all probably already have in formalizing it in more methodical form-- in algorithmic form, if you will, can we solve problems using some of the building blocks that we already have at our disposal. And indeed, problem-solving in computer science is all about implementing these algorithms and applying them to problems of interest to you. But how do we think about just how good this algorithm is? It sort of stands to reason that it has to be minimally correct. But how efficient is it? I've been talking in terms of absolute numbers. 1,000, 500, or 10, or 4 billion, 2 billion, and 32. But it would be nice to compare algorithms in a more general way so that I can use the same searching algorithm, if you will, to find not Mike Smith, but someone else. Or to search not a phone book, but Google or search engine more generally. And so computer scientists also have a more formal technique via which to describe the efficiency of algorithms. And we might consider these not even with numbers or formulas, per se, but with simple visual representations thereof. Here, for instance, is a simple plot. On the x or horizontal axis is going to be the size of the problem. How many pages are in the phone book, which we'll generally call n, where n is a number for a computer scientist. Much like a mathematician might go with x or y or z, computer scientists might tend to start counting with n. On the vertical or y-axis here, we'll say this is going to be the time to solve a problem, be it a phone book or something else. And that might be seconds or minutes or page turns or some other quantifiable unit of measure. So how do I go about describing that first algorithm where I turn one page at a time? Well I propose that it manifests a linear relationship, or n, or a line, a straight line on the chart. Why is it a straight line? Well, if Verizon or the local phone company, for instance, next year adds just one more page to that phone book, that means for me, it might take me one more unit of measure of time to find someone like Mike Smith, because we've gone from 1,000 to 1001 pages. So the slope of this line indeed can be straight or linear. That second algorithm now where I'm flying through the phone book twice as fast is really fundamentally the same algorithm or same shape. It just so happens that for a given size of the problem, it's not going to take this many seconds n, it's going to take me n over 2 seconds because I'm doing twice as much work at a time. And so this line, too, is linear or straight, but we might describe it in terms of n over 2, where n is the number of pages but I only have to look at half of them except for maybe that very last one if I double-back, and so its shape is fundamentally the same. So better, but not fundamentally better, it would seem. But that third and final algorithm has a fundamentally disparate shape, depicted here with our green curved line which has a logarithmic slope to it. So if n is the number of pages, and the algorithm in use is division and conquering, it turns out that has a logarithmic slope. Now what does that actually mean? Well it turns out that if Verizon adds one more page to the phone book next year, that is a drop in the bucket for that final algorithm where I'm just tearing the problem in half. But more powerfully, if Verizon doesn't just add one page, perhaps to neighboring towns merged together and form one massive phone book next year, going from 1,000 pages to 2,000 pages all in one big, fat book, well how many more steps does it take that third algorithm to search for anyone in it? Just one. Because with 2,000 pages, you can take 1,000-page byte out of that problem all in one fell swoop even though it's twice as big as before. And so even as the size of the problem gets bigger and bigger and bigger and even farther away, the amount of time it takes to solve a problem using that algorithm only increases ever so slightly. That's not a one-to-one or a one-to-two ratio, it's instead said to be logarithmic. And this, then, is a fundamentally different curve-- it's a fundamentally better algorithm in that the problem can grow exponentially large, and the amount of time it takes to solve it itself grows far less quickly than that. Now it's one thing to talk about all these algorithms. At some point we need to put them to paper, or more technically, program them into a computer. So how do we go about formalizing these step-by-step instructions in such a way that all of us can agree that they are correct, all of us can discuss their efficiency, and most importantly, all of us can program a device to execute these steps for us? Well let me propose that we first implement an algorithm like that of searching a phone book in what's called pseudocode. Pseudocode is not a formal programming language, it has no one definition. It generally means using short, succinct phrases, often in English, to describe your algorithm step-by-step-by-step in such a way that you yourself understand it, and anyone reading it can also understand it. Now along the way do you have to make certain assumptions, because you need to decide at what layer of abstraction to actually operate. And we'll take a look at where the opportunities there are, but let me propose that we implement this algorithm for searching for Mike Smith, for instance, in the following way. Now this is an algorithm, or really, a program, albeit written in pseudocode. And even though written in pseudocode or English, it manifests certain constructs that we're actually going to see in any number of today's popular programming languages, a programming language being an English-like language that humans have decided can be used in a certain way to tell a computer what to do. Now this pseudocode is just for us humans, but what are some of the fundamental constructs in it that we'll see in actual programming languages? Well first highlighted in yellow here are all of the verbs or actions that more technically we'll call functions. Statements that tell the computer, or in this case, human what to do. Pickup, open to, look at, call, open or open, or quit-- all of them calls to actions. In the context of a computer with these same types of verbs, we more traditionally call it functions. What else is laying inside of this program? Well these pictured here in yellow. If, else if, else if, else, well these represent what are called conditions or branches. Forks in the road, so to speak, via which you make a decision to go this way or this way or this way or that. But how do you decide down which road to go? You need what are called Boolean expressions. Named after mathematician Boole, these Boolean expressions are short phrases that either have yes or no answers, or true or false answers, or if you will, 1 or 0-- on or off answers. Smith is among names, Smith is earlier in book. Smith is later in book-- each of them could be asked effectively with a question mark to which the answer is yes or no, and based on that answer can you go down any one of those four roads. And then lastly is this-- go back to step 2 or go back to step 2. Highlighted in yellow here's an example of a loop, a deliberate cycle that gets you to do something again and again and again until some condition is no longer true and you eventually quit or call Mike instead. And so looping in a program allows you to write less code, but do something again and again and again just by referring back to some work that you've already done. Now these constructs, though we've seen them in the context of pseudocode and searching a phone book, can be found in any number of languages and programs, be it Python or C++ or Java. And so when it comes to programming a computer, or more generally solving a problem with a computer, we'll ultimately express ourselves in fundamentally the same way, with functions, conditions, and Boolean expressions, and loops, and many other constructs still, but we'll do it in a way that's methodical, we'll do it in a way wherein we've all agreed how to represent the inputs to that process and the outputs thereto, and ultimately use that representation and these layers of abstraction in order to solve our problems. But sometimes too much abstraction is not always a good thing. Sometimes it's both necessary and quite helpful to get into the weeds of the lower level implementation details, so to speak. And let's see if we can't see this together. Take out if you could a pen or pencil, as well as a sheet of paper, and in just a moment allow me to program you, if you will. I'll use a bit of verbal pseudocode to program you to draw a particular picture on that sheet of paper. The onus is on me to choose an appropriate level of abstraction so that you're on the same page, if you will, as I, so that ultimately what I program is what you implement correctly. Let's try. All right. First, go ahead if you would and draw a square. All right. And next to that square to the right, go ahead if you could and draw a circle. To the right of that circle, go ahead and draw a diagonal line from bottom left to top right. All right, and lastly, go ahead if you would and draw a triangle to the right of that line. All right. Now hopefully you're quite proud of what you just drew, and it's perfectly correct as you followed my instructions step-by-step. So surely what you drew looks like this? Well that's a square, to a circle to the right, a straight line from bottom left to top right, to the right of which is a triangle. Now with some probability, you probably didn't draw quite this. Fortunately there's no one there to see, but where might you have gone wrong or maybe where might I have gone wrong? Well I didn't exactly specify just how big that square should be, let alone where it should be on the page. And you might have quite reasonably drawn it right in the center-- maybe the top left or the top right. But if you drew in a place that I didn't intend, you might not have had enough room for that actual square unless you flipped perhaps the paper around. As for the circle, I said draw it to the right, but I didn't technically say that it should be just as wide or just as high as that square, so there might have been a disparity there. As for that straight line, I said from bottom left to top right. I knew that I meant from the bottom of the circle to the top right of what would then be the triangle, but maybe you started from here and all the way up to here. I was making an assumption, and that, perhaps, was not necessarily precise enough. And then lastly, the triangle. Pretty sure that I learned growing up that there's all sorts of different triangles. Some have right angles, some have acute angles or obtuse angles, or some triangles or even isosceles. Now I happen to intend this one and you might have drawn just that, but if you did, you probably got a bit lucky. And unfortunately when it comes to computing, you don't want to just get lucky when programming your computer, you want to actually ensure that what you write is what it does. And in fact, if you've ever seen on your Mac or PC a spinning beach ball or hourglass indicating that something is taking quite long, or the computer freezes or reboots, odds are that's because some programmer, now like myself, might not have been specific enough to the computer as to what to do in all circumstances, and so sometimes unforeseen behavior happens. The computer starts thinking forever. The computer reboots or crashes or freezes, which is simply the result of one or more lines of code-- not in pseudocode, but in Java or C++ or something else-- not having anticipated some user's input or some particular direction. So maybe it would help to operate not quite at this high of a level of abstraction. And indeed, all of these shapes are abstractions. A square, a circle, a line, and a triangle are abstractions in the sense that we've ascribed words to them that have some semantic meaning to us, but what really is a square? Well again, if we go back to grade school, a square is a rectangle, all of whose sides are of equal length and whose angles are all right angles. But very quickly, my own eyes start to glaze over when you get into the weeds of that implementation, and so all of us have agreed since to just call it a square. And same for circle and line and triangle, but even then, there are variations. So let's get a bit more into the weeds this time and let me take an approach with the second of two pictures, programming you this time at a much lower level of detail. Go ahead and take out another sheet of paper or flip over the one that you already have. Toward the top middle of that sheet of paper, roughly one inch from the top, go ahead and put down the tip of your pen or pencil. Keeping your pen or pencil applied to the paper, start to draw from the top middle of the paper down toward the left at roughly a 45 degree angle for two or so inches and then stop. Keeping the tip of your pen or pencil on the paper, go ahead and draw another line perhaps three inches straight down toward the bottom of the paper, stopping two or so inches shy of the bottom of the paper. Then, if you would, hook a right, this time moving at a 45-degree angle toward the bottom-middle of the paper, stopping ultimately one or so inches from the bottom of the paper. Then bear right yet again, this time going up at a 45-degree angle upward toward the middle of the paper, this time extending a couple of inches, and then stopping at the same height your previous line started at. Now draw a vertical line about three inches high, stopping exactly where two lines ago started. And if you're on the same page, no pun intended, go ahead and hook a left this time, going back a 45-degree angle toward the top-middle of the page, hopefully closing the loop, if you will, and connecting your very first dot to the last place you stopped. At this point, you hopefully have a shape on your page that has six total sides. What comes next? Go ahead and from the top-middle of the page follow the line you already drew down to the left at a 45-degree angle and stop where you made a corner before-- before you went straight down on the page. And instead of following that vertical line, go ahead and go down 45 degrees to the right a couple of inches, stopping once you're directly below the top-middle of the dot you initially drew. Then go ahead and do two things from the point you're now at. Go ahead and draw the opposite line up and to the right at a 45-degree angle, connecting that line to one of the corners you previously made. And then from that same initial point, draw a vertical line down to the bottom-middle of the page where you had another angle still. Lift up your pen or pencil and take pride in what you've just drawn, because it is most surely exactly this. Was it? I don't know, because that was quite painful for me to say in this program because I was operating at such a low level really with no abstractions other than line and point and angle. I was instead directing you much like a paintbrush on a page to go up and down and left and right, collectively creating what will be an abstraction that I would dare say call a cube, but a cube that I did not name by name. Had I said draw a cube, frankly we learned from last time that that could have opened up multiple possibilities. What should the angles be? What should the size be? What should the rotation there of it be? And so an abstraction alone might not have been enough if I wanted you to draw precisely this cube. But if I do want you to be ever so correct and consistent with what I had in mind on the screen here, I really did need to go down to such a low level, so to speak, that I was talking in terms of edges' length and angles therein. And even then, I might not have been as clear as I might have been-- more precise measurements and angles and such might have been ideal so that you ultimately end up precisely with the same thing. So if you veered off-track, not a problem, my fault more so than yours. But what's the takeaway here? Well at some point it would be nice not to have to write programs at such a low level again and again and again. Wouldn't it be nice if just one of us, the best artist among us could implement code-- write code-- that implements a cube, that implements a square, a circle, a line, and a triangle? But when using those drawing functions, if you will, that someone else has implemented, you should probably get into the habit of asking me additional questions. You want a square. Well how long should each edge be and where should its position be on the page or the canvas? Same goes for circle or line or triangle-- what are the angles and lengths and positioning be? And if you can parametrize your functions in that way-- in other words, write your functions in such a way that they themselves take input so that what a function does is at the end of the day produce output subject to those inputs, we have then solved a problem. Not necessarily the one problem at hand, but we've solved other people's problems. And indeed, that's what the world of programming ultimately is-- writing code to solve problems, and ideally solving those problems once and only once. And whether it's internally within your firm or your company, writing code in such a way that other people-- maybe yourself and your colleagues can then use it-- or in the case of open source software, anyone in the world can use it. So that in an ideal world, only one of us humans ever need write code in some language to draw a circle or cube or square or line or triangle, and everyone else in the world can stand on his or her shoulders, use that code to build their tool or product on top it. This, then, was computational thinking. You have inputs, you want outputs, and in between, you have algorithms. You just first need to decide how to represent those inputs and outputs, and you have to decide for yourself at what level of abstractions to work.
B1 中級 2019年律師行業CS50--計算思維篇 (CS50 for Lawyers 2019 - Computational Thinking) 1 0 林宜悉 發佈於 2021 年 01 月 14 日 更多分享 分享 收藏 回報 影片單字