Placeholder Image

字幕列表 影片播放

  • We were talking a few weeks ago about how we can add additional processes into a computers (sic) to do specialist tasks

  • One of the things we talked about was floating point processors

  • Now these days they're built directly onto your CPU

  • But, you can still get some CPUssome of the variants of the ARM CPUand certainly if you go back in history

  • Some of the other CPUs you can get didn't have a floating-point unit. They could do math but they could only do integer math

  • So they could add 1 and 2 but they couldn't add 1.5 and 2.5

  • Well you "could" but you had to write the software to handle the fractional part of the number an do the math

  • and stick it back together in the right order. So I was wondering

  • What's the difference in speed that a floating-point person would actually make?

  • So as I said most computers we have these days have floating-point units of some sort built-in

  • So I went back to one of my old ones and I decided to write a program to test it

  • So I wrote a very simple program which does a 3d spinning cube. So the program is relatively straightforward

  • It's got a series of eight points which restores representation into a series of matrix transformations on them

  • To get them into the screen coordinates and then we draw the lines

  • so I did this using floating point maps and the programs running here and we can see it's

  • Reasonably quick it takes no point not for five a second to calculate where all the screen coordinates need to be for each frame

  • sometimes varies but that's in general what it takes so I then went off onto a

  • popular auction site

  • beginning with a letter E and

  • Ordered myself a floating point ship for the Falcon and I then inserted it

  • into the machine and I recompiled the program this time to use

  • The floating point unit. So this version it's doing floating point maps

  • It's using the fractions

  • But it's all being done in software this machine code instructions for the six 8030 chip in there

  • To calculate all those different floating point numbers

  • so then we compiled the program this time to actually use the floating point unit and

  • This version runs about 4.5 times faster it takes no point

  • Not one seconds rather no point naught four five seconds to do exactly the same calculations. This is exactly the same source code

  • I just recompile a using GCC to produce a version that used the floating-point

  • Unit and you can actually see that the graphics are slightly smoother and the time is much less

  • So the fact we can speed it up by doing it in hardware. Perhaps isn't that surprising?

  • There's lots of tasks where you could either implement it in software or implement it in harder

  • And if you implement it in hard way it's often

  • Faster to do so so we are going to try that but it's actually worth thinking about what's involved in adding up

  • Floating-point numbers Tom did a good video

  • Right back at the beginning computer file looking at how floating-point numbers are represented as a sort of high-level and it will say well

  • It - naught point nine nine nine nine nine nine nine then after a while

  • you'll stop but actually when you get down to the sort of

  • Level of the computer's having to deal with them and see how they're stored

  • It's quite interesting to see how they're stored and how then manipulating them

  • Are you writing software to do something simple like adding two numbers together?

  • Actually ends up being quite an involved task compared to adding together two binary numbers

  • So let's look at how we add numbers together

  • So let's let's take a number and to save time I've printed out the bits. So let's take the number

  • Let's say 42 because why everyone who uses that so we've got here the number 42 is one zero

  • One zero one zero and then we need to fill the rest of these with zeros. Well ignore that for now

  • So that's bit naught over on the right hand side through two bit

  • One two three, and these of course are the same with the powers of 2

  • So 2 2 zeros is ones two to the one is two four eight

  • Just like we have powers of 10 when we do decimal numbers. So let's everyone to add together 42 and

  • 23 so I've got another binary number here 23 so the same bits and

  • We'll just basically do addition. So 0 plus 1 Shaun is

  • 1 good

  • yeah, ok, 1 plus 1 is

  • 0 and we have to carry the 1 0 plus 1 plus 1

  • Okay. Yeah 1 plus 0 plus 1

  • So yeah

  • We've run it up 42 this numbers 23 42 plus 23 is 65 and sorry produced

  • 65 in binary is a result

  • So rubbing up in binary is a relatively straightforward thing

  • What we do is we take from the right each pair of bits add them together

  • We produce a sum bit and occasion

  • We also produce a carry bit and then we add the carry on in the next column just like we do

  • when we do decimal arithmetic

  • And you can generate systems that represent

  • Decimals or by symbols

  • I guess they'd be called or fractional numbers

  • Using this so you can use a system which is quite common

  • Was used in Adobe Acrobat as used on the iOS for 3d graphics at one point

  • which is fixed point numbers where you say that say about 32 bits the top 16 bits are going to represent the sort of

  • Integer part the bottom 16 bits are going to represent

  • the fractional part and the basic way to think about that is you multiplied every number by 6 on

  • 65,536 shifts everything along and then when you want to produce the final result you divide it all by 6

  • 65536 now the problem with fixed point numbers is that they have a fixed scale

  • Fixed is key in the name. So for example, if we use

  • 32-bit fixed point numbers splitting into 16 bits and 16 bits. That's great. We can go up to

  • 65,000 or so on the integer part, but if we need to get to 70,000 we can't story

  • Likewise we can go to 1

  • 65536 the other thing we'd agree to go to 1

  • 130

  • 1072 sort of a thing we can't because we don't have that resolution on occasion

  • We need the bits down here to represent very small quantities and occasion

  • We want them to represent very large quantities for something like 3d graphics or graphics in general

  • Fixed point numbers can work. Well for general-purpose things. They don't work that well

  • So what people tend to do is they use floating-point numbers, which is the right things as tom said in

  • scientific notation so rather than writing

  • 102 4 we write it as 1 point 0 to 4 times 10 to the 3 so we're using scientific notation

  • We can do the same in binary rather than writing

  • 101

  • One, oh, we can write one point

  • zero one zero

  • One times two this time rather than 10 to the 1 2 3 4 so we can write it 2 to the 4

  • So what floating-point numbers do is that they say okay rather than representing

  • Numbers using a fixed number as bits for each we're going to represent them in scientific notation effectively. We're the sort of

  • Number that were then going to multiply by 2 to the something to shift it to the right point to make things absolutely clear

  • I'm going to use

  • Decimal numbers here to represent the 2 times 10 to the 4 so I will cheat and use this here

  • but of course it would be

  • 102 the 1 0 0 so I guess the question that remains is how do we represent this in a computer?

  • We've got to change this notation, which we can write nicely on a piece of paper to represent the binary number

  • Multiplied by a power of 2, but how do we represent that in the computer?

  • what we need to do is take this and find an encouraging which

  • Represents it as a series of bits that the computer can then deal with

  • So we're going to look at 32 bit of floating point numbers mainly because the number of digits I have to fill in

  • become

  • Relatively smaller to deal with then rather than doing 64 bit

  • We could have done 16 bit sign things, but they use the same thing

  • It's just the way they break it down change your slightly how many bits are assigned to each section

  • So we've got our 32 bits and we need to represent

  • this number in there we start off by splitting this up into

  • A few different things. So the first bits or the most significant bit are in the number. The one on the left over here is

  • The sign bit and that says whether it's a positive number

  • We just let it be zero or negative number what which case it will be what?

  • So unlike two's complement

  • Which David's looked at in the past two's complement is equivalent to the ones complement with one?

  • Added to it. The sign is represented purely as being a zero means positive one means negative

  • We just have one bit represented with that

  • They then say we're going to have eight bits, which we're going to use to represent the exponent this bit here

  • I what power of 2 which gives us 255 or so

  • Different powers of two we can use we'll come back to how that's represented in a second and then the rest of it

  • Is used to represent the mantissa as its referred to so the remaining 23 bits are at the 32 are used to represent

  • The remaining bit of the number, okay

  • So they've got 23 bits to represent the number which is n gonna be multiplied by the 8 bit exponent

  • They said every single possible floating-point number you're gonna write down is going to have a 1 as its most significant digit

  • Except 1 0 say, ok. We'll treat 0 as a special case and to represent 0 they just set all bits to be zeros

  • So we know that this is going to be 1 what we know is 1 so we don't need to encode it

  • It's always going to be 1 so actually these

  • 23 bits here are the bits that come after the 1 so it's one dollar

  • So on which are all the bits that come after here

  • So we we sort of don't encode that bit because we know it's there one way to think about floating-point numbers

  • Is there a sort of lossy compression mechanism for?

  • real

  • numbers floating-point real

  • Fractional numbers because we're taking a number which is some representation and we're compressing it into these bits

  • But we lose some information and we can see that in a second

  • we run a little demo and we'll see that actually it can't represent all numbers and

  • It's surprising sometimes which numbers it can't represent and which can each it can so we can then start writing

  • Numbers in this form and to simplify things. I've printed out a form like this

  • So if you want to write out the number one, it's one point naught naught naught naught

  • Times 2 to the power of 0 so it's one point

  • na-na-na-na-na-na-na-na naught

  • times 2 to the power of

  • 0 which is 1 so it's 1 times 1 which is 1 and of course the sign bit because it's positive

  • would be 0 to say that so we could write that out as

  • The number so we can start assigning these things to the different bits

  • We put a 0 there cuz it's positive and the mantissa is all 0 so we just fill them up with

  • Zeros, and that leaves us with this 8 bit here

  • We've got to represent 2 to the power of 0 now they could have decided to just put 0 in there

  • But then the number 1 would have exactly the same a bit patent

  • There's the number zero and introduced that's potentially not a good idea

  • so what they actually say we're going to do is we're going to take the power which will go from mind 127 through to

  • 127 and then they add 127 on it. So our exponent here. Our power of 2 is 0

  • so 0

  • plus

  • 127 obviously is 127 so we encode

  • 127 into these remaining bits so 0 1 1 1 1

  • 1 1 1

  • So to encode the number 1 like that we encode it into the binary representation

  • 0 for the sign bit 0 1

  • 127 for the exponent and then because we know that one's already encoded the rest of it becomes 0 this is a lossy system

  • We can encode some numbers, but we're only encoding 24 significant bits where they are within the number of encoding changes

  • But we're only encoding 24 significant bits

  • So that's just right program

  • That takes a number 1 6 7 7 7 - 1 5 an integer number and adds one to it

  • And we'll do this in a loop

  • We'll add one to the result and add one to the world and print out the values

  • So we think that one six seven seven seven two one six one six seven seven seven

  • Two one seven and we'll do this with both for an integer variable. So a 32-bit integer and also with a

  • 32-bit float, so got to money without program written here on the computer

  • So we set up a float why we set up the variable. I to be

  • 16 million 770

  • 7215 checking things binary there and we set Y to equal I so they both start off with the same value

  • We're then going to print them out. Where's the decimal and the floating point? Well, I'm also going to print out the hexadecimal

  • Representations of the bits so we can see what's going on

  • We're then going to add 1 to the value of y and add 1 to the value of ice

  • So we're going to increment them both

  • So let's run this program

  • To not million mistakes that's always a good sign and let's run it so we get

  • And 15 and we get

  • 16777215 point normal or not? What would expect?

  • 16,777,216 and the same there. So now we had one on it again and we get for the integer value

  • 16777215 point number that's not a right. Okay, so that's not right what's going on there? Well, if we think about how we represent this

  • Let's think about the number one six seven seven seven two one six

  • That number is one times two to the 24 and I sort of tricked by

  • Generating it this way at the beginning. There's one with lots of zeros after it times two to the 24

  • we have only 23 bits to represent this bit in here if

  • We want to add on

  • an extra bit

  • We would need 24 bits here. We've only got 23 we can't do it

  • So we can't represent it. If we added up to each time, it would work fine. So actually as we get some larger numbers

  • We still have the same number of significant bits

  • or significant digits

  • But we can't store certain values. Well as we can with integers, so it's a lossy compression system

  • basically, we can store a large range of values for anything from

  • minus 2 to the power of 127 through 2 2 to the power of

  • 127 or we can go very very low and have numbers as small as 2 to the minus 127

  • But we only have a certain they were pursuing

  • So if we deal with very very large numbers that we've still only got 23 bits

  • We have the precision and if we do them very small, but numbers we can got 23 bits worth of precision, which is fine

  • We can cope with that because often when you're dealing with big numbers

  • You're not worried about the small fiddly decimal places in your small four significant figures if you're measuring how far it is from the Earth

  • to Alpha Centauri in

  • millimeters plus or minus a few

  • Millimeters a few points of a millimeter isn't going to make much difference that sort of thing

  • So it's a compression where it's just a lossy system

  • you

  • All for this videos gonna mean writing zeroes 23 times, maybe I should have done 16-bit numbers

We were talking a few weeks ago about how we can add additional processes into a computers (sic) to do specialist tasks

字幕與單字

單字即點即查 點擊單字可以查詢單字解釋

B1 中級

浮點數(第一部分:Fp與固定) - Computerphile (Floating Point Numbers (Part1: Fp vs Fixed) - Computerphile)

  • 2 0
    林宜悉 發佈於 2021 年 01 月 14 日
影片單字