A2 初級 英國腔 88 分類 收藏
In a previous video, we looked at how CPU's can use caches to speed up accesses to memory.
So, the CPU has to fetch things from memory; it might be a bit of data, it might be an instruction
And it goes through the cache to try and access it.
And the cache keeps a local copy in fast memory to try and speed up the accesses
But what we didn't talk about is:
What does a CPU do with what it's fetched from memory

what is it actually doing and how does it process it?
So the CPU is fetching values from memory.
We'll ignore the cache for now, because it doesn't matter if the CPU has a cache or not
it's still gonna do roughly the same things
And we're also gonna look at very old CPU's
the sort of things that are in 8-bit machines
purely because they're simpler to deal with
and simpler to see what's going on
The same idea is still applied to an ARM CPU today or an X86 chip
or whatever it is you got in your machine.
Modern CPU's use what's called
the Van Neumann architecture

and what this basically means is
that you have a CPU

and you have a block of memory.
And that memory is connected to the CPU
by two buses

Each is just a collection of several wires that are connecting
And again we're looking at old-fashioned macines.
On a modern machine it gets a bit more complicated

But the idea, the principle, is the same.
So we have an addess bus
and the idea is that the CPU can generate a number in here in binary
to access any particular value in here.
So we say that the first one is at adress 0
and we're gonna use a 6502 as an example
We'll say that the last one is at address 65535 in decimal, or FFFF in hexadecimal
So we can generate any of these numbers on 16 bits of this address bus
to access any of the individual bytes
in this memory

How do we get the data between the two?
Well we have another bus

which is called the data bus,
which connects the two together

Now the reason why this is a Van Neumann machine
is because this memory can
contain both the program

i.e. the bytes that make up the instructions
that the CPU can execute

and the data
So the same block of memory
contain some bytes

which contain program instructions
some bytes which contain data
And the CPU if you wanted to could
treat the program as data

or treat the data as program
Well if you do that then it would probably crash
So what we've got here is an old BBC Micro
using a 6502 CPU

and we're gonna just write a very, very simple
machine code program

that uses
well the operation is saying just to
print out the letter C for computerphile

So if you assemble it,
we're using hexadecimal

we've started our program at 084C
So that's the address,
were our program is being created

And our program is very simple
It loads one of the CPU's registers
which is just basically a temporary data store
that you can use

and this one is called the accumulator
with the ascii code 67 which represents
a capital C

and then it says:
jump to the subroutine at this address

which will print out that particular character
And then we tell it we want to stop
so we gotta return

from subroutine.
And if we run this

and type in the address,
so we're at ... 84C

then you'll see that it prints out the letter C
and then we get a prompt
to carry on doing things

So our program,
we write it in assembly language

which we can understand as humans
-ish, LDA: Load Accumulator
JSR: Jump to subroutine

RTS: Return to subroutine
You get the idea once you've done it a few times
And the computer converts this
into a series of numbers, in binary

The CPU is working in binary but to make it easier to read we display it as hexadecimal
So our program becomes:
A9, 43

20 EE FF

That's the program we've written
And the CPU, when it runs it
needs to fetch those bytes from memory

into the CPU
Now, how does it do that?
To get the first byte we need to
put the address: 084C on the address bus

and a bit later on, the memory will send back
the byte that represents the instruction: A9

Now, how does the CPU know where to get these instructions from?
Well, it's quite simple.
Inside the CPU

there is a register, which we call
the program counter, or PC on a 6502

or something like an X86 machine it's
known as the instruction pointer.

And all that does is store the address
to the next instruction to execute

So when we were starting up here,
it would have 084C in it

That's the address to the instruction we want to execute
So when the CPU wants to fetch the
instruction it's gonna execute

It puts that address on the address bus
and the memory then sends the instruction
back to the CPU

So the first thing the CPU is
gonna do to run our program

is to fetch the instruction
and the way it does that is by
putting the address from

the program counter onto
the address bus

and then fetching the actual instruction
So the memory provides it,
but the CPU then reads that in

on it's input on the data bus
Now it needs to fetch the whole
instruction that the CPU is gonna execute

and on the example we saw there
it was relatively straightforward

because the instruction was only
a byte long

Not all CPU's are that simple
Some CPU's will vary these things,
so this hardware can actually be quite complicated

so it needs to work out how long
the instruction is

So it could be as short as one byte
it could be as long on some CPU's
as 15 bytes

and you sometimes don't know how long it's gonna be until you've read at few of the bytes
So this hardware can be relatively trivial
So an ARM CPU makes it very, very simple
it says: all instructions are 32 bits long

So the Archimedes over there
can fetch the instruction very, very simply

32 bits
On something like an x86, it can be
any length up to 15 bytes or so

and so this becomes more complicated,
you have to sort of work out

what it is utnil you've got it
But we fetch the instruction
So in the example we've got,
we've got A9 here

So we now need to work out what A9 does
Well, we need to decode it into
what we want the CPU to actually do

So we need to have another bit
of our CPU's hardware

which we're dedicating to
decoding the instruction

So we have a part of the CPU which is
fetching it

and part of the CPU which is then
decoding it

So it gets A9 into it:
So the A9 comes into the decode

And it says: Well okay, that's a load instruction.
So I need to fetch a value from memory
which was the 43
the ASCII code for the capital letter C
that we saw earlier

So we need to fetch something else
from memory

We need to access memory again,
and we need to work out what address

that's gonna be.
We also then need to,
once we've got that value,

update the right register
to store that value

So we've gotta do things in sequence.
So part of the Decode logic is to
take the single instruction byte,

or how long it is,
and work out what's the sequence that we need to drive the other bits of the CPU to do
And so that also means that we have
another bit of the CPU

which is the actual bit that does things,
which is gonna be all the logic
which actually executes instructions

So we start off by fetching it
and then once we've fetched it
we can start decoding it

and then we can execute it
And the decode logic is responsible for saying:
Put the address for where you want to get the value,
that you can load into memory from

and then store it,
once it's been loaded into the CPU

So you're doing things in order:
We have to fetch it first
and we can't decode it until we've fetched it
and we can't execute things
until we've decoded it

So, at any one time,
we'll probably find on a simple CPU

that quite a few of the bits of the
CPU wouldn't actually be doing anything

So, while we're fetching the value
from memory

to work out how we're gonna decode it
the decode and the execute logic
aren't doing anything

They're just sitting there, waiting for their turn
And then, when we decode it,
it's not fetching anything

and it's not executing anything
So we're sort of moving through these different
states one after the other

And that takes different amounts of time
If we're fetching 15 bytes it's gonna take longer than
if we're fetching one

decoding it might well be shorter
than if we're fetching something from memory,
cos' this is all inside the CPU

And the execution depends on
what's actually happening

So your CPU will work like this:
It will go through each phase,

then once it's done that,
it'll start on the next clock tick

all the CPU's are synchronized to a clock,
which just keeps things moving in sequence
and you can build a CPU.
Something like the 6502 worked like that

But, as we said, lots of the CPU aren't actually
doing anything at any time

which is a bit wasteful of the resources
So is there another way you can do this?
And the answer is yes!
You can do what's called

a sort of pipe-lined model of a CPU
So what you do here is,
you still have the same 3 bits of the CPU

But you say: Okay, so we gotta fetch (f)
instruction one
In the next bit of time,
I'm gonna start decoding this one

So, I'm gonna start decoding instruction one
But I'm gonna say: I'm not using
the fetch logic here,

so I'm gonna have this start to get things ready
and, start to do things ahead of schedule
I'm also at the same time
gonna fetch instruction 2

So now I'm doing two things,
two bit's of my CPU in use the same time

I'm fetching the next instruction,
while decoding the first one

And once we've done decoding, I can start
executing the first instruction

So I execute that
But at the same time, I can start
decoding instruction 2

and hopefully,
I can start fetching instruction 3

So what? It is still taking the same
amount of time to execute that first instruction

So the beauty is when it
comes to executing instruction two

it completes exactly one
cycle after the other

rather than having to wait for it to go through
the fetch and decode and execute cycles

we can just execute it as soon as we've
finished instruction one

So each instruction still takes the
same amount of time

it's gonna take, say, three clock cycles
to go through the CPU

but because we've sort of pipelined it together
they actually appear to execute one after each other
so it appears to execute one clock cycle
after each other

And we could do this again
So we could start decoding

instruction 3 here
at the same time as we're executing instruction two
Now there can be problems
This works for some instructions,
but say this instruction

said "store this value in memory"
Now you've got a problem
You've only got one address bus
and one data bus

so you can only access or store
one thing in memory at a time

You can't execute a store instruction and fetch a value from memory
So you wouldn't be able to fetch it until the next clock cycle
So we fetch instruction four there
while executing instruction three
But we can't decode anything here
So in this clock cycle, we can
decode instruction four

and fetch instruction five
but we can't execute anything
We've got what's called a "bubble"
in our pipelines,

or pipeline store
because at this point,
the design of the CPU doesn't let us

fetch an instruction
and execute an instruction at the same time
it's ... what is called "pipeline hazards"
that you can get when designing a pipeline CPU
because the design of the CPU
doesn't let you

do the things you need to
do at the same time

at the same time.
So you have to

delay things, which means that
you get a bubble

So, you can't quite get up to
one instruction per cycle

But you can certainly get closer
than you could if you
just had everything

to do one instruction at a time.


CPU內 (Inside the CPU - Computerphile)

88 分類 收藏
dearjane 發佈於 2018 年 12 月 30 日
  1. 1. 單字查詢


  2. 2. 單句重複播放


  3. 3. 使用快速鍵


  4. 4. 關閉語言字幕


  5. 5. 內嵌播放器


  6. 6. 展開播放器


  1. 英文聽力測驗


  1. 點擊展開筆記本讓你看的更舒服

  1. UrbanDictionary 俚語字典整合查詢。一般字典查詢不到你滿意的解譯,不妨使用「俚語字典」,或許會讓你有滿意的答案喔