字幕列表 影片播放 列印英文字幕 I think, as far as I know, it was Brian Kernighan and Dennis Ritchie who first introduced it to me. I don't if it goes back earlier than that, but certainly in the C book - there it is: 'printf ("hello world\n")', you know, and the use of '\n' to denote a new line at the end of it, and all that. It's now really become a part of Comp Sci legend. The first thing you do when you show that you've mastered [or just begun] a new language be it Python or whatever, you know, "Oh yes! here's how to do 'Hello World' ". Of course, "Hello World" is a characters-based challenge. And from what we now know about characters - in modern computers at least - being stored in addressable bytes - does it sort of follow then, that "Hello World" would be somewhat easier [at low level] on a byte-based machine? Oh yes! it would be a lot easier on a byte-based machine. But there's other things as well. So as, perhaps, an illustration of just how horrible it could be - and given that we have done some stuff on EDSAC already - let's go and do that. If you haven't seen the other EDSAC stuff I think you'll be able to follow what I'm doing anyway. And you could always go back later and pick up some more background about EDSAC. But when we were on this EDSAC simulator, the last time, we actually did run the program that Martin Campbell-Kelly supplies with it. And he got fed up of doing "Hello World". He said: "I'll just do a brief version that says 'HI' ". We did that. Thanks to a combined programming effort now, by those in this room, I have here the new version "HelloWorld_SR_DFB.txt" And there it is. It's quite a lot longer, of course, than the previous one was. >> Sean: So, is each of those lines using a word, then ? >> DFB: Yes. EDSAC was designed around the most minimalist set of things It was basically ... the story was ... if it's possible to do [it] with what we've got already, then don't start inventing new flavours of instructions. So, all you've got here is - this is the stuff of course for setting up where the load point is, and where the relative offsets of these addresses is, relative to 64. The '@' symbol at the end [of an instruction] signals to David Wheeler's Initial Orders that what comes here is a relative address. So what it's saying is: letter O - not a zero - "output the character which you can find in the memory location 16 further on than 64 is". So, all these offsets: 16, 17, 18, 19, 20 are all relative to 64. So in actual fact, then, it turns out that address 80 holds the very first thing you want to output. And of course 16 on from 64 .... well if 64 is here this is where the actual data starts. The 'ZF' and the things like that correspond to what are nowadays called assembler directives. It's not always the case that these things go one-for-one into occupying a word. Some of them are messages to the assembler. All the stuff up here is basically saying: "I want you to remember 64 and start locating everything relative to that". >> Sean: Because if we look specifically at the line numbers on the left there that wouldn't be the place you're trying to get to, right?". >> DFB: No, this stuff up here is what would probably be done in modern assemblers by saying something like "ORG = 64". [where ORG = "origin"] In other words that isn't a program instruction. It's telling you, the assembler: "Please start me [loading] at 64". And it's for your own [assembler] internal knowledge. It's not to be translated into a program instruction. So the ZF says "Stop" - stop execution. But in the meantime what we're expecting is the thing that is 16 on from 64 will actually get us to here for *F. What does *F do? * is a short code for saying "Put yourself in letter shift". Veterans of 5-hole paper tape will know - you've got to make sure that you're in letter shift to print meaningful messages. The other possible shift is figure shift and all hell breaks loose if you start forgetting to shift out [of that]. It's just like the shift key on a typewriter, that's where it comes from historically >> Sean: Can you use that as a very, very simplistic code ?! >> DFB: [laughing] Yes! Possibly! Anyway, so turn into letter shift and, look, this makes sense now! Can you see HF in one [single-length] word? F means: "This is a single length word". Yeah, 18 bits. Actually the op-code field for those who've got the EDSAC tutorial. The op-code field is occupied by an H but the O command will output these [bits in the op-code field] as if they were characters, - and meant to be characters. They've got to be in the op-code field but the O command says: "Look in the opcode field". Regard it - as not a Baudot character, remember Maurice Wilkes had invented EDSAC code - subtly different but never mind. And it's so you end up coming to here and saying: "Oh! it's a letter H [that] I am to output when this O instruction, with a relative address offset on it. And you go all the way. Look here H-E-L-L-O. What's the exclamation mark? Look it up in the EDSAC tutorial, as I had to do. That's the marker you put in if you want to force an explicit space between HELLO and WORLD. Which we did. And we finally ... what are @F and &F after the 'D' of "HELLO WORLD"? Well, let's take a guess. We're trying to be neat and tidy - make it look good - that's the code for "give me a carriage return; give me a line feed". And then we say "end of the whole thing; end execution". And this is a marker also to Initial Orders: You can stop relocating this program for me. I'm done. OK. so that - since it's on top now - Oh! - fingers crossed Sean - what do we do? We do Start don't we? We noticed that, way back up at the top [of the program], we put in a Stop, just to make sure. Because [puts on 'ironic' voice] with our incredible knowledge of EDSAC binary. Sean and I can see, straight away, [looks at oscilloscope display] that that, of course, is HELLO WORLD. I mean, we're not kidding. David Wheeler would know that it said HELLO WORLD. I'll tell you something else, Sean. After only half a day's familiarity with this, John von Neumann would know that that was HELLO WORLD! He'd find it so comfortable to remember the details of the binary. Y'know, I'm sure he would! I really do. So, here we go then. Let's do a Single EP, a single instruction, Single Shot, it's sometimes called nowadays. Right! There we are! It's still blinking. We turned into letter shift with that click, next click 'H'.Oh! isn't this wonderful Sean?! Aren't we demon programmers?! E-L-L- O-space. Yes! W-O-R-L-D- carriage return - line feed. So, that was pretty painful! Although the T64K gives you relocatability - [e.g.] you could change that to be T256K, say, if you wanted to - [i.e.] shove the whole thing up memory and then maybe turn it into a subroutine? You want to push it somewhere else in memory. So, the bulk relocation, against the base address, is taken care of by Initial Orders, but you've still got to get the offsets right. And it's painful! It's utterly, utterly painful. We're now gonna jump forward [in time] into safe byte-addressed territory, for handling characters, and [focus on] the ARM 32-bit ARM chip, which we use for teaching assembler programming here [at Univ of Nottingham] to our first years [undergrads.] Yeah, it is a 32-bit word, broken up into four bytes, 8-bit bytes, which of course use ASCII not IBM EBCDIC Fine, so down at the assembler level for the ARM, then, what does the byte addressability give us and what other things have happened between the EDSAC era and this era, where we're talking late 80s, 90s - this sort of thing. What else has happened to make this {ARM assembler] thing so much more compact, so much easier to understand and so much more flexible? Well, let's go here through, step by step. Comments: anything after a semicolon is a comment. I've put a comment up at the top saying to put out the "Hello World", we've used the so-called software interrupts - the system calls - as provided by the University of Manchester's KoMoDo ARM development environment, which is what we use. So when we get to actually printing the character out, don't get worried by SWI, it means 'software interrupt', to ask the [KoMoDo] operating system to print something for me, or something like that. So let's start up here. Programs on the ARM will cheerfully expect - if you don't tell them otherwise - that they will start executing at line 1 of your program, and go madly on. I put this data for "Hello World" up at the top of the listing. Not at the bottom as I could have done. But the rule then is: if I declare "Hello World" here, as being a piece of text, and this DEFB here means ' ,,, just define a bunch of bytes'. And you put them in " quotes like you would in C. And even - taking over some of its story from C - it even allows you to ask for a newline to be put in there with \n. And the only difference is whereas C implicitly plugs its strings with a null character at the end, ARM doesn't do that for you. You must explicitly put in a null character at the end of your string - if that is your stop indicator. But in order to stop the ARM chip executing "Hello World" as if it was bit-patterns for instructions - which you don't want - you want to jump past it, I've put in here, look, an unconditional branch to [the label] 'main'. Branch to 'main'. Aw! now this is wonderful! You don't have to say branch to an absolute address and be like David Wheeler and John von Neumann and have them all in your head, you just say: "Let's label it 'main' and this thing called 'an assembler' will work out what 'main' means in terms of the address you want to jump to. Isn't that wonderful! [In fantasy] von Neumann stares at you and says: "That's for the weak-brained who can't keep track of their addresses!" Y' know! Anyway, so, we branch to 'main' and the first thing it says, very self-evidently, really is: "Get me the start address of the text string and put that start address into register 1 [r1]". Next thing we notice - as long promised: modern CPUs [like ARM] have [typically] 15 or 16 special-purpose registers to make life bearable. EDSAC didn't - it only had the accumulator! And if you wanted other storage places, you had to start parking it in memory, in all sorts of horrible ways. So, that helps us straight away: r1 is going to be our so-called index register; it's going to start off by pointing at the address of 'H'. Now I don't know what the byte address of 'H' is. It might even be relatively zero here. It's the first thing that happens in this program. But whatever is the actual bytes address of 'H' is now in register 1. Here is the crux of the whole thing: LDRB [B=byte] "load into a register the byte specified as follows; here I say r0, that's the register I want to load it into. But where does it come from? In square brackets [r1]. That says look in r1 and you will find an address of the start of that string. I don't want you to load the address into r0, I want you to load the character that is at that address into r0. It's [called] "indirection" and that is indicated by that square bracket [i.e.] not putting the address that's in r1 into r0; I'm following the pointer from r1 saying:" Oh! that's the letter 'H'at the moment and that's what I put into r0. And here's the other cute thing at the end - wouldn't those pioneers have given the world for this - is to say: " ... and when you've done that, please, for next time around the loop increment that r1 address by one". So, if it was pointing at 18, shall we say to start with, it's 19 now, for next time around the loop. So you keep on going around that loop. And here's the thing where you check whether you've hit the null character: "Compare the contents of register 0 - which would be a character contents - against literally 0, which is what the null character is. Now, is the answer "yes" or "no"? Is it equal, or not equal, to 0. And here's another lovely thing about the ARM chip that Steve and I love dearly. This is the 32-bit ARM chip - I think in the 64-bit one they've [decided] it's not so important to do it nowadays. They have a thing in the 32-bit one called 'conditional execution', which can save you often using a branch instruction, which are relatively expensive in pipeline terms. So here we've got SWINE- which is wonderful! Software interrupt 0 says " ... punch out this character for me on the display, on the screen". But NE says: " ... but do that only if the last thing you did didn't yield 'equal' [so it's 'not equal'] Well, we're checking for the null character. So, as long as it wasn't the null character it'll say: "No - I'm not equal to the null character". And you print it out and out it comes, character by character. After that, of course, you loop back to go around and print another character, remembering that the #1 has incremented your address pointer along that string. So you keep on going round here you don't have to remember what [the] address 'loop' is. You don't know! [But] the assembler knows [and] it fixes it up for you. And then, right at the very end, the way to say: "Stop execution - I've done it" SWI flavour 2, on this emulated environment says "Stop it completely". The development of that from EDSAC? You think "Oh! my golly, I am so pleased I've got that!" And Martin the inventor of the EDSAC simulator here, I emailed him the other day and he came back to me and said: "Yes, the need for an index register was realized so quickly that that's why my [EDSAC] emulator is [only] early '49 to late-ish 1950, because in late 1950 David Wheeler and everybody said "My golly, we need an index register!" And they built one in. So, in a way then, this is what is happening. It's that the pioneers were using their early machinery to lead the way into saying: "What extra facilities do we need to make life tolerable for us?" Now, there is the hardware facility of having the index registers and they've just become standard kit, afterwards every other machine has index registers. But also what interests me is the role of a proper assembler. Initial Orders II is not a full-blown assembler. It helps you a little bit by turning decimal addresses into binary but you have to remember that that letter A - that you put in the leading five bits - could be the character 'A', but if you're regarding this as an instruction, that's an ADD instruction. So, but then Initial Orders II is relocating; it's relocating; doing a bit of binary translation; it's a single-pass process; it's wonderful! The problem with assembler is it has to be a two-pass process. The trouble always is that if you jump back to labels you have already seen, you will know already what address that will be at. But it's when you jump forward. How do I know where the heck that label down there is gonna be [in address terms]? I don't even want to calculate it! I want the assembler to say: "Oh! I'm on location 186 now - how handy!" But then it can't fix up the addresses till it knows and has counted its way through the program. So then it says: "Right, I will now output you a definitive thing - that you can put in through David Wheelers Initial Orders II - because I've made it so much easier; because I've allowed labels. One doesn't think of labels as being a structuring convention and yet at this low level they are, in a way. Because this [label] is saying 'loop' - it starts here. Another label. Oh! it ends here. Please calculate the addresses of what's happening there and fix it up for me. And so you might say: "Well, all right but didn't everybody say 'we must have assemblers it's the modern way to do things' ?" There were very mixed views about this. And I don't think EDSAC got an assembler until EDSAC 2 - when another friend of mine, David Hartley did, I think, a macro-assembler for EDSAC 2 - not EDSAC 1. Because there's a story here related to von Neumann as well. I don't know whether it was EDVAC or his version of EDVAC that he had in his basement (called Johnniac ?!). Apparently he really berated a grad student who wrote an assembler. [Invented quote] "Assemblers are for the weak- brained who cannot work out their own addresses! You do realize that in running this assembler of yours - punching out a paper tape - I'm behind you in the queue. I don't get my turn next! You come to me and say: 'Ah but this is ready to load now, in the second phase, as absolute binary' You're wasting time! If you're so weak-brained you can't program in absolute ... [I'm putting words in his mouth !! ]. But this was essentially it. He, no doubt, had dreams in Absolute Binary. There was no problem with John von Neumann about coping as close to binary as possible. He could keep it all in his head and he would, I think have found Initial Orders on EDSAC about, yes, nice and helpful. Single pass Not slowing down things a lot. But an assembler! You're wasting time on this machine! By doing assemblers. I mean it really really brings it home to those of us who always joked about, y'know: "Real Programmers use Assembler" The answer, certainly from John von Neumann - possibly even from David Wheeler - but he wouldn't have been as extreme as that - is: "Real Programmers use Absolute Binary!"
B1 中級 Hello World (Assemblers, Considered harmful?) - Computerphile(計算機愛好者) (Hello World (Assemblers, Considered Harmful?!) - Computerphile) 2 0 林宜悉 發佈於 2021 年 01 月 14 日 更多分享 分享 收藏 回報 影片單字