Placeholder Image

字幕列表 影片播放

  • what is going on?

  • Everybody, welcome back to another unconventional neural networks tutorial in this video, what we're gonna be doing is going over the results from arm or complex mathematical models.

  • So up to this point, we've been using a character level sequence sequence model to do at least addition, which we found was 100% accurate.

  • Uh, when we did the inference testing, we noticed there was some weird inconsistencies there.

  • What was going on is what with the chat, but we have an additional scoring mechanism that's like a rule based scoring mechanism that sits on top of the, uh, output.

  • Depending on being with weaken, we can actually have a number of outputs from the Chapa, and then we can pick which one we want to use.

  • And that's what we were doing with the addition as well.

  • But with the scoring mechanism, one of the main things with scoring as we tend to like longer results and we penalize shorter ones.

  • So in the model where we were doing, like five plus two or whatever as producing a very short results, and the network was kind of being scored negatively for that, and that's why we had that inconsistency.

  • Now we could go through and make a difference.

  • A different inference.

  • Stop.

  • I I don't really see any need.

  • This is truly just research phase.

  • And with inference Stop.

  • I You're only doing, like, one at a time, whereas the output that is truly 500 or 100 out of sample tests or however many you wanna have.

  • So to me, that's that's the way to go anyways, when we're just doing research.

  • So that's why I've just decided to keep it.

  • But that's why we had that, like, strange and consistency there with 100% accuracy, and then suddenly we're finding instances where it's not.

  • Anyway, let's get into it.

  • So the 1st 1 that we have here is the multiple operators.

  • So if you in case you forget, we can go into data and check it out.

  • So we look at look at the from first.

  • We don't really need to look at two, but anyways, so this case is a subtraction.

  • Multiplication.

  • We got a division problem and the reason why they're spaces here.

  • It's just being token ized.

  • Hence the character level sequence sequence here.

  • Um, anyway, all the different variants and we're ready to rumble.

  • So I'm just going to minimize this and a couple things.

  • All the models, including the 100% accurate addition model, uh, the this model here, the multiple operators and then finally, the much more complex math model.

  • All of them I'm gonna upload to python programming, and that's so you can kind of play with them.

  • I'll try to remember to upload them with the settings file so that you could run it, actually.

  • But if I forget, someone remind me and I'll throw the settings into it.

  • So this is the result, basically, right up here, it takes me forever to scroll this page.

  • Let me just do a tiny bit.

  • Hopefully it'll there it goes, um, rather than doing it via the Python files with the output.

  • Dev, What we did here is this time it's actually just tied into tense aboard that so we can track these along the way.

  • So So all of these charts here are over time from start to the very end of what the difference was so obviously like with, especially with multiplication, um, the magnitude of being wrong on multiplication is very much different than the magnitude of probably being wrong with, like, addition or subtraction.

  • So we didn't want to just have a general.

  • How?

  • How?

  • How absolute valued off are We, um, necessarily like we did.

  • This is math, total difference.

  • Like we do track that as well.

  • But we're kind of curious, really.

  • About these individually as well, just to see if they're all learning or what's going wrong.

  • So an addition is one above.

  • There's so many plots here, though, that it is really, really Laghi.

  • So I don't want to scroll up, but anyways, it's a dare.

  • Um, and then what we have here is so the total difference, But then also the total accuracy.

  • So as you can see, I took a little bit, and suddenly it jumps up.

  • Probably learning rate decreased their, um, and then eventually just kind of leveled out despite learning rate decreases.

  • So, um, I just kind of stopped the model at this point.

  • Could keep going.

  • Maybe it was gonna learn something, Maur I'm not really sure, but I'm actually pretty impressed with a 45% accuracy.

  • Um, that's pretty good.

  • So, uh, anyways, so that's that I don't really want a harp too long on this one because this one is not as cool to me.

  • I'm pretty confident we could come up with some sort of model that would learn this 100%.

  • But what I really wanted to do was try farm or complex types of math than just this.

  • Because if we could solve the far more complex math with some sort of model, then that should also solve this problem.

  • So let's move on to the far more complex variant here.

  • And let me just pull down our paper space.

  • So close that.

  • And I guess we'll just have to keep it this way.

  • So this is the these air, like all the charts, basically for the more complex one.

  • Now, let me pull up an example.

  • Let me see if I can't find Here we go.

  • So let me just pull up, um, a test from so, in fact that this So I'm gonna pull up one.

  • That's not token eyes, because I just It's harder for us to read it, So I'm just gonna go into new data instead test from oh, man.

  • Is it Schoof?

  • Untie me, man.

  • Come on, there it goes anyway.

  • Okay, so this is These are the types of operations now that we're trying to solve.

  • So not only do you have multiple types of operators, we've got multiple lengths of sequences, although these air all five interestingly enough, but this one's there's a least this one's for long.

  • Um and so is this anyway.

  • But we also have, like, parentheses in here and different operations in the print sees.

  • We've got multiple parentheses.

  • You can see there's quite a bit of embedding going on in this one.

  • Uh, these are cool.

  • If we could solve this kind of math, that would be impressive.

  • So, uh, so now that was that's the input, basically to the model this time.

  • And obviously the output of the, uh, you know that to file is just the answer to these, which is not that interesting.

  • It's just the right answer.

  • This is that model.

  • After I think we're on step like 700,000 something.

  • It looks like it's about, you know, maybe 7 50,000 where we at 7 41 So 741 that was in steps, and we're tracking all the same things.

  • We're tracking the ads, the dibs, the mole's subs and then the total difference.

  • Now the total there is a little hard to CSOs moles.

  • Ah, one option we have is to try to zoom in a little bit.

  • Um, the promise.

  • These values were just so freaking huge that they confused the chart anyway, So you can zoom in a little bit and start to kind of get an idea of what that graph looks like.

  • Also, the total difference one I can try to zoom in.

  • So it's so Laghi with this thing's been up for so long.

  • So anyway, you can at least get the idea they are declining over time.

  • The total accuracy here is a mere 1.2%.

  • Now, in the grand scheme of things with the fact that this is a character level secrets sequence, in theory, it could be an infinite number of outputs.

  • Getting 1.2% accuracy is pretty exciting to me.

  • I actually didn't think that this would work at all.

  • I didn't think it would train at all.

  • Um, so actually 1.2%.

  • I'm pretty impressed.

  • The fact, too, that um, the total differences in steady decline and it's definitely learning things.

  • Also, we can pull up some of the comparison.

  • So I went ahead and just pulled in the sum of the comparing files from before, just so that we can kind of look at what the intended output was compared to what the actual output was.

  • So this is just a really basic file from one of the previous tutorials.

  • So it's just gonna load up output dead 5000.

  • So it's just it's all 500 of them.

  • So we can see this is after 5000 steps.

  • How it was doing had 0% accuracy on here.

  • Um, but interestingly enough, almost immediately, we actually see that it's not super far off like I mean, Well, okay, this one is actually pretty far off.

  • That's actually one of the worst ones I've seen.

  • To be honest, usually it's at least somewhere in the right order of magnitude.

  • But actually a few of these air quite a bit off.

  • This is more along the lines of what I kind of expected the best of this model to be.

  • I just didn't feel like I felt like this is very challenging for a neural network to be ableto learn all the intricacies of how to do addition, how to do division, the parentheses and, like combining all these operations that is curious to me that it's able to do that.

  • Um, so So anyway, so there was only 5000 steps.

  • So as I showed you guys, we have, like, 700,000.

  • So, um, let's just jump to let's do it.

  • Let's jump to, like 100,000 or whatever is close to that 100,125.

  • So me do it again.

  • I'll put this model up 100,000.

  • I'll put this model up on Python.

  • Permanent is 500 megabytes of, ah, model.

  • It's quite a large model, so take that in consideration.

  • But you could run it off a CPU.

  • You know, running live on a CPU is really no big deal.

  • It's mostly just training on a CPU.

  • That's a pain in the Okay, so let's run this one.

  • Okay, so here we have this one, still with 0% accurate, we can see that this one, it just got way, way, way wrong.

  • But a lot of these are actually pretty close and you gotta kind of hand it to the model to an extent, um, these air really long decimals.

  • So so, Yeah, but you can already tell that it is at least getting in the ballpark, you know, 170,600 verse 107,400 over here to await.

  • Well, I think it's, like two million, right?

  • Am I blind?

  • That's about two million something.

  • And then this one is again about two million something.

  • Um, So if this this eh, I was on a multiple choice test.

  • It probably do.

  • Okay, um, and yes, So least it's it's it's getting there.

  • It's getting closer to things.

  • Right.

  • Um and then let's just go ahead and jump it all the way to the latest, which is 7 20 to 1 25 So 7 22 Make it nice and big here.

  • Okay, So in this case, after this was 0.1, I forget if I round it to actually round it, possibly correct?

  • No, not really rounded.

  • So at least in this case, it was only 1%.

  • And then, like the best it's ever got is like 1.2, not a huge difference.

  • Um, but anyway, we can already see here.

  • I forget which ones Which one?

  • It was historically that seemed to always get the magnitude wrong.

  • It was one of these shorter one of these ones down here.

  • Well, this one looks pretty bad.

  • Anyway, we can continue to kind of like, look at some of these and you should just be able to see, but also, if we look back at the well, pull it back up here in a minute.

  • But the tents are bored.

  • Log of the differences and all that, like, it gets pretty good.

  • I mean, it's pretty shocking to me that over time it's able to slowly hone in.

  • Um, I'm not really sure you know that.

  • First, all this model has been training for a week, at least on paper space.

  • Um, so So probably to stop it at some point.

  • It's a costly, costly endeavor, but I've already dropped the learning rate quite a bit.

  • I'm not really sure I want to keep dropping it, but at the same time it is.

  • It does appear to me to continue to keep learning.

  • And I had really like to see how far we could take this model, so we'll see.

  • But I don't think that, like, I don't think we're going to suddenly jump up to above 5%.

  • So I think the next step is to come up with a superior model that will learn quicker than this one does.

  • So, like in the first, let's say 10,000 or 100,000 steps.

  • Is it?

  • Is it?

  • Has it made more progress in this model?

  • If so, I'll let that one continue.

  • We'll see where we can get it.

  • So, um, anyway, that's kind of my plan going forward.

  • But it takes such a long time to keep playing around these models.

  • So I'm just gonna kind of keep playing with him.

  • I probably won't make a update video every time I poke around, Um, and in the meantime, we're gonna pride jump into a different topic entirely.

  • I'd like to play around with sound, so there's all kinds of fun things that we can do with generating sound s Oh, that's kind of what I'm gonna be looking into next.

  • If you guys have any suggestions about what kind of model, size and shape and stuff that we should use here, by all means make the suggestion.

  • This is also bi directional neural network.

  • I don't know if that's actually beneficial for us to be doing so.

  • Could try to take that away.

  • Let me just pull up the settings for this one real quick.

  • Um said, uh, settings.

  • So yeah, the boat capsized 18.

  • Test size 500.

  • These were the pox that I set up here.

  • Um, yes.

  • So in this case, it's a 10 by 5 12 So quite the large model.

  • It might even be the case that tend by 5 12 is too big.

  • So a lot of times you make the model too big.

  • It's just too challenging to actually learn.

  • So probably I step I would do is actually shrink the size of the network and see if that helps us at all.

  • So that's probably the first thing that I'm gonna actually do.

  • But anyway, I'm actually pretty excited about this result.

  • I think that's really cool.

  • Ah, that it's capable of learning.

  • Um, a pretty complex.

  • There's not a direct, like, relate.

  • There's so much going on in the input to this and that it can slowly actually get more and more accurate.

  • Um, especially given the fact that we have so many inputs and so many plausible outputs especially like the number of combinations that you could have here.

  • Um, it can't be brute force, right?

  • Is the fact that it can actually learn.

  • Um, this style of math is very intriguing.

  • That's cool.

  • I mean, I just I think that's really cool, and I think it would be super cool if you could get 100% accuracy or even, like, 50%.

  • I'd be like, ecstatic with finding a model that could get even, like, 50% accuracy on this kind of an equation.

  • That'd be pretty cool.

  • And then eventually, if I could get a good math model going, I'd like to start playing around with weaker forms of encryption.

  • Just see one.

  • Can a neural network reliably do encryption like to make a hash or something that that you could actually rely on?

  • Uh and then also, could it break?

  • Um, less strong forms of encryption or hashes Anyway, that's it for now.

  • If you guys have suggestions, I doubt anybody else is running one of these models.

  • But if you do and you happen to find something that's a little more accurate.

  • Let me know.

  • Anyways, that's it for now.

  • I hope you guys have enjoyed.

  • Like I said, we're probably get into sound and stuff like that.

  • Next.

  • That way more people could hopefully follow along.

  • I just think this one is clearly a very challenging task for a neural network, whereas with sound, we can definitely get more into something that everybody can do on their computer as well.

  • So anyways, that's what we're gonna be doing in the coming tutorials.

  • Hope you guys are enjoying the Siri's.