Name: 複雜數學結果--非常規神經網絡第12頁 (Complex Math Results - Unconventional Neural Networks p.12)
Uploaded: 2021-01-14T12:27:56.000Z
Duration: 16 min 26 s
Description: 【看影片學英語】數萬部 YouTube 影片，搭配英漢字典即點即查，輕鬆掌握單字發音與用法，長久累積看電影不必再看字幕。

現在進行式

Everybody, welcome back to another unconventional neural networks tutorial in this video, what we're gonna be doing is going over the results from arm or complex mathematical models.

過去簡單式

So up to this point, we've been using a character level sequence sequence model to do at least addition, which we found was 100% accurate.

Uh, when we did the inference testing, we noticed there was some weird inconsistencies there.

What was going on is what with the chat, but we have an additional scoring mechanism that's like a rule based scoring mechanism that sits on top of the, uh, output.

Depending on being with weaken, we can actually have a number of outputs from the Chapa, and then we can pick which one we want to use.

And that's what we were doing with the addition as well.

But with the scoring mechanism, one of the main things with scoring as we tend to like longer results and we penalize shorter ones.

So in the model where we were doing, like five plus two or whatever as producing a very short results, and the network was kind of being scored negatively for that, and that's why we had that inconsistency.

Now we could go through and make a difference.

I You're only doing, like, one at a time, whereas the output that is truly 500 or 100 out of sample tests or however many you wanna have.

So to me, that's that's the way to go anyways, when we're just doing research.

So that's why I've just decided to keep it.

But that's why we had that, like, strange and consistency there with 100% accuracy, and then suddenly we're finding instances where it's not.

So the 1st 1 that we have here is the multiple operators.

So if you in case you forget, we can go into data and check it out.

We don't really need to look at two, but anyways, so this case is a subtraction.

We got a division problem and the reason why they're spaces here.

Hence the character level sequence sequence here.

Um, anyway, all the different variants and we're ready to rumble.

So I'm just going to minimize this and a couple things.

All the models, including the 100% accurate addition model, uh, the this model here, the multiple operators and then finally, the much more complex math model.

All of them I'm gonna upload to python programming, and that's so you can kind of play with them.

I'll try to remember to upload them with the settings file so that you could run it, actually.

But if I forget, someone remind me and I'll throw the settings into it.

So this is the result, basically, right up here, it takes me forever to scroll this page.

Hopefully it'll there it goes, um, rather than doing it via the Python files with the output.

Dev, What we did here is this time it's actually just tied into tense aboard that so we can track these along the way.

So So all of these charts here are over time from start to the very end of what the difference was so obviously like with, especially with multiplication, um, the magnitude of being wrong on multiplication is very much different than the magnitude of probably being wrong with, like, addition or subtraction.

So we didn't want to just have a general.

How absolute valued off are We, um, necessarily like we did.

About these individually as well, just to see if they're all learning or what's going wrong.

There's so many plots here, though, that it is really, really Laghi.

So I don't want to scroll up, but anyways, it's a dare.

Um, and then what we have here is so the total difference, But then also the total accuracy.

So as you can see, I took a little bit, and suddenly it jumps up.

Probably learning rate decreased their, um, and then eventually just kind of leveled out despite learning rate decreases.

So, um, I just kind of stopped the model at this point.

Maybe it was gonna learn something, Maur I'm not really sure, but I'm actually pretty impressed with a 45% accuracy.

So, uh, anyways, so that's that I don't really want a harp too long on this one because this one is not as cool to me.

I'm pretty confident we could come up with some sort of model that would learn this 100%.

But what I really wanted to do was try farm or complex types of math than just this.

Because if we could solve the far more complex math with some sort of model, then that should also solve this problem.

So let's move on to the far more complex variant here.

And let me just pull down our paper space.

And I guess we'll just have to keep it this way.

So this is the these air, like all the charts, basically for the more complex one.

So let me just pull up, um, a test from so, in fact that this So I'm gonna pull up one.

That's not token eyes, because I just It's harder for us to read it, So I'm just gonna go into new data instead test from oh, man.

Okay, so this is These are the types of operations now that we're trying to solve.

So not only do you have multiple types of operators, we've got multiple lengths of sequences, although these air all five interestingly enough, but this one's there's a least this one's for long.

But we also have, like, parentheses in here and different operations in the print sees.

You can see there's quite a bit of embedding going on in this one.

If we could solve this kind of math, that would be impressive.

So, uh, so now that was that's the input, basically to the model this time.

And obviously the output of the, uh, you know that to file is just the answer to these, which is not that interesting.

After I think we're on step like 700,000 something.

It looks like it's about, you know, maybe 7 50,000 where we at 7 41 So 741 that was in steps, and we're tracking all the same things.

We're tracking the ads, the dibs, the mole's subs and then the total difference.

Now the total there is a little hard to CSOs moles.

Ah, one option we have is to try to zoom in a little bit.

These values were just so freaking huge that they confused the chart anyway, So you can zoom in a little bit and start to kind of get an idea of what that graph looks like.

Also, the total difference one I can try to zoom in.

So it's so Laghi with this thing's been up for so long.

So anyway, you can at least get the idea they are declining over time.

Now, in the grand scheme of things with the fact that this is a character level secrets sequence, in theory, it could be an infinite number of outputs.

Getting 1.2% accuracy is pretty exciting to me.

I actually didn't think that this would work at all.