字幕列表 影片播放 列印英文字幕 LAURENCE MORONEY: Hi, and welcome to episode 5 of our Natural Language Processing with TensorFlow series. In this video, we're going to take a look at how to manage the understanding of context in language across longer sentences, where we can see that the impact of a word early in the sentence can determine the meaning and semantics of the end of the sentence. We'll use something called an LSTM, or Long Short Term Memory, to achieve this. So for example, if we're predicting text and the text looks like this-- today has a beautiful blue something-- it's easy to predict that the next word is probably sky, because we have a lot of context close to the word, and most notably the word blue. But what about a sentence like this one-- I lived in Ireland, so I learned how to speak something? How do we predict the something? The correct answer, of course, is Gaelic, not Irish, but that's close enough. And you and I could interpret that, but how do we do that? What's the keyword that determines this answer? Of course, it's the word Ireland, because in this case, the country determines the language. But the word is very far back in the sentence. So when using a recurrent neural network, this might be hard to achieve. Remember, the recurrent neural networks we've been looking at are a bit like this, where there's a neuron that can learn something and then pass context to the next timestamp. But over a long distance, this context can be greatly deluded, and we might not be able to see how meanings in faraway words dictate overall meaning. The LSTM architecture might help here, because it introduces something called a cell state, which is a context that can be maintained across many timestamps, and which can bring meaning from the beginning of the sentence to bear. It can learn that Ireland denotes Gaelic as the language. What's fascinating is that it can also be bi-directional, where it might be that later words in the sentence can also provide context to earlier ones so that we can learn the semantics of the sentence more accurately. I won't go into the specifics of LSTMs in this video, but if you want to learn how they work in depth, the deep learning specialization from Deep Learning AI is a great place to go. So we've seen in theory how they work. But what does this look like in code? Let's dive in and take a look. Let's consider how we would use an LSTM and a classifier like the sarcasm classifier we saw in an earlier video. It's really quite simple. We first define that we want an LSTM-style layer. This takes a numeric parameter for the number of hidden nodes within it, and this is also the dimensionality of the output space from this layer. If you wanted to be bi-directional, you can then wrap this layer in a bi-directional like this, and you're good to go. Remember that this will look at your sentence forwards and backwards, learn the best parameters for each, and then merge them. It might not always be best for your scenario, but it is worth experimenting with. LSTMs can use a lot of parameters, as a quick look at this model summary can show you. Note that there are 128 in the LSTM layer, because we're doing a bi-directional using 64 in each direction. You can, of course, also stack LSTM layers so that the outputs of one layer get fed into the next, a lot like with dense layers. Just be sure to set return sequences to true on all layers that are feeding another. So in a case like this, where we have two, the first should have it. If you have three LSTM layers stacked, the first two should have it, and so on. And a summary of this model will show the extra parameters that the extra LSTMs give. So now you've seen the basis of recurrent neural networks, including long short term memory ones. You've also seen the steps in pre-processing text for training a neural network. In the next video, you'll put all of this together and start with a very simple neural network for predicting and thus creating original text. I'll see you there. And for more videos on AI in TensorFlow, don't forget to hit that Subscribe button. [MUSIC PLAYING]
B1 中級 NLP的長短期記憶(NLP零到英雄-第五部分 (Long Short-Term Memory for NLP (NLP Zero to Hero - Part 5)) 3 0 林宜悉 發佈於 2021 年 01 月 14 日 更多分享 分享 收藏 回報 影片單字