字幕列表 影片播放 列印英文字幕 LAURENCE MORONEY: Hi, and welcome to this episode in "Natural Language Processing, Zero to Hero" with TensorFlow. In the previous videos in this series, you saw how to tokenize text and how to use sequences of tokens to train a neural network. In particular, you saw how to create a neural network that classified text by sentiments. And in this case, you trained a classifier on sarcasm headlines. But the next step I'm often asked when it comes to text is, what about generating text? Can a neural network create text based on the corpus that it's trained on, and can we get an AI to write poetry? Well, the answer to this is yes. And over the next few videos, I'll show you a simple example on how you can achieve this. Before we can do that, though, an important concept that you'll need to understand is recurrent neural networks. This type of neural network takes the sequence of data into account when it's learning. So for example, in the case of a classifier for text that we just saw, the order in which the words appear in the sentence doesn't really matter. What determined the sentiment was the vector that resulted in adding up all of the individual vectors for the individual words. The direction of that vector roughly gave us the sentiments. But if we're going to generate text, the order does matter. For example, consider this sentence. "Today the weather is gorgeous, and I see a beautiful blue"-- something. If you were trying to predict the next word-- and the concept of creating text really boils down to predicting the next word-- you'd probably say, "sky," because that comes after "beautiful" and "blue," and the context is the weather, which we saw earlier in the sentence. So how do we fit this to neural networks? Let's take a look at what's involved in changing from sequence list data to sequential data. Neural networks for classification or regression tend to look like this. It's kind of like a function that you feed in data and labels, and it infers the rules that fits the data to the labels. But you could also express it like this. The f of data and labels equals the rules. But there's no sequence inherent in this. So let's take a look at some numeric sequences and explore the anatomy of them. And here's a very famous one called the Fibonacci sequence. To describe the rules that make this sequence, let's describe the numbers using a variable. So for example, we can say n0 for the first number, n1 for the next, and so on. And the rule that then defines the sequence is that any number in the sequence is the sum of the two numbers before it. So if we start with 1 and 2, the next number is 1 plus 2, which is 3. The next number is 5, which is 2 plus 3, and so on. We could also try to visualize it like this on a computation graph. If the function is plus, we feed in 1 and 2 to get 3. We also pass this answer and the second parameter, which in this case was 2, onto the next computation. This gives us 2 plus 3, which is 5. This gets fed into the next computation along with the second parameter, so 5 plus 3 get added to get 8, and so on. So every number is in essence contextualized into every other number. We started with 1, and added it to 2 to get 3. The 1 and the 3 still exists. And when added to 2 again, we get 5. That 1 still continues to exist throughout the series. Thus, a numeric value can recur throughout the life of the series. And this is the basis of the concept of a recurrent neural network. Let's take a look at this type of network in a little more detail. Typically, a recurrent neuron is drawn like this. There's a function that gets an input value that produces an output value. In addition to the output, it also produces another feed-forward value that gets passed to the next neuron. So a bunch of them together can look like this. And reading from left to right, we can feed x0 into the neuron, and it calculates a result, y0, as well as a value that gets passed to the next neuron. That gets x1 along with the fed-forward value from the previous neuron and calculates y1. And its output is combined with x to get y2 and a feed-forward value to the next neuron, and so on. Thus, sequence is encoded into the outputs, a little bit like the Fibonacci sequence. This recurrence of data gives us the name recurrent neural networks. So that's all very well. And you may have seen a little catch in how this could work with natural language processing. A simple RNN like the one that I've just shown is a bit like the Fibonacci sequence in that the sequence can be very strong, but it weakens as the context spreads. The number at the position 1 has very little impact on the number at the position 100, for example. It's there, but it's tiny. And that could be useful for predicting text where the signal to determine the text is close by, for example, the beautiful blue something that we mentioned earlier. It's easy for us to see that "sky" is the next word. But what about a sentence like this? "I lived in Ireland, so they taught me how to speak"-- something. Now, you might think it's "Irish," but the correct answer is "Gaelic." But think about how you predicted that word. The key word that dictated it was much further back in the sentence, and it's the word "Ireland." If we were only predicting based on the words that are close to the desired one, we'd miss that completely, and we'd get a bad prediction. The key there is to go beyond the very short-term memory of a recurrent neural network with a longer short-term memory and a network type not surprisingly called long short-term memory, or LSTM. You'll see that in the next video, so don't forget to hit that Subscribe button for more great episodes of "Coding TensorFlow at Home." [MUSIC PLAYING]
B1 中級 循環神經網絡的ML(NLP Zero to Hero--第4部分)。 (ML with Recurrent Neural Networks (NLP Zero to Hero - Part 4)) 5 0 林宜悉 發佈於 2021 年 01 月 14 日 更多分享 分享 收藏 回報 影片單字