Placeholder Image

字幕列表 影片播放

  • LAURENCE MORONEY: Hi, and welcome

  • to this episode in "Natural Language Processing, Zero

  • to Hero" with TensorFlow.

  • In the previous videos in this series,

  • you saw how to tokenize text and how to use sequences of tokens

  • to train a neural network.

  • In particular, you saw how to create a neural network that

  • classified text by sentiments.

  • And in this case, you trained a classifier

  • on sarcasm headlines.

  • But the next step I'm often asked when it comes to text

  • is, what about generating text?

  • Can a neural network create text based on the corpus

  • that it's trained on, and can we get an AI to write poetry?

  • Well, the answer to this is yes.

  • And over the next few videos, I'll

  • show you a simple example on how you can achieve this.

  • Before we can do that, though, an important concept

  • that you'll need to understand is recurrent neural networks.

  • This type of neural network takes the sequence of data

  • into account when it's learning.

  • So for example, in the case of a classifier for text

  • that we just saw, the order in which

  • the words appear in the sentence doesn't really matter.

  • What determined the sentiment was the vector that

  • resulted in adding up all of the individual vectors

  • for the individual words.

  • The direction of that vector roughly gave us the sentiments.

  • But if we're going to generate text, the order does matter.

  • For example, consider this sentence.

  • "Today the weather is gorgeous, and I see a beautiful blue"--

  • something.

  • If you were trying to predict the next word--

  • and the concept of creating text really

  • boils down to predicting the next word--

  • you'd probably say, "sky," because that

  • comes after "beautiful" and "blue,"

  • and the context is the weather, which

  • we saw earlier in the sentence.

  • So how do we fit this to neural networks?

  • Let's take a look at what's involved

  • in changing from sequence list data to sequential data.

  • Neural networks for classification or regression

  • tend to look like this.

  • It's kind of like a function that you

  • feed in data and labels, and it infers the rules that

  • fits the data to the labels.

  • But you could also express it like this.

  • The f of data and labels equals the rules.

  • But there's no sequence inherent in this.

  • So let's take a look at some numeric sequences

  • and explore the anatomy of them.

  • And here's a very famous one called the Fibonacci sequence.

  • To describe the rules that make this sequence,

  • let's describe the numbers using a variable.

  • So for example, we can say n0 for the first number, n1

  • for the next, and so on.

  • And the rule that then defines the sequence

  • is that any number in the sequence

  • is the sum of the two numbers before it.

  • So if we start with 1 and 2, the next number

  • is 1 plus 2, which is 3.

  • The next number is 5, which is 2 plus 3, and so on.

  • We could also try to visualize it like this on a computation

  • graph.

  • If the function is plus, we feed in 1 and 2 to get 3.

  • We also pass this answer and the second parameter,

  • which in this case was 2, onto the next computation.

  • This gives us 2 plus 3, which is 5.

  • This gets fed into the next computation

  • along with the second parameter, so 5 plus 3 get added to get 8,

  • and so on.

  • So every number is in essence contextualized

  • into every other number.

  • We started with 1, and added it to 2 to get 3.

  • The 1 and the 3 still exists.

  • And when added to 2 again, we get 5.

  • That 1 still continues to exist throughout the series.

  • Thus, a numeric value can recur throughout the life

  • of the series.

  • And this is the basis of the concept

  • of a recurrent neural network.

  • Let's take a look at this type of network

  • in a little more detail.

  • Typically, a recurrent neuron is drawn like this.

  • There's a function that gets an input value that

  • produces an output value.

  • In addition to the output, it also

  • produces another feed-forward value that

  • gets passed to the next neuron.

  • So a bunch of them together can look like this.

  • And reading from left to right, we can feed x0 into the neuron,

  • and it calculates a result, y0, as well as a value that

  • gets passed to the next neuron.

  • That gets x1 along with the fed-forward value

  • from the previous neuron and calculates y1.

  • And its output is combined with x

  • to get y2 and a feed-forward value to the next neuron,

  • and so on.

  • Thus, sequence is encoded into the outputs, a little bit

  • like the Fibonacci sequence.

  • This recurrence of data gives us the name

  • recurrent neural networks.

  • So that's all very well.

  • And you may have seen a little catch

  • in how this could work with natural language processing.

  • A simple RNN like the one that I've just shown

  • is a bit like the Fibonacci sequence

  • in that the sequence can be very strong,

  • but it weakens as the context spreads.

  • The number at the position 1 has very little impact

  • on the number at the position 100, for example.

  • It's there, but it's tiny.

  • And that could be useful for predicting text

  • where the signal to determine the text

  • is close by, for example, the beautiful blue something

  • that we mentioned earlier.

  • It's easy for us to see that "sky" is the next word.

  • But what about a sentence like this?

  • "I lived in Ireland, so they taught me how to speak"--

  • something.

  • Now, you might think it's "Irish,"

  • but the correct answer is "Gaelic."

  • But think about how you predicted that word.

  • The key word that dictated it was much further back

  • in the sentence, and it's the word "Ireland."

  • If we were only predicting based on the words that

  • are close to the desired one, we'd miss that completely,

  • and we'd get a bad prediction.

  • The key there is to go beyond the very short-term memory

  • of a recurrent neural network with a longer short-term memory

  • and a network type not surprisingly

  • called long short-term memory, or LSTM.

  • You'll see that in the next video,

  • so don't forget to hit that Subscribe

  • button for more great episodes of "Coding TensorFlow at Home."

  • [MUSIC PLAYING]

LAURENCE MORONEY: Hi, and welcome

字幕與單字

單字即點即查 點擊單字可以查詢單字解釋

B1 中級

循環神經網絡的ML(NLP Zero to Hero--第4部分)。 (ML with Recurrent Neural Networks (NLP Zero to Hero - Part 4))

  • 5 0
    林宜悉 發佈於 2021 年 01 月 14 日
影片單字