Placeholder Image

字幕列表 影片播放

  • Are you surprised at the advances

  • that have come in the last several years?

  • Oh, yes, definitely. I didn’t imagine

  • it would become this impressive.

  • What’s strange to me,

  • is that we create these models,

  • but we don’t really understand

  • how the knowledge is encoded.

  • To see what’s in there,

  • it’s almost like a black box,

  • although we see the innards,

  • and so understanding why it does so well,

  • or so poorly, were still pretty naive.

  • One thing I’m really excited about

  • is our lack of understanding

  • on both types of intelligence,

  • artificial and human intelligence.

  • It really opens new intellectual problems.

  • There’s something odd

  • about how these large language models,

  • that we often call LLMs,

  • acquire knowledge in such an opaque way.

  • It can perform some tests extremely well,

  • while surprising us

  • with silly mistakes somewhere else.

  • It’s been interesting that,

  • even when it makes mistakes,

  • sometimes if you just

  • change the prompt a little bit,

  • then all of a sudden,

  • even that boundary is somewhat fuzzy,

  • as people play around.

  • Totally.

  • Quote-unquote "prompt engineering"

  • became a bit of a black art

  • where some people say that you have to

  • really motivate the transformers

  • in the way that you motivate humans.

  • One custom instruction that I found online

  • was supposed to be about

  • how you first tell LLM’s

  • you are brilliant at reasoning,

  • you really think carefully,”

  • then somehow the performance is better,

  • which is quite fascinating.

  • But I find two very divisive reactions

  • to the different results that you can get

  • from prompt engineering.

  • On one side, there are people

  • who tend to focus primarily

  • on the success case.

  • So long as there is one answer

  • that is correct, it means

  • the transformers, or LLMs,

  • do know the correct answer;

  • it’s your fault that

  • you didn’t ask nicely enough.

  • Whereas there is the other side,

  • the people who tend to focus

  • a lot more on the failure cases,

  • therefore nothing works.

  • Both are some sort of extremes.

  • The answer may be

  • somewhere in between,

  • but this does reveal

  • surprising aspects of this thing. Why?

  • Why does it make

  • these kinds of mistakes at all?

  • We saw a dramatic improvement

  • from the models the size of GPT-3

  • going up to the size of ChatGPT-4.

  • I thought of 3 as kind of a funny toy,

  • almost like a random sentence generator

  • that I wrote 30 years ago.

  • It was better than that,

  • but I didn’t see it as that useful.

  • I was shocked that ChatGPT-4

  • used in the right way

  • can be pretty powerful.

  • If we go up in scale,

  • say another factor of 10 or 20 above GPT-4,

  • will that be a dramatic improvement,

  • or a very modest improvement?

  • I guess it’s pretty unclear.

  • Good question, Bill.

  • I honestly don’t know

  • what to think about it.

  • There’s uncertainty,

  • is what I’m trying to say.

  • I feel there’s a high chance

  • that well be surprised again,

  • by an increase in capabilities.

  • And then we will also be really surprised

  • by some strange failure modes.

  • More and more, I suspect that

  • the evaluation will become harder,

  • because people tend to have a bias

  • towards believing the success case.

  • We do have cognitive biases in the way that

  • we interact with these machines.

  • They are more likely to be adapted

  • to those familiar cases,

  • but then when you really start trusting it,

  • it might betray you

  • with unexpected failures.

  • Interesting time, really.

  • One domain that is almost counterintuitive

  • that it’s not as good at is mathematics.

  • You almost have to laugh that

  • something like a simple Sudoku puzzle

  • is one of the things that it can’t figure out,

  • whereas even humans can do that.

  • Yes, it’s like reasoning in general,

  • that humans are capable of,

  • that these ChatGPT

  • are not as reliable right now.

  • The reaction to that

  • in the current scientific community,

  • it’s a bit divisive.

  • On one hand, that people might believe

  • that with more scale,

  • the problems will all go away.

  • Then there’s the other camp

  • who tend to believe that, wait a minute,

  • there’s a fundamental limit to it,

  • and there should be better, different ways

  • of doing it that are much more efficient.

  • I tend to believe the latter.

  • Anything that requires a symbolic reasoning

  • can be a little bit brittle.

  • Anything that requires

  • a factual knowledge can be brittle.

  • It’s not a surprise when you actually look at

  • the simple equation that we optimize

  • for training these larger language models

  • because, really, there’s no reason why

  • suddenly such capability should pop out.

  • I wonder if the future architecture may have

  • more of a self-understanding

  • of reusing knowledge in a much richer way

  • than just this forward-chaining

  • set of multiplications.

  • Yes, right now the transformers, like GPT-4,

  • can look at such a large amount of context.

  • It’s able to remember so many words

  • as spoken just now.

  • Whereas humans, you and I,

  • we both have a very small working memory.

  • The moment we hear

  • new sentences from each other,

  • we kind of forget exactly

  • what you said earlier,

  • but we remember the abstract of it.

  • We have this amazing capability

  • of abstracting away instantaneously

  • and have such a small working memory,

  • whereas right now GPT-4

  • has enormous working memory,

  • so much bigger than us.

  • But I think that’s actually the bottleneck,

  • in some sense,

  • hurting the way that it’s learning,

  • because it’s just relying on the patterns,

  • a surface of patterns overlay,

  • as opposed to trying to abstract away

  • the true concepts underneath any text.

  • Subscribe toUnconfuse Me” wherever you listen to podcasts.

Are you surprised at the advances

字幕與單字

單字即點即查 點擊單字可以查詢單字解釋

B1 中級 美國腔

【比爾蓋茲來解惑】There's a high chance we'll be surprised again by AI | Unconfuse Me with Bill Gates(“There’s a high chance we’ll be surprised again by AI” | Unconfuse Me with Bill Gates)

  • 39 1
    Q San 發佈於 2023 年 11 月 27 日
影片單字