Placeholder Image

字幕列表 影片播放

由 AI 自動生成
  • We're starting a series of new models with the new name O1 and this is to highlight the fact that you might feel different when you use O1 as compared to previous models such as GPT-4-O.

    我們將以 O1 這個新名稱開始一系列新產品,這樣做是為了強調一個事實:與 GPT-4-O 等以前的產品相比,您在使用 O1 時可能會有不同的感覺。

  • So, as others will explain later, O1 is a reasoning model, so it will think more before answering your question.

    是以,正如其他人稍後會解釋的那樣,O1 是一個推理模型,所以在回答你的問題之前,它會多想一想。

  • We are releasing two models, O1 Preview, which is to preview what's coming for O1, and O1 Mini, which is a faster, smaller and faster model that is trained with a similar framework as O1.

    我們將發佈兩個模型,一個是 O1 Preview,用於預覽 O1 的未來發展;另一個是 O1 Mini,它是一個更快、更小、更快的模型,使用與 O1 相似的框架進行訓練。

  • So, we hope you like our new naming scheme, O1.

    是以,我們希望您喜歡我們的新命名方案 O1。

  • So what is reasoning anyway?

    那麼,推理到底是什麼呢?

  • So one way of thinking of reasoning is that there are times where we ask questions and we need answers immediately because they're simple questions.

    是以,推理的一種思維方式是,有時我們會提出一些問題,我們需要立即得到答案,因為這些問題很簡單。

  • For example, if you ask what's the capital of Italy, you know the answer is Rome and you don't really have to think about it much.

    例如,如果你問意大利的首都是哪裡,你知道答案是羅馬,而且不用多想。

  • But if you wonder about a complex puzzle or you want to write a really good business plan, you want to write a novel, you probably want to think about it for a while.

    但是,如果你想知道一個複雜的謎題,或者你想寫一份非常好的商業計劃書,你想寫一本小說,你可能需要考慮一段時間。

  • And the more you think about it, the better the outcome.

    你想得越多,結果就越好。

  • So reasoning is the ability of turning thinking time into better outcomes, whatever the task you're doing.

    是以,無論你在做什麼工作,推理能力都是將思考時間轉化為更好結果的能力。

  • It's been going on for a long time, but I think what's really cool about research is there's that aha moment.

    這已經持續了很長時間,但我認為研究的真正魅力在於它的 "啊哈時刻"。

  • There's that particular point in time where something surprising happens and things really click together.

    在某個特定的時間點上,會發生一些令人驚訝的事情,讓事情變得一拍即合。

  • Are there any times for you all when you had that aha moment?

    你們是否有過 "啊哈 "時刻?

  • There was a first moment when the moment was hot off the press.

    有一個最初的時刻,那一刻熱火朝天。

  • We started talking to the model and people were like, wow, this model is really great and started doing something like that.

    我們開始與模特交談,大家都覺得,哇,這個模特真的很棒,於是就開始做這樣的事情。

  • And I think that there was a certain moment in our training process where we put more computes in our L than before and trained first while generating coherent chains of thought.

    我認為,在我們的訓練過程中,有一個特定的時刻,我們在 L 中放入了比以前更多的計算機,並在產生連貫的思維鏈時首先進行了訓練。

  • And we saw, wow, this looks like something meaningfully different than before.

    我們看到,哇,這看起來和以前有了很大的不同。

  • And I think for me, this is the moment.

    我想,對我來說,這就是時刻。

  • I think related to that, when we think about training a model for reasoning, one thing that immediately jumps to mind is you could have humans write out their thought process and train on that.

    我想與此相關的是,當我們考慮訓練一個推理模型時,有一件事會立刻浮現在腦海中,那就是你可以讓人類寫出他們的思維過程,並在此基礎上進行訓練。

  • When aha moment for me was when we saw that if you train the model using RL to generate and hone its own chain of thoughts, it can do even better than having humans write chains of thought for it.

    當我們看到如果使用 RL 訓練模型來生成和磨練它自己的思維鏈時,對我來說是 "啊哈 "時刻,它甚至比人類為它編寫思維鏈做得更好。

  • And that was an aha moment that you could really scale this and explore models reasoning that way.

    那一刻,我突然意識到,你真的可以通過這種方式來擴展和探索模型推理。

  • For a lot of the time that I've been here, we've been trying to make the models better at solving math problems, as an example.

    例如,我在這裡的很多時間裡,我們都在努力讓模型更善於解決數學問題。

  • And we've put a lot of work into this.

    我們為此做了大量工作。

  • And we've come up with a lot of different methods.

    我們想出了很多不同的方法。

  • But one thing that I kept, like, every time I would read these outputs from the models, I'd always be so frustrated that the model just would never seem to question what was wrong or when it was making mistakes or things like that.

    但有一件事我一直記得,比如,每次我讀到這些模型的輸出結果時,我總是非常沮喪,因為模型似乎從來沒有質疑過哪裡出了問題,或者它什麼時候犯了錯誤,或者諸如此類的事情。

  • But one of these early O1 models, when we trained it and we actually started talking to it, we started asking it these questions and it was scoring higher on these math tests we were giving it, we could look at how it was reasoning.

    但其中一個早期的 O1 模型,當我們對它進行訓練,並真正開始與它對話時,我們開始問它這些問題,它在我們給它的數學測試中得分更高,我們可以看看它是如何推理的。

  • And you could just see that it started to question itself and have really interesting reflection.

    你可以看到它開始質疑自己,並進行了非常有趣的反思。

  • And that was a moment for me where I was like, wow, like, we've uncovered something different.

    那一刻我覺得,哇,我們發現了一些與眾不同的東西。

  • This is going to be something new.

    這將是一件新鮮事。

  • And it was just like one of these coming together moments that was really powerful.

    這就像一個聚在一起的時刻,非常有力量。

  • Thank you and congrats on releasing this.

    謝謝你,祝賀你發佈了這本書。

We're starting a series of new models with the new name O1 and this is to highlight the fact that you might feel different when you use O1 as compared to previous models such as GPT-4-O.

我們將以 O1 這個新名稱開始一系列新產品,這樣做是為了強調一個事實:與 GPT-4-O 等以前的產品相比,您在使用 O1 時可能會有不同的感覺。

字幕與單字
由 AI 自動生成

單字即點即查 點擊單字可以查詢單字解釋