Placeholder Image

字幕列表 影片播放

由 AI 自動生成
  • (upbeat music)

    (歡快的音樂)

  • - There have been a lot of news about ChatGPT lately

    - 最近有很多關於ChatGPT的消息

  • like people using ChatGPT to write essays,

    像人們使用ChatGPT來寫論文。

  • ChatGPT hitting a hundred million users,

    ChatGPT的用戶數達到了一億人。

  • Google launching Bard to compete against ChatGPT

    谷歌推出巴德,與ChatGPT競爭

  • and Microsoft integrating ChatGPT

    和微軟整合ChatGPT

  • into all their products, and also the viral sensation

    在他們所有的產品中,也有病毒性的轟動。

  • of CatGPT where it can answer all of your queries,

    的CatGPT,它可以回答你所有的疑問。

  • but as a cat, meow, meow, meow, meow, meow, meow.

    但作為一隻貓,喵、喵、喵、喵、喵、喵、喵。

  • ChatGPT, if you don't know already, it's a chat bot

    ChatGPT,如果你還不知道,它是一個哈拉機器人。

  • by OpenAI where you can ask it many things.

    由OpenAI提供,在那裡你可以問它很多事情。

  • For example, explaining complex topics

    例如,解釋複雜的主題

  • like explain why I'm a disappointment to my parents

    就像解釋為什麼我對我的父母很失望一樣

  • or ask it more technical questions like,

    或問它更多的技術問題,如。

  • how do I inherit more money than my brother from my parents?

    我怎樣才能從父母那裡繼承比我弟弟更多的錢?

  • A lot of people are using it to write essays, draft emails,

    很多人都在用它來寫論文,起草電子郵件。

  • and even write code.

    甚至是寫代碼。

  • So I tried it myself, of course, as a YouTuber obviously,

    所以我自己也試了一下,當然是作為一個優酷網友。

  • my first question to it was, who is Joma Tech?

    我對它的第一個問題是,Joma Tech是誰?

  • And it answered...

    而它的回答是...

  • Are you fucking--

    你他媽的...

  • You know, ChatGPT has a lot of limitations,

    你知道,ChatGPT有很多限制。

  • like here we ask it to name colors

    就像在這裡,我們要求它命名顏色

  • that don't have the letter E in them,

    不含字母E的。

  • and this is what they gave us.

    而這就是他們給我們的東西。

  • Orang, yllow, red, that's clearly wrong.

    橙色、黃色、紅色,這顯然是錯誤的。

  • In all seriousness,

    認真地說。

  • this is to demonstrate how ChatGPT works.

    這是為了演示ChatGPT是如何工作的。

  • It's a pre-trained large language model,

    這是一個預先訓練好的大型語言模型。

  • meaning it was trained on text data

    意味著它是在文本數據上訓練的

  • from the internet until the end of 2021.

    在2021年年底之前,從互聯網上獲取信息。

  • So it won't know anything

    所以它不會知道任何事情

  • about things that happened recently.

    關於最近發生的事情。

  • It doesn't have access to the internet.

    它沒有接入互聯網。

  • It'll only predict the answer based

    它只會根據以下情況預測答案

  • on what it has consumed already,

    在它已經消耗的東西上。

  • and the way it answers your question is

    而它回答你的問題的方式是

  • by predicting each word that comes next.

    通過預測接下來的每一個單詞。

  • For example, if you ask GPT who Bard is,

    例如,如果你問GPT,巴德是誰?

  • it's not going to know.

    它是不會知道的。

  • You might ask Joma, didn't your channel launch in 2017

    你可能會問Joma,你的頻道不是在2017年推出嗎?

  • and ChatGPT was trained on internet data until 2021,

    而ChatGPT在2021年之前都是根據互聯網數據進行訓練。

  • yet it doesn't know who you are?

    但它卻不知道你是誰?

  • Yeah, so there's actually a technical reason

    是的,所以實際上有一個技術原因

  • and fuck you.

    和他媽的你。

  • Recently ChatGPT hit a hundred million users.

    最近ChatGPT的用戶數達到了一億。

  • It launched November 30th, 2022,

    它於2022年11月30日啟動。

  • and this article came out February 3rd, 2023.

    而這篇文章是在2023年2月3日發表的。

  • So it took two months to hit a hundred million users.

    是以,它花了兩個月的時間就達到了一億用戶。

  • Who are these users and what are they doing with ChatGPT?

    這些用戶是誰,他們在用ChatGPT做什麼?

  • Well, it's pretty obvious, they're cheating with it.

    嗯,這很明顯,他們在用它作弊。

  • Everybody's cheating such that

    每個人都在作弊,這樣

  • some school districts have banned access to ChatGPT.

    一些學區已經禁止訪問ChatGPT。

  • If they can write essays, then they can pass exams.

    如果他們能寫論文,那麼他們就能通過考試。

  • ChatGPT was able to pass exams from law school,

    ChatGPT能夠通過法律學校的考試。

  • business school, and medical school.

    商學院和醫學院。

  • Three prestigious industries.

    三個著名的行業。

  • Now, this is why I went into coding

    現在,這就是我進入編碼領域的原因

  • because I always thought that law school,

    因為我一直認為,法律學校。

  • business school, and medical school,

    商學院和醫學院。

  • it was too much about memorization

    太多關於記憶的東西了

  • and you're bound to get replaced,

    而你一定會被替換。

  • it just wasn't intellectual enough, you know?

    它只是不夠聰明,你知道嗎?

  • All right, well,

    好了,好了。

  • I guess engineering is getting replaced, too.

    我想工程也在被取代。

  • ChatGPT passes Google coding interview,

    ChatGPT通過了谷歌的編碼面試。

  • which is known to be hard, but I guess not.

    眾所周知,這是很難的,但我想不是。

  • But note that it is for a L3 engineer,

    但請注意,這是針對L3級工程師的。

  • which means it's a entry level, for those not in tech,

    這意味著它是一個入門級,對那些不從事技術工作的人來說。

  • there's no L2 and L1, it starts at L3,

    沒有L2和L1,它從L3開始。

  • but this does raise questions about ChatGPT's ability

    但這確實讓人對ChatGPT的能力產生懷疑。

  • to change engineering jobs behind it,

    以改變它背後的工程工作。

  • and we're already seeing the change

    而且我們已經看到了這種變化

  • as Amazon employees are already using ChatGPT

    因為亞馬遜員工已經在使用ChatGPT

  • for coding even though that immediately after,

    為編碼,即使是緊接著。

  • they told them to stop, warning them not

    他們叫他們停下來,警告他們不要

  • to share confidential information with ChatGPT.

    與ChatGPT分享機密信息。

  • What's happening is they're feeding ChatGPT

    現在的情況是他們在給ChatGPT提供食物

  • internal documents, which are confidential,

    內部文件,這些文件是保密的。

  • but OpenAI stores all that data.

    但OpenAI存儲了所有這些數據。

  • You know, it reminds me of when I used to intern

    你知道,這讓我想起了我以前實習的時候

  • at Microsoft and they didn't let us use Google

    在微軟,他們不允許我們使用谷歌。

  • for searches because they think that they might spy on us.

    因為他們認為他們可能會監視我們,所以要進行搜查。

  • I was like, relax, I'm an intern.

    我當時說,放鬆,我是個實習生。

  • I'm not working on anything important.

    我沒有在做任何重要的工作。

  • In fact, I actually wasn't working at all.

    事實上,我實際上根本就沒有工作。

  • You know, I was playing Overwatch all day,

    你知道,我整天都在玩《守望先鋒》。

  • but yeah, anyways, they forced us to use Bing for searches.

    但是,無論如何,他們強迫我們使用Bing進行搜索。

  • One thing that's being underreported

    有一件事沒有被充分報道

  • in mainstream media is the success of GitHub Copilot.

    主流媒體的報道是GitHub Copilot的成功。

  • It's probably the most useful

    這可能是最有用的

  • and most well executed AI product currently out there.

    和目前執行得最好的人工智能產品。

  • Have I used it?

    我用過嗎?

  • No, I haven't coded in forever.

    不,我已經很久沒有編碼了。

  • Now, here's how it works.

    現在,事情是這樣的。

  • The moment you write your code,

    在你寫代碼的那一刻。

  • it's like auto complete on steroids, like this example,

    它就像類固醇的自動完成,就像這個例子。

  • it helps you write the whole drawScatterplot function

    它可以幫助你編寫整個drawScatterplot函數

  • and it knows how to use a D3 library correctly.

    而且它知道如何正確使用D3庫。

  • Another example here, you can write a comment

    這裡還有一個例子,你可以寫一個評論

  • explaining what you want your function to do

    解釋你希望你的函數做什麼

  • and it'll write the code for you.

    它就會為你寫代碼。

  • Sometimes even the name

    有時甚至連名字

  • of the function will give it enough information

    的函數會給它足夠的資訊

  • to write the rest of the code for you.

    來為你寫其餘的代碼。

  • It's very powerful

    它是非常強大的

  • because it's able to take your whole code base as context

    因為它能夠把你的整個代碼庫作為上下文。

  • and with that, make more accurate predictions that way.

    並以此為基礎,做出更準確的預測。

  • For example, if you're building a trading bot

    例如,如果你正在建立一個交易機器人

  • and you write the function get_tech_stock_prices,

    而你寫了函數get_tech_stock_prices。

  • it'll suggest, hey, I know you're going

    它將暗示,嘿,我知道你要去

  • through a rough time,

    通過一個艱難的時期。

  • but building a trading bot is not going

    但建立一個交易機器人並不是要

  • to fix your insecurities and maybe you should just accept

    來解決你的不安全感,也許你應該接受

  • that you'll be a disappointment for the rest of your life.

    你會在你的餘生中成為一個令人失望的人。

  • Okay.

    好的。

  • How did all of this happen?

    這一切是如何發生的?

  • Why is AI so good suddenly?

    為什麼人工智能突然變得這麼好?

  • The answer is the transformer model

    答案是變壓器模型

  • which caused a paradigm shift

    這引起了範式的轉變

  • on how we build large language models, LLM.

    關於我們如何建立大型語言模型,LLM。

  • By the way, this diagram means nothing to me.

    順便說一句,這張圖對我來說毫無意義。

  • It makes me look smart, so that's why I put it on there.

    它使我看起來很聰明,所以這就是我把它放在上面的原因。

  • Before transformers,

    在變壓器之前。

  • the best natural language processing system used RNN,

    最好的自然語言處理系統使用了RNN。

  • and then it used LSTM,

    然後,它使用了LSTM。

  • but then Google Brain published a paper

    但後來谷歌大腦發表了一篇論文

  • in 2017 called "Attention is All You Need"

    在2017年,名為 "關注是你所需要的一切"

  • which is also my life's motto because I'm a narcissist.

    這也是我的人生格言,因為我是一個自戀者。

  • The paper proposes a simple neural network model

    本文提出了一個簡單的神經網絡模型

  • they call transformer, which is based

    他們稱之為變壓器,它是基於

  • on the self attention mechanism

    關於自我注意機制

  • which I don't fully understand, so I'll pretend

    我並不完全理解,所以我就假裝

  • like I don't have time to explain it

    就像我沒有時間去解釋它一樣

  • but I also know that it allows for more parallelization

    但我也知道,它可以實現更多的並行化

  • which means you can throw more hardware,

    這意味著你可以扔更多的硬件。

  • more GPUs to make your training go faster

    更多的GPU,使你的訓練更快進行

  • and that's when things got crazy.

    就在這時,事情變得瘋狂起來。

  • They kept adding more data and also added more parameters

    他們不斷添加更多的數據,也添加更多的參數

  • and the model just got better.

    而且該模型剛剛變得更好。

  • So what did we do?

    那麼我們做了什麼?

  • We made bigger models with more parameters

    我們做了更大的模型,有更多的參數

  • and shoved it a shit ton of data.

    並把一噸的數據塞給它。

  • Sorry, I'm trying my best here to make the model bigger.

    對不起,我在這裡盡力使模型變大。

  • All right, fuck it.

    好吧,去他媽的。

  • Anyway, that gave us ready

    總之,這讓我們準備好了

  • to use pre-trained transformer models like Google's Bert,

    來使用預先訓練好的轉化器模型,如谷歌的Bert。

  • and OpenAI's GPT, generative pre-trained transformers.

    和OpenAI的GPT,生成性預訓練的轉化器。

  • They crawled the whole web to get text data

    他們抓取了整個網絡來獲得文本數據

  • from Wikipedia and Reddit.

    來自維基百科和Reddit。

  • This graph shows you how many parameters each model has.

    該圖顯示了每個模型有多少個參數。

  • So as you can see, we've been increasing the number

    是以,正如你所看到的,我們一直在增加

  • of parameters exponentially.

    的參數呈指數增長。

  • So OpenAI kept improving their GPT model

    所以OpenAI不斷改進他們的GPT模型

  • like how Goku kept becoming stronger each time

    就像悟空每次都會變得更強

  • he reached a new Super Saiyan form.

    他達到了一個新的超級賽亞人形態。

  • While editing this,

    在編輯這個的時候。

  • I realized how unhelpful the "Dragon Ball" analogy was.

    我意識到 "龍珠 "的比喻是多麼的無助。

  • So I want to try again.

    所以我想再試試。

  • To recap, transformer was the model architecture,

    簡而言之,變壓器是模型架構。

  • a type of neural network.

    一種類型的神經網絡。

  • Other types of models would be like RNN and LSTM.

    其他類型的模型將像RNN和LSTM。

  • Compared to RNN, transformers don't need

    與RNN相比,變壓器不需要

  • to process words one by one,

    來逐一處理單詞。

  • so it's way more efficient at training with lots of data.

    所以它在大量數據的訓練中更有效率。

  • OpenAI used the transformer model and pre-trained it

    OpenAI使用了轉化器模型並對其進行了預訓練

  • by feeding it a bunch of data from the internet

    通過從互聯網上輸入一堆數據來實現。

  • and they called that pre-trained model GPT-1.

    他們把這個預訓練的模型稱為GPT-1。