Placeholder Image

字幕列表 影片播放

由 AI 自動生成
  • When Mark Zuckerberg isn't wakesurfing wearing a tuxedo and a puka shell necklace at his Lake Tahoe mansion crushing Coors' yellow bellies and waving the American flag, he clocks into work with a sunburn to battle Google and OpenAI for artificial intelligence supremacy.

    馬克-扎克伯格(Mark Zuckerberg)穿著燕尾服,戴著普卡貝殼項鍊,在太浩湖的豪宅裡碾碎庫爾斯(Coors)的黃肚皮,揮舞著美國國旗,玩起了滑浪,而不是頂著烈日上班,與谷歌和 OpenAI 爭奪人工智能的霸主地位。

  • Yesterday, Meta released its biggest and baddest large language model ever, which also happens to be free and arguably open source.

    昨天,Meta 發佈了有史以來最大、最猛的大型語言模型,該模型也是免費的,可以說是開源的。

  • It took months to train on 16,000 Nvidia H100 GPUs, which likely cost hundreds of millions of dollars and used enough electricity to power a small country.

    使用 16,000 個 Nvidia H100 GPU 進行訓練耗時數月,這些 GPU 的成本可能高達數億美元,耗電量足以為一個小國供電。

  • But the end result is a massive 405 billion parameter model with a 128,000 token context length, which according to benchmarks is mostly superior to OpenAI's GPT-4.0 and even beats Claude 3.5's SONNET on some key benchmarks.

    但最終的結果卻是一個擁有 4,050 億個參數、12.8 萬個標記上下文長度的龐大模型,根據基準測試,它在很大程度上優於 OpenAI 的 GPT-4.0,甚至在某些關鍵基準測試中擊敗了 Claude 3.5 的 SONNET。

  • But benchmarks lie, and the only way to find out if a new model is any good is to vibe with it.

    但是,基準會騙人,要想知道一款新機型是否優秀,唯一的辦法就是與它親密接觸。

  • In today's video, we'll try out LLAMA 3.1 Heavy and find out if it actually doesn't suck like most Meta products.

    在今天的視頻中,我們將試用 LLAMA 3.1 Heavy,看看它是否真的不像大多數 Meta 產品那樣糟糕。

  • It is July 24th, 2024, and you're watching The Code Report.

    現在是 2024 年 7 月 24 日,您正在收看《密碼報告》。

  • AI hype has died down a lot recently, and it's been almost a week since I've mentioned it in a video, which I'm extremely proud of.

    最近,人工智能的炒作已經平息了很多,我已經快一週沒有在視頻中提到它了,對此我感到非常自豪。

  • But LLAMA 3.1 is a model that cannot be ignored.

    但 LLAMA 3.1 是一個不容忽視的模型。

  • It comes in three sizes, 8B, 7DB, and 405B, where B refers to billions of parameters, or the variables that the model can use to make predictions.

    它有三種大小:8B、7DB 和 405B,其中 B 指的是數十億個參數,也就是模型可以用來進行預測的變量。

  • In general, more parameters can capture more complex patterns, but more parameters doesn't always mean that the model is better.

    一般來說,更多的參數可以捕捉到更復雜的模式,但更多的參數並不總是意味著模型更好。

  • GPT-4 has been rumored to have over 1 trillion parameters, but we don't really know the The cool thing about LLAMA is that it's open source.

    據傳,GPT-4 有超過 1 萬億個參數,但我們並不真正瞭解 LLAMA。

  • Well, kind of.

    算是吧。

  • You can make money off of it, as long as your app doesn't have 700 million monthly active users, in which case you need to request a license from Meta.

    只要你的應用程序沒有 7 億月活躍用戶,你就可以從中賺錢,在這種情況下,你需要向 Meta 申請許可證。

  • What's not open source is the training data, which might include your blog, your github repos, all your facebook posts from 2006, and maybe even your whatsapp messages.

    不開源的是訓練數據,其中可能包括你的博客、github 倉庫、2006 年以來你在 facebook 上發佈的所有帖子,甚至可能包括你的 whatsapp 消息。

  • What's interesting is that we can take a look at the actual code used to train this model, which is only 300 lines of Python and PyTorch, along with a library called Fairescale to distribute training across multiple GPUs.

    有趣的是,我們可以看看用於訓練這個模型的實際代碼,只有 300 行 Python 和 PyTorch 代碼,還有一個名為 Fairescale 的庫,用於在多個 GPU 上分配訓練。

  • It's a relatively simple decoder-only transformer, as opposed to the mixture-of-experts approach used in a lot of other big models, like its biggest open source rival, Mixtral.

    它是一個相對簡單的純解碼器轉換器,而不是許多其他大型模型(如其最大的開源競爭對手 Mixtral)所採用的專家混合方法。

  • Most importantly though, the model weights are open, and that's a huge win for developers building AI-powered apps.

    但最重要的是,模型權重是開放的,這對於開發人員構建人工智能驅動的應用程序來說是一個巨大的優勢。

  • Now you don't have to pay a bunch of money to use the GPT-4 API, and instead can self-host your own model and pay a cloud provider a bunch of money to rent some GPUs.

    現在,你不必再為使用 GPT-4 API 而支付一大筆費用,而是可以自行託管自己的模型,然後向雲提供商支付一大筆費用,租用一些 GPU。

  • The big model would not be cheap to self-host.

    大型模型的自託管費用並不低。

  • I used LLAMA to download it and use it locally, but the weights weigh 230 gigabytes, and even with an RTX 4090, I wasn't able to ride this LLAMA.

    我使用 LLAMA 下載並在在地使用,但重量高達 230 千兆字節,即使使用 RTX 4090,我也無法駕馭這個 LLAMA。

  • The good news though is that you can try it for free on platforms like Meta.ai, or platforms like Grok or Nvidia's Playground.

    不過好消息是,您可以在 Meta.ai 等平臺或 Grok 或 Nvidia's Playground 等平臺上免費試用。

  • Now the initial feedback from random weirdos on the internet is that Big LLAMA is somewhat disappointing, while the smaller LLAMAs are quite impressive.

    現在,網上一些怪人的初步反饋是,大 LLAMA 有點令人失望,而小 LLAMA 則令人印象深刻。

  • But the real power of LLAMA is that it can be fine-tuned with custom data, and in the near future, we'll have some amazing uncensored fine-tuned models like Dolphin.

    但 LLAMA 的真正威力在於,它可以通過自定義數據進行微調,在不久的將來,我們將擁有像 Dolphin 這樣令人驚歎的未經審查的微調模型。

  • My favorite test for new LLMs is to ask it to build a Svelte 5 web application with runes, which is a new yet-to-be-released feature.

    我最喜歡測試新的 LLM,要求它使用符文構建一個 Svelte 5 網絡應用程序,這是一項尚未發佈的新功能。

  • The only model I've seen do this correctly in a single shot is CLAWD 3.5 Sonnet, and LLAMA 405B failed pretty miserably, and seems to have no awareness of this feature.

    我所見過的唯一一款能一次性正確完成此操作的機型是 CLAWD 3.5 Sonnet,而 LLAMA 405B 則非常失敗,似乎沒有意識到這一功能。

  • Overall though, it is pretty decent coding, but still clearly behind CLAWD.

    總的來說,它的編碼相當不錯,但仍明顯落後於 CLAWD。

  • I also had it do some creative writing and poetry, and overall it's pretty good, just not the best I've ever seen.

    我還讓它進行了一些創意寫作和詩歌創作,總體來說還不錯,只是不是我見過的最好的。

  • If we take a minute to reflect though, what's crazy is that we have multiple different companies that have trained massive models with massive computers, and they're all plateauing at the same level of capability.

    不過,如果我們花一分鐘反思一下,就會發現令人抓狂的是,我們有多家不同的公司使用大型計算機訓練了大量模型,但它們的能力都在同一水準上趨於穩定。

  • OpenAI was the first to make a huge leap from GPT-3 to GPT-4, but since then, it's only been small incremental gains.

    OpenAI 最先實現了從 GPT-3 到 GPT-4 的巨大飛躍,但從那以後,它只是小幅遞增。

  • Last year, Sam Altman practically begged the government to regulate AI to protect humanity, but a year later, we still haven't seen the apocalyptic Skynet human extinction event that they promised us.

    去年,薩姆-奧特曼(Sam Altman)幾乎是在乞求政府監管人工智能以保護人類,但一年過去了,我們仍然沒有看到他們向我們承諾的末日天網人類滅絕事件。

  • I mean, AI still hasn't even replaced programmers.

    我的意思是,人工智能甚至還沒有取代程序員。

  • It's like that time airplanes went from propellers to jet engines, but the advancement to lightspeed engines never happened.

    這就好比飛機從螺旋槳變為噴氣發動機,但光速發動機的進步卻從未發生過。

  • When talking about LLMs, artificial superintelligence is still nowhere in sight, except in the imagination of the Silicon Valley mafia.

    說到法學碩士,除了硅谷黑手黨的想象之外,人工超級智能仍然遙不可及。

  • It feels wrong to say this, but Meta is really the only big tech company keeping it real in the AI space.

    雖然這麼說感覺不對,但 Meta 確實是唯一一家在人工智能領域保持真實的大型科技公司。

  • I'm sure there's an evil ulterior motive hidden in there somewhere, but LLAMA is one small step for man, one giant leap for Zuckerberg's redemption arc.

    我相信這其中一定隱藏著不可告人的邪惡動機,但 LLAMA 是人類的一小步,也是扎克伯格救贖弧線的一大步。

  • This has been The Code Report, thanks for watching, and I will see you in the next one.

    密碼報告》到此結束,感謝您的收看,我們下期再見。

When Mark Zuckerberg isn't wakesurfing wearing a tuxedo and a puka shell necklace at his Lake Tahoe mansion crushing Coors' yellow bellies and waving the American flag, he clocks into work with a sunburn to battle Google and OpenAI for artificial intelligence supremacy.

馬克-扎克伯格(Mark Zuckerberg)穿著燕尾服,戴著普卡貝殼項鍊,在太浩湖的豪宅裡碾碎庫爾斯(Coors)的黃肚皮,揮舞著美國國旗,玩起了滑浪,而不是頂著烈日上班,與谷歌和 OpenAI 爭奪人工智能的霸主地位。

字幕與單字
由 AI 自動生成

單字即點即查 點擊單字可以查詢單字解釋