Placeholder Image

字幕列表 影片播放

已審核 字幕已審核
  • You're called to create a post-apocalyptic giraffe astronaut.

    您的創作指令是:末世太空長頸鹿。

  • Generated.

    產生。

  • Genghis Khan playing a guitar solo, pixel art.

    成吉思汗獨奏吉他,像素圖。

  • Generated.

    產生。

  • A man holding a delicious apple...

    男人拿著美味的蘋果…

  • Ah... What's with his hands?

    額…他的手是怎樣?

  • Why can't AI art make hands?

    AI 為什麼畫不好手?

  • It doesn't matter what AI art model you use.

    不管你用的是哪個 AI 繪圖工具,

  • If you have a man holding a delicious apple, his hands will look weird holding it.

    如果你輸入男人拿著美味的蘋果,他的手都會看起來很奇怪。

  • Why is this so hard?

    為什麼畫不好呢?

  • Seems easy enough, right?

    看起來挺簡單的,不是嗎?

  • We've got this weird situation where AI art can instantly make...

    這種情況很奇怪,人工智能能立即產生出…

  • Abraham Lincoln dressed like glam David Bowie.

    亞伯拉罕林肯穿得像華麗的大衛鮑伊,

  • But struggles with a woman holding a cell phone.

    但卻畫不好拿著手機的女人。

  • This isn't just a weird glitch.

    這不僅只是個奇怪的小漏洞。

  • The struggle of AI art with hands can actually teach you something bigger about how AI art works.

    AI 繪圖碰上的這個難題其實能帶我們領略更深的層面,了解此技術的運作方式。

  • I mean, what is so hard about this?

    我是說,這有什麽好難的?

  • I asked an artist who has taught thousands of people how to draw hands from imagination.

    我請教了一位教過成千上萬人如何憑想像畫手的藝術家。

  • Before someone becomes or starts training to be an artist, like officially training.

    在某人成為畫家或開始接受畫家訓練前,我是指正式培訓,

  • It's pattern recognition.

    得先識別模式。

  • You just grow up seeing a whole bunch of hands...

    從小到大,我們看過很多手…

  • and you start knowing what hands look like.

    從中知道了手該長怎樣。

  • You learn how things look by living in the world and recognizing patterns.

    人類透過從生活中識別模式來了解事物外觀。

  • An AI is similar, but has key differences.

    人工智能也一樣,但有幾處關鍵的不同。

  • Imagine an AI is like you,

    想像 AI 和你一樣,

  • but trapped in a museum from birth.

    但它從出生就被困在博物館中。

  • All the machine has to learn from are the pictures...

    這些機器只能從圖片…

  • and the little placards on the side.

    及一旁的解說標示學習。

  • Apple: A red apple on a brown table.

    蘋果:一顆紅蘋果在褐色桌面上。

  • That's like the images it sees from the web and the descriptions that go with them.

    這就像 AI 從網路上看到的圖像,以及其附帶的描述。

  • It's similar to how you learn, but locked in that museum.

    AI 的學習方式跟人很像,只是它被關在一間博物館裡。

  • If you want to understand an apple you can rotate it in your hand.

    如果想了解一顆蘋果,你可以在手中轉它。

  • You can watch it whenever you want.

    你可以隨時觀察它。

  • If AI wants to understand an apple,

    如果 AI 想了解一顆蘋果,

  • it has to find another picture of an apple in the museum.

    它得在博物館裡找到另一張蘋果的照片。

  • Pattern recognition has allowed AI and people to draw decent apples,

    模式識別讓人工智能和人類都能畫出像樣的蘋果,

  • but the processes differ.

    但過程不同。

  • You start training to become an artist,

    開始受訓成為一名畫家時,

  • and now you're like, okay, now I have to learn the rules.

    你會想,好吧,我得先學習規則。

  • And that's where it becomes very different from how AI is learning.

    而這就是人類與人工智能學習方式不同的地方。

  • Artists, in order to draw something complicated,

    畫家想畫複雜的東西時,

  • we tend to simplify things into basic forms.

    我們傾向於將事物簡化到基本形式。

  • And so when you look at a hand,

    觀察我們的手,

  • you pretty much have the big blocky part of the palm, right?

    基本構成是一大塊手掌,是吧?

  • You have the front, you have the back,

    有手心手背,

  • and then you have the thickness.

    然後手掌是有厚度的。

  • So you can pretty much just make that into like a square with some thickness to it.

    所以基本上是一個有厚度的方形。

  • Then an artist can add all the style and texture and detail they want.

    然後畫家可以再加上他們想要的樣式、紋理和細節。

  • AI works differently.

    人工智能的運作不同。

  • Look at this hand.

    看看這隻手。

  • The shapes are bizarre,

    形狀很奇怪,

  • but the AI has done a great job showing the light and texture here.

    但是 AI 在光線和紋理方面處理得很好。

  • Remember, the AI knows how things look,

    記住,AI 知道東西的樣子,

  • but not how they work.

    但不知道它們怎麼運作的。

  • So these patterns in pixels are easy to understand.

    所以這些以像素為單位的圖很容易理解。

  • It never learned, however, that fingers don't really bend like this.

    然而,AI 不知道手指並不會這樣彎曲。

  • It doesn't simplify the forms.

    人工智能不會將事物簡化分析。

  • Remember, it's trapped in the museum.

    別忘了,它被關在「博物館」中。

  • So it is just trying to guess where hand-like pixels should be

    所以它只是在猜這些像手的像素該擺在哪,

  • without knowing how hands work like we do.

    而不知道手是如何運作的,不像我們。

  • But listen, I find this kind of dissatisfying.

    但聽著,我對這個答案並不滿意。

  • I mean, I'm basically just saying that AI can't draw hands because it's not a person.

    我基本上只是在說 AI 手畫得很爛,因為它不是人。

  • But AI also doesn't know anything about construction,

    但人工智能對建築也一無所知,

  • and it can still make a beautiful skyscraper in New York City.

    它照樣可以畫出一棟在紐約的漂亮摩天大樓。

  • So to understand this better,

    所以為了更好地了解這一點,

  • I spoke to two people who have worked with generative art models.

    我採訪了兩位研究 AI 繪圖的人。

  • Yilun Du is a grad student whose heart is in robotics.

    Yilun Du 是名專攻機器人的研究生。

  • But, you know, AI art is like a big deal now.

    但你知道的,AI 繪畫現在是頭等大事。

  • So, he got pulled into it.

    於是,他也投身其中。

  • Because of how popular these models have been in generative art...

    這些繪圖產生器十分流行…

  • I've also been working on that.

    所以我正在研究這塊。

  • And I talked to Roy Shilkrot,

    我採訪了 Roy Shilkrot,

  • who has a super varied resume,

    他經歷豐富,

  • but has been teaching about generative art since 2018.

    自 2018 年來一直在教授關於生成繪圖的知識。

  • Good students that come in that are trying to break those models and take them to the next level.

    進入此領域的優秀學生們一直試圖做出技術突破,想提升 AI 繪圖水平。

  • Talking to them helped me figure out three big reasons.

    與他們談過後,我找出了三個主要原因。

  • Not every reason,

    不是所有原因,

  • but three big reasons that hands are tough for AI art models.

    但是 AI 繪圖畫不好手的三大原因。

  • The data size and quality,

    數據的大小和畫質,

  • the way hands act,

    手的動作,

  • and the low margin for error.

    和誤差容忍度低。

  • For the data size, let's go back to the museum idea.

    關於數據大小,讓我們先回到博物館那個比喻。

  • The museum the robot hangs out in,

    機器人在的那個博物館,

  • it has a ton of rooms dedicated to faces,

    有大量容納臉部的空間,

  • but not so many rooms for hands.

    但沒那麼多空間給手。

  • That means it has less to learn from.

    這代表了 AI 能學習的手部資訊較少。

  • Just as an example, available datasets like Flickr HQ has 70,000 faces.

    舉個例子,像 Flickr HQ 這種數據庫有七萬張臉孔資料。

  • 70,000

    七萬張。

  • And this popular one annotates 200,000 pics of celebrity faces...

    而這個熱門數據庫有二十萬張名人臉部照…

  • for lots of details, like eyeglasses or pointy noses.

    包含很多細節,如眼鏡或尖挺的鼻子。

  • There are a ton of great hand datasets that can really understand hands,

    其實是有大量的手部圖庫可以幫助 AI 理解手部的,

  • like this one with 11,000 hands.

    像這個有一萬一千張手部圖。

  • But these may not have been used to train the AI that makes art.

    但這些可能沒被用來訓練 AI 繪圖。

  • That data scarcity combines with the quality and complexity of the data.

    資料稀缺加上畫質和手部復雜性等問題,

  • Hands data in the art museum isn't yet annotated to show how they work,

    「博物館」中的手部資料還沒辦法展示出它們是如何運作的,

  • like the celebrities pointy noses.

    不像名人堅挺的鼻子。

  • What they say is...

    指令是這麼說的…

  • there is an image and there is a person in the image and that person is holding an umbrella.

    圖中有個人,那個人拿著把傘。

  • You don't give the machine a lot of clues,

    給機器的線索不夠多。

  • saying this is a person holding the umbrella.

    應該要說,有個人撐著傘。

  • The thumb is going from one side of the handle and the fingers are curled,

    拇指從手柄的一側伸出,其他手指捲曲,

  • and then the thumb is covering the index finger, but not the other ones.

    拇指覆蓋住食指,但不會蓋到其他手指。

  • All that is made worse because hands do lots of things compared to, say... faces.

    這讓狀況變得更糟了,因為與面部相比,手可以做很多動作。

  • So there's a pretty common like portrait photo face.

    人像照片很常見。

  • There are a lot of these photos online,

    網路上有很多這樣的照片,

  • and the thing is everything is very well centered, right?

    臉部很好定中心,是吧?

  • Like eyes are always around here.

    眼睛永遠都是在這附近。

  • Like there's always this order.

    順序永遠是這樣的。

  • That's not true of hands,

    而手不是這樣,

  • which can do this and this and this.

    手可以這樣,這樣,還可以這樣。

  • I swear I'm sober right now.

    我沒醉,我發誓。

  • Stan mentioned this, too.

    Stan 也有提到這個。

  • How many fingers do you see right now?

    你現在看到幾隻手指?

  • Like... two or three.

    兩隻或三隻。

  • Like it doesn't know there's five

    AI 不會知道有五隻手指,

  • cuz sometimes there's two, sometimes there's three,

    因為有時候圖中的數量是兩隻或三隻,

  • sometimes four, sometimes five.

    有時候是四或五隻。

  • You can see these problems with AI hands,

    這種問題出現在手部繪圖上,

  • but the jankiness is all over AI art.

    但其實 AI 繪畫很多地方都有這種紕漏。

  • Just look at horses.

    看看馬就知道。

  • You can also have like three legs, five legs, six legs.

    可能會出現三條腿、五條腿、六條腿。

  • The model does not learn to explain this because there's too much diversity

    人工智能學不會這點,因為資料太多樣了,

  • and it doesn't have as much bias as we do.

    而且人工智能不像人有那麼多偏見。

  • Okay. Did you hear that last part he said?

    好,你有聽到他說的最後一部分嗎?

  • Good, because it's really important.

    很好,因為這很重要。

  • It doesn't have as much bias as we do.

    人工智能不像人有那麼多偏見。

  • We care a lot about hands and need them to be perfect.

    我們非常注重手,我們要求完美無缺。

  • There is a low margin for error.

    容錯率很低。

  • But because the model doesn't understand hands,

    但因為人工智能不懂手,

  • hasn't seen many and because hands act weird...

    沒有足夠手部資訊,而且手能做很多奇怪的動作…

  • it makes pictures that are like hands it's seen in the museum,

    AI 會畫出長得像它在資料庫裡看到的手,

  • but not an exact hand.

    但不是真的手的圖。

  • That's good enough for a ton of stuff, but not hands.

    這對很多東西來說已夠了,但對手來說不夠好。

  • Here, let me give you some examples.

    我給你看看一些例子。

  • Come over here.

    過來。

  • So, I typed "make me a person with exactly five freckles".

    我輸入「畫恰好有五個雀斑的人」。

  • So this one's from Dall-E 2,

    這張是 Dall-E 2 產生的,

  • this one is from Stable Diffusion,

    這張是 Stable Diffusion,

  • and this one is from Midjourney.

    然後這張是 Midjourney 畫的。

  • So it's like, you know, great job.

    是畫的很好沒錯。

  • You've got, you know, a red haired person.

    它產生出了一個紅髮的人,

  • They're more likely to have freckles.

    他們比較可能有雀斑,

  • But there are not exactly five freckles here.

    但這並不符合「恰好五個雀斑」。

  • Here that doesn't really matter because we see a freckly face.

    在這裡這並不重要,因為我們看到了一張長滿雀斑的臉。

  • But hands require higher standards.

    但我們對手的要求更高。

  • Look at our apple-holding man again.

    再看一次我們拿著蘋果的男人。

  • I made 3 other variations.

    我做了四個版本。

  • The hands are all weird, but don't look at them right now.

    手都很怪,但先不要看手。

  • It changed the shirt stripes, the buttons, the apple style...

    AI 改變了襯衫的條紋、鈕扣、蘋果的樣子…

  • None of that matters because it's stripe-like

    這些都不重要,因為圖案都是條狀的,

  • and button-like and apple-like.

    都是鈕扣,蘋果也都是蘋果。

  • But hand-like isn't good enough.

    但「長得像手」是不夠的。

  • I came away from this thinking a couple of things.

    這讓我不禁思考了幾件事。

  • AI art is basically bad at art.

    AI 繪畫基本上畫得不好,

  • We're just able to see it with hands.

    只是我們只注意到手而已。

  • And B, it's never going to get any better.

    還有,AI 繪畫是不可能畫得好的。

  • But both of those things are a bit wrong.

    但以上兩點都不全對。

  • I will say that the newest AI art generator to come out at the time of this video is Midjourney version 5

    這支影片出時最新發佈的 AI 繪圖器應該是 Midjourney 5.0。

  • and they made some progress with hands for sure,

    而他們在手部繪畫已經有所進步,

  • but it's not totally fixed yet.

    但還沒完全改善。

  • Don't tell the AI to hold an umbrella.

    別叫 AI 畫拿雨傘的樣子。

  • I think they're, like, spending lots of time on some things that you appreciate,

    我認為他們花很多功夫在人們會欣賞的事上,

  • which is why you like the images, and a lot of stuff that you don't actually even notice.

    這就是你會喜歡這些圖的原因,實際上很多部分是你沒注意到的。

  • I think that for a lot of natural scenery or something like that,

    我覺得像自然風景之類的圖,

  • I feel like model might be better at that than people.

    AI 可能畫得比人類好。

  • And they are working on two things.

    他們正在做兩件事。

  • First, they have the AI look at a ton more pictures,

    首先,他們讓 AI 看更多圖片,

  • which requires more computing power.

    這需要更多的「算力」。

  • They're trying to solve that on a big scale

    他們也正試圖大規模解決這個問題,

  • because if you want to train on more than a handful of images...

    因為如果你不想只用幾張圖片…

  • if you want to train on more than 100 images

    如果你想用超過 100 張圖片訓練,

  • this would take tremendous resources from you to retrain the model itself.

    這將佔用大量資源來重新訓練人工智能。

  • The other solution might be to invite more people into the museum.

    另一個解決方案可能是邀請更多人進入「博物館」。

  • There's an interesting analog.

    有個有趣的相似情況。

  • So like, have you heard of like ChatGPT?

    你有聽過 ChatGPT 嗎?

  • The big difference was that it basically used human feedback.

    最大的區別在於它基本上用了人工反饋。

  • So like they generated many, many sentences

    它會產生很多句子,

  • and asked people to rate which ones are good and which ones are not good.

    然後讓用戶評價哪些好,哪些不好。

  • They basically fine-tuned the model

    基本上,這能幫助微調。

  • so that it would generate sentences that are convincing to people.

    這樣 AI 就能產生出對人們來說合理的句子。

  • I guess it would require a lot of engineering to get people to label so much data.

    我想這需要大量工程才能讓人們標記如此多數據。

  • But I think if we could just get, like, people to rank how good the images are generated by these models

    但我認為,如果我們能讓用戶評價 AI 生成的圖,

  • then, like, a lot of these issues will go away, actually.

    那很多問題都會消失。

  • Because they're just training the models to do what people like.

    因為這樣就是在訓練 AI 照著人們喜好做事。

  • It's not just the hand,

    不只是手,

  • teeth and abs,

    牙齒、腹肌都是。

  • anything where there's like a pattern, a large amount of something,

    任何有規律,有一定數目的東西,

  • It doesn't know the rule of "there are this many"

    AI 都不知道該有幾個,

  • because it's trained on different amounts.

    因為用於訓練它的資料中量都不同。

You're called to create a post-apocalyptic giraffe astronaut.

您的創作指令是:末世太空長頸鹿。

字幕與單字
已審核 字幕已審核

單字即點即查 點擊單字可以查詢單字解釋