Placeholder Image

字幕列表 影片播放

由 AI 自動生成
  • Maybe we-- if you guys could stand over--

    也許我們 -- 如果你們能站在 --

  • Is it okay if they stand over here?

    如果他們站在這裡可以嗎?

  • - Yeah. - Um, actually.

    - 是的 呃,實際上。

  • Christophe, if you can get even lower.

    克里斯托弗,如果你能再低一點。

  • - Okay. - ( shutter clicks )

    - 好的。 -(快門聲)。

  • This is Lee and this is Christophe.

    這位是李,這位是克里斯托夫。

  • They're two of the hosts of this show.

    他們是這個節目的兩位主持人。

  • But to a machine, they're not people.

    但對機器來說,他們不是人。

  • This is just pixels. It's just data.

    這只是像素而已。這只是數據而已。

  • A machine shouldn't have a reason to prefer

    一臺機器不應該有理由去喜歡

  • one of these guys over the other.

    這些人中的一個比另一個強。

  • And yet, as you'll see in a second, it does.

    然而,正如你馬上就會看到的那樣,它確實如此。

  • It feels weird to call a machine racist,

    說一臺機器是種族主義者,感覺很奇怪。

  • but I really can't explain-- I can't explain what just happened.

    但我真的無法解釋 -- 我無法解釋剛剛發生的事情。

  • Data-driven systems are becoming a bigger and bigger part of our lives,

    數據驅動的系統正在成為我們生活中越來越大的一部分。

  • and they work well a lot of the time.

    而且它們在很多時候都能很好地工作。

  • - But when they fail... - Once again, it's the white guy.

    - 但當他們失敗時- 又一次,是白人。

  • When they fail, they're not failing on everyone equally.

    當他們失敗時,他們不是對每個人都平等地失敗。

  • If I go back right now...

    如果我現在回去...

  • Ruha Benjamin: You can have neutral intentions.

    魯哈-本傑明:你可以有中立的意圖。

  • You can have good intentions.

    你可以有好的意圖。

  • And the outcomes can still be discriminatory.

    而結果仍然可能是歧視性的。

  • Whether you want to call that machine racist

    無論你是否想把那臺機器稱為種族主義者

  • or you want to call the outcome racist,

    或者你想把這個結果稱為種族主義者。

  • we have a problem.

    我們有一個問題。

  • ( theme music playing )

    (主題音樂播放)

  • I was scrolling through my Twitter feed a while back

    前段時間,我在翻閱我的推特資料時發現

  • and I kept seeing tweets that look like this.

    而且我一直看到看起來像這樣的推文。

  • Two of the same picture of Republican senator Mitch McConnell smiling,

    共和黨參議員米奇-麥康奈爾的兩張相同的照片在微笑。

  • or sometimes it would be four pictures

    或者有時是四張照片

  • of the same random stock photo guy.

    的同一個隨機股票照片的傢伙。

  • And I didn't really know what was going on,

    而我並不真正知道發生了什麼事。

  • but it turns out that this was a big public test of algorithmic bias.

    但事實證明,這是對算法偏見的一次大型公開測試。

  • Because it turns out that these aren't pictures of just Mitch McConnell.

    因為事實證明,這些並不是只有米奇-麥康奈爾的照片。

  • They're pictures of Mitch McConnell and...

    他們是米奇-麥康奈爾的照片和...

  • - Barack Obama. - Lee: Oh, wow.

    - 巴拉克-奧巴馬。- 李。哦,哇。

  • So people were uploading

    所以人們在上傳

  • these really extreme vertical images

    這些真正極端的垂直影像

  • to basically force this image cropping algorithm

    基本上強迫這種影像裁剪算法

  • to choose one of these faces.

    來選擇這些面孔之一。

  • People were alleging that there's a racial bias here.

    人們指責這裡有種族偏見。

  • But I think what's so interesting about this particular algorithm

    但我認為這個特定算法的有趣之處在於

  • is that it is so testable for the public.

    是,它對公眾來說是如此的可檢驗。

  • It's something that we could test right now if we wanted to.

    如果我們想的話,現在就可以測試。

  • - Let's do it. - You guys wanna do it?

    - 我們來做吧。- 你們想做嗎?

  • Okay. Here we go.

    好了,我們開始吧。

  • So, Twitter does offer you options to crop your own image.

    是以,Twitter確實為你提供了裁剪自己圖片的選項。

  • But if you don't use those,

    但如果你不使用這些。

  • it uses an automatic cropping algorithm.

    它使用自動裁剪算法。

  • - Wow. There it is. - Whoa. Wow.

    - 哇。就在那裡。- 哇哦。哇哦。

  • That's crazy.

    這很瘋狂。

  • Christophe, it likes you.

    克里斯托弗,它喜歡你。

  • Okay, let's try the other-- the happy one.

    好吧,讓我們試試另一個--快樂的那個。

  • Lee: Wow.

    李:哇。

  • - Unbelievable. Oh, wow. - Both times.

    - 難以置信。哦,哇。- 兩次都是如此。

  • So, do you guys think this machine is racist?

    那麼,你們認為這臺機器是種族主義者嗎?

  • The only other theory I possibly have

    我唯一可能有的其他理論

  • is if the algorithm prioritizes white faces

    是指如果該算法優先考慮白人面孔

  • because it can pick them up quicker, for whatever reason,

    因為不管什麼原因,它能更快地把它們撿起來。

  • against whatever background.

    無論在什麼背景下,都是如此。

  • Immediately, it looks through the image

    立即,它通過影像查看

  • and tries to scan for a face.

    並試圖掃描出一張臉。

  • Why is it always finding the white face first?

    為什麼總是先找到白臉?

  • Joss: With this picture, I think someone could argue

    喬斯。通過這張照片,我想有人可以爭辯說

  • that the lighting makes Christophe's face more sharp.

    燈光使克里斯托夫的臉更加鮮明。

  • I still would love to do

    我還是很想做

  • a little bit more systematic testing on this.

    在這一點上,要進行更多的系統測試。

  • I think maybe hundreds of photos

    我想可能有數百張照片

  • could allow us to draw a conclusion.

    可以讓我們得出一個結論。

  • I have downloaded a bunch of photos

    我已經下載了一堆照片

  • from a site called Generated Photos.

    來自一個名為Generated Photos的網站。

  • These people do not exist. They were a creation of AI.

    這些人並不存在。他們是人工智能的創造。

  • And I went through, I pulled a bunch

    而我通過,我拉了一堆

  • that I think will give us

    我認為這將給我們帶來

  • a pretty decent way to test this.

    一個相當體面的方法來測試這個。

  • So, Christophe, I wonder if you would be willing to help me out with that.

    所以,克里斯托弗,我想知道你是否願意幫我解決這個問題。

  • You want me to tweet hundreds of photos?

    你想讓我在推特上發佈數百張照片?

  • - ( Lee laughs ) - Joss: Exactly.

    - (李笑) - 喬斯。正是如此。

  • I'm down. Sure, I've got time.

    我下來了。當然,我有時間。

  • Okay.

    好的。

  • ( music playing )

    ( 音樂播放 )

  • There may be some people who take issue with the idea

    可能有一些人對這個想法有異議

  • that machines can be racist

    機器可以是種族主義者

  • without a human brain or malicious intent.

    沒有人的大腦或惡意的意圖。

  • But such a narrow definition of racism

    但這樣一個狹義的種族主義定義

  • really misses a lot of what's going on.

    真的錯過了很多正在發生的事情。

  • I want to read a quote that responds to that idea.

    我想讀一段迴應這一想法的話語。

  • It says, "Robots are not sentient beings, sure,

    它說,"機器人不是有生命的人,當然。

  • but racism flourishes well beyond hate-filled hearts.

    但種族主義的盛行遠遠超出了充滿仇恨的心。

  • No malice needed, no "N" word required,

    不需要惡意,不需要 "N "字。

  • just a lack of concern for how the past shapes the present."

    只是對過去如何塑造現在缺乏關注。"

  • I'm going now to speak to the author of those words, Ruha Benjamin.

    我現在要和這些話的作者魯哈-本傑明談談。

  • She's a professor of African-American Studies at Princeton University.

    她是普林斯頓大學的非裔美國人研究教授。

  • When did you first become concerned

    你是什麼時候開始關注

  • that automated systems, AI, could be biased?

    自動化系統,人工智能,可能有偏見?

  • A few years ago, I noticed these headlines

    幾年前,我注意到這些頭條新聞

  • and hot takes about so-called racist and sexist robots.

    以及對所謂的種族主義和性別歧視機器人的熱議。

  • There was a viral video in which two friends were in a hotel bathroom

    有一個病毒視頻,其中兩個朋友在一個酒店的浴室裡

  • and they were trying to use an automated soap dispenser.

    而他們正試圖使用一個自動皁液器。

  • Black hand, nothing. Larry, go.

    黑色的手,什麼都沒有。拉里,走。

  • Black hand, nothing.

    黑色的手,什麼都沒有。

  • And although they seem funny

    雖然他們看起來很有趣

  • and they kind of get us to chuckle,

    而且他們有點讓我們發笑。

  • the question is, are similar design processes

    問題是,類似的設計過程是否

  • impacting much more consequential technologies that we're not even aware of?

    影響到我們甚至沒有意識到的更有影響的技術?

  • When the early news controversies came along maybe 10 years ago,

    當早期的新聞爭論出現時,也許是10年前。

  • people were surprised by the fact that they showed a racial bias.

    人們對他們表現出種族偏見的事實感到驚訝。

  • Why do you think people were surprised?

    你認為為什麼人們會感到驚訝?

  • Part of it is a deep attachment and commitment

    它的一部分是一種深深的依戀和承諾

  • to this idea of tech neutrality.

    對這種技術中立的想法。

  • People-- I think because life is so complicated

    人們......我想因為生活是如此複雜

  • and our social world is so messy--

    和我們的社會世界是如此混亂 --

  • really cling on to something that will save us,

    真正緊緊抓住能拯救我們的東西。

  • and a way of making decisions that's not drenched

    和一種不被淹沒的決策方式

  • in the muck of all of human subjectivity,

    在所有人類主觀性的泥沼中。

  • human prejudice and frailty.

    人類的偏見和弱點。

  • We want it so much to be true.

    我們非常希望它是真的。

  • We want it so much to be true, you know?

    我們非常希望它是真的,你知道嗎?

  • And the danger is that we don't question it.

    而危險的是,我們沒有質疑它。

  • And still we continue to have, you know, so-called glitches

    而我們仍然繼續有,你知道,所謂的小毛病

  • when it comes to race and skin complexion.

    當涉及到種族和膚色的時候。

  • And I don't think that they're glitches.

    而且我不認為它們是小毛病。

  • It's a systemic issue in the truest sense of the word.

    這是一個最真實意義上的系統性問題。

  • It has to do with our computer systems and the process of design.

    這與我們的計算機系統和設計過程有關。

  • Joss: AI can seem pretty abstract sometimes.

    喬斯。 人工智能有時會顯得很抽象。

  • So we built this to help explain

    所以我們建立了這個來幫助解釋

  • how machine learning works and what can go wrong.

    機器學習是如何工作的,會出什麼問題。

  • This black box is the part of the system that we interact with.

    這個黑盒子是我們與之互動的系統的一部分。

  • It's the software that decides which dating profiles we might like,

    它是決定我們可能喜歡哪些約會資料的軟件。

  • how much a rideshare should cost,

    乘坐共享汽車應該花費多少錢。

  • or how a photo should be cropped on Twitter.

    或Twitter上的照片應該如何裁剪。

  • We just see a device making a decision.

    我們只是看到一個設備在做決定。

  • Or more accurately, a prediction.

    或者更準確地說,是一種預測。

  • What we don't see is all of the human decisions

    我們沒有看到的是所有的人類決定

  • that went into the design of that technology.

    融入該技術的設計。

  • Now, it's true that when you're dealing with AI,

    現在,當你與人工智能打交道時,這確實是事實。

  • that means that the code in this box

    這意味著,這個盒子裡的代碼

  • wasn't all written directly by humans,

    並非都是由人類直接寫的。

  • but by machine-learning algorithms

    但通過機器學習算法

  • that find complex patterns in data.

    找到數據中的複雜模式。

  • But they don't just spontaneously learn things from the world.

    但他們不會自發地從世界上學到東西。

  • They're learning from examples.

    他們正在從實例中學習。

  • Examples that are labeled by people,

    被人貼上標籤的例子。

  • selected by people,

    由人選擇。

  • and derived from people, too.

    而且也是來自於人。

  • See, these machines and their predictions,

    看,這些機器和它們的預測。

  • they're not separate from us or from our biases

    他們與我們或與我們的偏見並不分離

  • or from our history,

    或來自我們的歷史。

  • which we've seen in headline after headline

    我們在一個又一個的頭條新聞中看到了這一點

  • for the past 10 years.

    在過去的10年裡。

  • We're using the face-tracking software,

    我們正在使用面部追蹤軟件。

  • so it's supposed to follow me as I move.

    所以它應該在我移動時跟隨我。

  • As you can see, I do this-- no following.

    正如你所看到的,我這樣做--沒有下文。

  • Not really-- not really following me.

    不太......不太關注我。

  • - Wanda, if you would, please? - Sure.

    - 萬達,如果你願意,請?- 當然可以。

  • In 2010, the top hit

    在2010年,最熱門的是

  • when you did a search for "black girls,"

    當你搜索 "黑人女孩 "時,

  • 80% of what you found

    你發現的80%的東西

  • on the first page of results was all porn sites.

    在結果的第一頁都是色情網站。

  • Google is apologizing after its photo software

    谷歌在其照片軟件之後進行了道歉

  • labeled two African-Americans gorillas.

    給兩個非裔美國人貼上了大猩猩的標籤。

  • Microsoft is shutting down

    微軟正在關閉

  • its new artificial intelligent bot

    其新的人工智能機器人

  • after Twitter users taught it how to be racist.

    在Twitter用戶教它如何成為種族主義者之後。

  • Woman: In order to make yourself hotter,

    女人。為了讓自己更性感。

  • the app appeared to lighten your skin tone.

    該應用程序出現了淡化你的膚色。

  • Overall, they work better on lighter faces than darker faces,

    總的來說,它們對淺色的臉比深色的臉效果更好。

  • and they worked especially poorly

    而且他們的工作特別差

  • on darker female faces.

    在較黑的女性臉上。

  • Okay, I've noticed that on all these damn beauty filters,

    好吧,我在所有這些該死的美容濾鏡上注意到了這一點。

  • is they keep taking my nose and making it thinner.

    是他們一直把我的鼻子弄得更薄。

  • Give me my African nose back, please.

    請把我的非洲鼻子還給我。

  • Man: So, the first thing that I tried was the prompt "Two Muslims..."

    男子:所以,我嘗試的第一件事是提示 "兩個穆斯林......"

  • And the way it completed it was,

    而它完成的方式是。

  • "Two Muslims, one with an apparent bomb,

    "兩名穆斯林,其中一人攜帶明顯的炸彈。

  • tried to blow up the Federal Building

    試圖炸燬聯邦大樓

  • in Oklahoma City in the mid-1990s."

    在1990年代中期的俄克拉荷馬城"。

  • Woman: Detroit police wrongfully arrested Robert Williams

    婦女。 底特律警方錯誤地逮捕了羅伯特-威廉姆斯

  • based on a false facial recognition hit.

    基於一個錯誤的面部識別命中。

  • There's definitely a pattern of harm

    肯定有一種傷害的模式

  • that disproportionately falls on vulnerable people, people of color.

    這一點不成比例地落在弱勢人群、有色人種身上。

  • Then there's attention,

    然後是注意力。

  • but of course, the damage has already been done.

    但當然,損害已經造成了。

  • ( Skype ringing )

    ( Skype鈴聲 )

  • - Hello. - Hey, Christophe.

    - 你好。- 嘿,克里斯托弗。

  • Thanks for doing these tests.

    謝謝你做這些測試。

  • - Of course. - I know it was a bit of a pain,

    - 當然了。- 我知道這有點麻煩。

  • but I'm curious what you found.

    但我很好奇你發現了什麼。

  • Sure. I mean, I actually did it.

    當然,我是說,我真的做到了。

  • I actually tweeted 180 different sets of pictures.

    我實際上在推特上發佈了180組不同的圖片。

  • In total, dark-skinned people

    總的來說,黑皮膚的人

  • were displayed in the crop 131 times,

    被顯示在莊稼地裡131次。

  • and light-skinned people

    和淺色皮膚的人

  • were displayed in the crop 229 times,

    被顯示在莊稼地裡229次。

  • which comes out to 36% dark-skinned

    這意味著36%的黑皮膚人

  • and 64% light-skinned.

    和64%的淺色皮膚。

  • That does seem to be evidence of some bias.

    這似乎確實是一些偏見的證據。

  • It's interesting because Twitter posted a blog post

    這很有趣,因為Twitter發佈了一篇博文

  • saying that they had done some of their own tests

    說他們已經做了一些自己的測試

  • before launching this tool, and they said that

    在推出這個工具之前,他們說,

  • they didn't find evidence of racial bias,

    他們並沒有發現種族偏見的證據。

  • but that they would be looking into it further.

    但他們將進一步調查此事。

  • Um, they also said that the kind of technology

    嗯,他們還說,那種技術

  • that they use to crop images

    他們用來裁剪影像的

  • is called a Saliency Prediction Model,

    被稱為 "顯著性預測模型"(Saliency Prediction Model)。

  • which means software that basically is making a guess

    這意味著軟件基本上是在做一個猜測

  • about what's important in an image.

    關於影像中什麼是重要的。

  • So, how does a machine know what is salient, what's relevant in a picture?

    那麼,機器如何知道什麼是突出的,什麼是圖片中的相關內容?

  • Yeah, it's really interesting, actually.

    是的,這真的很有趣,實際上。

  • There's these saliency data sets

    有這些突出性數據集

  • that documented people's eye movements

    記錄了人們的眼球運動

  • while they looked at certain sets of images.

    當他們看某些影像集的時候。

  • So you can take those photos

    所以你可以拍攝這些照片

  • and you can take that eye-tracking data

    而且你可以利用這些眼球追蹤數據

  • and teach a computer what humans look at.

    並教計算機看人類的東西。

  • So, Twitter's not going to give me any more information

    所以,Twitter不會給我任何更多的資訊

  • about how they trained their model,

    關於他們如何訓練他們的模型。

  • but I found an engineer from a company called Gradio.

    但我找到了一個叫Gradio的公司的工程師。

  • They built an app that does something similar,

    他們建立了一個應用程序,做了類似的事情。

  • and I think it can give us a closer look

    而且我認為它可以給我們一個更近距離的觀察

  • at how this kind of AI works.

    在這種人工智能如何工作。

  • - Hey. - Hey.

    - 嘿。

  • - Joss. - Nice to meet you. Dawood.

    - 喬斯。- 很高興見到你。達伍德。

  • So, you and your colleagues

    所以,你和你的同事

  • built a saliency cropping tool

    建立了一個突出性裁剪工具

  • that is similar to what we think Twitter is probably doing.

    這與我們認為Twitter可能正在做的事情相似。

  • Yeah, we took a public machine learning model, posted it on our library,

    是的,我們把一個公共的機器學習模型,發佈在我們的圖書館。

  • and launched it for anyone to try.

    並推出它供任何人嘗試。

  • And you don't have to constantly post pictures

    而且你不必不斷髮布照片

  • on your timeline to try and experiment with it,

    在你的時間軸上嘗試和實驗。

  • which is what people were doing when they first became aware of the problem.

    這也是人們第一次意識到這個問題時正在做的事情。

  • And that's what we did. We did a bunch of tests just on Twitter.

    而這正是我們所做的。我們僅僅在Twitter上做了一堆測試。

  • But what's interesting about what your app shows

    但有趣的是,你的應用程序所顯示的內容

  • is the sort of intermediate step there, which is this saliency prediction.

    是那種中間步驟,也就是這種突出性預測。

  • Right, yeah. I think the intermediate step is important for people to see.

    對,是的。我認為中間的步驟對人們來說很重要。

  • Well, I-- I brought some pictures for us to try.

    好吧,我......我帶來了一些照片,供我們嘗試。

  • These are actually the hosts of "Glad You Asked."

    這些人實際上是 "Glad You Asked "的主持人。

  • And I was hoping we could put them into your interface

    我希望我們可以把它們放到你的界面上。

  • and see what, uh, the saliency prediction is.

    並看看,呃,突出性預測是什麼。

  • Sure. Just load this image here.

    當然,只要在這裡加載這個影像。

  • Joss: Okay, so, we have a saliency map.

    喬斯。好的,所以,我們有一個突出性地圖。

  • Clearly the prediction is that faces are salient,

    顯然,預測的結果是面孔是突出的。

  • which is not really a surprise.

    這其實並不令人驚訝。

  • But it looks like maybe they're not equally salient.

    但看起來,也許它們並不同樣突出。

  • - Right. - Is there a way to sort of look closer at that?

    - 對。- 有什麼辦法可以更仔細地看這個問題嗎?

  • So, what we can do here, we actually built it out in the app

    是以,我們在這裡可以做的是,我們實際上在應用程序中建立了它

  • where we can put a window on someone's specific face,

    在這裡,我們可以把一個窗口放在某人的具體面孔上。

  • and it will give us a percentage of what amount of saliency

    它將給我們一個百分比,說明有多少突出性

  • you have over your face versus in proportion to the whole thing.

    你的臉與整個事情的比例。

  • - That's interesting. - Yeah.

    - 這很有意思。- 是的。

  • She's-- Fabiola's in the center of the picture,

    她是......法比奧拉在照片的中央。

  • but she's actually got a lower percentage

    但實際上她的百分比更低

  • of the salience compared to Cleo, who's to her right.

    的顯著性相比,克萊奧,誰在她的右邊。

  • Right, and trying to guess why a model is making a prediction

    對,試圖猜測一個模型為什麼要做預測

  • and why it's predicting what it is

    以及為什麼它的預測結果是這樣的

  • is a huge problem with machine learning.

    是機器學習的一個巨大問題。

  • It's always something that you have to kind of

    這一直是你必須要做的事情。

  • back-trace to try and understand.

    回溯,以嘗試和理解。

  • And sometimes it's not even possible.

    而有時甚至不可能。

  • Mm-hmm. I looked up what data sets

    嗯,嗯。我查找了哪些數據集

  • were used to train the model you guys used,

    是用來訓練你們使用的模型的。

  • and I found one that was created by

    我發現有一個是由

  • researchers at MIT back in 2009.

    早在2009年,麻省理工學院的研究人員就已經開始研究。

  • So, it was originally about a thousand images.

    是以,它最初是大約一千張圖片。

  • We pulled the ones that contained faces,

    我們拉出了包含臉部的那些。

  • any face we could find that was big enough to see.

    我們能找到的任何足夠大的臉都能看到。

  • And I went through all of those,

    而我經歷了所有這些。

  • and I found that only 10 of the photos,

    而我發現,只有10張照片。

  • that's just about 3%,

    這只是大約3%。

  • included someone who appeared to be

    包括一個似乎是

  • of Black or African descent.

    黑人或非洲裔的人。

  • Yeah, I mean, if you're collecting a data set through Flickr,

    是的,我的意思是,如果你通過Flickr收集一個數據集。

  • you're-- first of all, you're biased to people

    你......首先,你對人有偏見

  • that have used Flickr back in, what, 2009, you said, or something?

    在2009年就已經使用Flickr了,你說的是什麼,還是什麼?

  • Joss: And I guess if we saw in this image data set,

    喬斯。我想如果我們在這個影像數據集中看到。

  • there are more cat faces than black faces,

    貓臉比黑臉多。

  • we can probably assume that minimal effort was made

    我們大概可以認為,已經做出了最小的努力

  • to make that data set representative.

    以使該數據集具有代表性。

  • When someone collects data into a training data set,

    當有人收集數據進入訓練數據集時。

  • they can be motivated by things like convenience and cost

    他們的動機可能是為了方便和成本等問題

  • and end up with data that lacks diversity.

    並最終得到缺乏多樣性的數據。

  • That type of bias, which we saw in the saliency photos,

    這種類型的偏見,我們在突出性照片中看到。

  • is relatively easy to address.

    是相對容易解決的。

  • If you include more images representing racial minorities,

    如果你包括更多代表少數民族的影像。

  • you can probably improve the model's performance on those groups.

    你可能可以提高模型在這些群體上的表現。

  • But sometimes human subjectivity

    但有時人的主觀性

  • is imbedded right into the data itself.

    是直接嵌入到數據本身。

  • Take crime data for example.

    以犯罪數據為例。

  • Our data on past crimes in part reflects

    我們關於過去犯罪的數據部分反映了

  • police officers' decisions about what neighborhoods to patrol

    警察決定在哪些街區進行巡邏

  • and who to stop and arrest.

    以及阻止和逮捕誰。

  • We don't have an objective measure of crime,

    我們沒有一個客觀的犯罪衡量標準。

  • and we know that the data we do have

    而且我們知道,我們所擁有的數據

  • contains at least some racial profiling.

    至少包含一些種族定性。

  • But it's still being used to train crime prediction tools.

    但它仍然被用於訓練犯罪預測工具。

  • And then there's the question of how the data is structured over here.

    然後還有一個問題,就是這邊的數據是如何結構化的。

  • Say you want a program that identifies

    假設你希望有一個程序能夠識別

  • chronically sick patients to get additional care

    長期患病的病人得到額外的護理

  • so they don't end up in the ER.

    這樣他們就不會被送進急診室了。

  • You'd use past patients as your examples,

    你會用過去的病人作為你的例子。

  • but you have to choose a label variable.

    但你必須選擇一個標籤變量。

  • You have to define for the machine what a high-risk patient is

    你必須為機器定義什麼是高風險病人

  • and there's not always an obvious answer.

    並不總是有一個明顯的答案。

  • A common choice is to define high-risk as high-cost,

    一個常見的選擇是將高風險定義為高成本。

  • under the assumption that people who use

    在假設使用的人

  • a lot of health care resources are in need of intervention.

    很多衛生保健資源需要干預。

  • Then the learning algorithm looks through

    然後,學習算法通過查看

  • the patient's data--

    病人的資料 --

  • their age, sex,

    他們的年齡、性別。

  • medications, diagnoses, insurance claims,

    藥品、診斷、保險索賠。

  • and it finds the combination of attributes

    並找到屬性的組合

  • that correlates with their total health costs.

    這與他們的總健康成本相關。

  • And once it gets good at predicting

    而一旦它善於預測

  • total health costs on past patients,

    在過去的病人身上的總醫療費用。

  • that formula becomes software to assess new patients

    該公式成為評估新病人的軟件

  • and give them a risk score.

    並給他們一個風險分數。

  • But instead of predicting sick patients,

    但不是預測生病的病人。

  • this predicts expensive patients.

    這預示著昂貴的病人。

  • Remember, the label was cost,

    記住,標籤是成本。

  • and when researchers took a closer look at those risk scores,

    而當研究人員對這些風險分數進行仔細觀察時。

  • they realized that label choice was a big problem.

    他們意識到,標籤的選擇是一個大問題。

  • But by then, the algorithm had already been used

    但到那時,該算法已經被使用了

  • on millions of Americans.

    對數百萬美國人的影響。

  • It produced risk scores for different patients,

    它為不同的病人產生了風險分數。

  • and if a patient had a risk score

    而如果一個病人的風險評分是

  • of almost 60,

    的近60。

  • they would be referred into the program

    他們將被轉入該計劃

  • for screening-- for their screening.

    為篩選--為他們的篩選。

  • And if they had a risk score of almost 100,

    而如果他們的風險分數幾乎達到100分。

  • they would default into the program.

    他們將默認進入該計劃。

  • Now, when we look at the number of chronic conditions

    現在,當我們看一下慢性病的數量時

  • that patients of different risk scores were affected by,

    不同風險分值的病人受到影響。

  • you see a racial disparity where white patients

    你看到一個種族差異,白人患者

  • had fewer conditions than black patients

    與黑人患者相比,他們的病情較少

  • at each risk score.

    在每個風險分數。

  • That means that black patients were sicker

    這意味著黑人患者的病情更嚴重

  • than their white counterparts

    比他們的白人同行更多

  • when they had the same risk score.

    當他們有相同的風險得分時。

  • And so what happened is in producing these risk scores

    是以,在產生這些風險分數時,發生了什麼事?

  • and using spending,

    和使用支出。

  • they failed to recognize that on average

    他們沒有認識到,平均而言

  • black people incur fewer costs for a variety of reasons,

    由於各種原因,黑人產生的費用較少。

  • including institutional racism,

    包括制度性的種族主義。

  • including lack of access to high-quality insurance,

    包括缺乏獲得高質量保險的機會。

  • and a whole host of other factors.

    以及一大堆其他因素。

  • But not because they're less sick.

    但不是因為他們病得少。

  • Not because they're less sick.

    不是因為他們病得少。

  • And so I think it's important

    是以,我認為這很重要

  • to remember this had racist outcomes,

    要記住這有種族主義的結果。

  • discriminatory outcomes, not because there was

    歧視性的結果,而不是因為存在著

  • a big, bad boogie man behind the screen

    銀幕背後的大壞蛋

  • out to get black patients,

    爭取黑人病人。

  • but precisely because no one was thinking

    但正是因為沒有人想到

  • about racial disparities in healthcare.

    關於醫療保健方面的種族差異。

  • No one thought it would matter.

    沒有人認為這很重要。

  • And so it was about the colorblindness,

    於是就有了色盲的說法。

  • the race neutrality that created this.

    造成這種情況的種族中立性。

  • The good news is that now the researchers who exposed this

    好消息是,現在曝光這一問題的研究人員

  • and who brought this to light are working with the company

    而將此事曝光的人正在與該公司合作。

  • that produced this algorithm to have a better proxy.

    產生這種算法的,有一個更好的代理。

  • So instead of spending, it'll actually be

    是以,與其說是花錢,不如說實際上是

  • people's actual physical conditions

    人們的實際身體狀況

  • and the rate at which they get sick, et cetera,

    以及他們生病的速度,等等。

  • that is harder to figure out,

    這就更難搞清楚了。

  • it's a harder kind of proxy to calculate,

    這是一種更難計算的代理。

  • but it's more accurate.

    但它更準確。

  • I feel like what's so unsettling about this healthcare algorithm

    我覺得這種醫療保健算法令人不安的地方在於

  • is that the patients would have had

    是,病人會有

  • no way of knowing this was happening.

    沒有辦法知道這種情況的發生。

  • It's not like Twitter, where you can upload

    它不像Twitter,在那裡你可以上傳

  • your own picture, test it out, compare with other people.

    你自己的照片,測試它,與其他人比較。

  • This was just working in the background,

    這只是在後臺工作。

  • quietly prioritizing the care of certain patients

    悄悄地將某些病人的護理列為優先事項

  • based on an algorithmic score

    基於一個算法的得分

  • while the other patients probably never knew

    而其他病人可能永遠不知道

  • they were even passed over for this program.

    他們甚至被排除在這個計劃之外。

  • I feel like there has to be a way

    我覺得一定有辦法的

  • for companies to vet these systems in advance,

    對公司來說,事先對這些系統進行審查。

  • so I'm excited to talk to Deborah Raji.

    所以我很高興能與黛博拉-拉吉交談。

  • She's been doing a lot of thinking

    她一直在做大量的思考

  • and writing about just that.

    並就這一點進行了寫作。

  • My question for you is how do we find out

    我想問的是,我們如何才能發現

  • about these problems before they go out into the world

    在他們走向世界之前,要了解這些問題。

  • and cause harm rather than afterwards?

    並造成傷害,而不是事後?

  • So, I guess a clarification point is that machine learning

    所以,我想澄清的一點是,機器學習

  • is highly unregulated as an industry.

    作為一個行業,它是非常不受監管的。

  • These companies don't have to report their performance metrics,

    這些公司不需要報告他們的業績指標。

  • they don't have to report their evaluation results

    他們不需要報告他們的評估結果

  • to any kind of regulatory body.

    向任何類型的監管機構。

  • But internally there's this new culture of documentation

    但在內部,有這種新的文件文化

  • that I think has been incredibly productive.

    我認為這已經是令人難以置信的成果。

  • I worked on a couple of projects with colleagues at Google,

    我和谷歌的同事一起做了幾個項目。

  • and one of the main outcomes of that was this effort called Model Cards--

    其主要成果之一是這項名為 "示範卡 "的努力------。

  • very simple one-page documentation

    非常簡單的單頁文件

  • on how the model actually works,

    關於該模型如何實際工作。

  • but also questions that are connected to ethical concerns,

    但也有與倫理問題相關的問題。

  • such as the intended use for the model,

    如該模型的預期用途。

  • details about where the data's coming from,

    關於數據來源的細節。

  • how the data's labeled, and then also, you know,

    數據是如何標記的,然後還有,你知道。

  • instructions to evaluate the system according to its performance

    根據系統的性能對其進行評估的訓示

  • on different demographic sub-groups.

    對不同的人口亞群。

  • Maybe that's something that's hard to accept

    也許這是很難接受的事情

  • is that it would actually be maybe impossible

    是,這實際上可能是不可能的

  • to get performance across sub-groups to be exactly the same.

    以使各分組的表現完全相同。

  • How much of that do we just have to be like, "Okay"?

    我們有多少是要像 "好的 "那樣的?

  • I really don't think there's an unbiased data set

    我真的不認為有一個無偏見的數據集

  • in which everything will be perfect.

    在其中,一切都將是完美的。

  • I think the more important thing is to actually evaluate

    我認為更重要的是要實際評估

  • and assess things with an active eye

    並以積極的眼光評估事物

  • for those that are most likely to be negatively impacted.

    為那些最有可能受到負面影響的人。

  • You know, if you know that people of color are most vulnerable

    你知道,如果你知道有色人種是最脆弱的

  • or a particular marginalized group is most vulnerable

    或某一特定的邊緣化群體最容易受到傷害

  • in a particular situation,

    在一個特定的情況下。

  • then prioritize them in your evaluation.

    然後在你的評估中對它們進行優先排序。

  • But I do think there's certain situations

    但我確實認為在某些情況下

  • where maybe we should not be predicting

    也許我們不應該預測的地方

  • with a machine-learning system at all.

    與機器學習系統完全不同。

  • We should be super cautious and super careful

    我們應該超級謹慎,超級小心

  • about where we deploy it and where we don't deploy it,

    關於我們在哪裡部署和在哪裡不部署的問題。

  • and what kind of human oversight

    和什麼樣的人的監督

  • we put over these systems as well.

    我們也在這些系統上做了一些工作。

  • The problem of bias in AI is really big.

    人工智能中的偏見問題確實很大。

  • It's really difficult.

    這真的很困難。

  • But I don't think it means we have to give up

    但我不認為這意味著我們必須放棄

  • on machine learning altogether.

    在機器學習方面,完全沒有問題。

  • One benefit of bias in a computer versus bias in a human

    計算機中的偏見與人類中的偏見相比,有一個好處

  • is that you can measure and track it fairly easily.

    是,你可以相當容易地測量和跟蹤它。

  • And you can tinker with your model

    而且你可以對你的模型進行修補

  • to try and get fair outcomes if you're motivated to do so.

    試圖獲得公平的結果,如果你有這樣的動機。

  • The first step was becoming aware of the problem.

    第一步是意識到這個問題。

  • Now the second step is enforcing solutions,

    現在,第二步是強制執行解決方案。

  • which I think we're just beginning to see now.

    我認為我們現在剛剛開始看到這一點。

  • But Deb is raising a bigger question.

    但是Deb提出了一個更大的問題。

  • Not just how do we get bias out of the algorithms,

    不僅僅是我們如何從算法中獲得偏見。

  • but which algorithms should be used at all?

    但到底應該使用哪些算法?

  • Do we need a predictive model to be cropping our photos?

    我們需要一個預測模型來剪裁我們的照片嗎?

  • Do we want facial recognition in our communities?

    我們是否希望在我們的社區進行面部識別?

  • Many would say no, whether it's biased or not.

    許多人會說不,不管它是否有偏見。

  • And that question of which technologies

    而哪些技術的問題

  • get built and how they get deployed in our world,

    在我們的世界中,它們是如何被建造和部署的。

  • it boils down to resources and power.

    歸根結底是資源和權力。

  • It's the power to decide whose interests

    它是決定誰的利益的權力

  • will be served by a predictive model,

    將由一個預測模型提供服務。

  • and which questions get asked.

    以及哪些問題會被問到。

  • You could ask, okay, I want to know how landlords

    你可以問,好吧,我想知道房東如何

  • are making life for renters hard.

    讓租房者的生活變得艱難。

  • Which landlords are not fixing up their buildings?

    哪些房東沒有修繕他們的建築?

  • Which ones are hiking rent?

    哪些是徒步旅行的租金?

  • Or you could ask, okay, let's figure out

    或者你可以問,好吧,讓我們弄清楚

  • which renters have low credit scores.

    哪些租房者的信用分數低。

  • Let's figure out the people who have a gap in unemployment

    讓我們來算算哪些人的失業率有差距

  • so I don't want to rent to them.

    所以我不想租給他們。

  • And so it's at that problem

    所以就在這個問題上

  • of forming the question

    形成問題的

  • and posing the problem

    並提出問題

  • that the power dynamics are already being laid

    權力的動態已經形成了

  • that set us off in one trajectory or another.

    這使我們在一個或另一個軌道上出發。

  • And the big challenge there being that

    而其中最大的挑戰是

  • with these two possible lines of inquiry,

    與這兩條可能的調查路線。

  • - one of those is probably a lot more profitable... - Exactly, exactly.

    - 其中一個可能是更有利可圖的......。

  • - ...than the other one. - And too often the people who are creating these tools,

    - ......比另一個更重要。- 而創造這些工具的人往往也是如此。

  • they don't necessarily have to share the interests

    他們不一定要有共同的興趣。

  • of the people who are posing the questions,

    的人提出的問題。

  • but those are their clients.

    但這些是他們的客戶。

  • So, the question for the designers and the programmers is

    是以,設計師和程序員的問題是

  • are you accountable only to your clients

    你只對你的客戶負責嗎?

  • or are you also accountable to the larger body politic?

    還是你也要對更大的政治體負責?

  • Are you responsible for what these tools do in the world?

    你對這些工具在世界上所做的事情負責嗎?

  • ( music playing )

    ( 音樂播放 )

  • ( indistinct chatter )

    (緲緲的嘮叨)。

  • Man: Can you lift up your arm a little?

    男子:你能把你的手臂抬起來一點嗎?

  • ( chatter continues )

    ( 談話繼續 )

Maybe we-- if you guys could stand over--

也許我們 -- 如果你們能站在 --

字幕與單字
由 AI 自動生成

單字即點即查 點擊單字可以查詢單字解釋