字幕列表 影片播放 列印所有字幕 列印翻譯字幕 列印英文字幕 I'd like to talk today 我今天想給大家介紹 about a powerful and fundamental aspect 一個對我們身份有重要影響的因素 of who we are: our voice. 那就是:聲音 Each one of us has a unique voiceprint 我們每一個人都有獨特的音印 that reflects our age, our size, 它反映了我們的年紀, 體型, even our lifestyle and personality. 甚至我們的性格與生活習慣 In the words of the poet Longfellow, 以詩人亨利·沃茲沃思·朗費羅的話說: "the human voice is the organ of the soul." "人類的聲音就是靈魂的器官." As a speech scientist, I'm fascinated 做為一個語言科學家, 我對聲音產生的過程 by how the voice is produced, 有著濃厚的興趣, and I have an idea for how it can be engineered. 我對如何來設計與建造聲音 有一個新的看法 That's what I'd like to share with you. 我想和大家分享的這個看法 I'm going to start by playing you a sample 先給大家放一個實例 of a voice that you may recognize. 你們也許認得這個聲音 (Recording) Stephen Hawking: "I would have thought (錄音) 史蒂芬‧霍金:"我以為我說的話 it was fairly obvious what I meant." 還是比較清楚的" Rupal Patel: That was the voice 這個錄音裡的聲音 of Professor Stephen Hawking. 是來自史蒂芬‧霍金教授 What you may not know is that same voice 但是你也許不知道同一個聲音 may also be used by this little girl 也可能被這個小女孩使用 who is unable to speak 她因為神經的問題 because of a neurological condition. 而無法說話 In fact, all of these individuals 事實上, 所有這些人 may be using the same voice, 都可能用著同一個聲音, and that's because there's only a few options available. 因為目前可用的聲音只有幾個 In the U.S. alone, there are 2.5 million Americans 僅在美國就有250萬人 who are unable to speak, 無法通過語言溝通, and many of whom use computerized devices 他們大多數 to communicate. 使用電子設備來溝通 Now that's millions of people worldwide 這意味著全世界有數百萬的人 who are using generic voices, 都用著同樣的聲音, including Professor Hawking, 其中包括了霍金教授, who uses an American-accented voice. 他用的是帶有美式口音的聲音 This lack of individuation of the synthetic voice 這種人工聲音缺少的個體性 really hit home 讓我非常的驚訝, when I was at an assistive technology conference 當我幾年前 a few years ago, 在一個輔具科技會議上, and I recall walking into an exhibit hall 我記得走進一個展覽廳 and seeing a little girl and a grown man 看見一個小女孩和一個成年男子 having a conversation using their devices, 通過他們的設備談話, different devices, but the same voice. 雖然設備不同, 但聲音卻是一樣的 And I looked around and I saw this happening 我望了望四周,發現 all around me, literally hundreds of individuals 周圍有幾百個人 using a handful of voices, 使用的聲音却只有幾種 voices that didn't fit their bodies 都不符合他們的身體 or their personalities. 或是性格. We wouldn't dream of fitting a little girl 我們不會考慮給一個小女孩裝上 with the prosthetic limb of a grown man. 一個成年男子的假肢 So why then the same prosthetic voice? 那為甚麼要給她一個 不屬於自己的聲音呢? It really struck me, 我因為感觸很深, and I wanted to do something about this. 所以決定對此做些甚麼 I'm going to play you now a sample 接下來我要播放的例子 of someone who has, two people actually, 是兩個人, who have severe speech disorders. 他們都有嚴重的語言障礙 I want you to take a listen to how they sound. 我希望大家聽聽看他們的聲音 They're saying the same utterance. 二人說的是一樣的話 (First voice) (聲音一) (Second voice) (聲音二) You probably didn't understand what they said, 你們也許沒聽懂他們的話, but I hope that you heard 但我希望你們注意到了 their unique vocal identities. 他們聲音中的獨特性 So what I wanted to do next is, 我接下來要做的是, I wanted to find out how we could harness 找到一個方法來 these residual vocal abilities 利用這些剩餘的聲音特性 and build a technology 來發明一套科技 that could be customized for them, 專為他們設計 voices that could be customized for them. 將他們的聲音個性化, So I reached out to my collaborator, Tim Bunnell. 我找到了我的合作人, 蒂姆·布涅爾 Dr. Bunnell is an expert in speech synthesis, 布涅爾博士是智能語音方面的專家, and what he'd been doing is building 他一直都在為 personalized voices for people 他人設計個性化的語音 by putting together 方法是通過收集 pre-recorded samples of their voice 這些人之前的聲音錄音 and reconstructing a voice for them. 然後再為他們重建一種聲音 These are people who had lost their voice 但是布涅爾博士的這些研究對象 later in life. 遇到的問題是後天性語言障礙 We didn't have the luxury 我們這次的研究沒有這個福利 of pre-recorded samples of speech 對這些先天帶有語言障礙的人 for those born with speech disorder. 我們沒有事先錄製好的聲音樣品 But I thought, there had to be a way 但是我想了想, 一定有一個方法 to reverse engineer a voice 可以從僅有的所剩中 from whatever little is left over. 將聲音逆向製作出來 So we decided to do exactly that. 所以我們決定就這樣做 We set out with a little bit of funding from the National Science Foundation, 我們從國家科學基金會獲得了一些資金, to create custom-crafted voices that captured 用以建造一套可以抓住他們 their unique vocal identities. 聲音特性的個體化語音 We call this project VocaliD, or vocal I.D., 我們將該專案稱作VocaliD, 或是vocal I.D., for vocal identity. 作為語音身份(Vocal Identity)的簡寫 Now before I get into the details of how 在我向大家播放 the voice is made and let you listen to it, 和介紹如何製作這個聲音之前, I need to give you a real quick speech science lesson. Okay? 我需要先給大家上一堂 語言科學課, 好嗎? So first, we know that the voice is changing 首先,我們需要了解聲音 dramatically over the course of development. 在成長的過程中會發生巨大的變化 Children sound different from teens 兒童和青少年聽起來會不同 who sound different from adults. 而青少年和成年人之間也是 We've all experienced this. 我們都曾經歷過這些語言變化階段 Fact number two is that speech 事實二,是語言的產生 is a combination of the source, 是由多個來源組成, which is the vibrations generated by your voice box, 其中包括了你喉頭產生的顫動, which are then pushed through 這種顫動接著 the rest of the vocal tract. 會貫穿整個聲腔 These are the chambers of your head and neck 圖像顯示的是頭和脖子的內部 that vibrate, 它們會顫動, and they actually filter that source sound 其實它們是將來源聲音過濾掉 to produce consonants and vowels. 來產生子音和母音 So the combination of source and filter 所以聲音的來源和過濾過程加在一起 is how we produce speech. 就是我們產生聲音的方法 And that happens in one individual. 這是一個人身上發生的過程 Now I told you earlier that I'd spent 我之前告訴過大家 a good part of my career 我職業生涯的大部分時間 understanding and studying 都用來研究和學習 the source characteristics of people 有嚴重語音障礙人士的 with severe speech disorder, 聲音源的特徵, and what I've found 我發現 is that even though their filters were impaired, 雖然他們的過濾器官已遭到損壞, they were able to modulate their source: 他們可以調製自己的聲音來源: the pitch, the loudness, the tempo of their voice. 包括高低度, 大小, 以及速度 These are called prosody, and I've been documenting for years 這些被稱之為音律, that the prosodic abilities of these individuals 我用了多年的時間 來紀錄這些人是如何 are preserved. 維持自己音律的能力 So when I realized that those same cues 當我認識到同樣的線索 are also important for speaker identity, 對說話人的身份同樣重要的時候, I had this idea. 我有了一個想法 Why don't we take the source 為什麼我們不找一個 聲音是我們所需要的人, from the person we want the voice to sound like, 從他那採集聲音源 because it's preserved, 因為它已被保留, and borrow the filter 然後再找一個有著相似年紀和體型的人 from someone about the same age and size, 從他那借用過濾器, because they can articulate speech, 因為他們能清晰地說話, and then mix them? 然後將二者混合? Because when we mix them, 因為當我們將它們混合的時候, we can get a voice that's as clear 我們得到的聲音將會和 as our surrogate talker -- 那個代替說話者一樣清楚 that's the person we borrowed the filter from— 代替說話者就是我們借用過濾器的人 and is similar in identity to our target talker. 而產生的語音和我們 目標說話者有相似的辨認度 It's that simple. 就這麼簡單 That's the science behind what we're doing. 這就我們該項研究的科學性 So once you have that in mind, 有了這個想法以後, how do you go about building this voice? 應該怎麼來製造這個聲音呢? Well, you have to find someone 首先,你必須找一個 who is willing to be a surrogate. 願意當這個代替者的人 It's not such an ominous thing. 這個任務也不是太糟糕 Being a surrogate donor 當一個聲音捐贈者 only requires you to say a few hundred 只要求你閱讀幾百 to a few thousand utterances. 到幾千句話. The process goes something like this. 以下是過程 (Video) Voice: Things happen in pairs. (錄影)聲音: 事情成雙成對地發生 I love to sleep. 我愛睡覺 The sky is blue without clouds. 天空藍色無雲 RP: Now she's going to go on like this 演講者: 她接下來的3-4個小時 for about three to four hours, 都會繼續閱讀, and the idea is not for her to say everything 目的是不要讓她說 that the target is going to want to say, 所有目標說話者要說的話 but the idea is to cover all the different combinations 真正的目的是要概擴所有 of the sounds that occur in the language. 在語言中可能發生的組合 The more speech you have, 你說的話越多, the better sounding voice you're going to have. 你的聲音就會聽起來更好 Once you have those recordings, 當錄音完成後, what we need to do 我們接下來 is we have to parse these recordings 要對這些錄音做語法分析 into little snippets of speech, 將它們分段, one- or two-sound combinations, 大概1-2個音的組合, sometimes even whole words 有時候也會是那些 that start populating a dataset or a database. 填入數據集或是數據庫的完整單字 We're going to call this database a voice bank. 我們將這個數據庫稱之為聲音銀行 Now the power of the voice bank 聲音銀行的力量 is that from this voice bank, 使我們通過它 we can now say any new utterance, 可以說出任何新的語句, like, "I love chocolate" -- 比如說, "我喜歡巧克力" everyone needs to be able to say that— 所有人都需要說這類的話的能力 fish through that database 搜尋數據庫 and find all the segments necessary 找到必須的部分 to say that utterance. 來完成這個語句 (Video) Voice: I love chocolate. (錄影)聲音: 我喜歡巧克力 RP: So that's speech synthesis. 演講人: 這是一個人工聲音 It's called concatenative synthesis, and that's what we're using. 我們將其稱之為連環整合 我們使用的就是這個方法 That's not the novel part. 這不是新奇的部分 What's novel is how we make it sound 它新奇之處是我們使它 like this young woman. 聽起來就像是這個年輕女士的聲音 This is Samantha. 她是珊曼莎 I met her when she was nine, 在她9歲時, 我第一次見到她 and since then, my team and I 在那之後, 我和我的團隊 have been trying to build her a personalized voice. 一直設法為她製造一款個性化的聲音 We first had to find a surrogate donor, 我們首先需要一個捐贈者, and then we had to have Samantha 然後我們會讓珊曼莎 produce some utterances. 發一些音 What she can produce are mostly vowel-like sounds, 雖然她所發出的音大部分都類似母音, but that's enough for us to extract 但我們用這些已足夠 her source characteristics. 來取得她聲音根源的特性 What happens next is best described 接下來所發生的事 by my daughter's analogy. She's six. 用我女兒的比喻來描述再合適不過, 她6歲 She calls it mixing colors to paint voices. 她說這是混合顏色來畫聲音 It's beautiful. It's exactly that. 很漂亮, 就是這樣 Samantha's voice is like a concentrated sample 珊曼莎的聲音就像是紅色食用色素 of red food dye which we can infuse 的濃縮樣品 into the recordings of her surrogate 我們可以將它注入到她代替者的錄音裡 to get a pink voice just like this. 然後取得一個像這樣的粉色聲音 (Video) Samantha: Aaaaaah. (錄影)珊曼莎:啊..... RP: So now, Samantha can say this. 現在, 珊曼莎可以說這個 (Video) Samantha: This voice is only for me. (錄影)珊曼莎: 這個聲音是我的專屬 I can't wait to use my new voice with my friends. 我等不及與我朋友們分享我的聲音 RP: Thank you. (Applause) 謝謝 I'll never forget the gentle smile 我永遠都不會忘記 that spread across her face 當她第一次聽到自己的聲音時 when she heard that voice for the first time. 佈滿在她臉上那輕柔的微笑 Now there's millions of people 目前世界上 around the world like Samantha, millions, 有好幾百萬像珊曼莎的人, 幾百萬, and we've only begun to scratch the surface. 而我們的工作才剛剛開始 What we've done so far is we have 我們目前只有 a few surrogate talkers from around the U.S. 幾個來自美國的語言代替者 who have donated their voices, 捐贈了他們的聲音, and we have been using those 我們使用了他們的捐贈 to build our first few personalized voices. 來建造我們第一批個性化的聲音 But there's so much more work to be done. 但還有更多的工作要完成 For Samantha, her surrogate 對珊曼莎而言, 她的代替者 came from somewhere in the Midwest, a stranger 是來自美國中西部, 一個陌生人 who gave her the gift of voice. 送給了她一個聲音禮物 And as a scientist, I'm so excited 作為一個科學家, 我很開心 to take this work out of the laboratory 能將這個研究從實驗室 and finally into the real world 帶到現實的世界 so it can have real-world impact. 讓它產生一個實際的影響 What I want to share with you next 我接下來想跟大家分享 is how I envision taking this work 我如何想像讓這項研究 to that next level. 進入下一個階段 I imagine a whole world of surrogate donors 我想像著一個充滿了聲音捐贈者的世界 from all walks of life, different sizes, different ages, 他們來自各行各業, 有著不同的體型和年齡, coming together in this voice drive 一起聚集到這個聲音活動 to give people voices 給其他人提供的聲音 that are as colorful as their personalities. 就像他們個性一樣多姿多采 To do that as a first step, 我們的第一個步驟, we've put together this website, VocaliD.org, 是建立這個網站, VocaliD.org, as a way to bring together those 通過這個網站將 who want to join us as voice donors, 那些願意捐贈聲音的, as expertise donors, 願意提供意見的, in whatever way to make this vision a reality. 還有想提供其它幫助的人聚集到一起 They say that giving blood can save lives. 有人說捐血可以救人 Well, giving your voice can change lives. 那麼捐聲音就可以改變他人的生活 All we need is a few hours of speech 從我們的代替說話者那裡 from our surrogate talker, 我們只需要幾個小時的語音, and as little as a vowel from our target talker, 然後再從我們的目標說話者那裡取得幾個母音, to create a unique vocal identity. 就可以建立出一個獨特的聲音身份 So that's the science behind what we're doing. 這就是我們研究背後的科學 I want to end by circling back to the human side 結尾我想再次強調人為因素 that is really the inspiration for this work. 因為它才是這項研究的啟發 About five years ago, we built our very first voice 大約在5年前, 我們為一個名為威廉的小男孩 for a little boy named William. 製造了第一個聲音 When his mom first heard this voice, 當他的媽媽第一次聽到兒子的聲音時, she said, "This is what William 她說, "如果威廉可以說話, would have sounded like 那他的聲音 had he been able to speak." 一定和這個一模一樣." And then I saw William typing a message 我們然後看到威廉在他的設備上 on his device. 打一條訊息 I wondered, what was he thinking? 我猜想他在想什麼? Imagine carrying around someone else's voice 試想一下借用了他人的聲音 for nine years 9年之後 and finally finding your own voice. 終於有了自己聲音的感覺 Imagine that. 試想一下 This is what William said: 這就是威廉說的話: "Never heard me before." "在這之前從來沒聽過我說話" Thank you. 謝謝大家 (Applause) 掌聲
B1 中級 中文 TED 聲音 語音 捐贈 代替 語言 TED】Rupal Patel:合成聲音,像指紋一樣獨特(Rupal Patel:合成聲音,像指紋一樣獨特)。 (【TED】Rupal Patel: Synthetic voices, as unique as fingerprints (Rupal Patel: Synthetic voices, as unique as fingerprints)) 489 33 Penguin 發佈於 2021 年 01 月 14 日 更多分享 分享 收藏 回報 影片單字