Placeholder Image

字幕列表 影片播放

  • I'd like to talk today

    我今天想給大家介紹

  • about a powerful and fundamental aspect

    一個對我們身份有重要影響的因素

  • of who we are: our voice.

    那就是:聲音

  • Each one of us has a unique voiceprint

    我們每一個人都有獨特的音印

  • that reflects our age, our size,

    它反映了我們的年紀, 體型,

  • even our lifestyle and personality.

    甚至我們的性格與生活習慣

  • In the words of the poet Longfellow,

    以詩人亨利·沃茲沃思·朗費羅的話說:

  • "the human voice is the organ of the soul."

    "人類的聲音就是靈魂的器官."

  • As a speech scientist, I'm fascinated

    做為一個語言科學家, 我對聲音產生的過程

  • by how the voice is produced,

    有著濃厚的興趣,

  • and I have an idea for how it can be engineered.

    我對如何來設計與建造聲音 有一個新的看法

  • That's what I'd like to share with you.

    我想和大家分享的這個看法

  • I'm going to start by playing you a sample

    先給大家放一個實例

  • of a voice that you may recognize.

    你們也許認得這個聲音

  • (Recording) Stephen Hawking: "I would have thought

    (錄音) 史蒂芬‧霍金:"我以為我說的話

  • it was fairly obvious what I meant."

    還是比較清楚的"

  • Rupal Patel: That was the voice

    這個錄音裡的聲音

  • of Professor Stephen Hawking.

    是來自史蒂芬‧霍金教授

  • What you may not know is that same voice

    但是你也許不知道同一個聲音

  • may also be used by this little girl

    也可能被這個小女孩使用

  • who is unable to speak

    她因為神經的問題

  • because of a neurological condition.

    而無法說話

  • In fact, all of these individuals

    事實上, 所有這些人

  • may be using the same voice,

    都可能用著同一個聲音,

  • and that's because there's only a few options available.

    因為目前可用的聲音只有幾個

  • In the U.S. alone, there are 2.5 million Americans

    僅在美國就有250萬人

  • who are unable to speak,

    無法通過語言溝通,

  • and many of whom use computerized devices

    他們大多數

  • to communicate.

    使用電子設備來溝通

  • Now that's millions of people worldwide

    這意味著全世界有數百萬的人

  • who are using generic voices,

    都用著同樣的聲音,

  • including Professor Hawking,

    其中包括了霍金教授,

  • who uses an American-accented voice.

    他用的是帶有美式口音的聲音

  • This lack of individuation of the synthetic voice

    這種人工聲音缺少的個體性

  • really hit home

    讓我非常的驚訝,

  • when I was at an assistive technology conference

    當我幾年前

  • a few years ago,

    在一個輔具科技會議上,

  • and I recall walking into an exhibit hall

    我記得走進一個展覽廳

  • and seeing a little girl and a grown man

    看見一個小女孩和一個成年男子

  • having a conversation using their devices,

    通過他們的設備談話,

  • different devices, but the same voice.

    雖然設備不同, 但聲音卻是一樣的

  • And I looked around and I saw this happening

    我望了望四周,發現

  • all around me, literally hundreds of individuals

    周圍有幾百個人

  • using a handful of voices,

    使用的聲音却只有幾種

  • voices that didn't fit their bodies

    都不符合他們的身體

  • or their personalities.

    或是性格.

  • We wouldn't dream of fitting a little girl

    我們不會考慮給一個小女孩裝上

  • with the prosthetic limb of a grown man.

    一個成年男子的假肢

  • So why then the same prosthetic voice?

    那為甚麼要給她一個 不屬於自己的聲音呢?

  • It really struck me,

    我因為感觸很深,

  • and I wanted to do something about this.

    所以決定對此做些甚麼

  • I'm going to play you now a sample

    接下來我要播放的例子

  • of someone who has, two people actually,

    是兩個人,

  • who have severe speech disorders.

    他們都有嚴重的語言障礙

  • I want you to take a listen to how they sound.

    我希望大家聽聽看他們的聲音

  • They're saying the same utterance.

    二人說的是一樣的話

  • (First voice)

    (聲音一)

  • (Second voice)

    (聲音二)

  • You probably didn't understand what they said,

    你們也許沒聽懂他們的話,

  • but I hope that you heard

    但我希望你們注意到了

  • their unique vocal identities.

    他們聲音中的獨特性

  • So what I wanted to do next is,

    我接下來要做的是,

  • I wanted to find out how we could harness

    找到一個方法來

  • these residual vocal abilities

    利用這些剩餘的聲音特性

  • and build a technology

    來發明一套科技

  • that could be customized for them,

    專為他們設計

  • voices that could be customized for them.

    將他們的聲音個性化,

  • So I reached out to my collaborator, Tim Bunnell.

    我找到了我的合作人, 蒂姆·布涅爾

  • Dr. Bunnell is an expert in speech synthesis,

    布涅爾博士是智能語音方面的專家,

  • and what he'd been doing is building

    他一直都在為

  • personalized voices for people

    他人設計個性化的語音

  • by putting together

    方法是通過收集

  • pre-recorded samples of their voice

    這些人之前的聲音錄音

  • and reconstructing a voice for them.

    然後再為他們重建一種聲音

  • These are people who had lost their voice

    但是布涅爾博士的這些研究對象

  • later in life.

    遇到的問題是後天性語言障礙

  • We didn't have the luxury

    我們這次的研究沒有這個福利

  • of pre-recorded samples of speech

    對這些先天帶有語言障礙的人

  • for those born with speech disorder.

    我們沒有事先錄製好的聲音樣品

  • But I thought, there had to be a way

    但是我想了想, 一定有一個方法

  • to reverse engineer a voice

    可以從僅有的所剩中

  • from whatever little is left over.

    將聲音逆向製作出來

  • So we decided to do exactly that.

    所以我們決定就這樣做

  • We set out with a little bit of funding from the National Science Foundation,

    我們從國家科學基金會獲得了一些資金,

  • to create custom-crafted voices that captured

    用以建造一套可以抓住他們

  • their unique vocal identities.

    聲音特性的個體化語音

  • We call this project VocaliD, or vocal I.D.,

    我們將該專案稱作VocaliD, 或是vocal I.D.,

  • for vocal identity.

    作為語音身份(Vocal Identity)的簡寫

  • Now before I get into the details of how

    在我向大家播放

  • the voice is made and let you listen to it,

    和介紹如何製作這個聲音之前,

  • I need to give you a real quick speech science lesson. Okay?

    我需要先給大家上一堂 語言科學課, 好嗎?

  • So first, we know that the voice is changing

    首先,我們需要了解聲音

  • dramatically over the course of development.

    在成長的過程中會發生巨大的變化

  • Children sound different from teens

    兒童和青少年聽起來會不同

  • who sound different from adults.

    而青少年和成年人之間也是

  • We've all experienced this.

    我們都曾經歷過這些語言變化階段

  • Fact number two is that speech

    事實二,是語言的產生

  • is a combination of the source,

    是由多個來源組成,

  • which is the vibrations generated by your voice box,

    其中包括了你喉頭產生的顫動,

  • which are then pushed through

    這種顫動接著

  • the rest of the vocal tract.

    會貫穿整個聲腔

  • These are the chambers of your head and neck

    圖像顯示的是頭和脖子的內部

  • that vibrate,

    它們會顫動,

  • and they actually filter that source sound

    其實它們是將來源聲音過濾掉

  • to produce consonants and vowels.

    來產生子音和母音

  • So the combination of source and filter

    所以聲音的來源和過濾過程加在一起

  • is how we produce speech.

    就是我們產生聲音的方法

  • And that happens in one individual.

    這是一個人身上發生的過程

  • Now I told you earlier that I'd spent

    我之前告訴過大家

  • a good part of my career

    我職業生涯的大部分時間

  • understanding and studying

    都用來研究和學習

  • the source characteristics of people

    有嚴重語音障礙人士的

  • with severe speech disorder,

    聲音源的特徵,

  • and what I've found

    我發現

  • is that even though their filters were impaired,

    雖然他們的過濾器官已遭到損壞,

  • they were able to modulate their source:

    他們可以調製自己的聲音來源:

  • the pitch, the loudness, the tempo of their voice.

    包括高低度, 大小, 以及速度

  • These are called prosody, and I've been documenting for years

    這些被稱之為音律,

  • that the prosodic abilities of these individuals

    我用了多年的時間 來紀錄這些人是如何

  • are preserved.

    維持自己音律的能力

  • So when I realized that those same cues

    當我認識到同樣的線索

  • are also important for speaker identity,

    對說話人的身份同樣重要的時候,

  • I had this idea.

    我有了一個想法

  • Why don't we take the source

    為什麼我們不找一個 聲音是我們所需要的人,

  • from the person we want the voice to sound like,

    從他那採集聲音源

  • because it's preserved,

    因為它已被保留,

  • and borrow the filter

    然後再找一個有著相似年紀和體型的人

  • from someone about the same age and size,

    從他那借用過濾器,

  • because they can articulate speech,

    因為他們能清晰地說話,

  • and then mix them?

    然後將二者混合?

  • Because when we mix them,

    因為當我們將它們混合的時候,

  • we can get a voice that's as clear

    我們得到的聲音將會和

  • as our surrogate talker --

    那個代替說話者一樣清楚

  • that's the person we borrowed the filter from

    代替說話者就是我們借用過濾器的人

  • and is similar in identity to our target talker.

    而產生的語音和我們 目標說話者有相似的辨認度

  • It's that simple.

    就這麼簡單

  • That's the science behind what we're doing.

    這就我們該項研究的科學性

  • So once you have that in mind,

    有了這個想法以後,

  • how do you go about building this voice?

    應該怎麼來製造這個聲音呢?

  • Well, you have to find someone

    首先,你必須找一個

  • who is willing to be a surrogate.

    願意當這個代替者的人

  • It's not such an ominous thing.

    這個任務也不是太糟糕

  • Being a surrogate donor

    當一個聲音捐贈者

  • only requires you to say a few hundred

    只要求你閱讀幾百

  • to a few thousand utterances.

    到幾千句話.

  • The process goes something like this.

    以下是過程

  • (Video) Voice: Things happen in pairs.

    (錄影)聲音: 事情成雙成對地發生

  • I love to sleep.

    我愛睡覺

  • The sky is blue without clouds.

    天空藍色無雲

  • RP: Now she's going to go on like this

    演講者: 她接下來的3-4個小時

  • for about three to four hours,

    都會繼續閱讀,

  • and the idea is not for her to say everything

    目的是不要讓她說

  • that the target is going to want to say,

    所有目標說話者要說的話

  • but the idea is to cover all the different combinations

    真正的目的是要概擴所有

  • of the sounds that occur in the language.

    在語言中可能發生的組合

  • The more speech you have,

    你說的話越多,

  • the better sounding voice you're going to have.

    你的聲音就會聽起來更好

  • Once you have those recordings,

    當錄音完成後,

  • what we need to do

    我們接下來

  • is we have to parse these recordings

    要對這些錄音做語法分析

  • into little snippets of speech,

    將它們分段,

  • one- or two-sound combinations,

    大概1-2個音的組合,

  • sometimes even whole words

    有時候也會是那些

  • that start populating a dataset or a database.

    填入數據集或是數據庫的完整單字

  • We're going to call this database a voice bank.

    我們將這個數據庫稱之為聲音銀行

  • Now the power of the voice bank

    聲音銀行的力量

  • is that from this voice bank,

    使我們通過它

  • we can now say any new utterance,

    可以說出任何新的語句,

  • like, "I love chocolate" --

    比如說, "我喜歡巧克力"

  • everyone needs to be able to say that

    所有人都需要說這類的話的能力

  • fish through that database

    搜尋數據庫

  • and find all the segments necessary

    找到必須的部分

  • to say that utterance.

    來完成這個語句

  • (Video) Voice: I love chocolate.

    (錄影)聲音: 我喜歡巧克力

  • RP: So that's speech synthesis.

    演講人: 這是一個人工聲音

  • It's called concatenative synthesis, and that's what we're using.

    我們將其稱之為連環整合 我們使用的就是這個方法

  • That's not the novel part.

    這不是新奇的部分

  • What's novel is how we make it sound

    它新奇之處是我們使它

  • like this young woman.

    聽起來就像是這個年輕女士的聲音

  • This is Samantha.

    她是珊曼莎

  • I met her when she was nine,

    在她9歲時, 我第一次見到她

  • and since then, my team and I

    在那之後, 我和我的團隊

  • have been trying to build her a personalized voice.

    一直設法為她製造一款個性化的聲音

  • We first had to find a surrogate donor,

    我們首先需要一個捐贈者,

  • and then we had to have Samantha

    然後我們會讓珊曼莎

  • produce some utterances.

    發一些音

  • What she can produce are mostly vowel-like sounds,

    雖然她所發出的音大部分都類似母音,

  • but that's enough for us to extract

    但我們用這些已足夠

  • her source characteristics.

    來取得她聲音根源的特性

  • What happens next is best described

    接下來所發生的事

  • by my daughter's analogy. She's six.

    用我女兒的比喻來描述再合適不過, 她6歲

  • She calls it mixing colors to paint voices.

    她說這是混合顏色來畫聲音

  • It's beautiful. It's exactly that.

    很漂亮, 就是這樣

  • Samantha's voice is like a concentrated sample

    珊曼莎的聲音就像是紅色食用色素

  • of red food dye which we can infuse

    的濃縮樣品

  • into the recordings of her surrogate

    我們可以將它注入到她代替者的錄音裡

  • to get a pink voice just like this.

    然後取得一個像這樣的粉色聲音

  • (Video) Samantha: Aaaaaah.

    (錄影)珊曼莎:啊.....

  • RP: So now, Samantha can say this.

    現在, 珊曼莎可以說這個

  • (Video) Samantha: This voice is only for me.

    (錄影)珊曼莎: 這個聲音是我的專屬

  • I can't wait to use my new voice with my friends.

    我等不及與我朋友們分享我的聲音

  • RP: Thank you. (Applause)

    謝謝

  • I'll never forget the gentle smile

    我永遠都不會忘記

  • that spread across her face

    當她第一次聽到自己的聲音時

  • when she heard that voice for the first time.

    佈滿在她臉上那輕柔的微笑

  • Now there's millions of people

    目前世界上

  • around the world like Samantha, millions,

    有好幾百萬像珊曼莎的人, 幾百萬,

  • and we've only begun to scratch the surface.

    而我們的工作才剛剛開始

  • What we've done so far is we have

    我們目前只有

  • a few surrogate talkers from around the U.S.

    幾個來自美國的語言代替者

  • who have donated their voices,

    捐贈了他們的聲音,

  • and we have been using those

    我們使用了他們的捐贈

  • to build our first few personalized voices.

    來建造我們第一批個性化的聲音

  • But there's so much more work to be done.

    但還有更多的工作要完成

  • For Samantha, her surrogate

    對珊曼莎而言, 她的代替者

  • came from somewhere in the Midwest, a stranger

    是來自美國中西部, 一個陌生人

  • who gave her the gift of voice.

    送給了她一個聲音禮物

  • And as a scientist, I'm so excited

    作為一個科學家, 我很開心

  • to take this work out of the laboratory

    能將這個研究從實驗室

  • and finally into the real world

    帶到現實的世界

  • so it can have real-world impact.

    讓它產生一個實際的影響

  • What I want to share with you next

    我接下來想跟大家分享

  • is how I envision taking this work

    我如何想像讓這項研究

  • to that next level.

    進入下一個階段

  • I imagine a whole world of surrogate donors

    我想像著一個充滿了聲音捐贈者的世界

  • from all walks of life, different sizes, different ages,

    他們來自各行各業, 有著不同的體型和年齡,

  • coming together in this voice drive

    一起聚集到這個聲音活動

  • to give people voices

    給其他人提供的聲音

  • that are as colorful as their personalities.

    就像他們個性一樣多姿多采

  • To do that as a first step,

    我們的第一個步驟,

  • we've put together this website, VocaliD.org,

    是建立這個網站, VocaliD.org,

  • as a way to bring together those

    通過這個網站將

  • who want to join us as voice donors,

    那些願意捐贈聲音的,

  • as expertise donors,

    願意提供意見的,

  • in whatever way to make this vision a reality.

    還有想提供其它幫助的人聚集到一起

  • They say that giving blood can save lives.

    有人說捐血可以救人

  • Well, giving your voice can change lives.

    那麼捐聲音就可以改變他人的生活

  • All we need is a few hours of speech

    從我們的代替說話者那裡

  • from our surrogate talker,

    我們只需要幾個小時的語音,

  • and as little as a vowel from our target talker,

    然後再從我們的目標說話者那裡取得幾個母音,

  • to create a unique vocal identity.

    就可以建立出一個獨特的聲音身份

  • So that's the science behind what we're doing.

    這就是我們研究背後的科學

  • I want to end by circling back to the human side

    結尾我想再次強調人為因素

  • that is really the inspiration for this work.

    因為它才是這項研究的啟發

  • About five years ago, we built our very first voice

    大約在5年前, 我們為一個名為威廉的小男孩

  • for a little boy named William.

    製造了第一個聲音

  • When his mom first heard this voice,

    當他的媽媽第一次聽到兒子的聲音時,

  • she said, "This is what William

    她說, "如果威廉可以說話,

  • would have sounded like

    那他的聲音

  • had he been able to speak."

    一定和這個一模一樣."

  • And then I saw William typing a message

    我們然後看到威廉在他的設備上

  • on his device.

    打一條訊息

  • I wondered, what was he thinking?

    我猜想他在想什麼?

  • Imagine carrying around someone else's voice

    試想一下借用了他人的聲音