Placeholder Image

字幕列表 影片播放

  • Hello, my name is Christian Rudder,

    大家好,我的名字叫 Christian Rudder

  • and I was one of the founders of OK Cupid.

    我是 OK Cupid 的創辦者之一

  • It's now one of the biggest dating sites in the United States.

    現在它是美國最大的交友網站之一

  • Like almost everyone at the site,

    跟這網站的其它負責人一樣

  • I was a math major, and, as you might expect,

    我主修數學,而就如你所預期的

  • we're known for the analytic approach

    我們較為人知的是

  • we have taken to love.

    用分析方式研究戀愛行為

  • We call it our matching algorithm.

    我們把它叫做「速配演算法」

  • Basically OK Cupid's matching algorithm helps us decide whether two people should go on a date.

    基本上,OK Cupid 的速配演算法幫助我們決定某兩個人該不該去約會

  • We built our entire business around it.

    這是我們事業的技術核心

  • Now, algorithm is a fancy word,

    演算法聽起來很花俏

  • and people like to drop it like it's this big thing,

    而人們放棄搞懂因為它太複雜了

  • but, really, an algorithm is just a systematic,

    但說真的,演算法只是一個有系統的

  • step-by-step way to solve a problem.

    一步一步解決問題的方法

  • It doesn't have to be fancy at all.

    完全不必要是花俏的

  • Here, in this lesson, I'm going to explain

    這個課程裡,我將會解釋

  • how we arrived at our particular algorithm

    我們是怎麼設計我們的演算法

  • so you can see how it's done.

    來讓你理解它是如何運作的

  • Now, why are algorithms even important?

    為什麼演算法如此重要

  • Why does this lesson even exist?

    又為什麼要有這個課程

  • Well, notice one very significant phrase I used above:

    這個,請注意我剛剛用的那個非常具暗示性的詞彙:

  • they are a "step-by-step" way to solve a problem,

    演算法是「一步一步」解決問題的方法

  • and, as you probably know,

    而就像你可能知道的

  • computers excel at step-by-step processes.

    電腦很擅長做這種一步一步規劃好的程序

  • A computer without an algorithm

    一臺沒有演算法的電腦

  • is basically an expensive paperweight.

    基本上只是一個很貴的紙鎮而已

  • And since computers are such a pervasive part of everyday life, algorithms are everywhere.

    由於電腦在日常生活中已經非常普及,所以演算法也是無所不在

  • The math behind OK Cupid's matching algorithm

    而 OK Cupid 演算法背後的數學

  • is surprisingly simple.

    其實異常地簡單

  • It's just some addition,

    只是一些加法

  • multiplication,

    乘法

  • a little bit of square roots.

    還有一些些開根號

  • The tricky part in designing it, though,

    然而要設計它比較麻煩的部份

  • was figuring out how to take something mysterious,

    反而是要想辦法把一些神秘的東西

  • human attraction,

    例如人類的吸引力

  • and break it into components that a computer can work with.

    把它變成電腦可以運算的東西

  • Well, the first thing we needed to match people up was data,

    那麼,要將人們配對,我們首先需要的是數據

  • something for the algorithm to work with.

    也就是要讓演算法能夠計算的東西

  • The best way to get data quickly from people

    要快速取得人們資料的最好方法

  • is to just ask for it.

    就是直接問他

  • So, we decided that OK Cupid should ask users questions,

    所以,我們決定 OK Cupid 應該要問使用者一些問題

  • stuff like, "Do you want to have kids one day?"

    像是:「你未來希望有小孩嗎?」

  • and "How often do you brush your teeth?",

    還有「你多常刷牙?」

  • "Do you like scary movies?"

    「你喜歡恐怖片嗎?」

  • and big stuff like "Do you believe in God?"

    以及較大的問題,像是「你相信神嗎?」

  • Now, a lot of the questions are good

    而很多問題都有助於

  • for matching like with like,

    將有相同喜好的人配在一起

  • that is when both people answer the same way.

    這是當雙方都回答了相同答案的情況

  • For example, two people who are both into scary movies are probably a better match than one person who is and one person who isn't.

    舉例來說,兩個都喜歡恐怖片的人也許就是不錯的配對,這比起將喜歡和不喜歡的人配在一起好

  • But what about a question like,

    但如果是像這樣的問題:

  • "Do you like to be the center of attention?"

    「你喜歡成為眾人的焦點嗎?」

  • If both people in a relationship are saying yes to this,

    如果一對情侶的兩個人都說「喜歡」

  • then they are going to have massive problems.

    那麼他們就有大問題了

  • We realized this early on,

    我們很早就知道這點

  • and so we decided we needed

    所以我們決定

  • a bit more data from each question.

    每個問題都需要再多一點資訊

  • We had to ask people to specify not only their own answer,

    我們要求使用者不只回答問題本身

  • but the answer they wanted from someone else.

    同時也回答他們對別人的期望

  • That worked really well,

    這效果真的很好

  • but we needed one more dimension.

    但我們還須要另一個衡量的維度

  • Some questions tell you more about a person than others.

    有一些問題比起其它問題,更能提供一個人的個性

  • For example, a question about politics, something like,

    比如說政治議題如:

  • "Which is worse: book burning or flag burning?"

    「哪一個比較糟:燒書或是燒國旗?」

  • might reveal more about someone than their taste in movies.

    比起對電影的品味,這可能透露更多這個人的個性

  • And it doesn't make sense to weigh all things equally,

    而每個人看事情的輕重大小都不同

  • so we added one final data point.

    所以我們加入了最後一個資料點

  • For everything that OK Cupid asks you,

    每一個 OK Cupid 問你的問題

  • you have a chance to tell us

    你都可以告訴我們

  • the role it plays in your life,

    它在你生活中扮演的角色

  • and this ranges from irrelevant to mandatory.

    而選項是從「不相關」到「極重要」

  • So now, for every question,

    所以現在,每一個問題

  • we have three things for our algorithm:

    我們都有三筆資訊可以給我們的演算法:

  • first, your answer;

    第一,你的答案

  • second, how you want someone else,

    第二,你對別人答案的期望

  • your potential match,

    也就是對於可能會跟你配對的人

  • to answer;

    的答案的期望

  • and three, how important the question is to you at all.

    第三,這問題究竟對你有多重要

  • With all this information,

    有了全部的這些資訊

  • OK Cupid can figure out how well two people will get along.

    OK Cupid 就可以算出這兩個人相處會多融洽

  • The algorithm crunches the numbers and gives us a result.

    演算法會在數值運算後,給我們一個答案

  • As a practical example,

    舉一個實際的例子

  • let's look at how we'd match you with another person,

    讓我們來看看你和另一個人有多速配

  • let's call him, "B".

    姑且叫他 B

  • Your match percentage with B is based on

    你和 B 的速配指數是基於

  • questions you've both answered.

    你們雙方都回答過的問題

  • Let's call that set of common questions, "s".

    這些問題的集合叫做 s

  • As a very simple example, we use a small set "s"

    舉一個非常簡單的例子, 我們用很小的集合 s

  • with just two questions in common

    只包含兩個雙方都回答過的問題

  • and compute a match from that.

    然後由它算出速配程度

  • Here are our two example questions.

    舉例來說,他們回答了這兩個問題

  • The first one, let's say, is, "How messy are you?"

    第一個,比如說:「你有多不愛乾淨?」

  • and the answer possibilities are

    而可能的答案是

  • very messy,

    「很髒亂」

  • average,

    「普通」

  • and very organized.

    及「很愛乾淨」

  • And let's say you answered "very organized,"

    假設你的答案是「很愛乾淨」

  • and you'd like someone else to answer "very organized,"

    而你期望別人也回答「很愛乾淨」

  • and the question is very important to you.

    並且這問題對你來說「非常重要」

  • Basically you are a neat freak.

    基本上你有潔癖

  • You're neat,

    你愛乾淨

  • you want someone else to be neat,

    你也希望別人愛乾淨

  • and that's it.

    這是你的結果

  • And let's say B is a little bit different.

    又假設 B 的回答有點不一樣

  • He answered very organized for himself,

    他回答自己「很愛乾淨」

  • but average is OK with him

    但別人回答「普通」

  • as an answer from someone else,

    對他來說就可以了

  • and the question is only a little important to him.

    而且這問題對他只有「些許重要」

  • Let's look at the second question,

    接著我們來看第二個問題

  • it's the one from our previous example:

    是我們先前說過的例子:

  • "Do you like to be the center of attention?"

    「你喜歡成為眾人的焦點嗎?」

  • The answers are just yes and no.

    而答案只有「是」或「否」

  • Now you've answered "no,"

    假設你的答案是「否」

  • how you want someone else to answer is "no,"

    而你希望對方回答「否」

  • and the questions is only a little important to you.

    並且這問題對你只有「些許重要」

  • Now B, he's answered "yes,"

    換 B,他回答「是」

  • he wants someone else to answer "no,"

    而他希望對方回答「否」

  • because he wants the spotlight on him,

    因為他希望焦點是在他身上

  • and the question is somewhat important to him.

    而這問題對他「蠻重要的」

  • So, let's try to compute all of this.

    好,讓我們試著來算看看

  • Our first step is,

    我們的第一個步驟是

  • since we use computers to do this,

    因為要用電腦計算

  • we need to assign numerical values

    我們必須對不同答案如「蠻重要的」和「非常重要」

  • to ideas like "somewhat important" and "very important"

    賦予相對應的數字

  • because computers need everything in numbers.

    因為電腦必須透過數字才能運算

  • We at OK Cupid decided on the following scale:

    在 OK Cupid 裡我們訂定了這樣的量表:

  • irrelevant is worth 0,

    「不相關」是 0

  • a little important is worth 1,

    「些許重要」是 1

  • somewhat important is worth 10,

    「蠻重要的」是 10

  • very important is 50,

    「非常重要」是 50

  • and absolutely mandatory is 250.

    而「極重要」是 250

  • Next, the algorithm makes two simple calculations.

    接著,演算法會進行兩個簡單的運算

  • The first is how much did B's answers satisfy you,

    第一是 B 的答案有多符合你的期望

  • that is, how many possible points did B score on your scale?

    也就是,B 在你的量表上會得到幾分?

  • Well, you indicated that B's answer

    嗯,你在第一個愛乾淨的問題中

  • to the first question about messiness

    表示 B 的答案

  • was very important to you.

    對你非常重要

  • It's worth 50 points and B got that right.

    它佔 50 分而 B 正好符合

  • The second question is worth only 1

    而第二個問題只佔 1 分

  • because you said it was only a little important,

    因為你說它只有些許重要

  • and B got that wrong.

    而 B 答得不對

  • So B's answers were 50 out of 51 possible points.

    所以 B 的答案在總分 51 分裡得到 50 分

  • That's 98% satisfactory.

    這樣是 98% 的滿意度

  • It's pretty good.

    相當不錯

  • And, the second question of the algorithm looks at

    而演算法第二步要做的是

  • is how much did you satisfy B.

    你有多符合 B 的要求

  • Well, B placed 1 point on your answer

    嗯,B 認為你對整潔問題

  • to the messiness question

    的答案佔 1 分

  • and 10 on your answer to the second.

    而第二個問題的答案佔 10 分

  • Of those, 11, that's 1 plus 10,

    總共是 11 分,也就是 1 + 10

  • you earned 10,

    你得到 10 分

  • you guys satisfied each other on the second question.

    你們雙方在第二個問題符合兩方的條件

  • So your answers were 10 out of 11

    所以你的答案是 11 分裡得 10 分

  • equals 91% satisfactory to B.

    也就是對於 B 來說 91% 的滿意度

  • That's not bad.

    也是不錯

  • The final step is to take these two match percentages

    而最後一步, 是把這兩個數字

  • and get one number for the both of you.

    變成你們兩個速配指數

  • To do this, the algorithm multiplies your scores,

    要完成這件事, 演算法會把你們的分數乘起來

  • then takes the nth root,

    然後開 n 次方根

  • where n is the number of questions.

    這裡 n 是問題的數目

  • Because s, which is the number of questions,

    因為在我們例子的 s 裡

  • in this sample, is only 2,

    問題數只有 2

  • we have match percentage equals

    我們就算出速配指數

  • the square root of 98% times 91%.

    是 98% 乘 91% 的開根號

  • That equals 94%.

    也就是 94%

  • That 94% is your match percentage with B.

    這 94% 就是你和 B 的速配指數

  • It's a mathematical expression

    這是我們基於對於你們的了解

  • of how happy you'd be with each other

    透過數學算式

  • based on what we know.

    來表現出你們在一起會多快樂的方式

  • Now, why does the algorithm multiply as opposed to, say,

    而,為什麼演算法要用相乘的

  • average the two match scores together

    而不用相加的

  • and do the square-root business?

    並且要取平方根呢?

  • In general, this formula is called the geometric mean,

    一般來說,這個公式叫作幾何平均數

  • which is a great way to combine values

    它是將範圍很廣

  • that have wide ranges

    並表達不同特性的數據合在一起的

  • and represent very different properties.

    一種很棒的方法

  • In other words, it's perfect for romantic matching.

    也就是說,它對浪漫的配對來說是很完美的

  • You've got wide ranges

    你會有很廣的數據

  • and you've got tons of different data points,

    你也許多不一樣的資訊

  • like I said, about movies,

    比如說,關於電影

  • about politics,

    關於政治

  • about religion,

    關於信仰

  • about everything.

    關於所有事

  • Intuitively, too, this makes sense.

    直覺來說,這也合理

  • Two people satisfying each other 50%

    兩個人互相有 50% 的滿意度

  • should be a better match

    應該會比

  • than two others who satisfy 0 and 100,

    一人是 0% 另一人是 100% 來得好

  • because affection needs to be mutual.

    因為感情是互相的

  • After adding a little correction for margin of error,

    再加上一些邊界錯誤的修正

  • in the case when we have a very small number of questions,

    就是說當問題數很少的時候的修正

  • like we do in this example,

    像是我們這個例子

  • we're good to go.

    我們就完成了

  • Any time OK Cupid matches two people,

    每一次 OK Cupid 在幫兩人配對時

  • it goes through the steps we just outlined.

    都經過了我們所講的那些步驟

  • First it collects data about your answers,

    首先從你的答案收集資訊

  • then it compares your choices and preferences

    然後用簡潔的數學方法

  • to other people in simple, mathematical ways.

    來將你和其它人的偏好作比較

  • This, the ability to take real world phenomena and make them something a microchip can understand, is, I think, the most important skill anyone can have these days.

    這樣把真實世界的現象變成微晶片能運作的一種能力,我認為是我們現今可以擁有最重要的技能

  • Like you use sentences to tell a story to a person,

    就像是你用句子來向別人說故事一樣

  • you use algorithms to tell a story to a computer.

    你會用演算法來對電腦說故事

  • If you learn the language,

    如果你學會這種語言

  • you can go out and tell your stories.

    你就可以把你的故事告訴別人

  • I hope this will help you do that.

    這就是我希望幫助你達成的事情

Hello, my name is Christian Rudder,

大家好,我的名字叫 Christian Rudder

字幕與單字

單字即點即查 點擊單字可以查詢單字解釋