Placeholder Image

字幕列表 影片播放

  • Every YouTube video has a unique ID.

    每一部 YouTube 的影片都有自己的代號。

  • It's up in the URL: a string of eleven characters

    就在網址裡面:一列共11位的字元,

  • that uniquely identifies which video you want.

    用來顯示出你要的影片。

  • Now, YouTube has millions and millions of videos.

    時至今日,YouTube 上有數以百萬計的影片。

  • The last stats that they released said they have

    最近有統計指出在 YouTube

  • 400 hours of video being uploaded every minute.

    每分鐘有超過400小時的影片被上傳

  • So: are they ever going to run out of those IDs?

    那麼問題來了:「代碼會有用盡的一天嗎?」

  • Well, to find out, let's talk about counting systems.

    嗯,要想知道著個答案,我們先來談談記數系統

  • People count in Base 10. 0 to 9.

    我們人類數數以 10 為底,從 0 到 9。

  • That'll be, hopefully, familiar to you.

    關於這點,希望你不會感到陌生

  • Computers count in base 2, in binary,

    電腦 (計算機) 則是以 2 為底的 2 進位,

  • but that's difficult for humans to read,

    但那對我們人類閱讀上來說有困難

  • it gets too long to write really, really quickly,

    因為要在短時間內寫下來實在是太長了

  • so often computers will display it in base 16, hexadecimal.

    所以來說一般在電腦 (計算機) 上是以 16為底的 16 進位

  • You have 0 to 9, and then A to F,

    你會從 0 算到 9 再來是 A 到 F (10 ~ 15),

  • and then you start adding to the next column.

    之後你再進到下一位。

  • Humans can't understand that easily,

    對於 16 進位人類無法一看就懂,

  • but it's efficient if we have to type it in somewhere,

    但若是要記下來的話還是可以的,

  • and 16 - 2 to the power of 4 - is also easy for computers to deal with.

    再說16,也就是 2 的 4 次方,對電腦 (計算機) 來說也是很容易處理的。

  • So how about Base 64?

    那麼 64 進位呢?

  • That'd be a ridiculous counting system, right? Except.

    你想說那是個很愚蠢的記數方式,是吧?然而,

  • 64 is another one of those easy numbers for computers,

    64 對電腦 (計算機) 來說只不過是另一個簡單的數字罷了

  • it is 2 to the power of 6.

    它 (64) 是 2 的 6 次方

  • And humans can get to 64 very easily:

    人類想要湊出 64 也並非難事:

  • 0 to 9, then capital letters A to Z,

    從 0 到 9,然後大寫的 A 到 Z

  • then small letters a to z, and two other characters.

    再加上小寫的 a 到 z,最後再加兩個其他符號。

  • Most Base 64 uses slash and plus,

    大部分的時候是用 / 和 + (斜線和加號)

  • but they don't work so well in URLs,

    但在網址中他們有別的功能,

  • so YouTube uses hyphen and underscore.

    所以 YouTube 改用 - 和 _ (連字號和底線)

  • That YouTube URL, that unique ID,

    我們說的 YouTube 的網址,那串獨一無二的代碼

  • is really just a random number in base 64.

    就只是一串以 64 為底的亂碼罷了。

  • They could have have picked base 10 or base 16,

    他們 (YouTube) 大可選用以 10 為底或是以 16 為底,

  • but they didn't: they went with 64,

    但他們 (YouTube) 沒有:他們選擇以 64 為底,

  • because it will let you cram a huge number into a small space

    如此一來便可將一個巨大的數字壓縮成短短的幾位數

  • and still make it vaguely human readable.

    並且讓人類在某種程度上是可閱讀的

  • Author and programmer Sam Hughes, by the way,

    順帶一提,身為作家和工程師的 Sam hughes

  • pushed this to the limit, and invented Base 65,536,

    將此一方法發揮到了極致,創造了 65536 進位,

  • which includes basically every character from every language.

    這種進位法也囊括了幾乎所有語言的所有字符。

  • It is ridiculous and unnecessary,

    這無疑是可笑且無用的,

  • but when has that ever stopped programmers?

    但工程師們從來就是對此一行為樂此不疲。

  • So why didn't YouTube just start counting at 1 and work up?

    那麼為什麼 YouTube 不從 1 開始然後往上加就好了呢?

  • Well, first, they would have to synchronise their counting

    這麼說吧,若是如此他們首先必須要

  • between all the servers handling the video uploads,

    先將負責上傳影片的伺服器同步,或是

  • or they'd have to assign each server a block of numbers.

    他們必須要先劃分清楚每一台伺服器負責的數字區間。

  • Either way, there's a lot of tracking to do,

    但無論以上哪種方法,都要不斷的檢查、

  • a lot of making sure that it's never duplicated.

    重複確認影片的編號絕對不會重複。

  • Instead, they just generate a random number for each video,

    相反地,現在他們只要為每一部影片產出一組亂數,

  • see if it's already taken, and if not, use it.

    並確認該組亂數沒有被使用就夠了。

  • And secondly, it is a really, really bad idea

    第二點,在網址中使用單純的

  • to just count 1, 2, 3 and so on in URLs.

    1, 2, 3 等等的數字絕對不是一個好主意。

  • Incremental counters, as they're called, can be a big security flaw:

    使用遞增記數的話,會產生非常嚴重的安全漏洞:

  • if you see video 283 up there, then you might wonder:

    如果你看見了一個編號 283 的影片,你一定會想:

  • what's video 284? Or video 282?

    編號 284 的影片是什麼?那 282 呢?

  • It's easy to enumerate, as it's called,

    這種編排法是很容易被類推的,

  • to run through the entire list.

    換句花說:很容易被看光光。

  • YouTube Unlisted videos, the ones that don't appear publicly

    用遞增記數的話就做不到影片非公開了(YouTube 上的非公開影片,

  • but that you can send the link to people, those wouldn't work.

    雖說是非公開,但你仍然可以把網址發給你的朋友。)

  • And by the way? Lots of badly designed sites do use incremental counters.

    喔對了,有不少設計不良的網站就是用這種遞增記數的

  • And it is a terrible idea.

    真的是遭透了。

  • It might tell your competitors exactly how many customers you have,

    這就好像在告訴你的競爭對手說你有多少使用者,

  • 'cos they can just count them.

    他們只要稍微數一下就知道了。

  • It might let people download all your records easily,

    也很容易讓有心人下載你所有的記錄,

  • 'cos they can just run through them.

    只要他們把數字都試了個遍。

  • And in one site that someone in Florida emailed me about this week,

    有個在佛羅里達州的人跟我說他知道有個網站,

  • it lets you look at other people's personal details.

    甚至讓你有辦法查看其他人的個人資料。

  • Don't use incremental counters if you're building a web site. Use a random number.

    如果你要架設網站的話請記住要用『亂數』, 『不、要、用、遞、增、記、數』

  • Which brings me to the question:

    回到我們的問題:

  • just how big are the numbers that YouTube uses?

    YouTube 到底用了多大的數字?

  • Well, let's work it out.

    我們來算一下吧。

  • One character of base 64 lets you have 64 ID numbers.

    64 進位的一位數可以有 64 個不同的 ID

  • Two characters? That's 64 by 64, or 4,096.

    兩位數?就是 64 x 64 = 4096

  • Three characters? 64 times 64 times 64 -- or 64 to the power of 3.

    三位?64 x 64 x64 或是 64 的三次方。

  • That is already more than a quarter of a million.

    而這個數字已經超過 25 萬了

  • And if we go to four? Well, now we're above 16 million.

    四位數呢?那就是大於 1600 萬了

  • If you use Base 64, then you can assign an ID number

    如果用 64 進位的話,就可以生產出足以

  • to everyone who lives in London down there twice over,

    超出倫敦居住人口數兩倍的影片代碼了,

  • and you'll only need four characters.

    而這,僅僅只用了四、位、數。

  • This gets big fast. We can keep on doing this,

    這數字成長的飛快。若我們繼續下去的話

  • and by seven characters we're already at four quadrillion.

    當我們用了七位數的時候就會得到超過 四兆 組的代碼。

  • Now, I assume that YouTube checks through a dictionary,

    呃...在這邊我希望 YouTube 有先檢查過,

  • and doesn't allow any actual words to appear up there --

    確保上面那串代碼不會有「單字」的出現

  • particularly anything rude.

    尤其是那些不好的字眼

  • But that is going to be a tiny minority of the URLs,

    當然那只佔了眾多網址中一小部分罷了,

  • so for our purposes, we can pretty much just ignore that.

    所以在這邊我們可以暫時先忽略那些

  • At YouTube's 11 characters, we are at 73 quintillion 786 quadrillion

    而 YouTube 用 11 位數產生的代碼可供 7378 京 (京:10^16)

  • 976 trillion 294 billion 838 million

    6976 兆 2948 億

  • 206 thousand and 464 videos.

    3820 萬 6464 部影片使用

  • That's enough for every single human on planet Earth

    而這足以讓這顆星球上的 每、一、個、人

  • to upload a video every minute for around 18,000 years.

    以每分鐘上傳一部影片的速度,持續大約18000年

  • YouTube planned ahead.

    YouTube 在這方面領先群雄啊

  • Can they run out of URLs? Technically, yes.

    他們的網址會有用完的一天嗎? 理論上來說,會。

  • Practically? No. And if they did?

    實際上呢?並不會。 但如果真的用完的話呢?

  • They could just add one more character.

    他們只需要再進一位就好啦(11--->12)

  • [Translating these subtitles? Add your name here!]

    翻譯:Andy Lin

  • Ha! One take! One take! Yes!

    哈!一鏡到底!一鏡到底啦!YES!

Every YouTube video has a unique ID.

每一部 YouTube 的影片都有自己的代號。

字幕與單字

單字即點即查 點擊單字可以查詢單字解釋