Placeholder Image

字幕列表 影片播放

由 AI 自動生成
  • It's not just you.

    不只是你。

  • Captcha, the online test to tell whether you're a human or a robot, has been getting harder.

    驗證碼是一種用來判斷你是人類還是機器人的在線測試,它的難度越來越大。

  • This is one of the guys who invented it.

    他就是發明者之一。

  • Oh yeah, I fail them all the time.

    哦,是的,我經常失敗。

  • I never know how much to say there's a traffic light if it's only like a tiny little corner of it.

    我從來都不知道該說有多少紅綠燈,如果它只是像一個很小的角落。

  • The problem is that Captcha was designed to keep malicious bots out of certain websites.

    問題在於,驗證碼的設計是為了防止惡意機器人進入某些網站。

  • But every time you've solved a test, you've actually made those bots smarter.

    但每次你解決了一個測試,實際上都讓這些機器人變得更聰明瞭。

  • As more and more data was fed into these perceptual systems, they simply got a lot better at solving the perceptual tasks.

    隨著越來越多的數據被輸入到這些感知系統中,它們在解決感知任務方面的能力也變得越來越強。

  • So what does it take to design a puzzle that can outsmart a bot but still be solved by any human?

    那麼,怎樣才能設計出既能勝過機器人,又能被人類解開的謎題呢?

  • This is the tech behind Captcha.

    這就是驗證碼背後的技術。

  • So Captcha was first used around the same time that Yahoo began giving out free email addresses.

    是以,在雅虎開始提供免費電子郵箱的同時,驗證碼也被首次使用。

  • This was the year 2000.

    那是 2000 年。

  • There were people who were writing programs to abuse different web services.

    有人在編寫濫用不同網絡服務的程序。

  • And there was no easy way to stop them.

    要阻止他們並不容易。

  • And the idea was there should be a test that is really quick that humans can pass but computers could not.

    我們的想法是,應該有一種非常快速的測試,人類可以通過,但計算機卻無法通過。

  • Luis's test looked like this.

    路易斯的測試是這樣的

  • A string of letters, slightly warped and distorted, that the end user had to input into a text field.

    最終用戶必須在文本資料欄中輸入一串略微扭曲變形的字母。

  • He called it Captcha, which stands for Completely Automated Public Turing Test to Tell Computers and Humans Apart.

    他稱之為 Captcha,即 "區分計算機和人類的完全自動化公共圖靈測試"。

  • And he took advantage of the fact that computers at the time weren't very good at something called OCR, optical character recognition.

    他利用了當時電腦還不擅長 OCR(光學字符識別)這一事實。

  • The way it works is a simple game of pairs.

    它的運作方式是一個簡單的配對遊戲。

  • The program scans the image and compares any shapes that emerge, like this letter P, against a database of different letters in different fonts.

    程序會掃描影像,並將出現的任何形狀(如字母 P)與不同字體的不同字母數據庫進行比較。

  • The second it finds a match, it has identified the letter.

    一旦找到匹配,它就識別出了字母。

  • But if that P was warped, overlapping another letter, or had marks through it, then the program will struggle to find anything in its database that matches and won't be able to identify the letter.

    但是,如果這個 P 變形了,與另一個字母重疊,或者有標記穿過,那麼程序就很難在數據庫中找到匹配的內容,也就無法識別這個字母。

  • Simple, but enough to keep out the malicious bots.

    雖然簡單,但足以將惡意機器人拒之門外。

  • The technology behind them was relatively primitive when it came to having access to what a letter looked like.

    在獲取信件的外觀方面,它們背後的技術相對原始。

  • But that advantage wouldn't last long.

    但這種優勢不會持續太久。

  • Here's Luis from 2010.

    這是路易斯 2010 年的照片。

  • In 2007, ReCaptcha was launched, an updated version that also helped to scan books.

    2007 年,ReCaptcha 推出了更新版,也有助於掃描書籍。

  • At the time, groups like the New York Times and the Internet Archive were in the process of trying to digitize old literature.

    當時,《紐約時報》和互聯網檔案館等組織正在嘗試將舊文獻數字化。

  • Since optical character recognition wasn't very accurate at the time, the digitization had errors.

    由於當時光學字符識別還不是很準確,所以數字化過程中會出現錯誤。

  • So, to try and help resolve those errors, ReCaptcha began showing two words from a scammed document.

    是以,為了幫助解決這些錯誤,ReCaptcha 開始顯示受騙文檔中的兩個單詞。

  • One was a control word, a word the computer knew, but the other was a word it couldn't quite identify.

    其中一個是控制詞,計算機知道這個詞,但另一個詞它卻無法識別。

  • In answering the Captcha, half of your answer was being used to pass the test, but the other half was used to tell the computer what word it was looking at, helping to improve OCR.

    在回答 "驗證碼 "時,您的一半答案是用來通過測試的,而另一半答案則是用來告訴計算機它正在查看的單詞,從而幫助改進 OCR。

  • Not only are you authenticating yourself as a human, but in addition you're helping to digitize books and newspapers.

    你不僅是在證明自己的人類身份,而且還在幫助將書籍和報紙數字化。

  • The technology was so efficient that Google acquired it in 2009.

    這項技術非常高效,谷歌在 2009 年收購了它。

  • But a new problem emerged.

    但新的問題出現了。

  • Because Captchas relied on machines being bad at reading.

    因為 Captchas 依靠的是機器不擅長閱讀。

  • And we just taught them to be very good at it.

    我們只是教他們如何做得很好。

  • Eventually, the bots were able to get Captchas right more frequently than humans.

    最終,機器人能夠比人類更頻繁地完成 Captchas。

  • The test needed to evolve.

    測試需要發展。

  • So in 2012, Google deployed this image-based Captcha.

    是以,2012 年,谷歌部署了這種基於影像的驗證碼。

  • There was a switch to go from distorted characters to the harder problem of distinguishing certain things in images, where you have to pick all the ones that have a traffic light or a bicycle or whatever.

    從扭曲的字元轉換到更難的在影像中分辨特定事物的問題,你必須挑選出所有有紅綠燈或自行車或其他東西的影像。

  • Interestingly, when it comes to boxes like this, where it's not exactly clear if this counts as a traffic light or not, there isn't actually a right or wrong answer.

    有趣的是,當涉及到這樣的方框時,我們並不清楚這算不算紅綠燈,實際上並沒有一個對或錯的答案。

  • The way to get it right is just what the majority of the human population says.

    正確的方法就是大多數人所說的那樣。

  • But for machines, this was a huge new challenge.

    但對機器來說,這是一個巨大的新挑戰。

  • After all, they'd only just learned how to read.

    畢竟,他們才剛剛學會如何閱讀。

  • See, with text-based Captchas, the machines only had to identify a limited range of variablesletters and numbersfrom a black and white background.

    要知道,使用基於文本的 Captchas 時,機器只需從黑白背景中識別出有限的變量(字母和數字)。

  • But now with image Captchas, they had to be able to identify anything and spot an object in a very busy background.

    但現在有了影像 Captchas,他們必須能夠識別任何東西,並在非常繁忙的背景中發現一個物體。

  • However, computer vision was just around the corner.

    然而,計算機視覺技術才剛剛起步。

  • Whereas you and I would process the stream of perceptual input, the computer vision system is basically taking the pixels and processes strings of pixelsvectorised, we would sayand as it processes these images, it's picking up patterns in the images, it's picking up patterns in the pixels.

    你和我處理的是感知輸入流,而計算機視覺系統處理的基本上是像素和像素串--我們可以說是矢量化--在處理這些影像時,它在影像中捕捉模式,在像素中捕捉模式。

  • Now for any computer vision system to perform well, it needs a lot of labelled data.

    現在,任何計算機視覺系統要想表現出色,都需要大量的標記數據。

  • It's being trained to identify cars by having many, many, many images of cars.

    它正在接受訓練,通過許多許多的汽車影像來識別汽車。

  • The problem was that image-based Captcha was essentially a data labelling task.

    問題在於,基於影像的驗證碼本質上是一項數據標註任務。

  • By solving it, you were once again generating data that could help a bot defeat it.

    通過解決這個問題,你再次生成了可以幫助機器人打敗它的數據。

  • The systems just kept getting better and better because they kept getting bigger and bigger.

    這些系統越來越好,因為它們越來越大。

  • A new approach was needed.

    我們需要一種新的方法。

  • So in 2014, Google launched the NoCaptcha.

    是以,2014 年,谷歌推出了 NoCaptcha。

  • One simple box.

    一個簡單的盒子

  • But this version of Captcha wasn't looking at whether you clicked the box.

    但這個版本的驗證碼並不看你是否點擊了方框。

  • It was looking at how you clicked it.

    它在看你如何點擊它。

  • And how you interacted with the rest of the internet.

    以及你如何與互聯網的其他部分進行互動。

  • See, if you write code to make an object move to a certain point, like a cursor, the simplest version will make it move in a straight line at a constant speed, like a robot.

    你看,如果你編寫代碼讓一個物體移動到某一點,比如遊標,最簡單的版本就是讓它像機器人一樣以恆定的速度直線移動。

  • But humans naturally aren't that accurate.

    但人類自然不會那麼準確。

  • We overshoot.

    我們超額完成了任務。

  • We don't move in a perfectly straight line.

    我們並不是直線前進的。

  • And that is what this version of Captcha was looking for.

    而這正是這個版本的驗證碼所追求的。

  • Human flaws.

    人性的缺陷

  • It was also monitoring things like your internet history and your typing speed.

    它還監控你的上網記錄和打字速度等。

  • If the history was just a string of repeated attacks on the same website, you're probably a bot.

    如果歷史記錄只是一連串對同一網站的重複攻擊,那麼你很可能是一個殭屍。

  • But if you stopped halfway through the day to browse for shoes or look at cat videos, you might just be human.

    但如果你中途停下來逛逛鞋店或看看貓咪視頻,那你可能只是個普通人。

  • By 2018, Google had done away with the tick box entirely and launched ReCaptcha v3, based solely on that hidden data.

    到 2018 年,谷歌完全取消了勾選框,並推出了 ReCaptcha v3,完全基於這些隱藏數據。

  • But even flawed human characteristics is something that a smart algorithm can eventually learn to mimic.

    但是,即使是有缺陷的人類特徵,智能算法最終也能學會模仿。

  • So are the bots always destined to win?

    那麼,機器人總是註定要贏嗎?

  • Well, perhaps.

    也許吧。

  • See, despite other companies coming up with new and inventive tests, there's a quirk at the very heart of Captcha that means it's likely the bots will always be able to win eventually.

    你看,儘管其他公司也提出了新穎的測試方法,但驗證碼的核心有一個怪圈,這意味著機器人很可能最終總是能夠獲勝。

  • And to understand why, you need to go back to before the internet was invented.

    要了解其中的原因,你需要追溯到互聯網發明之前。

  • To the man who Captcha is named after, Alan Turing.

    敬以 "驗證碼 "命名的艾倫-圖靈。

  • He's considered one of the founding fathers of AI after he penned this paper, Computing

    在撰寫了這篇論文《計算》之後,他被認為是人工智能的奠基人之一。

  • Machinery and Intelligence.

    機械與智能。

  • In it, he describes a method to test whether you're talking to a human or a machine.

    在這篇文章中,他描述了一種測試你是在和人類還是機器對話的方法。

  • We had a questioner right behind the screen and trying to differentiate between a human and a computer machine giving answers.

    我們讓一位提問者站在螢幕後面,試圖區分給出答案的是人還是電腦。

  • This now has famously become known as the Turing test.

    這就是著名的圖靈測試。

  • The problem with Captcha is that the questioner isn't human.

    驗證碼的問題在於提問者不是人類。

  • It's a computer, and therefore any information that's input into the computer has the potential to be used by AI models to train Captcha-defeating bots.

    它是一臺電腦,是以輸入電腦的任何資訊都有可能被人工智能模型用來訓練驗證碼破解機器人。

  • Even Luis's original paper on Captchas says that the technology will act as a security measure for websites and advance the field of AI.

    就連路易斯關於 Captchas 的原始論文也說,這項技術將作為網站的安全措施,並推動人工智能領域的發展。

  • You have the research world almost continually catching up and surpassing the Captcha world.

    研究領域幾乎一直在追趕並超越驗證碼領域。

  • But the solution may be to take the test out of the computer and into the real world.

    但解決的辦法可能是讓測試走出電腦,進入現實世界。

  • We all have mobile phones and they have so many different sensors within them.

    我們都有手機,手機裡有許多不同的傳感器。

  • Being able to tilt the phone on instruction or being able to take a few steps in one direction or another using the sensors in the phone.

    能夠根據指令傾斜手機,或利用手機中的傳感器朝一個方向或另一個方向走幾步。

  • But in order for us to truly prove that we're human online, we may have to find an entirely new approach.

    但是,為了真正證明我們在網上也是人,我們可能必須找到一種全新的方法。

  • I think the question now is not, are we doomed?

    我認為現在的問題不是:我們註定要失敗嗎?

  • It's how do we seize control again?

    問題是,我們如何才能再次掌握控制權?

  • I would not be designing a Captcha today.

    我今天就不會設計驗證碼。

  • I think that's a losing battle.

    我認為這是一場失敗的戰鬥。

  • This is too hard.

    這太難了。

It's not just you.

不只是你。

字幕與單字
由 AI 自動生成

單字即點即查 點擊單字可以查詢單字解釋