Placeholder Image

字幕列表 影片播放

已審核 字幕已審核
  • Six thousand miles of road,

    六千英里公路,

  • 600 miles of subway track,

    六百英里地鐵路線,

  • 400 miles of bike lanes

    四百英里腳踏車專用道,

  • and a half a mile of tram track,

    半英里的有軌電車專用道

  • if you've ever been to Roosevelt Island.

    僅在羅斯福島。

  • These are the numbers that make up the infrastructure of New York City.

    這些數字構成了紐約市的基建。

  • These are the statistics of our infrastructure.

    這些基建的統計數字,

  • They're the kind of numbers you can find released in reports by city agencies.

    都可以在市政機關公佈的報告中找到。

  • For example, the Department of Transportation will probably tell you

    譬如,交通部門可能會告訴你,

  • how many miles of road they maintain.

    他們維護這多少英里的道路。

  • The MTA will boast how many miles of subway track there are.

    MTA(紐約交通運輸管理局)會自誇 他們掌管著多少英里捷運。

  • Most city agencies give us statistics.

    多數的市政機關都在公佈統計數據。

  • This is from a report this year

    這是今年計程車與轎車委員會發佈的報告,

  • from the Taxi and Limousine Commission,

    我們從中知道紐約市運營著 大約一萬三千五百輛計程車。

  • where we learn that there's about 13,500 taxis here in New York City.

    很有趣,是嗎?

  • Pretty interesting, right?

    但你有否想過這些數據來自哪裡?

  • But did you ever think about where these numbers came from?

    既然有這些數字存在, 那肯定是因為在市政機關的某個人

  • Because for these numbers to exist, someone at the city agency

    想過:嗯......這個數字可能有人會想知道。

  • had to stop and say, hmm, here's a number that somebody might want want to know.

    這個數字是市民們想知道的。

  • Here's a number that our citizens want to know.

    所以他們找回那些原始數據,

  • So they go back to their raw data,

    他們計數、相加、計算,

  • they count, they add, they calculate,

    然後把得出的結果寫進報告中,

  • and then they put out reports,

    所以那些報告中會有這樣的數字。

  • and those reports will have numbers like this.

    那麼問題來了:他們怎麼會知道 我們的問題都是什麼?

  • The problem is, how do they know all of our questions?

    我們有很多問題。

  • We have lots of questions.

    事實上,可以說我們有無窮無盡的問題

  • In fact, in some ways there's literally an infinite number of questions

    有關我們這座城市。

  • that we can ask about our city.

    市政機關可無法跟得上(我們的節奏)。

  • The agencies can never keep up.

    現有模式並不具有實效,我覺得 我們的政策制定者也知道這點,

  • So the paradigm isn't exactly working, and I think our policymakers realize that,

    因為在2012年,彭博市長 簽署了一個法令,他稱之為

  • because in 2012, Mayor Bloomberg signed into law what he called

    全美最具雄心和綜合性的 開放數據立法。

  • the most ambitious and comprehensive open data legislation in the country.

    從各種意義上來說,他是對的。

  • In a lot of ways, he's right.

    在過去兩年中,市政有1000個數據庫

  • In the last two years, the city has released 1,000 datasets

    放在我們的開放數據門戶網站上,

  • on our open data portal,

    還是蠻驚人的。

  • and it's pretty awesome.

    我們來檢視這些數據,

  • So you go and look at data like this,

    除了數數計程車的數量,

  • and instead of just counting the number of cabs,

    我們也能開始問不一樣的問題了。

  • we can start to ask different questions.

    我有一個問題:

  • So I had a question.

    紐約市的交通高峰在什麼時候?

  • When's rush hour in New York City?

    這簡直煩人。高峰到底是什麼時候?

  • It can be pretty bothersome. When is rush hour exactly?

    我想到,這些計程車可不僅僅是個數字,

  • And I thought to myself, these cabs aren't just numbers,

    它們可以是開遍全市道路的GPS記錄儀,

  • these are GPS recorders driving around in our city streets

    記錄著乘客的每一差車程。

  • recording each and every ride they take.

    數據是現成的。我檢視它們,

  • There's data there, and I looked at that data,

    並制出一張圖表,標出 一天中紐約市計程車的平均時速。

  • and I made a plot of the average speed of taxis in New York City throughout the day.

    大家可以看到, 從半夜到凌晨五點十八分,

  • You can see that from about midnight to around 5:18 in the morning,

    時速一直在增加,然後到了拐點,

  • speed increases, and at that point, things turn around,

    時速逐漸下降,在早間的八點三十五分,

  • and they get slower and slower and slower until about 8:35 in the morning,

    時速降到十一英里半。

  • when they end up at around 11 and a half miles per hour.

    運營中計程車的平均時速 保持在十一英里半,

  • The average taxi is going 11 and a half miles per hour on our city streets,

    結果沒有變化,

  • and it turns out it stays that way

    整天都是如此。

  • for the entire day.

    (笑聲)

  • (Laughter)

    我告訴自己,紐約市並不存在高峰時段,

  • So I said to myself, I guess there's no rush hour in New York City.

    而是全天都高峰。

  • There's just a rush day.

    這是個有意義的結論,原因有幾點。

  • Makes sense. And this is important for a couple of reasons.

    如果你是做交通規劃的, 知道這個結論會有意義。

  • If you're a transportation planner, this might be pretty interesting to know.

    如果你要快速到達某地,

  • But if you want to get somewhere quickly,

    只要把鬧鐘定在凌晨四點四十五分就行了。

  • you now know to set your alarm for 4:45 in the morning and you're all set.

    紐約嘛!

  • New York, right?

    但這個數據背後還有故事。

  • But there's a story behind this data.

    這個數據並不真的是現成的。

  • This data wasn't just available, it turns out.

    你需要做一個「信息自由法案申請」,

  • It actually came from something called a Freedom of Information Law Request,

    也叫「FOIL申請」。

  • or a FOIL Request.

    你可以在計程車和轎車委員會的網站上 找到相關申請表。

  • This is a form you can find on the Taxi and Limousine Commission website.

    如果要獲得這些數據, 你要弄到這張申請表,

  • In order to access this data, you need to go get this form,

    填好上交,受理人員屆時會通知你。

  • fill it out, and they will notify you,

    一個叫克里斯▪旺的人就這樣做了。

  • and a guy named Chris Whong did exactly that.

    克里斯來到委員會,工作人員告訴他

  • Chris went down, and they told him,

    「帶個全新的硬盤來辦公室,

  • "Just bring a brand new hard drive down to our office,

    我們會把相關數據拷貝給你, 過五小時來拿。」

  • leave it here for five hours, we'll copy the data and you take it back."

    這就是拿到數據的經過。

  • And that's where this data came from.

    克里斯想公開這些數據,

  • Now, Chris is the kind of guy who wants to make the data public,

    於是放到網路上供所有人使用, 所以我才能做出這張圖。

  • and so it ended up online for all to use, and that's where this graph came from.

    這一切——這些GPS記錄儀真是酷。

  • And the fact that it exists is amazing. These GPS recorders -- really cool.

    但是,市民要攜帶自己的移動硬盤

  • But the fact that we have citizens walking around with hard drives

    踏遍市政機關, 然後通過自己的努力公開,這件事——

  • picking up data from city agencies to make it public --

    政府數據可以說是公開的, 普通市民能得到它,

  • it was already kind of public, you could get to it,

    但這只是名義上的「公開」, 並不是真正的公開。

  • but it was "public," it wasn't public.

    我們的城市可以做得更好。

  • And we can do better than that as a city.

    我們不需要費力帶著移動硬盤到處跑。

  • We don't need our citizens walking around with hard drives.

    並不是每一個數據庫都需要FOIL申請。

  • Now, not every dataset is behind a FOIL Request.

    我做的這張地圖標出了紐約市最危險的路口,

  • Here is a map I made with the most dangerous intersections in New York City

    來源是腳踏車騎行者的交通事故數據。

  • based on cyclist accidents.

    紅色區域更危險,

  • So the red areas are more dangerous.

    圖上顯示,首先,曼哈頓的東側,

  • And what it shows is first the East side of Manhattan,

    特別是曼哈頓的下城區域, 腳踏車事故更多。

  • especially in the lower area of Manhattan, has more cyclist accidents.

    這可能是因為,

  • That might make sense

    在這裡有更多的騎行者從大橋下來。

  • because there are more cyclists coming off the bridges there.

    圖上還有其他的熱點區域值得研究。

  • But there's other hotspots worth studying.

    威廉姆斯堡、皇后區的羅斯福大道,

  • There's Williamsburg. There's Roosevelt Avenue in Queens.

    這些咨詢才是Vision Zero項目所需要的。

  • And this is exactly the kind of data we need for Vision Zero.

    這正是我們要找的東西。

  • This is exactly what we're looking for.

    這個數據背後也有個故事。

  • But there's a story behind this data as well.

    這個數據並不是現成的。

  • This data didn't just appear.

    有多少人知道這個符號?

  • How many of you guys know this logo?

    我看到有人點頭了。

  • Yeah, I see some shakes.

    你們有沒有試過從PDF文檔中 拷貝和黏貼數據,

  • Have you ever tried to copy and paste data out of a PDF

    並據此作出結論呢?

  • and make sense of it?

    我看到更多人點頭了。

  • I see more shakes.

    試圖拷貝粘貼的人 比認識這個標誌的人更多,真有趣。

  • More of you tried copying and pasting than knew the logo. I like that.

    你們剛剛看到的數據是做在PDF裡的。

  • So what happened is, the data that you just saw was actually on a PDF.

    事實上,是成千上萬頁的PDF文檔,

  • In fact, hundreds and hundreds and hundreds of pages of PDF

    由我們的紐約警署發佈。

  • put out by our very own NYPD,

    如果你想享用這些數據, 你要不就持續

  • and in order to access it, you would either have to copy and paste

    做複製黏貼的動作,花掉成千上萬小時,

  • for hundreds and hundreds of hours,

    要不就像約翰▪克勞斯一樣。

  • or you could be John Krauss.

    約翰▪克勞斯

  • John Krauss was like,

    可不想重複地去複製黏貼, 他寫了一個程式。

  • I'm not going to copy and paste this data. I'm going to write a program.

    這個程序叫做 「紐約警署交通事故數據OK蹦」,

  • It's called the NYPD Crash Data Band-Aid,

    它能到紐約警署的網站下載PDF文檔,

  • and it goes to the NYPD's website and it would download PDFs.

    每天它都去搜索; 如果找到一個PDF文檔,就下載下來,

  • Every day it would search; if it found a PDF, it would download it

    然後運行某個PDF解碼的程式,

  • and then it would run some PDF-scraping program,

    把其中的文字信息提取出來,

  • and out would come the text,

    其中的訊息會發佈在網路上, 人們就可以製作這些地圖。

  • and it would go on the Internet, and then people could make maps like that.

    這些數據就在那兒,我們都能得到——

  • And the fact that the data's here, the fact that we have access to it --

    每一個交通事故就是一行數據。

  • Every accident, by the way, is a row in this table.

    你們可以想像有多少PDF需要轉碼。

  • You can imagine how many PDFs that is.

    ——我們能看到這些數據固然好,

  • The fact that we have access to that is great,

    但能不能不要弄成PDF格式的,

  • but let's not release it in PDF form,

    不然市民們就得去寫PDF解碼的程式,

  • because then we're having our citizens write PDF scrapers.

    這對市民的時間來說是一種浪費,

  • It's not the best use of our citizens' time,

    而我們的城市能做的更好。

  • and we as a city can do better than that.

    有個好消息,白思豪市長的班底

  • Now, the good news is that the de Blasio administration

    在幾個月前公開了這份數據,

  • actually recently released this data a few months ago,

    所以我們能直接享用這些數據,

  • and so now we can actually have access to it,

    然而還有很多數據是PDF格式的。

  • but there's a lot of data still entombed in PDF.

    譬如,我們的罪案數據目前只有PDF格式的。

  • For example, our crime data is still only available in PDF.

    除了罪案數據,市政預算也是如此。

  • And not just our crime data, our own city budget.

    目前我們的市政預算只有PDF格式的。

  • Our city budget is only readable right now in PDF form.

    不僅是我們無法分析這些數字,

  • And it's not just us that can't analyze it --

    那些為市政預算投票的立法委員們

  • our own legislators who vote for the budget

    也只能拿到PDF版本的數字。

  • also only get it in PDF.

    所以我們的立法委員是無法分析 他們要為之投票的市政預算的。

  • So our legislators cannot analyze the budget that they are voting for.

    我認為我們的城市還能做得更好。

  • And I think as a city we can do a little better than that as well.

    很多數據已經不躲在PDF中了。

  • Now, there's a lot of data that's not hidden in PDFs.

    這裡有一幅地圖可以作為例證,

  • This is an example of a map I made,

    標示了紐約市最骯髒的水路。

  • and this is the dirtiest waterways in New York City.

    我是如何衡量「骯髒」的呢?

  • Now, how do I measure dirty?

    這裡有些奇怪,

  • Well, it's kind of a little weird,

    我衡量的是糞便大腸菌群的水平,

  • but I looked at the level of fecal coliform,

    這是水路中糞便物質的一種衡量指標。

  • which is a measurement of fecal matter in each of our waterways.

    圓圈越大,水就越髒,

  • The larger the circle, the dirtier the water,

    所以圖上的大圓圈代表髒水, 小圓圈代表乾淨的水。

  • so the large circles are dirty water, the small circles are cleaner.

    大家看到的是內河水道。

  • What you see is inland waterways.

    這裡有紐約市過去五年採樣的所有數據。

  • This is all data that was sampled by the city over the last five years.

    內河水道總的來說變髒了。

  • And inland waterways are, in general, dirtier.

    這個結論挺合理的,對嗎?

  • That makes sense, right?

    大圓圈代表髒水。 我從中學到了幾件事情。

  • And the bigger circles are dirty. And I learned a few things from this.

    第一:千萬別在任何叫做「xx溪」 或「xx運河」的地方游泳。

  • Number one: Never swim in anything that ends in "creek" or "canal."

    但是第二:紐約市最髒的水路,

  • But number two: I also found the dirtiest waterway in New York City,

    只看(糞便大腸菌群)這個唯一的指標,

  • by this measure, one measure.

    在康尼島溪,幸好不是你們游泳的康尼島。

  • In Coney Island Creek, which is not the Coney Island you swim in, luckily.

    那在島的另一面。

  • It's on the other side.

    但在康尼島溪中, 過去五年的採樣中有94%

  • But Coney Island Creek, 94 percent of samples taken over the last five years

    含有超標的糞便含量,

  • have had fecal levels so high

    以至於達到州法律禁止游泳的水平。

  • that it would be against state law to swim in the water.

    這種類型的事實

  • And this is not the kind of fact that you're going to see

    你可不會在市政報告中看到,不是嗎?

  • boasted in a city report, right?

    這也不會登上紐約市政府網站的頭條。

  • It's not going to be the front page on nyc.gov.

    我們肯定不會看到的,

  • You're not going to see it there,

    但能看到這些數據真實不錯。

  • but the fact that we can get to that data is awesome.

    同樣,拿到這些數據並不容易,

  • But once again, it wasn't super easy,

    因為它們並不在公開數據門戶網站上。

  • because this data was not on the open data portal.

    如果你看公開數據的門戶網站,

  • If you were to go to the open data portal,

    你只能看到其中一些片段, 只有一年內或幾個月的數據。

  • you'd see just a snippet of it, a year or a few months.

    這些數據其實是在環境保護部門的網站上。

  • It was actually on the Department of Environmental Protection's website.

    每一個鏈接都是一個Excel文件, 而每個Excel文件都是不一樣的。

  • And each one of these links is an Excel sheet, and each Excel sheet is different.

    每一個表頭都不同: 需要複製、黏貼、還有重新整理。

  • Every heading is different: you copy, paste, reorganize.

    一旦完成你就能做出這些地圖, 但我要再次重申,

  • When you do you can make maps and that's great, but once again,

    我們的城市能做的更好, 我們可以標準化。

  • we can do better than that as a city, we can normalize things.

    我們正在改善這裡有個 索克拉塔公司建立的網站

  • And we're getting there, because there's this website that Socrata makes

    叫做「紐約市公開數據門戶」。

  • called the Open Data Portal NYC.

    這裡,1100個數據庫

  • This is where 1,100 data sets that don't suffer

    都不存在標準化的問題,

  • from the things I just told you live,

    而且(這些無縫連接的數據庫)數字還在增加。

  • and that number is growing, and that's great.

    你可以下載任一格式的數據: CSV、PDF或Excel文件都可以。

  • You can download data in any format, be it CSV or PDF or Excel document.

    按你自己的需求來下載。

  • Whatever you want, you can download the data that way.

    但問題又來了,

  • The problem is, once you do,

    你會發現不同的機構 用不同的代碼來表示地址。

  • you will find that each agency codes their addresses differently.

    有街道名、有路口名、

  • So one is street name, intersection street,

    行政區、地址、建築物、建築物地址等等。

  • street, borough, address, building, building address.

    所以,即使有這個門戶網站的幫助,

  • So once again, you're spending time, even when we have this portal,

    你還得花時間來標準化地址這塊的數據。

  • you're spending time normalizing our address fields.

    這也不是有效利用市民時間的方法。

  • And that's not the best use of our citizens' time.

    我們的城市能做得更好。

  • We can do better than that as a city.

    我們可以對地址進行標準化,

  • We can standardize our addresses,

    如果做到了, 我們就能做出更多這樣的地圖。

  • and if we do, we can get more maps like this.

    這是紐約市消防龍頭的地圖,

  • This is a map of fire hydrants in New York City,

    但不僅於此。

  • but not just any fire hydrants.

    這些是前250個吃到最多違章停車罰單的 消防栓位置圖。

  • These are the top 250 grossing fire hydrants in terms of parking tickets.

    (笑聲)

  • (Laughter)

    我從圖中學到了幾件事, 我也真的喜歡這張圖。

  • So I learned a few things from this map, and I really like this map.

    第一:別在上東區停車。

  • Number one, just don't park on the Upper East Side.

    千萬別停。因為不管停哪裡都會吃罰單。

  • Just don't. It doesn't matter where you park, you will get a hydrant ticket.

    第二:我找出了全紐約市最最容易 吃到違章停車罰單的兩個消防栓的位置,

  • Number two, I found the two highest grossing hydrants in all of New York City,

    兩個都在下東區,

  • and they're on the Lower East Side,

    每年能在罰單上創收五萬五千多美金。

  • and they were bringing in over 55,000 dollars a year in parking tickets.

    我注意到這點,覺得有些奇怪,

  • And that seemed a little strange to me when I noticed it,

    於是深入挖掘了一下原因, 結果發現消防栓

  • so I did a little digging and it turns out what you had is a hydrant

    都有一個叫做控制擴展的區域,

  • and then something called a curb extension,

    是約有七英呎的一塊地方,可以走路,

  • which is like a seven-foot space to walk on,

    然後是一個停車位。

  • and then a parking spot.

    所以車開過來,司機發現消防栓,

  • And so these cars came along, and the hydrant --

    想“還有一段距離,這裡沒問題的”,

  • "It's all the way over there, I'm fine,"

    何況地上還有一個畫得美美的停車位,

  • and there was actually a parking spot painted there beautifully for them.

    司機停好車,但紐約警署不同意這種配置,

  • They would park there, and the NYPD disagreed with this designation

    開出了罰單。

  • and would ticket them.

    可不只是我本人吃了罰單,

  • And it wasn't just me who found a parking ticket.

    這是谷歌街景拍到的一輛過路車,

  • This is the Google Street View car driving by

    也吃了同樣的一張罰單。

  • finding the same parking ticket.

    於是我把這件事發到自己的部落格上 以及“I Quant NY”上,

  • So I wrote about this on my blog, on I Quant NY, and the DOT responded,

    結果交通部門回復如下:

  • and they said,

    “交通部並未就此地點收到相關投訴,

  • "While the DOT has not received any complaints about this location,

    我們會重新檢視道路標誌, 並做出適當的改善措施。”

  • we will review the roadway markings and make any appropriate alterations."

    我暗自想:多麼官腔!

  • And I thought to myself, typical government response,

    好吧,我該幹嘛幹嘛去了。

  • all right, moved on with my life.

    然而,幾週時間過去, 發生了意料之外的事情。

  • But then, a few weeks later, something incredible happened.

    停車位重新畫了,

  • They repainted the spot,

    那一瞬間我覺得能看到公開數據的未來。

  • and for a second I thought I saw the future of open data,

    大家想想這件事,

  • because think about what happened here.

    過去五年,這個讓人困惑的停車位 一直讓人吃罰單,

  • For five years, this spot was being ticketed, and it was confusing,

    但某一天,一位市民發現了問題 報告市政機關,又過了幾週時間,

  • and then a citizen found something, they told the city, and within a few weeks

    問題車位被修正了。

  • the problem was fixed.

    太不可思議了。很多人認為 公開數據讓市民變成政府的監視者,

  • It's amazing. And a lot of people see open data as being a watchdog.

    並非如此,它實則讓人們成為了合作夥伴。

  • It's not, it's about being a partner.

    市民能夠有底氣成為政府更好的合作夥伴,

  • We can empower our citizens to be better partners for government,

    這並不難。

  • and it's not that hard.

    我們只需要作出一些改變。

  • All we need are a few changes.

    如果我們在申請FOIL信息自由法案數據,

  • If you're FOILing data,

    如果你看到自己申請的數據已經被反覆申請,

  • if you're seeing your data being FOILed over and over again,

    讓我們直接向公眾公開, 因為反覆申請就是需要公開的一种信號。

  • let's release it to the public, that's a sign that it should be made public.

    如果某個政府機關正在發佈PDF數據,

  • And if you're a government agency releasing a PDF,

    讓我們通過法案 要求他們發佈隱藏的數據,

  • let's pass legislation that requires you to post it with the underlying data,

    因為這些數據必定有來源。

  • because that data is coming from somewhere.

    我不知道從哪兒,但肯定有來源,