7 數據庫範例 (7 Database Paradigms) - VoiceTube 看影片學英語

字幕列表影片播放

由 AI 自動生成

My dad used to tell me, use the right tool for the job, and not vice versa.

我父親曾告訴我，工作要使用正確的工具，而不是相反。
When it comes to app development, choosing the right database is one of the single most important decisions that you'll ever make.

在應用程序開發過程中，選擇合適的數據庫是您做出的最重要的決定之一。
In today's video, we'll look at seven different database paradigms, along with some databases you've probably heard of, and some others that you haven't.

在今天的視頻中，我們將介紹七種不同的數據庫範例，以及一些你可能聽說過的數據庫和一些你沒有聽說過的數據庫。
We'll look at how they work from a technical perspective, but most importantly, we'll try to understand what they're best used for, because my dad also used to tell me, don't bring a knife to a gunfight.

我們將從技術角度瞭解它們的工作原理，但最重要的是，我們將嘗試瞭解它們的最佳用途，因為我父親也曾告訴我，不要帶著刀去槍戰。
If you're new here, like and subscribe, and also check out my new top 7 playlist for more videos like this.

如果你是新來的，請點贊並訂閱，還可以查看我新的 7 大播放列表，瞭解更多類似視頻。
Our list will start from the most simple type of database, and gradually become more complex as we get to number 7.

我們的列表將從最簡單的數據庫類型開始，到第 7 個數據庫時逐漸變得複雜。
And that brings us to our first paradigm, the Key-Value Database.

這就是我們的第一個範例--鍵值數據庫。
Popular databases in this space include Redis, Memcached, and Etcd.

該領域流行的數據庫包括 Redis、Memcached 和 Etcd。
The database itself is structured almost like a JavaScript object or Python dictionary.

數據庫本身的結構幾乎類似於 JavaScript 對象或 Python 字典。
You have a set of keys, where every key is unique, and points to some value.

你有一組鍵，每個鍵都是唯一的，並指向某個值。
In Redis, for example, we can read and write data using commands.

以 Redis 為例，我們可以使用命令讀寫數據。
We use the set command, followed by a key and a value to write data, then the git command to retrieve that data in the future.

我們使用 set 命令，後面跟一個鍵和一個值來寫入數據，然後使用 git 命令在將來檢索這些數據。
In the case of Redis and Memcached, all of the data is held in the machine's memory, as opposed to most other databases that keep all their data on the disk.

就 Redis 和 Memcached 而言，所有數據都保存在機器內存中，而大多數其他數據庫則將所有數據保存在磁盤上。
This limits the amount of data you can store, however it makes the database extremely fast, because it doesn't require a round trip to the disk for every operation.

這限制了可存儲的數據量，但卻使數據庫的運行速度極快，因為每次操作都不需要往返磁盤。
In addition, it doesn't perform queries, joins, or anything like that, so your data modeling options are very limited, but again it's extremely fast, like sub-millisecond fast.

此外，它不能執行查詢、連接或類似操作，是以你的數據建模選項非常有限，但它同樣非常快，快到亞毫秒級。
You wouldn't want to use a key-value store for your main app data.

您不會希望將鍵值存儲用於主要應用程序數據。
Most often, they're used as a cache to reduce data latency.

它們通常用作緩存，以減少數據延遲。
Apps like Twitter, GitHub, and Snapchat all use Redis for real-time delivery of their data.

Twitter、GitHub 和 Snapchat 等應用程序都使用 Redis 實時傳輸數據。
There are other use cases beyond caching, like message queues, PubSub, and gaming leaderboards, but more often than not, key-value databases are used as a cache on top of some other persistent data layer.

除了緩存，還有其他一些用例，如消息隊列、PubSub 和遊戲排行榜，但鍵值數據庫更多時候是用作其他持久數據層之上的緩存。
Now a database that only supports key-value pairs is obviously pretty limited, and that brings us to the wide-column database.

現在，只支持鍵值對的數據庫顯然非常有限，這就給我們帶來了寬列數據庫。
Popular options in this family include Cassandra and HBase.

該系列中的熱門選項包括 Cassandra 和 HBase。
A wide-column database is like you took a key-value database and added a second dimension to it.

寬列數據庫就像是在鍵值數據庫的基礎上增加了第二個維度。
At the outer layer, you have a keyspace, which holds one or more column families, and each column family holds a set of ordered rows.

在外層，你有一個鍵空間，鍵空間容納一個或多個列族，每個列族容納一組有序行。
This makes it possible to group related data together, but unlike a relational database, it doesn't have a schema, so it can handle unstructured data.

這使得將相關數據分組成為可能，但與關係數據庫不同的是，它沒有模式，是以可以處理非結構化數據。
This is nice for developers, because you get a query language called CQL that's very similar to SQL, although much more limited and you can't do joins.

這對開發人員來說是件好事，因為你可以得到一種名為 CQL 的查詢語言，它與 SQL 非常相似，但侷限性更大，而且不能進行連接。
However, it's much easier to scale up and replicate data across multiple nodes.

不過，在多個節點上擴展和複製數據要容易得多。
Unlike an SQL database, it's decentralized and can scale horizontally.

與 SQL 數據庫不同，它是分散的，可以橫向擴展。
A popular use case is for scaling a large amount of time series data, like records from an IoT device, weather sensors, or in the case of Netflix, a history of the different shows you've watched.

一個流行的使用案例是擴展大量時間序列數據，如來自物聯網設備、天氣傳感器的記錄，或者在 Netflix 的情況下，您觀看過的不同節目的歷史記錄。
It's often used in situations where you have frequent writes, but infrequent updates and reads.

它通常用於頻繁寫入但不頻繁更新和讀取的情況。
It's not going to be your primary app database.

它不會成為你的主要應用程序數據庫。
For that, you'll need something more general purpose, like a document-oriented database.

為此，你需要一些更通用的東西，比如面向文檔的數據庫。
Popular options in the Firestore, DynamoDB, CouchDB, and a few others.

流行的選項包括 Firestore、DynamoDB、CouchDB 和其他一些選項。
In this paradigm, you have documents, where each document is a container for key-value pairs.

在這種模式下，你有文檔，每個文檔都是鍵值對的容器。
They're unstructured and don't require a schema.

它們是非結構化的，不需要模式。
Then the documents are grouped together in collections.

然後，這些文件被歸類為文件集。
Fields within a collection can be indexed, and collections can be organized into a logical hierarchy, allowing you to model and retrieve relational data to a pretty significant degree.

可以對集合內的字段進行索引，並將集合組織成邏輯層次結構，從而在相當大的程度上對關係數據進行建模和檢索。
They don't support joins, so instead of normalizing your data into a bunch of small parts, you're encouraged to embed the document.

它們不支持連接，是以我們鼓勵你嵌入文檔，而不是將數據規範化為一堆小部分。
This creates a tradeoff where reads from a friend and application are much faster, however writing or updating data tends to be more complex.

這就造成了一種折衷，即從朋友和應用程序中讀取數據的速度更快，但寫入或更新數據往往更加複雜。
Document databases are far more general purpose than the other options we've looked at so far.

文檔數據庫的通用性遠高於我們目前所瞭解的其他選擇。
From a developer perspective, they're very easy to use.

從開發人員的角度來看，它們非常易於使用。
They're often suitable for mobile games, IoT, content management, and many other use cases.

它們通常適用於移動遊戲、物聯網、內容管理和許多其他用例。
If you're not exactly sure how your data is structured at this point, a document database is probably the best place to start.

如果你現在還不確定數據的結構，那麼文檔數據庫可能是最好的起點。
Where they generally fall short is when you have a lot of disconnected but related data that is updated often, like a social app that has many users who have many friends who have many comments who have many likes, and you want to see all the comments that your friends like.

一般來說，當你擁有大量互不關聯但又經常更新的相關數據時，它們就會出現不足，比如一個社交應用有很多用戶，這些用戶有很多朋友，這些朋友有很多評論，這些評論有很多贊，而你想看到你的朋友喜歡的所有評論。
Data like this needs to be joined, and it's not easily done in a document database at scale.

這樣的數據需要連接起來，而這在大規模的文檔數據庫中並不容易實現。
Luckily, we have this thing that's been around forever called the relational database.

幸運的是，我們有一種一直存在的東西，叫做關係數據庫。
You're likely familiar with this type of database with flavors like MySQL, Postgres, SQL Server, and many others.

您可能對 MySQL、Postgres、SQL Server 等這類數據庫並不陌生。
They've been around for nearly 50 years and continue to be one of the most popular types of databases in today's world.

這種數據庫已有近 50 年的歷史，一直是當今世界最流行的數據庫類型之一。
They were originally conceived by a computer scientist named Ted Codd.

它們最初是由一位名叫特德-科德（Ted Codd）的計算機科學家構想出來的。
He worked for IBM and spent years working out his theories on relational data modeling.

他曾在 IBM 工作，花了數年時間研究關係數據建模理論。
You can read his original paper online, and most of it goes way over my head, but you can appreciate the amount of math and science that went into the development of relational databases, and that's very likely why they remain so popular today.

你可以在網上閱讀他的原始論文，雖然大部分內容都超出了我的理解範圍，但你可以體會到關係數據庫開發過程中的大量數學和科學知識，而這很可能就是關係數據庫今天仍然如此流行的原因。
A few years later, this would inspire the development of SQL, or Structured Query Language, or SQL if you prefer.

幾年後，SQL（結構化查詢語言）的開發受到了啟發。
It's a special type of programming language called a query language that allows you to access and write data in the database.

它是一種特殊的編程語言，稱為查詢語言，允許您訪問和寫入數據庫中的數據。
Okay, but what do we actually mean when we say relational database?

好吧，但我們說的關係數據庫究竟是什麼意思呢？
Well, imagine you have a facility that builds airplanes.

想象一下，你有一個製造飛機的工廠。
The facility is your database, and on that database you might have different warehouses that hold different parts, like engines, wheels, and so on.

設施就是你的數據庫，在數據庫中，你可能有不同的倉庫，存放不同的部件，如發動機、車輪等。
Each warehouse is like a database table for holding a certain type of part.

每個倉庫就像一個數據庫表，用於存放某一類零件。
Each individual part has a serial number to uniquely identify it, and you can think of an individual part as a row in a table.

每個零件都有一個唯一標識的序列號，可以把每個零件看作是表格中的一行。
So now that we have all these parts separated into different warehouses, how do we build an airplane?

那麼，既然我們把所有這些部件都分到了不同的倉庫，我們該如何製造一架飛機呢？
That's where relationships come in.

這就是人際關係的作用所在。
We can build an airplane by referencing the unique ID of the different parts that go into it.

我們可以通過引用不同部件的唯一 ID 來製造飛機。
Notice how each part has its own unique ID.

請注意，每個部件都有自己唯一的 ID。
This is known as its primary key, then it defines its various parts by referencing their IDs.

這被稱為主鍵，然後通過引用 ID 來定義各個部分。
These are known as foreign keys because they reference data in a different table.

這些鍵被稱為外鍵，因為它們引用不同表中的數據。
Now if we want to join all this data together, we can run a query to do that.

現在，如果我們想將所有這些數據連接在一起，可以運行查詢來實現。
So the main takeaway here is that an SQL database organizes data in its smallest normal form.

是以，這裡的主要啟示是，SQL 數據庫以最小的正常形式組織數據。
However, a potential drawback here is that it requires a schema.

不過，它的一個潛在缺點是需要一個模式。
If you don't know the right data shape up front, they can be a little harder to work with.

如果你不知道正確的數據形狀，它們可能會有點難用。
SQL databases are also ACID compliant, which means whenever there's a transaction in the database, data validity is guaranteed even if there are network or hardware failures.

SQL 數據庫還符合 ACID 標準，這意味著只要數據庫中有事務，即使出現網絡或硬件故障，也能保證數據的有效性。
That's essential for things like banks and financial institutions, but it makes this type of database inherently more difficult to scale.

這對銀行和金融機構等機構來說是必不可少的，但也使得這類數據庫在本質上更難擴展。
However, it's worth noting that there are modern SQL databases like Cockroach that are specifically designed to operate at scale.

不過，值得注意的是，有一些現代 SQL 數據庫（如 Cockroach）專門設計用於大規模運行。
In any case, relational databases remain the most popular type of database in production today.

無論如何，關係數據庫仍然是當今生產中最流行的數據庫類型。
But what if instead of modeling a relationship in a schema, we just treated the relationship itself as data?

但是，如果我們不在模式中建立關係模型，而是將關係本身作為數據處理呢？
Enter the graph database, where your data is represented as nodes and the relationships between them as edges.

在圖數據庫中，數據以節點表示，節點之間的關係以邊表示。
Popular options in this space include Neo4j and dgraph.

該領域的熱門選項包括 Neo4j 和 dgraph。
Let's imagine we want to set up a many-to-many relationship in an SQL database.

假設我們想在 SQL 數據庫中建立多對多關係。
We do that by setting up a join table with the that define the relationship.

我們可以用定義關係的連接表來實現這一點。
In a graph database, we don't need this middleman table.

在圖形數據庫中，我們不需要這種中間表。
We just define an edge and connect it to the other record.

我們只需定義一條邊，並將其連接到另一條記錄。
We can now query this data with a statement that's much more concise and readable.

現在，我們可以用更簡潔、更易讀的語句來查詢這些數據。
In addition, we can achieve much better performance, especially on larger datasets.

此外，我們還能獲得更好的性能，尤其是在較大的數據集上。
Graph databases can be a great alternative to SQL, especially if you're running a lot of joins and performance is taking a hit because of that.

圖形數據庫是 SQL 的最佳替代品，尤其是在運行大量連接且性能是以受到影響的情況下。
They're often used for fraud detection in finance, for building internal knowledge graphs within companies, and to power engines like the one used by Airbnb.

它們常用於金融欺詐檢測、公司內部知識圖譜的構建，以及像 Airbnb 這樣的引擎。
Now let's imagine you want to build something like Google.

現在，讓我們想象一下，你想建立一個像 Google 一樣的東西。
A user provides a small amount of text, then your database needs to return the most relevant results ranked in the proper order from a huge amount of data.

用戶提供少量文本，然後您的數據庫需要從海量數據中按適當順序返回最相關的結果。
For that, you're going to want a full-text search engine.

為此，您需要一個全文搜索引擎。
Most of the databases in this space are based on top of the Apache Lucene project, which has been around since 1999, like Solr and Elasticsearch.

該領域的大多數數據庫都基於 Apache Lucene 項目，如 Solr 和 Elasticsearch。
In addition, we have cloud-based like Algolia, and my new personal favorite, MeleSearch, a Rust-based full-text search engine.

此外，我們還有 Algolia 等基於雲的搜索引擎，以及我個人的新寵 MeleSearch，一個基於 Rust 的全文搜索引擎。
If you want to check it out, I have a full tutorial on Fireship.io for pro members.

如果你想了解更多，我在 Fireship.io 上為專業會員提供了完整的教程。
From a developer perspective, they work very similar to a document-oriented database.

從開發人員的角度來看，它們的工作原理與面向文檔的數據庫非常相似。
You start with an index, then you add a bunch of data objects to it.

首先建立一個索引，然後向其中添加大量數據對象。
The difference is that under the hood, the search database will analyze all of the text in the document and create an index of the searchable terms.

不同之處在於，在引擎蓋下，搜索數據庫將分析文檔中的所有文本，並創建可搜索術語的索引。
So essentially, it works just like the index that you would find in the textbook.

是以，從本質上講，它就像教科書中的索引一樣。
When a user performs a search, it only has to scan the index as opposed to every document in the database, and that makes it very fast even on large datasets.

當用戶執行搜索時，只需掃描索引，而不是數據庫中的每個文檔，是以即使在大型數據集上也能非常快速。
The database can also run a variety of different algorithms to rank those results, filter out irrelevant hits, handle typos, and so on.

數據庫還可以運行各種不同的算法，對這些結果進行排序，過濾掉不相關的點擊，處理錯別字等。
This does add a lot of overhead, and they can be expensive to run at scale, but at the same time, they can add a ton of value to the user experience if you're building something like a type-ahead search box.

這確實會增加很多開銷，而且大規模運行的成本也會很高，但與此同時，如果你正在構建類似於 "先輸入搜索框 "這樣的功能，它們也能為用戶體驗增加很多價值。
And with that, we've reached number seven, the multi-model database, which in my opinion is the most exciting paradigm on this list.

現在，我們來到了第七項--多模型數據庫，在我看來，這是本列表中最令人興奮的範例。
There are a few different options out there, but the database I want to focus on here is FaunaDB, which is very different than anything else we've looked at so far.

目前有幾種不同的選擇，但我想在此重點介紹的數據庫是 FaunaDB，它與我們迄今為止瞭解到的其他數據庫截然不同。
If you're a front-end developer, all you really care about is the data that you consume in the front-end application.

如果你是一名前端開發員，你真正關心的是在前端應用程序中消耗的數據。
You just want some JSON.

你只需要一些 JSON。
You don't want to have to think about data modeling, schemas, replication, shards, or anything like that.

你不需要考慮數據建模、模式、複製、分片或其他類似問題。
With FaunaDB, you describe how you want to access your data using GraphQL.

通過 FaunaDB，您可以描述如何使用 GraphQL 訪問數據。
In this example, we have a user model and a post model, where a user can have many posts.

在這個例子中，我們有一個用戶模型和一個帖子模型，其中一個用戶可以有很多帖子。
If we upload our GraphQL schema to Fauna, it automatically creates collections where we can store data, and an index to query the data.

如果我們將 GraphQL 架構上傳到 Fauna，它就會自動創建用於存儲數據的集合和用於查詢數據的索引。
Behind the scenes, it's figuring out how to take advantage of multiple database paradigms, like graph, relational, and document, and determining how to best use these paradigms based on the GraphQL code you provided.

在幕後，它要弄清楚如何利用圖形、關係和文檔等多種數據庫範式，並根據您提供的 GraphQL 代碼確定如何最好地使用這些範式。
You create data by adding documents to collections just like you would with a document database, but you're not with the inherent limitations when it comes to data modeling.

你可以像使用文檔數據庫一樣，將文檔添加到集合中來創建數據，但在數據建模方面卻不會受到固有的限制。
On top of that, it's ACID compliant, extremely fast, and you never have to worry about provisioning the actual infrastructure.

此外，它還符合 ACID 標準，運行速度極快，而且您無需擔心實際基礎設施的配置問題。
You just decide how you want to consume your data, and you let the cloud figure everything else out for you.

您只需決定如何使用數據，其他一切都由雲計算為您解決。
I'm going to go ahead and wrap things up there.

我先走了，到此為止。
We didn't cover every single database paradigm.

我們沒有涵蓋所有的數據庫範例。
There's a few others that you might want to know about, like time series databases, and also data warehouses.

你可能還想了解其他一些知識，比如時間序列數據庫和數據倉庫。
And if you want to learn advanced data modeling concepts, consider becoming a pro member at Fireship.io and taking the Firestore data modeling course.

如果您想學習高級數據建模概念，可以考慮成為 Fireship.io 的專業會員，並參加 Firestore 數據建模課程。
Thanks for watching, and I will see you in the next one.

感謝您的收看，我們下期再見。