Placeholder Image

字幕列表 影片播放

  • What

  • is the first thing that comes to your mind when you hear the worddatabase”?

  • For many people, this question is more challenging than it might seem at first.

  • An answer like “a big file where much information is storedis not satisfactory and would

  • not please potential employers.

  • You should remember there are two main types of databasesrelational and non-relational.

  • The former will be the focus of this course, while the latter regards more complex systems.

  • Although understanding non-relational databases requires a serious mathematical and programming

  • background, some of the logic applied in its coding is the same as SQL.

  • Likewise, relational databases have a few advantages on their own.

  • A small bit of theory will explain why they are still the preferred choice in many companies

  • and institutions.

  • Databasesmain goal is to organize huge amounts of data that can be quickly retrieved

  • upon usersrequest.

  • Therefore, they must be compact, well-structured, and efficient in terms of speed of data extraction.

  • Today, people need such extra efficiency because data occupies memory space andthe bigger

  • its size, the more sluggish the database is and the slower the retrieval process becomes.

  • If we have a database containing one multi-million-row table, with many columns, then every time

  • a request has been received, the server must load all the records, with all fields, and

  • it would take too much time for a task to be completed.

  • Don’t forget every symbol is a container of information and requires bytes of storage

  • space.

  • Hence, loading that much data will not be an easy job for the computer.

  • So, what allows us to contain so much data on the server, but lets us efficiently use

  • only the portions we need for our analysis?

  • The secret lies behind the use of mathematical logic originating from relational algebra.

  • Please, don’t worry - we will not bother you with math.

  • Imagine each table with data is represented by a transparent circle that contains all

  • the data values of the table, categorized by columns or, as we will often call them,

  • fields.

  • Now, if our database consisted of only one table, a giant circle would represent the

  • entire database, something like this huge table from our fictional example with the

  • Salesdatabase.

  • And when we need a piece of information from the database, for example if we wish to see

  • who has bought something on a certain date, we will have to lift this whole big circle

  • and then search for what we need.

  • This challenge seems vague and the process of data extraction will not be efficient.

  • See what can happen if we split the circle into 3 smaller circles, just as we did with

  • theSalesdatabase.

  • One circle will stand for theSalestable, the other forCustomers”, and

  • the last one forItems”.

  • There are various theoretical combinations between 3 or more circles, but in our database,

  • we have the following model.

  • SalesandCustomershave the samecustomer IDcolumn, andSales

  • andItemshave the sameitem codecolumn.

  • This way, we can see the circles overlap as they have common fields.

  • So, if we’d like to extract the same information, the names of the customers who have purchased

  • something on a given date, we will need only thedate of purchasecolumn from the

  • Salestable and thefirstandlast namefrom theCustomerstable.

  • So, to satisfy this request, we will not need to lift the third circle from our database,

  • Items”.

  • This way, we can save energy or, more technically, increase efficiency.

  • Less data, represented as only two of the three circles, will be involved in this operation.

  • The mathematical trick lies in relating the tables to one another.

  • Relationships were formed namely through these common fields.

  • Anyway, I am sure that now you understand why we use the term relational databases.

  • Some professionals may refer to the tables, or the circles in our plot, as relations because,

  • theoretically, they are the smallest units in the entire system that can carry integral

  • logical meaning.

  • Likewise, the three circles are all part of ourSalesdatabase.

  • When we combine the database and its existing relations, we obtain the famous term relational

  • database management system, frequently abbreviated as RDBMS.

  • ? We hope this theoretical illustration makes

  • things clearer.

  • SQL is designed for managing relational database management systems and can do that by creating

  • relations between the different tables in a database.

  • To complete our theoretical preparation, we will need to learn more database terminology.

  • This is a necessary step that will help us when we start coding in SQL later.

  • Let’s go through the entire process of creating a database.

  • Assume our database containing customer sales data has not been set up yet, ok?

  • So, imagine you are the shop owner and you realize you have been selling goods quite

  • well recently, and you have more than a million rows of data.

  • What do you need, then?

  • A database!

  • But you know nothing about databases.

  • Who do you call, an SQL specialist?

  • No.

  • You need a database designer.

  • She will be in charge of deciding how to organize the data into tables and how to set up the

  • relations between these tables.

  • This step is crucial.

  • If the database design is not perfect from the beginning, your system will be difficult

  • to work with and wouldn’t facilitate your business needs; you will have to start over

  • again.

  • Considering the time and data (…and money!) involved in the process, you want to avoid

  • going back to point 0.

  • What do database designers actually do?

  • They plot the entire database system on a canvas using a visualization tool.

  • There are two main ways to do that.

  • One is drawing an Entity-Relationship diagram, an ER diagram for brief.

  • It looks like thisand, as its name suggests, the different figures represent different

  • data entities and the specific relationships we have between entities.

  • The connections between tables are indicated with lines.

  • This way of representing databases is powerful and professional, but it is complicated.

  • We will not focus in-depth on ER diagrams in this course, but you should know they exist

  • and refer to the process of database design.

  • Another form of representation of a database is the relational schema.

  • This is an existing idea of how the database must be organized.

  • It is useful when you are certain of the structure and organization of the database you would

  • like to create.

  • More precisely, a relational schema would look like this.

  • It represents a table in the shape of a rectangle.

  • The name of the table is at the top of the rectangle.

  • The column names are listed below.

  • All relational schemas in a database form the database schema.

  • You can also see lines indicating how tables are related in the database.

  • To this moment, it has been ideas, planning, abstract thinking, and design.

  • At this stage, it would be correct to say SQL can be used to set up the database physically,

  • as opposed to contriving it abstractly.

  • Then, you can enjoy the advantages of data manipulation.

  • It will allow you to use your dataset to extract business insights that aim to improve the

  • performance and efficiency of the business you are working for.

  • This process is interesting, and, practically, the main part of the course will be related

  • to that.

  • Rememberwell thought-out databases that are carefully designed and created are crucial

  • prerequisites for data manipulation.

  • If we have done good work with these steps, we could write effective queries and navigate

  • in a database rather quickly.

  • You will often hear the term database management.

  • It comprises all these steps a business undertakes to design, create, and manipulate its databases

  • successfully.

  • Finally, database administration is the most frequently encountered job amongst all.

  • A database administrator is the person providing daily care and maintenance of a database.

  • Her scope of responsibilities is narrower regarding the ones carried out by a database

  • manager, but she is still indispensable for the database department of a company.

  • All right!

  • So

  • In this lesson, we will use theSalesdatabase to illustrate the concept of relational

  • schemas a little better.

  • The data will be stored in 4 tables – “Sales”, “Customers”, “Items”, andCompanies”.

  • You see how these tables have a tabular shape.

  • Let’s see how a relational schema can be applied to represent them.

  • Let’s build the landscape piece by piece.

  • We care about the sales per customer of our company, and that’s why we have a “Sales

  • table.

  • It tells us the number with which each purchase was registered, the date of the purchase,

  • the respective customer ID, and the item code.

  • So, in this table, the dates of a few purchases may coincide, because it is normal to sell

  • some of the same goods in the same day.

  • Analogically, in the third column, the ID of a customer may appear a few times, given

  • that some customers make more than one purchase.

  • People may have bought many units of the same product; hence, there is a possibility to

  • see the same item code a few times in the last column.

  • This reasoning does not apply to the first field, though.

  • Each purchase is unique!

  • In databases, this means all the numbers in this column will be different, because each

  • purchase will be assigned a specific number that is unique.

  • There is a term for such type of field.

  • A column (or a set of columns) whose value exists and is unique for every record in a

  • table is called a primary key.

  • Each table can have one and only one primary key.

  • In one table, you cannot have 3 or 4 primary keys.

  • For instance, in ourSalestable, “number of purchasecan act as a single-column

  • primary key, and there will be no other primary keys.

  • If the database creators decide, the primary key of a table may be composed of a set of

  • columns, not of just one column.

  • In theSalestable, bothnumber of purchaseanddate of purchasecan

  • be thought of as a unique-identifier pair for the data in this table.

  • For instance, purchase number 1 and purchase date, which is the 3rd of September, will

  • form a unique pair and so will purchase number 2 and the same date, 3rd of September.

  • This means these 2 rows are different.

  • Please, don’t overlook the fact you cannot have a combination of the same purchase number,

  • 1, and the same date, 3rd of September, more than once in your table.

  • Either the number or the date must be different.

  • This is because, as we already said, a primary key must be unique.

  • Good!

  • Both ways of representation make sense.

  • Since having a one-column primary key here would mean all purchases in our database will

  • be recorded under a different number, this means they can go from 1 to, hopefully, a

  • million, or two

  • This logic will preserve the uniqueness of the purchases.

  • Ok.

  • So, we will proceed withpurchase numberas a single-column primary key.

  • Awesome!

  • Therefore, you can think of a primary key as the field or group of fields that identifies

  • the content of a table in a unique way.

  • For this reason, the primary keys are also called the unique identifiers of a table.

  • Another crucial feature of primary keys is they cannot contain null values.

  • This means, in the example with a single-column primary key, there must always be a value

  • inserted in the rows under this column.

  • You cannot leave it blank.

  • Please, be careful and don’t forget this characteristic of the primary key!

  • Technically, to create a relational schema, we draw a table in the form of a rectangle,

  • and we place its name on top.

  • Then, we enlist the fields vertically (as opposed to their horizontal organization in

  • a spreadsheet).

  • The field that is the primary key of the table is usually quoted on top of the other fields.

  • What is more important - it is always underlined.

  • Keeping these simple rules in mind, when you spot this table, you will immediately know:

  • It is calledSales”; its primary key ispurchase number”; and there are three

  • other fields – “date of purchase”, “customer ID”, “item code”.

  • This image corresponds to tabular data in the following form

  • All right?

  • Good.

  • One last remark about primary keys.

  • Please, remember that not all tables you work with will have a primary key, although almost

  • all tables in any database will have a single-column or a multi-column primary key.

  • To conclude, the term for representing a table with the help of such a rectangle and the

  • names of the fields is called a relational schema.

  • When you combine the relational schemas of the tables we have in a database, this gives

  • us a database schema.

  • Informally, it is ok to call the representation of the entire database in plural - relational

  • schemas.

  • All right!

  • Excellent!

  • TheCustomer IDcolumn does not give away much information about the purchaser,

  • does it?

  • We will store her first and last names, email address, and number of complaints submitted

  • in another table, calledCustomers”.

  • Here is a table containing all these fields, as well as a “Customer IDcolumn, which

  • is unique; there is no need to keep the information about the same customer more than once, right?

  • Therefore, in this table, theCustomer IDfield is the primary key.

  • The other columns represent the rest of the information about a customerfirst name,

  • last name, email address, and number of complaints.

  • Interesting!

  • We already have two tables – “SalesandCustomers”!

  • We know they are related through theCustomer IDfield, and as you can see, a “Customer

  • IDcolumn exists in theSalestable.

  • The coincidence of the two column names is chosen purely for convenience.

  • The same relationship could be based on columns with different names – “Column IDand

  • justID”, for example.

  • What is important is that the information in theSalestable can be obtained through

  • theCustomer IDcolumn in theCustomerstable.

  • Great!

  • Let’s stick to using the same name for both columns.

  • All right.

  • Here is the big trick.

  • Customer IDis a primary key for theCustomerstable and a foreign key in

  • theSalestable.

  • In the relational schemas form of representation, relations between tables are expressed in

  • the following waythe column name that designates the logical match is a foreign

  • key in one table, and it is connected with a corresponding column from another table.

  • Often, the relationship goes from a foreign key to a primary key, but in more advanced

  • circumstances, this will not be the case.

  • To catch the relations on which a database is built, always look for the foreign keys,

  • as they show us where the relations are.

  • How is this shown on the canvas?

  • First, you should indicate thatCustomer IDis a foreign key in theSales

  • table with F.K. in brackets next to the column name.

  • Second, you must draw an arrow that begins from the same level as the field name of the

  • foreign key and ends pointing at the column name from the table it is related to.

  • Don’t worry; the arrow won’t be straightwhat matters are its start and its end.

  • Observe the following example from ourSalesdatabase to avoid confusion between primary

  • and foreign keys.

  • In theCustomerstable, “Customer IDcontains different values; that is,

  • different numbers are stored on every row, and there are no missing values.

  • This complies with the requirement for primary keys to contain unique values only.

  • In theSalestable, though, we see under theCustomer IDcolumn we can have repeating

  • and missing values.

  • This is ok.

  • For instance, it is normal to expect customer number 2 to have executed three purchases

  • in our shop, correct?

  • Also, it is possible to have not registered who executed purchase number 9.

  • The values under this field in theSalestable will be mentioned in theCustomers

  • table, as well.

  • If we have registered information for four customers with IDs 1,2,3, and 4, we cannot

  • have a value of 5, 10, or 100 here, right?

  • This also means, if you try to insert in theSalestable a record with a “Customer

  • IDthat differs from the ones we have already seen in theCustomerstable, MySQL will

  • raise an error and the record won’t be stored.

  • Fantastic!

  • By the same token, we can create theItemstable.

  • The fieldItem codeappears in this table as a primary key and inSales

  • as a foreign key.

  • Analogically to the previous example with theCustomer IDcolumn from theCustomers

  • table, “Item Codecontains a letter and a number forming a unique combination in the

  • Itemstable, while inSales”, the values could repeat, as is the case with item

  • B2, for instance.

  • Therefore, we can use “F.K.” to indicate anItem Codeis a foreign key in the

  • Salestable and by drawing an arrow starting from theItem Codefield in

  • theSalestable pointing at the column with the same name in theItemstable.

  • Great!

  • Remember that primary keys were called unique identifiers of a table?

  • Well, foreign keys can be called identifiers, too, but they identify the relationships between

  • tables, not the tables themselves.

  • Amazing!

  • Ok, great.

  • Now we have a good idea of what the database will look like.

  • However, there is one missing bitwe have to create a connection between theSales”,

  • Customers”, andItemstable and theCompaniestable you created earlier.

  • And why is theCompaniestable so important to us?

  • Well, I hope you can already see thatCompany IDis a primary key in it.

  • Spotting this must be a piece of cake for you already.

  • On the other hand, you can rarely have two or more companies with the same name, but

  • it’s not impossible, especially if they are registered in a different state or if

  • they operate in different industries.

  • How about theHeadquarters Phone Numberfield?

  • Can you have two US numbers that are completely identical?

  • No!

  • When you dial a phone number, there is just one phone that will ring somewhere in the

  • US, right?

  • Therefore, we can conclude that the values in this field have to be unique.

  • But wait!

  • This also means thatHeadquarters Phone Numberis a primary key!

  • Would that be possible?

  • We just said that there is one and only one primary key in a table, and here theCompany

  • IDcolumn is a primary key!

  • Don’t worry, there is a name for this type of field, and it is unique keys.

  • They are used whenever you would like to specify that you don’t want to see duplicate data

  • in a given field, exactly as in our example with the headquarters phone number.

  • There is one substantial difference between unique keys and primary keys, which is that

  • they can contain null values.

  • Meaning, if we don’t know the headquarters phone number of Company D, we can still have

  • this record and our table will be functioning as part of the relational database.

  • However, it won’t give us information about the phone number of Company D.

  • If, instead, we remove number two as company ID and leave the nameCompany B” and

  • the relevant phone number, SQL will display an error message becauseCompany ID

  • is a primary key and we are obliged to provide data in every row of this column.

  • In addition, a table can have 0, 1, 2 or more unique keys.

  • This means that depending on the requirements of the database, a table may or may not contain

  • one or more unique key columns.

  • This is different with respect to the characteristics of the primary key constraint: remember, you

  • can have only one primary key per table.

  • Furthermore, what is similar between the two types of constraints is that both can be applied

  • to multiple columns, not just to a single column.

  • In other words, the unique key constraint can be defined by a single field from the

  • table it is applied to, or it could comprise a few of the columns in that table.

  • Both scenarios are possible.

  • Wonderful!

  • This completes the creation of the entireSalesdatabase scheme!

  • All right!

  • Excellent!

  • This is a lecture in which we will try to illustrate that relationships between tables

  • can be categorized.

  • We will not explore all types of relationships in detail given that this topic is rather

  • theoretical and time consuming in its entirety.

  • Instead, we will study the main types of relationships you will likely need in your workplace.

  • Relationships tell you how much of the data from a foreign key field can be seen in the

  • primary key column of the table the data is related to and vice versa.

  • What do we mean by that?

  • As we already discussed, theCustomer IDcolumn is a primary key of theCustomers

  • table.

  • This means it contains only unique values – 1, 2, 3, and 4.

  • The same information about customer ID can be found in theSalestable as a foreign

  • key, but you will likely have a lot more than 4 rows there.

  • Hence, the values from 1 to 4 can be repeated many times because the same customer could

  • execute more than one purchase.

  • This is an example of a ‘one-to-manytype of relationship: one value from the customer

  • ID column under theCustomerstable can be found many times in the customer ID

  • column in theSalestable.

  • In the relational schema, this is shown by assigning the correct symbols at the end of

  • the arrow.

  • You should always read the symbols according to the direction of the relationship you are

  • exploring.

  • For instance, think of it this way: a single customer could have made one purchase, but

  • she could have made more than one purchase too!

  • Therefore, the second symbol, which is next to the rectangle, shows the minimum number

  • of instances of theCustomersentity that can be associated with theSales

  • entity.

  • When this symbol is a tiny line, it meansone”.

  • The symbol located next to the rectangle indicates the maximum number of instances that can be

  • associated with theSalesentity.

  • The angle-like symbol stands formany”.

  • Let’s check the relationship in the opposite directionfor a single purchase registered

  • in theSalestable a single customer can be indicated as the buyer.

  • So, the minimum number of customers you will find in theCustomerstable for whom

  • we have such information as name, email, and number of complaints, and that corresponds

  • to a single purchase in theSalestable, is one.

  • At the same time, we just said that for a given purchase we cannot have more than one

  • buyer, meaning that the maximum number of instances fromSalesassociated with

  • Customersis also one.

  • Therefore, for every purchase, we can obtain the name, email, and number of complaints

  • data for one customer, and we represent this logic by drawing а line for the minimum,

  • and а line for the maximum.

  • The two symbols in closer proximity to the rectangles form the relationship between the

  • Customersand theSalestables, according to the specified direction.

  • In our case, it is correct to say that theCustomerstoSalesrelationship

  • is one-to-many, whileSalestoCustomersis many-to-one.

  • Great!

  • The symbols showing us relationship limitations are called cardinality constraints.

  • There are other symbols that can be used too – M or N for infinite associations, or a

  • circle for optional instances, which would have been the case if it wasn’t necessary

  • for a registered person to have purchased an item.

  • There are some other types of relationships as wellone-to-one, many-to-many, and

  • others.

  • This is information that we share with you for general knowledge, as this is a specialized

  • topic which is of interest mainly to advanced users.

  • To sum it up, relational (or database) schemas do not just represent the concept database

  • administrators must implement.

  • They depict how a database is organized.

  • They can be thought of as blueprints, or a plan for a database, because they are usually

  • prepared at the stage of database design.

  • Drawing a relational schema isn’t an easy job, but relational schemas will help you

  • immensely while writing your queries – a neat and complete visualization of the structure

  • of the entire database will always be useful when retrieving information.

What

字幕與單字

單字即點即查 點擊單字可以查詢單字解釋

B1 中級

MySQL在25分鐘內的主要和外鍵|數據庫術語初學者 (MySQL in 25 Minutes | Primary and Foreign Keys | Database Terminology for Beginners)

  • 10 0
    林宜悉 發佈於 2021 年 01 月 14 日
影片單字