SQL SELECT教程 - Part 2 |¦| SQL教程 |¦| SQL for Beginners。 (SQL SELECT Tutorial - Part 2 |¦| SQL Tutorial |¦| SQL for Beginners)

字幕列表影片播放

In skilled hands, the SELECT statement seems like magic.
With just the right query, you can reach into a table... hunt around... and voila!
Find the exact data you desire.
This is the power of SQL.
So hang on to your hat and / or glasses, because we are going to take a wild ride
with SELECT queries...
A “SELECT” query is used to retrieve data from one or more tables.
The first part of a SELECT query is to list the columns you want data for.
If you do not want to specify the columns, you can use an asterisk to retrieve all columns.
Next, write a FROM clause to list the tables you want data from.
You can provide a list of restrictions on the data to retrieve with a WHERE clause.
Here, you specify the conditions that must be met by the data in order to be included
in the results.
If you have more than one condition, you usually separate them with “AND” ... or “OR”.
And finally, you can SORT the data returned by your query.
For example, to order the data by the values in a specific column, write “ORDER BY”
followed by the column name.
By default, it will sort in ascending order.
But if you would prefer to be explicit, you can enter “ASC” to sort in ascending order…
… or “DESC” to sort in descending order.
These are the primary clauses in a “SELECT” query: SELECT, FROM, WHERE, and ORDER BY.
The table we will use today is the “earthquake” table.
The SQL database I will be using is Postgres with the pgAdmin GUI.
You can download the data used in this video from Github.
If you select all the data from the earthquake table, you can see the columns available.
There are 10 columns describing each earthquake.
Note that the first column is NOT data from the table.
Much like a spreadsheet, this column numbers the rows.
Now, what we just did was a bit risky.
What if this table had several million rows?
Our SELECT query asked for all data so we could look at the columns.
But when you think about it using your brain, we only needed to select 1 row to see the columns.
So let us start over and SELECT all columns from the earthquake data, but LIMIT our results
to 1 row with the LIMIT statement.
Much better.
Let us now use SELECT queries to explore the earthquake table.
To begin, how many rows are in this table?
This can be answered by selecting a count from the earthquake table.
This table contains information on 23,119 earthquakes.
COUNT is a function.
Some other common and useful SQL functions are MIN, MAX, AVERAGE, and SUM.
Let us use the MIN and MAX functions to find out the timespan covered by this table.
The “occured_on” column is a timestamp marking when the earthquake happened.
In a single query, we can select the minimum and maximum values in the “occurred_on” column.
Execute…
We see that the earthquake table includes quakes from 1969 through 2018.
A half century of rumbles and tumbles...
What about the magnitude?
Does the table include ALL earthquakes?
Again, we can use functions to answer this question.
Let us select the minimum and maximum magnitudes.
Execute…
The smallest earthquake in this table measured 5.5 on the Richter scale.
And sadly, the largest event was a terrifying 9.1 earthquake.
This provides us with a quick and accurate summary of this table.
The earthquake table contains 10 columns…
Information on 23,119 quakes of magnitude 5.5 or greater covering the 50 year period
from 1969 through 2018.
Next, let us investigate the “cause” column.
One might be tempted to select the “cause” data from the table.
Execute…
Houston, we have a problem.
As we scroll through, we see the query returned 23,119 results with a LOT of duplicates.
We can fix this with the SELECT DISTINCT statement.
The “DISTINCT” keyword makes sure we will not see duplicate rows.
Execute…
Well isn’t that a kick in the pants.
Nuclear tests can cause measurable earthquakes.
Let us count how many quakes were the result of each cause.
We first count the earthquakes that were caused by a natural earthquake.
We only want the count… and we want to restrict the count to those events where the cause
is “earthquake.”
Execute…
22,942…
Well, we can be thankful that most of the quakes were due to good old-fashioned tectonic
activity.
Next, count the number of earthquakes due to nuclear explosions.
Execute…
173…
Yowsers…
Lastly, how many events were caused by an “explosion.”
Execute…
Only 4 results.
Curious...
Since there are so few, let us look at the 4 rows.
After some searching, I learned that all four of these earthquakes were also the result
of a nuclear test.
Two by the Soviet Union, and two by France.
This example illustrates a common problem with databases.
The data may not be perfect.
These four rows should have a cause of “nuclear explosion” and not just “explosion.”
Let us now construct a query to find the details about the most recent earthquake caused by
a nuclear explosion.
We will select the place…
magnitude… and occurred_on columns.
The cause will be ‘nuclear explosion’.
To find the most recent event, order by the “occurred_on” column in descending order.
We are only interested in the most recent event, so limit our query to 1.
Execute…
And there it is…
Let us answer a different question using SQL.
What were the 10 largest earthquakes in this time period?
Again, we select the place, magnitude, and occurred_on columns.
FROM the earthquake table…
To find the largest quakes, ORDER the data by magnitude in descending order…
And LIMIT our query to 10 rows.
Execute…
This is a powerful and destructive list.
The earthquake off the coast of Indonesia was one of the most destructive earthquakes
in history causing the catastrophic boxing day tsunami.
The 9.1 quake off the coast of Japan also triggered a devastating tsunami.
And as typically happens with quakes of these magnitude, it spawned numerous aftershocks.
How can we count the number of aftershocks?
One solution is to search for all earthquakes that contain both “Honshu” and “Japan”
in the PLACE column... that occurred within a week of the March 11 quake.
To do this, select a count … from the earthquake table…
WHERE the place is LIKE ‘%Honshu%Japan%’
Here is a new operator.
In quotes, we specify a string pattern.
The % symbol matches zero or more characters.
So this pattern will match any place containing the location “Honshu” before the word
“Japan” Next, restrict the rows to earthquakes that
occurred one week from the primary earthquake.
To do this, we can use another operator called “BETWEEN”.
This will only select rows with dates between March 11 and March 18, 2011.
Run…
There were 166 earthquakes of magnitude 5.5 or greater within a week of the 9.1 quake...
Today we have learned many new ways to build SELECT queries.
These queries described real events impacting millions of people.
Today, scientists and engineers are taking this data and building early warning systems.
A few seconds warning before an earthquake can give people time to seek shelter.
And a few MINUTES warning of an incoming tsunami can give people time to retreat to safer ground.
SQL is a powerful tool for both exploring the past, and building a safer and brighter
tomorrow.