Placeholder Image

字幕列表 影片播放

  • Hello and Welcome to using MySQL

  • to build Big Data applications

  • to build Big Data applications

  • This is going to be a tutorial about

  • obviously, using MySQL

  • obviously, using MySQL

  • to build Big Data applications, but

  • when I mean Big Data

  • when I mean Big Data

  • there could be two things, it could be..

  • there could be two things, it could be..

  • Sorry, there could be two problems that you are addressing. Either it's an problem of scaling

  • as in, my system already has a lot of data and I..

  • as in, my system already has a lot of data and I..

  • I would like to be able to

  • I would like to be able to

  • make the existing features more performant or

  • be allowed to get more volume

  • be allowed to get more volume

  • and the other problem is reporting

  • and the other problem is reporting

  • in the sense that you already have Big Data

  • in the sense that you already have Big Data

  • in the sense that you already have Big Data

  • and you are asked to make use of that in some way

  • and you are asked to make use of that in some way

  • and you are asked to make use of that in some way

  • to either give more insight

  • to either give more insight

  • to the business users in your organization or give aggregated reports

  • to your customers about how they are performing

  • to your customers about how they are performing

  • and I'm going to focus today, on this side (reporting)

  • and I'm going to focus today, on this side (reporting)

  • and I'm going to focus today, on this side (reporting)

  • of the Big Data the problem.

  • So what is the problem with with the Big Data?

  • So what is the problem with with the Big Data?

  • Basically, it's as if you have a very large table

  • Basically, it's as if you have a very large table

  • Basically, it's as if you have a very large table

  • with millions or billions of rows

  • with millions or billions of rows

  • and in order to do the reporting that you need to do

  • and in order to do the reporting that you need to do

  • you need to gather all this information from this table and process it in some way

  • you need to gather all this information from this table and process it in some way

  • However, what does that mean in terms of the underlying physics of it.

  • However, what does that mean in terms of the underlying physics of it.

  • However, what does that mean in terms of the underlying physics of it.

  • You have a hard disk

  • You have a hard disk

  • (let's pretend that's a hard disk)

  • and in order to get the certain rows from the table on the hard disk

  • and in order to get the certain rows from the table on the hard disk

  • you have to go over many different places in the hard disk

  • So, if it is a large amount of data (that) would obviously be more time consuming.

  • If the data is fragmented across different places on the hard disk that would mean you have to spin more.

  • If the data is fragmented across different places on the hard disk that would mean you have to spin more.

  • If the data is fragmented across different places on the hard disk that would mean you have to spin more.

  • and once you have that

  • you need to get that data into the CPU (roughly)

  • you need to get that data into the CPU (roughly)

  • you need to get that data into the CPU (roughly)

  • to aggregate that (data)

  • to aggregate that (data)

  • To process it (the data). To manipulate it into whatever way you need it to (be)

  • and then you produce a report

  • which you later provide to your users

  • which you later provide to your users

  • provide to your users

  • and they are happy about it

  • (I'm not sure if you can see that)

  • So, this problem has actually been going on for a very long time

  • So, this problem has actually been going on for a very long time

  • How are we able to, with existing hardware technologies,

  • How are we able to, with existing hardware technologies,

  • get more data faster to be able to process it and turn it into a report

  • get more data faster to be able to process it and turn it into a report

  • Many years ago, a person called Ralph Kimbal

  • who is the main or one of the two main contributors to the data warehousing

  • who is the main or one of the two main contributors to the data warehousing

  • who is the main or one of the two main contributors to the data warehousing

  • who is the main or one of the two main contributors to the data warehousing

  • he came up with.. data warehousing.. I wouldn't say movement, but technology

  • he came up with.. data warehousing.. I wouldn't say movement, but technology

  • he came up with.. data warehousing.. I wouldn't say movement, but technology

  • came up with the idea in 1995 or 1996

  • came up with the idea in 1995 or 1996

  • where he said basically, no matter what the technology is

  • where he said basically, no matter what the technology is

  • is we'll always have to go through a large number of rows

  • is we'll always have to go through a large number of rows

  • so how can we design our database

  • (in a way) that we are able to produce reports without

  • (in a way) that we are able to produce reports without

  • (in a way) that we are able to produce reports without

  • (in a way) that we are able to produce reports without

  • very resource intensive operations and

  • what he thought was

  • his solution to this program was basically to create something called a summary table

  • his solution to this program was basically to create something called a summary table

  • and a summary table is an aggregated

  • version of this table

  • obviously, smaller and with less rows

  • that data is already been

  • taken from here (the large table) and

  • summarized here (the small table). So when you access this summary table

  • it's obviously much easier to get the rows and much easier to give back the results

  • it's obviously much easier to get the rows and much easier to give back the results

  • it's obviously much easier to get the rows and much easier to give back the results

  • So let me give some examples about what

  • what that would look like

  • so let's say, you have

  • so let's say, you have

  • a table

  • and it has orders

  • like a basic e-commerce site

  • and you have

  • usually a hundred thousand rows

  • usually a hundred thousand rows

  • per day

  • so it's not really a

  • not really an issue for any relational database.

  • You store those rows

  • You store those rows

  • with the database. That's fine.

  • But your period of time, lets say a year

  • But your period of time, lets say a year

  • you have quite a large number of rows

  • So you start to have 36.5 million rows

  • and that could get cumbersome

  • and in some cases

  • it could be much more than 100,000 rows, but lets stick to this example

  • So you want to

  • create a report from the orders table and you want to know

  • create a report from the orders table and you want to know

  • The business users in your organization want to know how certain products doing across particular dates

  • The business users in your organization want to know how certain products doing across particular dates

  • The business users in your organization want to know how certain products doing across particular dates

  • What you could you do (is), you could create a summary table

  • What you could you do (is), you could create a summary table

  • like this

  • and

  • For the sake of clarity, I'll write a SELECT statement here that will explain the contents of the summary table. So you have

  • For the sake of clarity, I'll write a SELECT statement here that will explain the contents of the summary table. So you have

  • select

  • So lets say we need date, because that was what was requested

  • So lets say we need date, because that was what was requested

  • and any product_id

  • and we want to get the aggregated details of revenue

  • and we want to get the aggregated details of revenue

  • and we want to get the aggregated details of revenue

  • and then we GROUP BY it

  • date

  • Basically the two keys (columns)

  • date and product_id

  • This is now the new summary table and we can call it

  • product revenue summary

  • product revenue summary

  • product revenue summary

  • and this had to say we have . hundred products, so this will have

  • hundred rows a day

  • So obviously, you could

  • after generating this table

  • You could provide this table to your business users and say

  • "Do whatever you need. Find out whatever information you want to gather."

  • so lets say for example,

  • If someone were to query for product 13A

  • If someone were to query for product 13A

  • If someone were to query for product 13A

  • and how it did (performed) on weekends

  • and how it did (performed) on weekends

  • so perhaps you know you would find the table

  • so perhaps you know you would find the table

  • for weekends or dates

  • Get only weekends and perhaps INNER JOIN it with that (summary) table

  • Get only weekends and perhaps INNER JOIN it with that (summary) table

  • and you'll get their answer very quickly

  • and you'll get their answer very quickly

  • and you'll get their answer very quickly

  • and your users will be happy because of it

  • and your users will be happy because of it

  • A different sample

  • or a different summary table could be for people who are interested to know how the

  • product is selling across a particular geography

  • and in this case, lets say city

  • so

  • what we would need to do for that it's a city_id isn't recorded in the orders table

  • we would need to enrich

  • we would need to enrich

  • the table a little bit

  • and the way we do that is we

  • we INNER JOIN it with the addresses table

  • and

  • what we would do is we would, basically.. I'll just write it here

  • what we would do is we would, basically.. I'll just write it here

  • you would do SELECT

  • let's do

  • let's do

  • o for orders, o.date

  • and

  • city

  • and

  • sum(o.revenue) FROM orders o

  • INNER JOIN addresses a

  • INNER JOIN addresses a

  • on

  • on

  • (actually) using

  • address_id

  • address_id

  • GROUP BY date and city

  • and we will fill up a new summary table

  • and we will fill up a new summary table

  • called

  • called

  • city revenue summary

  • so here we have

  • two summary tables

  • Two different ways of slicing the data. Now

  • you aren't exactly limited by the number of summary tables you can have

  • you aren't exactly limited by the number of summary tables you can have

  • obviously, they take a certain amount of space and

  • obviously, they take a certain amount of space and

  • they also take some effort into creating (them), but we'll get into that soon

  • they also take some effort into creating (them), but we'll get into that soon

  • was you could have done for example here is that you could have added city to to product

  • so you have product

  • you have here date, product_id and city_id

  • make it a larger summary table, but you can get the data in two different ways

  • or perhaps you can then have a more extensive

  • more extensive summary table with a higher level of granularity

  • more extensive summary table with a higher level of granularity

  • You could search for product and city and date

  • that could be a user requirement. It depends.

  • if you're interested in getting to the data in one way

  • You are only interested in slicing the data in this way or slicing the data in this way (second summary table)

  • You are only interested in slicing the data in this way or slicing the data in this way (second summary table)

  • currently you have two summary tables

  • and this particular summary table has saved you an INNER JOIN

  • that could be quite valuable in terms of performance, saving you an INNER JOIN

  • So, those are the two examples

  • i'd just like to quickly

  • give another example

  • of what happens nowadays in some other companies

  • of what happens nowadays in some other companies

  • some social networks

  • Already kind of use the idea of summary tables

  • in their systems

  • lets say

  • they have lots of servers

  • it's spread geographically: this is Europe

  • This is North America. This is South America

  • and this is Asia

  • and this is Asia

  • and

  • in order for them to get reports that they are interested in

  • what they would do is they would get data

  • what they would do is they would get data

  • From all the servers

  • into lets say a map/reduce system

  • in this case lets say hadoop, for example

  • and remember, we don't need the exact

  • and remember, we don't need the exact

  • once it arrives here, we don't need the exact data from them. We need the aggregated data to goto

  • once it arrives here, we don't need the exact data from them. We need the aggregated data to goto

  • once it arrives here, we don't need the exact data from them. We need the aggregated data to goto

  • to another database or another summary table

  • to another database or another summary table

  • and once the data from here is aggregated

  • it goes into a reporting database

  • depending on their needs this can be mysql database

  • depending on their needs this can be mysql database

  • if their needs are greater, it could be

  • if their needs are greater, it could be

  • if their needs are greater, it could be

  • any number of commercial or open source solutions which can handle

  • larger amounts of data

  • But the theory is very similar to

  • the example of summary tables there was that you get

  • data from from someplace you

  • you summer you advocated in the clinton reporting databases manual use those

  • you know years those

  • uh... access to state the base

  • and

  • korea

  • according hollers as they see cannot from becky affirmative also creates

  • ripples on your own

  • but he's a study group or to chat

  • uh...

  • though you can't

  • change of course you can change according to the report is is as it is

  • whereas here

  • uh... if they want to change

  • uh... the query okay

  • different information once today

  • discover there's something wrong with that they prevail and it's not slacking

  • more information according to what they found that they can

  • uh... graders that was whereas here it's static

  • um...

  • so

  • as well

  • a lot of things you're adding basically have the leading ill to yield a delay

  • system a subsystem

  • uh... or something

  • and you need to know

  • basically

  • creator

  • he needs to

  • uh... make sure that

  • dates are rising too

  • data constantly

  • rising to the summary tables

  • and needs more thought everything goals according as well

  • soul

  • uh...

  • the joys residue of the owners

  • animal duets with

  • pass that you can do is the reason the some

  • is that the summertime blues yeah

  • uh... one is real time

  • and one is attached

  • university either on angel home

  • overlooking once an hour

  • or job it's that time

  • geranium for example when no one's yours it's over and

  • uh...

  • it's very securities the pros

  • lettuce for example it sorry

  • that seems a little ones once an hour soldier getting older all the all those

  • that happened

  • over the last hour and breaking it

  • i'm putting it into the summary tables

  • insect time

  • same principle but

  • for all the day's worth of episodes

  • it's becoming a bit

  • uh... large amount of

  • in these cases it's important to note that

  • the summit that was our only refreshed

  • you know once in a row once a day so

  • it's a business decision that's okay that's

  • and you can go out and do that

  • uh...

  • mcconnell's

  • martin duckworth

  • and

  • forestry with uh...

  • so it would be with

  • not saw this with his uh... and the system

  • you need to set up monitoring

  • because you can't just take for granted that you know that has a right you can

  • just create something like the select statement that i created

  • put in the crimes of and and all that nothing goes around you have to make

  • sure that that

  • at everything

  • is there's no warning messages no analysts is that the didgeridoo

  • as it should

  • um...

  • regarding my sql

  • to set something like this out

  • you can basically is a select statement that i i did

  • you know what the celts and uh...

  • well close depending on your

  • on norinko

  • an insult into

  • three-fifths if it's owns themselves for example you have uh...

  • once all alert reporting so lands

  • uh... you have a date the coming through petition which is the the riddles atm

  • so you can set up

  • magistrates mountain

  • for example

  • this is you're going to lose

  • and here is your role

  • mainstream

  • lazy reporting seem

  • annual to st louis yeah

  • and have for example you have the to some of them

  • and you would do use the insult statement here

  • and something to say pencils

  • uh... take it from here

  • things that the clinton

  • these two well

  • so or that openssl

  • insert it into

  • tabled

  • that the other

  • uh...

  • the other way of doing this is

  • perhaps a bit

  • bids moral

  • slightly more confusing but sometimes it's a requirement by

  • something databases

  • that you do select

  • uh... unsolved problem

  • some kind of fun

  • and then some kind of uh...

  • into our

  • on the twenty on august the loads

  • data

  • info on

  • command mysql

  • and

  • this is

  • sunday just helps with rick occasional absolute soaking

  • requirement so that was one of the please do whatever is more convenient

  • for you

  • i would say it

  • i would advocate

  • here for example e

  • wearing them as a group of

  • i would also bergmann ending a little boy now

  • because my skill by default

  • would have the uh... grew brian

  • also

  • w invisible although by columns that you chose and that means you have to do in

  • addition fossil thing

  • if you don't need that they don't require it

  • uh... estimate total

  • you can handle it by now and a commandment

  • uh... you can

  • so you can not

  • frank usual unique he's on on these

  • and you can

  • replacing terror that smoking principally no ring true

  • uh... you can

  • aren't story differs empl

  • murdered a job

  • between going to love the data uh...

  • uh...

  • it's only need a tony date arrives and that's fine but if for example

  • uh...

  • there's a

  • chance of although they could being updated then you have to

  • perhaps include more than when i walk for beta in into your interval you have

  • to maybe

  • diesel too little as i said replace interested in going to

  • ordinator that may be updated you have to either recalled which totals of days

  • ago you can just

  • bulk say

  • the last three six seven days

  • uh...

  • if any that there was a bit in that period of time please update

  • countries object sometimes with a vaca

  • submit something to look out for with uh...

  • what do you can do is best for do

  • uh...

  • solar duke however didn't go

  • that their cargoes quite good recently

  • and

  • this is still the same back

  • still blanche

  • i would say though that the relation that was regarding you know

  • the crew buys

  • this congregation tradition there is still very very strong

  • uh... you would

  • look at uh... wanted to do

  • even though it's questions to just one

  • beta versions not paralyzed very well

  • uh... its

  • if it takes a very long time you may want to a group in some way

  • uh... just the differences here that this is one senator can be you know five

  • six seven eight cells

  • uh... judge richard data so he would you would get it

  • heating digital mysql

  • database

  • interview with you

  • cluster

  • and i go back and kids

  • or you can have back employees have hurt

  • specific reporting data

  • regarding real time

  • interest in this basically means triggers

  • powell purification

  • and understands dirt itself so

  • daggers once you update uh... regulator base

  • when your insults and also that the riddle database

  • uh... yesterday the summary tables as well

  • in the same in the same uh...

  • instance

  • so

  • or is it is a bit of an overhead

  • uh... if you have a high note in the day the races

  • in general and uh... uh...

  • and you can take this additional rental grand

  • you may want to consider branch all you know i consider all

  • i don't want to use it

  • requirement has to be real time and i i think you're in for social networks this

  • is going to be like every requirement

  • um...

  • so

  • you have

  • basically when you have been so sick man here

  • uh...

  • you would

  • if it's tickets for example

  • you have uh...

  • some integration then you would

  • so you know for this particular line

  • adv did them on document the distance

  • this one role

  • into this table and that is

  • attitude

  • if it's been updated

  • change it if it's the lead to remove it

  • there is also in a lot of other functions functions average main

  • maxim that

  • can be more complicated

  • uh... there is an envelope arrived

  • mobile from other websites

  • speaks about that

  • how to get used to go out to right uh...

  • transformed

  • if alum

  • it's not an issue to add these two girls though it is an issue tablets relief

  • and it's a bit more convenient

  • because mona you don't need to monitor it so much

  • index

  • uh... if you insult if you answer that

  • and something goes wrong and i wanted to make sense of it so were friends you fix

  • it done and uh...

  • if you're using a tribute to log something and then delete legend and

  • asynchronous and

  • slugs accurate time

  • then you may need to use the monitor

  • but uh...

  • trying to confuse you

  • uh...

  • we've done basically trios

  • and you can't find him halted set that up

  • uh...

  • and was being

  • so

  • just as a summarize

  • even as the gorgeous progress

  • if you want to speed up you'll

  • your reports

  • it is a good idea to have some examples

  • uh...

  • and you can use that reports of those

  • it's very common

  • in places a couple of days now

  • and

  • uh...

  • i'm showing you can help you

  • it does take a bit of addition design and i haven't spoken about it looks a

  • bit of an example where

  • we've removed animal join with pcp

  • ambitions on

  • something you have to look at it more and as a group of seven given

  • too much code example soul

  • uh... this is really interesting obviously understand what some of the

  • rules are

  • um...

  • exit thank you for watching my sister

  • uh... if you want to contacting

  • uh... this reminder to us

  • anderson in mind

  • website slash blog

  • we can find out more information to fit in the past

  • thank you very much

Hello and Welcome to using MySQL

字幕與單字

單字即點即查 點擊單字可以查詢單字解釋

A2 初級

使用MySQL構建大數據應用 (Using MySQL to Build Big Data Applications)

  • 167 20
    Chris Lyu 發佈於 2021 年 01 月 14 日
影片單字