Placeholder Image

字幕列表 影片播放

  • >> Okay. Hi, guys. So this is Jay Pipes. He's a MySQL employee. He's in charge of North

  • American Community Relations. I met him this week at the MySQL User Conference which is

  • down in Sta. Clara. Now, the deal is that we're going to host this on Google video.

  • So if you have confidential questions, if you could leave them till the cameras are

  • turned off, that would be cool. With that, no more ado, Jay Pipes.

  • >> PIPES: All right. Nice to meet all of you. I'm going to be giving a talk that I gave

  • on, I think, Tuesday or Wednesday, on "Performance Tuning Best Practices for MySQL." Some of

  • the things are version specific but for the most part, it's pretty general. We're going

  • to be covering Profiling and Benchmarking Core Concepts, Sources of Problems that you

  • might encounter, Index Guidelines, Schema Guidelines, Coding Techniques, and lastly,

  • we'll go over some general things about tuning server parameters. Since the audience is,

  • I think, sort of mixed levels of experience, we'll just shout out questions or tell me

  • if you want some grand stuff or so. All right, Benchmarking Concepts. The general idea behind

  • benchmarking MySQL is to, obviously, have a track record. You know, when you change

  • things in your code, in your application, or the configuration files with MySQL, or

  • your schema design, or you're indexing, you want to make sure you have a track record

  • of what you're doing. Because a lot of times, if you--if you don't keep a track record and

  • don't do benchmarking, you won't really know whether a specific change that you made had

  • a detrimental impact or whether it actually increased performance. So, the baseline--and

  • you always want to have your baseline for your starting point. And always give yourself

  • a target. Don't just do it so it's a moving start--moving target, you always want to have

  • an end goal, you know. You either want to get, you know, 20% increase in performance.

  • So that's your goal. But just don't have it, you know, just kind of end with nothing because

  • then you'll just spend years and years and years trying to, you know, tune milliseconds

  • out of your--out of your application when there's really no benefit to it. So, change

  • only one thing at a time. A pretty basic concept but it really--it goes a long way in benchmarking.

  • If you change two things and you rerun a test, how do you know which one of those two things

  • actually affected the difference between the two tests? So, always change one thing at

  • a time and rerun the benchmarks. And record everything, configuration files, information

  • on the hardware, information on obviously the operating system, the version of MySQL,

  • and everything to do with the benchmarks. Keep it in a separate folder and keep everything

  • for historical record. A lot of times, you'll think that you won't need information when

  • you're running benchmarks. And then, six months later, you realize that, "Wow! It would be

  • really useful if I had--oh, I threw it away". So, always keep information from your benchmarks

  • so you have a historical record of what you did. And specifically with MySQL, you want

  • to disable the query cache whenever you're running benchmarks. The reason is it will

  • skew the results that you get from read-intensive queries and write-intensive queries as well.

  • So, disable the query cache by setting the query cache size equal to zero when you--when

  • you run benchmarks, so. ProfYes. Go ahead. >> [INDISTINCT] benchmarking for [INDISTINCT]

  • problem with the operating system, caches [INDISTINCT]?

  • >> PIPES: Right. Yes. The question was--what you were experiencing that the operating system

  • caching was skewing the results of the benchmark. Generally, the idea is either you can disable

  • caching if it's actually getting in the way or if you bump up the actual number of runs

  • in the benchmark tests you can minimize the skew effect. So that's usually what we recommend.

  • But, when you get into specific caches there's very specific things to do. So it kind of

  • depend on the exact scenario that you're doing, so. But, yes. Bumping up the number of runs

  • in the benchmarks generally tends to mitigate a lot of that, so. Does that answer your question?

  • Okay. Profiling Concepts, profiling is a little different than benchmarking. Generally, you're

  • profiling a production or running a test system that you're trying to figure out, you know,

  • diagnosing what's going on while it's running. So, instead of doing--you know finding the

  • stress or the load that the system can take, you're actually trying to pinpoint where the

  • bottlenecks are in the code while it's running. With MySQL, the best tool that you have is

  • the EXPLAIN command. So, regardless of what version of MySQL you are running, you really

  • want to get familiar with everything in the EXPLAIN command. Understand what the access

  • types, does everyone know what the EXPLAIN command is? Everyone use it? Okay. Does everyone

  • know what every single type of access type is? Okay. It's sort of a poorly named column

  • in the--in the EXPLAIN results. But, it actually describes the path or the optimization pattern

  • at which the optimizer's saying, "Okay well, I'm going to, you know, access, you know,

  • these results at and join it to these results at." And each of those things, the ref, the

  • refrenol, the equaf range, all the different access types mean different things. And the

  • performance that you get from each of them, you can go into MySQL manual, depending on

  • your version. You can see which ones have a better performance for different types of

  • queries. It's not always the same. So, you want to make sure you understand those. Understanding

  • obviously that if you see all in the access type, it means it's doing a full table scan.

  • You know just little things like that. Be aware of what they all mean and what the performance

  • impact of the different ones are. So, by using the Slow Query Log and mysqldumpslow to parse

  • the query log, very easy to set up, Log Slow Queries and then you give it a long query

  • time, anything above that, it will log to the--to the Slow Query Log. There's also--I

  • think in 4.1, we put it an option that you can--anything that--any table, any select

  • statement that does not use an index on a table can get logged to the Slow Query Log

  • regardless of how fast it runs. And that's--and that's useful for when you're doing an indexing

  • strategy and you're trying to determine okay my patterns have access, do I--do I have indexes

  • on those fields that I'm actually accessing? You know, and my where clause and ON clause

  • and that kind of thing. So, low hanging fruit law of diminishing returns tackles stuff that's

  • going to get you the absolute best return on investment for your time. Don't concentrate

  • on, you know, I know you guys are Google. So, a millisecond definitely does count. But,

  • if you got a--if you got a choice of, you know, going from five milliseconds to four

  • milliseconds, you're going from a minute to, you know, a second, it's a no-brainer. You

  • know, concentrate on the stuff that--that's going to give you the biggest impact. Using

  • Mytop. Jeremy Zawodny wrote a nice little tool. Most of you are probably familiar with

  • the top utility in UNIX. It's almost identical but it's just for MySQL. And it gives you

  • an ability to on a delay--on a repeating delay, you know, show you the process list of what's

  • going on in MySQL, and show, you know, show various show commands, the status variables

  • and gives you a nice little interface to profile into the system while it's running. And one

  • nice thing about Mytop that I really like is that you can--you can use it for remote

  • hosts as well. So, you can--you can run it on your local machine and access a remote

  • host. So that it is a nice little tool. Okay. Sources of Problems. This is kind of my personal

  • opinion of where I rank the law of diminishing returns, like what I actually like go for

  • immediately. The performance team tends to think that, you know, sometimes you can--you

  • can tune a server parameter and that's like the biggest bang for the least, you know,

  • amount of time. But I think, really, where performance problems come in are the--are

  • the definitely the top three here. Poor Indexing Choices will kill an application faster than

  • really anything else. If you're querying on a field and you don't have an index on it

  • and you're repeatedly hitting that, it's going to kill the application faster than, you know,

  • whether sort buffer size can be adjusted. So, same thing with the Inefficient or Bloated

  • Schema Design, you know, I talked in the--in the performance talk. I've seen a number of

  • cases where people designing new applications, well, actually de-normalized from the beginning

  • because they think it's like going to give them better performance. Don't do that. You

  • know, there's specific cases where you can de-normalize the schema and you can get some

  • performance benefits. But don't do it right away. You know, just because you think it's

  • going to get, you know, a performance benefit. Use good, efficient, normalize schemas. And

  • we'll talk--we'll go into detail on this later. Bad coding practices. MySQL has some inefficiency

  • problems with sub-queries. But in general, using joins is a lot more efficient than using

  • sub-queries. We're going to go show some examples of specific where you can really get a performance

  • increase by just simply changing the way that you--the way that you code SQL. Server Variables

  • Not Tuned Properly, that's a source of problems. But in my opinion, very specific situations

  • can be--can be helped by tuning server parameters. It's not some catch-all, you know, it totally

  • depends on, you know, how much memory you have in the machine, what the architecture

  • is, what operating system you're running on, what version of MySQL, what storage engines

  • you're using, your, you know, the type of the application that you have, whether it's

  • OLTP or heavy reads, you know, all these things. Yes. Tuning server parameters can help but

  • it's very specific to certain situations. And it also requires retesting and rerunning.

  • But, you know, you tweak one thing, you have to re, you know, run the benchmark test to

  • see if--see if you got the performance gain out it. It can be time consuming to tweak

  • server variables whereas it's pretty easy to, you know, change some seq--some SQL statements

  • and immediately rerun it and see if you got a huge performance gain from doing that. And

  • finally, Hardware and--and Network Bottlenecks. I/O, CPU bound, network latency. Right. There

  • is a good book that I brought along that I recommend for that and when we get to the

  • last slide, I'll show you that, so. All right, Indexing Guidelines. A poor or missing index,

  • definitely the fastest way to kill an application, like I said. Look for covering index opportunities.

  • We're going to go into some slides here to kind of explain what that is. But the covering

  • index is--especially for my--who's using MyISAM versus InnoDB? InnoDB, raise hands. MyISAM,

  • raise hands. Nobody is using MyISAM? For a single thing, okay.

  • >> [INDISTINCT] >> PIPES: Say again?

  • >> [INDISTINCT] >> PIPES: Got you, okay. Well, we'll explain

  • some things about covering index opportunities that are very specific to InnoDB. But covering

  • index is basically what it means is that the MySQL can get all the information from the

  • index records which are by definition slimmer than the data records because they don't contain

  • all the fields in the table and use that information to complete whatever the query was without

  • having to go into the data records. One of the key concepts to this, the slimmer that

  • your index records are, the more that you can fit into a single index block and the

  • more that you can fit into an index block, the fewer reads you're going to do, the faster

  • your application's going to be. So, covering index opportunities definitely look around

  • for those. When you're deciding on which fields you actually want to index, make sure you

  • take a look at what the selectivity of the--of the field is. Selectivity is the cardinality

  • or the number of unique values that are contained in that field divided by the total number

  • of values in a table. So, obviously a unique index would be a cardinality of one. Because

  • it's, you know, unique over the total count. That's obviously the best-case scenario but

  • you can get situations where you've got awful selectivity. If you're querying just on that

  • index or just on that field you may get awful performance from that because--or may just

  • decide not to use the index because it ceases to be useful. I think generally like, [INDISTINCT]

  • could talk a little bit more about this but I think that it numbers about 30%. If your

  • optimizer sees that, okay you do--you do a query select, you know, start from whatever,

  • where column A equals two. And it will take a look and see an estimate from what the information

  • that has on the--on the index, well this particular query returned a third--you know, around 30%

  • or more of the rows in the table. If it does it's actually much more efficient to just

  • sequentially scan the table than it is to go into the index and randomly seek and grab

  • a bunch of records and then do lookups back to the data records. So, be aware of--of the

  • uniqueness of the values in your index. For lower selective fields were you--were you

  • don't have very much selectivity you can sometimes tack them on, you know, to a multicolumn index

  • so that, especially we grouped by queries and we'll show an example of this you get

  • some efficiencies there. But just be aware of the selectivity, so. On multicolumn indexes,