Justin Silverton at Jaslabs has a supposed list of 10 tips for optimizing MySQL queries. I couldn't read this and let it stand because this list is really, really bad. Some guy named Mike noted this, too. So in this entry I'll do two things: first, I'll explain why his list is bad; second, I'll present my own list which, hopefully, is much better. Onward, intrepid readers!

Why That List Sucks

  1. He's swinging for the top of the trees

    The rule in any situation where you want to opimize some code is that you first profile it and then find the bottlenecks. Mr. Silverton, however, aims right for the tippy top of the trees. I'd say 60% of database optimization is properly understanding SQL and the basics of databases. You need to understand joins vs. subselects, column indices, how to normalize data, etc. The next 35% is understanding the performance characteristics of your database of choice. COUNT(*) in MySQL, for example, can either be almost-free or painfully slow depending on which storage engine you're using. Other things to consider: under what conditions does your database invalidate caches, when does it sort on disk rather than in memory, when does it need to create temporary tables, etc. The final 5%, where few ever need venture, is where Mr. Silverton spends most of his time. Never once in my life have I used SQL_SMALL_RESULT.

  2. Good problems, bad solutions

    There are cases when Mr. Silverton does note a good problem. MySQL will indeed use a dynamic row format if it contains variable length fields like TEXT or BLOB, which, in this case, means sorting needs to be done on disk. The solution is not to eschew these datatypes, but rather to split off such fields into an associated table. The following schema represents this idea:

    CREATE TABLE posts (
        id int UNSIGNED NOT NULL AUTO_INCREMENT,
        author_id int UNSIGNED NOT NULL,
        created timestamp NOT NULL,
        PRIMARY KEY(id)
    );

    CREATE TABLE posts_data (
        post_id int UNSIGNED NOT NULL.
        body text,
        PRIMARY KEY(post_id)
    );

  3. That's just…yeah

    Some of his suggestions are just mind-boggling, e.g., "remove unnecessary paratheses." It really doesn't matter whether you do SELECT * FROM posts WHERE (author_id = 5 AND published = 1) or SELECT * FROM posts WHERE author_id = 5 AND published = 1. None. Any decent DBMS is going to optimize these away. This level of detail is akin to wondering when writing a C program whether the post-increment or pre-increment operator is faster. Really, if that's where you're spending your energy, it's a surprise you've written any code at all

My list

Let's see if I fare any better. I'm going to start from the most general.

  1. Benchmark, benchmark, benchmark!

    You're going to need numbers if you want to make a good decision. What queries are the worst? Where are the bottlenecks? Under what circumstances am I generating bad queries? Benchmarking is will let you simulate high-stress situations and, with the aid of profiling tools, expose the cracks in your database configuration. Tools of the trade include supersmack, ab, and SysBench. These tools either hit your database directly (e.g., supersmack) or simulate web traffic (e.g., ab).

  2. Profile, profile, profile!

    So, you're able to generate high-stress situations, but now you need to find the cracks. This is what profiling is for. Profiling enables you to find the bottlenecks in your configuration, whether they be in memory, CPU, network, disk I/O, or, what is more likely, some combination of all of them.

    The very first thing you should do is turn on the MySQL slow query log and install mtop. This will give you access to information about the absolute worst offenders. Have a ten-second query ruining your web application? These guys will show you the query right off.

    After you've identified the slow queries you should learn about the MySQL internal tools, like EXPLAIN, SHOW STATUS, and SHOW PROCESSLIST. These will tell you what resources are being spent where, and what side effects your queries are having, e.g., whether your heinous triple-join subselect query is sorting in memory or on disk. Of course, you should also be using your usual array of command-line profiling tools like top, procinfo, vmstat, etc. to get more general system performance information.

  3. Tighten Up Your Schema

    Before you even start writing queries you have to design a schema. Remember that the memory requirements for a table are going to be around #entries * size of a row. Unless you expect every person on the planet to register 2.8 trillion times on your website you do not in fact need to make your user_id column a BIGINT. Likewise, if a text field will always be a fixed length (e.g., a US zipcode, which always has a canonical representation of the form "XXXXX-XXXX") then a VARCHAR declaration just adds a superfluous byte for every row.

    Some people poo-poo database normalization, saying it produces unecessarily complex schema. However, proper normalization results in a minimization of redundant data. Fundamentally that means a smaller overall footprint at the cost of performance — the usual performance/memory tradeoff found everywhere in computer science. The best approach, IMO, is to normalize first and denormalize where performance demands it. Your schema will be more logical and you won't be optimizing prematurely.

  4. Partition Your Tables

    Often you have a table in which only a few columns are accessed frequently. On a blog, for example, one might display entry titles in many places (e.g., a list of recent posts) but only ever display teasers or the full post bodies once on a given page. Horizontal vertical partitioning helps:

    CREATE TABLE posts (
        id int UNSIGNED NOT NULL AUTO_INCREMENT,
        author_id int UNSIGNED NOT NULL,
        title varchar(128),
        created timestamp NOT NULL,
        PRIMARY KEY(id)
    );

    CREATE TABLE posts_data (
        post_id int UNSIGNED NOT NULL,
        teaser text,
        body text,
        PRIMARY KEY(post_id)
    );

    The above represents a situation where one is optimizing for reading. Frequently accessed data is kept in one table while infrequently accessed data is kept in another. Since the data is now partitioned the infrequently access data takes up less memory. You can also optimize for writing: frequently changed data can be kept in one table, while infrequently changed data can be kept in another. This allows more efficient caching since MySQL no longer needs to expire the cache for data which probably hasn't changed.

  5. Don't Overuse Artificial Primary Keys

    Artificial primary keys are nice because they can make the schema less volatile. If we stored geography information in the US based on zip code, say, and the zip code system suddenly changed we'd be in a bit of trouble. On the other hand, many times there are perfectly fine natural keys. One example would be a join table for many-to-many relationships. What not to do:

    CREATE TABLE posts_tags (
        relation_id int UNSIGNED NOT NULL AUTO_INCREMENT,
        post_id int UNSIGNED NOT NULL,
        tag_id int UNSIGNED NOT NULL,
        PRIMARY KEY(relation_id),
        UNIQUE INDEX(post_id, tag_id)
    );

    Not only is the artificial key entirely redundant given the column constraints, but the number of post-tag relations are now limited by the system-size of an integer. Instead one should do:

    CREATE TABLE posts_tags (
        post_id int UNSIGNED NOT NULL,
        tag_id int UNSIGNED NOT NULL,
        PRIMARY KEY(post_id, tag_id)
    );
  6. Learn Your Indices

    Often your choice of indices will make or break your database. For those who haven't progressed this far in their database studies, an index is a sort of hash. If we issue the query SELECT * FROM users WHERE last_name = 'Goldstein' and last_name has no index then your DBMS must scan every row of the table and compare it to the string 'Goldstein.' An index is usually a B-tree (though there are other options) which speeds up this comparison considerably.

    You should probably create indices for any field on which you are selecting, grouping, ordering, or joining. Obviously each index requires space proportional to the number of rows in your table, so too many indices winds up taking more memory. You also incur a performance hit on write operations, since every write now requires that the corresponding index be updated. There is a balance point which you can uncover by profiling your code. This varies from system to system and implementation to implementation.

  7. SQL is Not C

    C is the canonical procedural programming language and the greatest pitfall for a programmer looking to show off his database-fu is that he fails to realize that SQL is not procedural (nor is it functional or object-oriented, for that matter). Rather than thinking in terms of data and operations on data one must think of sets of data and relationships among those sets. This usually crops up with the improper use of a subquery:

    SELECT a.id,
        (SELECT MAX(created)
        FROM posts
        WHERE author_id = a.id)
    AS latest_post
    FROM authors a

    Since this subquery is correlated, i.e., references a table in the outer query, one should convert the subquery to a join.

    SELECT a.id, MAX(p.created) AS latest_post
    FROM authors a
    INNER JOIN posts p
        ON (a.id = p.author_id)
    GROUP BY a.id
  8. Understand your engines

    MySQL has two primary storange engines: MyISAM and InnoDB. Each has its own performance characteristics and considerations. In the broadest sense MyISAM is good for read-heavy data and InnoDB is good for write-heavy data, though there are cases where the opposite is true. The biggest gotcha is how the two differ with respect to the COUNT function.

    MyISAM keeps an internal cache of table meta-data like the number of rows. This means that, generally, COUNT(*) incurs no additional cost for a well-structured query. InnoDB, however, has no such cache. For a concrete example, let's say we're trying to paginate a query. If you have a query SELECT * FROM users LIMIT 5,10, let's say, running SELECT COUNT(*) FROM users LIMIT 5,10 is essentially free with MyISAM but takes the same amount of time as the first query with InnoDB. MySQL has a SQL_CALC_FOUND_ROWS option which tells InnoDB to calculate the number of rows as it runs the query, which can then be retreived by executing SELECT FOUND_ROWS(). This is very MySQL-specific, but can be necessary in certain situations, particularly if you use InnoDB for its other features (e.g., row-level locking, stored procedures, etc.).

  9. MySQL specific shortcuts

    MySQL provides many extentions to SQL which help performance in many common use scenarios. Among these are INSERT … SELECT, INSERT … ON DUPLICATE KEY UPDATE, and REPLACE.

    I rarely hesitate to use the above since they are so convenient and provide real performance benefits in many situations. MySQL has other keywords which are more dangerous, however, and should be used sparingly. These include INSERT DELAYED, which tells MySQL that it is not important to insert the data immediately (say, e.g., in a logging situation). The problem with this is that under high load situations the insert might be delayed indefinitely, causing the insert queue to baloon. You can also give MySQL index hints about which indices to use. MySQL gets it right most of the time and when it doesn't it is usually because of a bad scheme or poorly written query.

  10. And one for the road…

    Last, but not least, read Peter Zaitsev's MySQL Performance Blog if you're into the nitty-gritty of MySQL performance. He covers many of the finer aspects of database administration and performance.

96 Comments

  1. Mike April 10th, 2007 / 12:40 am

    Hey Jesse,

    I just scanned your post (it’s past 2AM and I have an early flight tomorrow), but it looks like a _huge_ improvement over the Jaslabs list. In fact, it looks like the kind of post I would have written had I had more time! Great job.

  2. Mike April 10th, 2007 / 1:01 am

    Oh yea, one other thing… preincrement is faster than postincrement b/c postincrement requires the creation of a temporary copy of the variable to return as the result of the statement. Unless of course your compiler optimizes away the difference :).

  3. Robert (Jamie) Munro April 10th, 2007 / 7:14 am

    On point 7, I disagree that you should change subqueries to joins. You should use subqueries where they make the SQL easier to understand, and joins when they make it easier to understand. If the two methods of writing the SQL produce the same results, then the query optimiser should optimise them to do the same thing internally anyway. If one method is slower than the other, that is a bug in the database engine.

  4. Bobby Voliva April 10th, 2007 / 7:35 am

    Your point in #5 is all fine and dandy until you use some type of ORM framework in your application. Bring in something like Hibernate to your application and try to run a code-generation on that table. It will create it’s own primary key for you in the classes it creates, which is a real pain to work with in the code.

  5. Andrew April 10th, 2007 / 7:39 am

    WOW your response to optimizing MySQL queries is basically to set up your database right. Genius!

    What if you don’t control the schema? I guess then you don’t have much.

  6. Yitz April 10th, 2007 / 11:53 am

    Hmm. If it’s on duggmirror, can I still post a comment?

  7. Nick April 10th, 2007 / 12:18 pm

    I attended a presentation on performance optimization by Jay Pipes from MySQL a couple weeks back, and his advice was to always use auto_increment primary keys unless you have a very good reason not to. I’m a little fuzzy on the details, but I believe what he said was when you use an auto_increment PK the engine inherently knows what the next available value is, while if you use another unique column for the PK it needs to do a little more work when inserting to ensure the value you’re providing is unique. The end result according to him was better performance when dealing with high insert volume when using a synthetic auto_increment key. I believe he was talking about InnoDB at the time, so I’m not sure if the same concept applies to MyISAM.

    In the example you gave I agree, keep the auto_increment out of there, but in other cases where there’s natural key you may actually be hurting performance.

  8. Jesse April 10th, 2007 / 12:22 pm

    Robert,

    Unfortunately that problem is hard. I personally wouldn’t trust a database to reliably translate subqueries into joins where possible. I also don’t think joins are frequently more obscure than subselects, especially if you’re (1) familiar with SQL idioms and (2) leave comments about what data the query is fetching.

  9. Jesse April 10th, 2007 / 12:25 pm

    Bobby,

    That is true. Rails, for example, doesn’t support composite primary keys out of the box. Unfortunately that is a problem since often composite primary keys are really the best solution. I know there’s a mixin for Rails which adds support for composite keys, although I don’t know how well it performs.

  10. Jesse April 10th, 2007 / 12:26 pm

    Andrew,

    If you don’t have any control over the layout of your database or the database configuration itself then there’s not much you can do. Use the profiling tools I described to isolate the expensive queries and try to rewrite them so that they’re not so expensive. Since you can’t alter indices or change the schema this might prove to be very difficult.

  11. Jesse April 10th, 2007 / 12:29 pm

    Nick,

    I think he was talking about using auto_increment versus incrementing via some other mechanism. Drupal, which I’ve written about elsewhere, does just this and it’s really annoying — it has its own sequence table. This makes multiple inserts difficult if not impossible. But if you have a natural primary key then adding an auto incrementing artificial key doesn’t get you anything. IOW, if you have an artificial key, it’s best to make it an auto_increment integer column, in MySQL at least.

  12. Ryan April 10th, 2007 / 12:29 pm

    Actually, for #7, you’d want a LEFT JOIN to ensure you got all authors even if they haven’t posted yet.

  13. Jesse April 10th, 2007 / 12:34 pm

    Ryan,

    Well, it depends on what data you’re actually trying to fetch. It was meant as an illustration of the subquery vs. join issue, though. I think it suffices for that. :)

  14. myshpa April 10th, 2007 / 12:56 pm

    Robert (Jamie) Munro: You’re joking, right?

  15. Surf SSL » 10 Tips for Optimizing MySQL Queries (That don’t suck) April 10th, 2007 / 1:15 pm

    […] So, I spent an hour or so and wrote up my own list. I think it fares much better. What do you think?read more | digg […]

  16. Tom April 10th, 2007 / 1:29 pm

    Good list. I’ve attended a presentation given by Jay Pipes from MySQL, and he addresses the same points, I suspect you may have been to a similar presentation or read his book. :)

    As for the sub-queries being changed to joins, the MySQL query optimizer simply does not optimize these well, Oracle or MSSQL on the other hand convert those sub-queries to joins. The point is to realize the strengths and weaknesses of your database, and develop accordingly. Whether or not it should be done by the optimizer is a moot point when you’re writing code. Since the optimizer isn’t going to help you out until a later version, you should account for this if you’re going to be using the DB. If you don’t, the result will be extremely inefficient queries.

  17. Jesse April 10th, 2007 / 1:37 pm

    Tom,

    Good to know. The environments in which I’ve used Oracle and MSSQL have never approached the level of activity of the environments in which I’ve used MySQL (LAMP stack applications generating 30M+ pageviews a month, etc.).

    As for Jay Pipes, I’ve never heard of the guy. These all come from my own experience, particularly making Drupal perform well, and reading Peter Zeitsev’s great weblog. Drupal basically rides right on top of the database so I’ve applied all these things many times over scaling Drupal applications. Have any good Pipes-related links?

  18. Tom April 10th, 2007 / 2:16 pm

    Jay is the North American community relations manager for MySQL AB, he’s also the co-author of Pro MySQL. You can find his blog at jaypipes.com and the book here http://www.amazon.com/MySQL-Experts-Voice-Open-Source/dp/159059505X

    A local university was lucky enough to get him to give a nice (and free) presentation on optimizing MySQL, and I must say, it was the most informative 4 hour session I’ve been to, regardless of price. He’s got a great understanding on how the database works, and what we as developers can and conversely shouldn’t do to get the best performance out of it.

    Looks like they just did a few of these in the Midwest, and are now done… but if you get a chance to go to one of these in the future I’d highly recommend it. I’ve been to MS and Oracle’s events like these, and while those are marketing driven, this was highly valuable.

  19. Jesse April 10th, 2007 / 3:16 pm

    Thanks, Tom! :)

  20. 10 Ways to Optimise MySQL Queries | Entropy April 10th, 2007 / 3:35 pm

    […] 10 Tips for Optimizing MySQL Queries […]

  21. Eric April 10th, 2007 / 8:22 pm

    For once I make a reply.
    Good post, and thanks for the tools in #1, I have been looking around for some and was actually seeking opinions on a few. Everyone has their opinions and different practices work better in different environments based on so many factors, so lets all be friends. Again, thank you for the input.

  22. Greg April 10th, 2007 / 9:30 pm

    Well written and very on target, much better than the other list. I just wanted to add another point regarding normalization and I’ll refer back to where you said “However, proper normalization results in a minimization of redundant data. Fundamentally that means a smaller overall footprint at the cost of performance”. From a storage point of view that’s a good argument but I think there is an even more important reason for normalization and that’s management of redundant data. If the same data is stored in more than one place in your database, which one is currently most correct. Just something for new developers to the database world.
    –G

  23. Because Its Possible.com » Blog Archives » links for 2007-04-11 April 10th, 2007 / 11:42 pm

    […] 10 Tips for Optimizing MySQL Queries (That don’t suck) | 20bits (tags: database development digg howto mysql optimization performance programming tips sql) Posted in My Del.icio.us Posts | Trackback | del.icio.us | Top Of Page […]

  24. Blog by Lars S. Linnet » Blog Archive » MySQL best practices by Jay Pipes April 11th, 2007 / 2:07 am

    […] If you want to know how to do these things perhaps looking at this video can help you. Or if you rather want to read some thing similar go to 10 tips for optimizing mysql queries or a critic of the previous article 10 Tips for Optimizing MySQL Queries (That don’t suck) […]

  25. troels April 11th, 2007 / 2:44 am

    Thanks for an interesting post. Nothing much for the seasoned MySql user, I suspect, but for an application programmer like myself, who merely uses the database as a place to store data, there were a couple of eye-openers.

    Your site looks crappy in firefox btw.

  26. oz4.org - links for 2007-04-11 April 11th, 2007 / 4:22 am

    […] 10 Tips for Optimizing MySQL Queries (That don’t suck) | 20bits More mysql optimization tips. (tags: database development mysql programming) […]

  27. All in a days work… April 11th, 2007 / 5:04 am

    […] 10 tips for optimizing mysql queries (tags: MySQL) […]

  28. Jesse April 11th, 2007 / 9:09 am

    troels,

    Hmm, what version of Firefox are you using? It looks alright in 2.0.

  29. The Abarentos Narrative » links for 2007-04-11 April 11th, 2007 / 5:27 pm

    […] 10 Tips for Optimizing MySQL Queries (That don’t suck) (tags: optimization database sql tips programming development performance webdev mysql) […]

  30. jon April 11th, 2007 / 8:17 pm

    #4 is actually an example of vertical partitioning. You’re moving “columns” into another table — think of it as slicing the table vertically.

    An example of horizontal partitioning would be let’s say if you have a table full of names, and you made a file group for last names A-L, and a file group for M-Z. Then, commonly, you would put each filegroup on a separate [set of] disk[s], speeding up performance.

  31. Jesse April 11th, 2007 / 9:10 pm

    Yep, you’re right, jon. My mistake. :)

  32. Sherif Mansour April 12th, 2007 / 7:09 am

    Thanks for the tips!

    I actually also found a Google Tech Talk Video on MySQL perofmance Tips and found it to be perfect. Check it out: http://blog.sherifmansour.com/?p=72 “Performance Tuning Best Practices for MySQL”

  33. Jesse April 12th, 2007 / 9:20 am

    Sherif,

    That presentation is way more comprehensive than my list. Thanks. :)

  34. links for 2007-04-12 - Bomb.org.uk April 12th, 2007 / 12:27 pm

    […] 10 Tips for Optimizing MySQL Queries (That don’t suck) | 20bits (tags: mysql optimization database sql tips programming development article tricks webdev web) […]

  35. Devlounge | Friday Focus #26 April 13th, 2007 / 1:19 pm

    […] 10 Tips for Optimizing MySQL Queries (That Don’t Suck) Ten ways you can speed up your MySQL Queries to help ease the load on your server and the wait time […]

  36. Justin Silverton April 13th, 2007 / 10:58 pm

    “Andrew,

    If you don’t have any control over the layout of your database or the database configuration itself then there’s not much you can do. Use the profiling tools I described to isolate the expensive queries and try to rewrite them so that they’re not so expensive. Since you can’t alter indices or change the schema this might prove to be very difficult

    in other words, use the tips at my site that jesse claimed were bad.

    “The rule in any situation where you want to opimize some code is that you first profile it and then find the bottlenecks. Mr. Silverton, however, aims right for the tippy top of the trees. I’d say 60% of database optimization is properly understanding SQL and the basics of databases. ”

    If you had bothered to even read (or understand) my article, you would have known that it was titled “10 tips for optimizing mysql queries” IE: Specific things you can do to your queries (not the databse, engine, or software) that can help in optimization.

    Your list is very generic and can be found in almost any book about databases.

  37. Jesse April 13th, 2007 / 11:30 pm

    Justin,

    I was wondering when you’d find your way over here. I still stand by my statement that your tips suck, or at the very least aren’t anything better than content created solely to generate traffic. What’s more, you didn’t address my critiques. So, here they are again:
    1) You are approaching the problem of performance from the wrong end
    2) Some of the problems you identify have solutions, but you give the wrong one.
    3) Some of your “tips” don’t even correspond to actual problems.

    And, I’d add a fourth: since you offer no analysis it’s impossible to judge the merits of your tips. Let’s take the SQL_SMALL_RESULT “tip” again. What are the downsides to using it? What happens if I use SQL_SMALL_RESULT on a query that accidentally returns a large result set? How does SQL_SMALL_RESULT affect the memory usage of MySQL? Do its effects vary between MySQL engines?

    These are concerns someone seriously interested in database performance would need to know, but you give none of it. My suggestions may be more general but they are, I hope, better supported.

    Cheers.

  38. 10 Tips for Optimizing MySQL Queries (That don’t suck) « v1ruz blog April 14th, 2007 / 5:11 am

    […] read more | digg story […]

  39. Yong April 14th, 2007 / 10:18 pm

    This is really useful!

  40. 10 Tips for Optimizing MySQL Queries (That don’t suck) « BoNnErIx.NeT April 15th, 2007 / 11:38 am

    […] aquellos que desean optimizar sus consultas en MySQL, les recomiendo el post de Jesse en 20bits.com y su lista de […]

  41. Top Programming Diggs for the last 30 days - Intelligentedu.com Free Computer Training Blogs April 16th, 2007 / 12:25 pm

    […] help you become a better programmer with MySQL, SQL, Rails, Javascript, CSS, PHP, AJAX, and more. 10 Tips for Optimizing MySQL Queries (That don’t suck)When I read the last list of 10 tips for optimizing MySQL queries I wanted to scream. They were […]

  42. troels April 16th, 2007 / 2:49 pm

    > Hmm, what version of Firefox are you using? It looks alright in 2.0.

    Using 2.0.0.3

    The div class=ch_code_container sections get a big grey block at the bottom. It happens because of the height=100%.

  43. troels April 16th, 2007 / 2:53 pm

    Ah … I just realized that I was looking at a mirrored site (http://duggmirror.com/programming/10_Tips_for_Optimizing_MySQL_Queries_That_don_t_suck/). This probably changes the doctype, which makes the height=100% behave differently.

  44. Bill’s Dev Blog » Blog Archive » 10 Tips for Optimizing MySQL Queries (That don’t suck) April 16th, 2007 / 6:58 pm

    […] Borrowed from this site. […]

  45. » Savaitgalio skaitiniai #23 Archyvas » Pixel.lt April 20th, 2007 / 10:27 pm

    […] serveris Gentoo sistemoje“, bei pristatytas bOOm.lt. Na ir savaitgalio skaitiniai: 10 Tips for Optimizing MySQL Queries (That don’t suck) Markup as a Craft […]

  46. Yuzle! April 22nd, 2007 / 4:44 pm

    Looks like Digg needs a ‘This is a Response to…’ ala YouTube!

  47. Motorcycle Guy April 26th, 2007 / 12:40 pm

    I normally simply use apache’s ab for benchmark testing.

  48. Archive » Friday roundup » stillbreathing.co.uk April 28th, 2007 / 5:13 pm

    […] 10 tips for optimising MySQL queries […]

  49. PHP Programmer Of Bangladesh Best PHP tools of the month « May 2nd, 2007 / 10:05 pm

    […] 10 tips for MySQL query Optimization […]

  50. links for 2007-05-05 -- Daniel Costa May 4th, 2007 / 11:28 pm

    […] 10 Tips for Optimizing MySQL Queries (That don’t suck) | 20bits (tags: mysql optimization performance) […]

  51. louis May 21st, 2007 / 3:34 am

    Very nice article.

    I’ve got a question though.. I’m making a site that uses cron jobs to update every entry in a table every 15mins.

    If my table has thousands’s of rows, its gona kill my server every 15mins right?

    Is there a way to spread the load out? Maybe an UPDATE equivalent of INSERT_DELAYED?

    Thanks
    Louis

  52. Jesse May 21st, 2007 / 8:46 am

    Louis,

    It depends on the structure of your table and what you’re doing every 15 minutes. It also depends on how heavily used the database is. But “thousands” of rows is not a lot. Hundreds of thousands isn’t even that bad, presuming your table isn’t designed horribly.

  53. 10 Tips for Optimizing MySQL Queries (That don’t suck) « Veronica’s Lore May 23rd, 2007 / 6:29 pm

    […] Read the original source… […]

  54. Тут Хумора.NET » Выпуск #306 May 28th, 2007 / 7:06 am

    […] 10 хитростей по ускорению MySQL-запросов. И еще в тему: 10 Tips for Optimizing MySQL Queries (That don’t suck).Use multiple-row INSERT statements to store many rows with one SQL statement.Interview Question and […]

  55. » mysql performance June 1st, 2007 / 4:07 am

    […] http://20bits.com/2007/04/10/10-tips-for-optimizing-mysql-queries-that-dont-suck/ This entry was published on Thursday, May 31st, 2007 at 2:45 pm and is categorized under mysql. You can follow any responses to this entry through the magic of RSS 2.0. You can also leave a response, or trackback from your own site. […]

  56. tending the garden » links for 2007-06-14 June 14th, 2007 / 10:23 pm

    […] 10 Tips for Optimizing MySQL Queries (That don’t suck) | 20bits (tags: Mysql optimize performance sql blog database development) […]

  57. freelancer June 26th, 2007 / 8:28 am

    Thanks for sharing!

  58. Zerd June 30th, 2007 / 10:30 pm

    I was trying to rewrite a sub-query into a join query, but it wouldn’t run any faster (actually a lot slower). I’m tryign to fetch number of comments for each blog entry. It goes like this:

    SELECT title, COUNT(com.id) comments FROM entries LEFT JOIN com ON entries.id = com.entry GROUP BY entries.id ORDER BY entries.time DESC LIMIT 10

    The subquery just selected COUNT(*) from comments where id = entries.id.

  59. Zerd June 30th, 2007 / 11:22 pm

    According to Jay Pipes, your example (by splitting columns) is horizontal. Making more tables (splitting and merging) is vertical.

  60. Jesse July 6th, 2007 / 2:43 pm

    Zerd,

    I’m pretty sure horizontal partitioning is when you split according to rows and vertical is when you split by column. So, e.g., if you have an accounting system and all accounts in a given year are in their own table, that’d be horizontal partitioning. What I’ve described is vertical partitioning.

    As for your query, I can take a guess. I’d wager that you’re using MyISAM tables, in which case count(*) is opimized away. Your indices might also be messed up. That is, if com.entry isn’t an index then joining becomes more expensive. I presume entries.id and com.id are primary keys, so they’re automatically indexed.

    It might even be faster to do: SELECT title, c.comments FROM entries JOIN (SELECT entry, COUNT(*) as comments FROM comments GROUP BY (comments.entry)) c ON c.entry = entries.id;

  61. Performance Within Reach » Links: July 20, 2007 July 20th, 2007 / 1:58 am

    […] 10 Tips for Optimizing MySQL Queries (That don’t suck), Jesse @ 20bits.com […]

  62. Performance Within Reach » Links: July 20, 2007 July 20th, 2007 / 1:58 am

    […] 10 Tips for Optimizing MySQL Queries (That don’t suck), Jesse @ 20bits.com […]

  63. SD 2007/07号のMySQL記事連動スクリプトを試してみた。 // under construction always July 23rd, 2007 / 3:15 am

    […] What to tune in MySQL Server after installation 10 Tips for Optimizing MySQL Queries (That don’t suck) […]

  64. Straw Dogs Code Blog » 10 Indispensible MySQL Resources July 28th, 2007 / 6:36 am

    […] 10 Tips on Optimizing MySQL - From 20bits. […]

  65. Agosto 2007 - Asier Marqués August 1st, 2007 / 4:39 am

    […] 10 consejos para optimización de MySQL. […]

  66. 20 recursos para mySQL August 1st, 2007 / 4:43 am

    […] 10 consejos para optimización de MySQL. […]

  67. 20 Recursos Para Trabajar Con MySQL August 2nd, 2007 / 12:53 am

    […] 10 consejos para optimización de MySQL. […]

  68. Actualidad, Entretenimiento y Humor » 20 Recursos Para Trabajar Con MySQL August 2nd, 2007 / 1:54 am

    […] 10 consejos para optimización de MySQL. […]

  69. 20 Recursos MySQL « Neozeratul en la Red August 9th, 2007 / 10:18 am

    […] 10 consejos para optimización de MySQL. […]

  70. ITVO Alternativo :: 20 Recursos para gestionar Bases de Datos en MySQL :: August :: 2007 August 15th, 2007 / 6:45 am

    […] 10 consejos para optimización de MySQL. […]

  71. smalls blogger » Blog Archive » links for 2007-09-02 September 1st, 2007 / 6:33 pm

    […] 10 Tips for Optimizing MySQL Queries (That don’t suck) | 20bits 10 Tips for Optimizing MySQL Queries (That don’t suck) (tags: mysql optimization database sql performance tips programming) Posted in Bookmarks | […]

  72. Consejos y manuales sobre MySQL « Sistemas Operativos September 30th, 2007 / 4:19 am

    […] 10 tips for optimizing MySQL queries (that don’t suck): en inglés, post relacionado con el anterior. […]

  73. Straw Dogs » 10 Indispensible MySQL Resources September 30th, 2007 / 8:34 am

    […] 10 Tips on Optimizing MySQL - From 20bits. […]

  74. Wahoo October 5th, 2007 / 5:52 pm

    Thank you for sharing!

  75. 10 Tips for Optimizing MySQL Queries (That don’t suck) « Programming News October 10th, 2007 / 10:42 am

    […] read more | digg story […]

  76. Bieber Labs » links for 2007-10-23 October 22nd, 2007 / 9:12 pm

    […] 10 Tips for Optimizing MySQL Queries (That don’t suck) | 20bits (tags: mysql tips optimization query) […]

  77. Matt Carpenter November 3rd, 2007 / 6:34 pm

    Very nice article Jesse. Sound, practical advice well presented and argued. The advanced topics such as benchmarking and profiling are areas I haven’t really delved into yet so thanks alot for the pointers. p.s. I don’t think Justin Silverton has indexed his `retort` column as his comeback query is taking a long time to run.

  78. MySQL Query Optimization Tips « Hone Watson Bookmarks November 7th, 2007 / 4:46 pm

    […] 10 Tips for Optimizing MySQL Queries (That don’t suck) | 20bits […]

  79. I am just a programmer » 10 Tips for Optimizing MySQL Queries (That don’t suck) November 18th, 2007 / 8:15 am

    […] read more | digg story […]

  80. MySQL Optimizing Tips, Best Practices | David Bisset: Web Designer, Coder, Wordpress Guru December 20th, 2007 / 12:40 am

    […] some interesting reading. I’m currently reading 10 Tips for optimizing mysql queries and 10 Tips for Optimizing MySQL Queries (That don’t suck), both I found referenced here. Tags: […]

  81. مدونة cssbit » أرشيف المدونة » روابط (6) February 23rd, 2008 / 1:16 pm

    […] عشر طرق لتحسين MySQL Queries […]

  82. fcicq's del.icio.us » Blog Archive » links for 2008-02-24 February 24th, 2008 / 3:14 am

    […] 10 Tips for Optimizing MySQL Queries (That don’t suck) | 20bits (tags: optimization mysql tips) […]

  83. Mysql freak February 26th, 2008 / 11:10 am

    The missing thing still is how to make sql queries faster

  84. Best PHP tools of the month « Web Technology: A Revolution March 4th, 2008 / 10:38 pm

    […] 10 tips for MySQL query Optimization […]

  85. Optimize MySQL Queries | Unix Trix March 10th, 2008 / 8:29 am

    […] is using the right queries, and going about analyzing them in the right way, so 20bits gives you: 10 tips for Optimizing MySQL Queries that don’t suck. Share and Enjoy: These icons link to social bookmarking sites where readers can share and […]

  86. test 03/18/2008 « Strange Kite March 18th, 2008 / 11:15 am

    […] test 03/18/2008 10 Tips for Optimizing MySQL Queries (That don’t suck) | 20bits  Annotated […]

  87. ken March 18th, 2008 / 11:44 am

    Jesse:

    Rails doesn’t support composite keys for model classes, but posts_tags here is just a join table, and it’s fine with composite keys here — in fact I’m pretty sure it’s the default. We do precisely this in several places, and I don’t remember needing to do anything special.

  88. Jesse March 18th, 2008 / 11:51 am

    ken,

    I think when I wrote this Rails would flip out if even the join table didn’t have its own auto incrementing primary key. This might have changed, but it’s also possible I was wrong.

  89. ken March 18th, 2008 / 1:19 pm

    OK, I guess I have to give you that. We do pass :id=>false (which I think has been around basically forever) to the join table, but it’s true that some Rails features flip out on :id=>false tables (even with 2.0.2). I don’t think I’ve seen any outflipping for join tables in particular, but I wouldn’t bet against it.

  90. Jesse March 18th, 2008 / 1:42 pm

    ken,

    Yeah. I don’t know what version what around when I wrote this article — but that was 11 months ago. It’d also be unfortunate if Rails flipped out by default (i.e., required :id => false), since, AFAIK, Rails is supposed to be about “convention no configuration.”

  91. My daily readings 03/19/2008 « Strange Kite March 19th, 2008 / 5:17 am

    […] 10 Tips for Optimizing MySQL Queries (That don’t suck) | 20bits  Annotated […]

  92. Sp3w » Blog Archive » Linkage 2008.03.24 March 24th, 2008 / 12:17 pm

    […] 10 Tips for Optimizing MySQL Queries […]

  93. Notional Slurry » links for 2008-05-06 May 6th, 2008 / 12:16 am

    […] 10 Tips for Optimizing MySQL Queries (That don’t suck) | 20bits (tags: MySQL coding database optimization performance programming) […]

  94. EllisGL May 8th, 2008 / 7:21 pm

    I’ve read a lot of the “MySQL” Optimization guides and yours is the best so far as a single post. Most of them have gone over “Indexing” or totally skewed the guide you end up screwing yourself.

    I use parenthesis when I’m trying to do two compares on a column ie: (`date` >= ‘xxxx’ AND `date` <= ‘yyyy’). Just make me feel comfortable with grouping it like that.

    As for the other things - I think you have given me a lot of stuff to help me in my job that I start in 2 weeks. I leave the current job tomorrow (friday), where even I know my read calls are really quick, but i can see it being at least 2x faster with these tips.

Leave a Reply