3 ways your MySQL migration project can shake you up
Once a development or operations team gets over the hurdle of open-source, and start to feel comfortable with the way software works outside of the enterprise world, they will likely start to settle in and feel comfortable. Best not to get too cushy though for there are more surprises hiding around the corner. Here are a few of the biggest ones.
1. Replication Is Not Perfect
Yes, you’ve installed MySQL, setup a slave, and got it replicating properly. You check the slave and it’s 0 seconds behind the master. What’s more you monitor the error log file, and have a check in place to alert you if something happens there. Job completed, good job!
Not so fast. Unfortunately this is not the end of the story. Many MySQL replication slaves are not consistent with their masters, but they drift apart silently. Really? Yes really. But there are tools to help you. Maatkit includes a checksum tool that will build checksums for all of your tables, and populate a table with that data. Your checksum data then flows through replication to the slave. From there you can perform a check on the slave to see if there are any differences. Such a check is easy to wire into nagios or other monitoring software, to alert you as soon as a difference materializes in your slave database.
With vigilant data integrity checking in place, your MySQL slave is robust enough to perform backups off of. But beware if you’re not using tools to check and relying only on the error log. This is unfortunately not sufficient.
If you want more specific help, we wrote a howto on bulletproofing MySQL replication with checksums.
Also you may find that after doing integrity checks, your slave database is out of sync with your master. There are various ways to sync up the differences, but if they are too numerous you may choose to rebuild your slave. Here’s how you can rebuild replication without downtime using hotbackups.
2. Queries With Bad Performance
MySQL’s optimizer is still fairly rudimentary. There are some classes of queries that MySQL does not handle well. If I were to provide a general rule it would be to stick to fairly simple queries, with at most a few joins. Simpler lookups are sure to be fast. Also less complex and convoluted queries are easier to understand, optimize and tune later on if they require it.
Although MySQL supports these, they often optimize badly, and perform terribly. Review an explain plan to be sure you’re optimal. Replace such a query with a join where possible.
MySQL only supports the nested loops algorithm to do joins. Merge join and hash join, two alternate and very efficient algorithms have not yet been implemented in MySQL unfortunately. Be aware, run explain and profile your slow queries.
Using Temporary; Using Filesort
When MySQL has to do sorting, it does so in one of two ways. If the dataset is small it’ll use a quicksort, and do it in memory. In this case you’ll see just “using filesort”. However if you see “Using Temporary; Using Filesort” in an EXPLAIN plan output, it means MySQL is using a two-pass algorithm to do a sort, creating a temporary table because it won’t all fit in memory. Sometimes a composite or multi-column index can help with these situations. You may also affect this behavior by setting sort_buffer_size and join_buffer_size at the session level, then rerunning the explain.
Newer versions of MySQL such as 5.6 have gotten much smarter about this, as the optimizer is always being tweaked and improved. Check your query plans in dev & test before migrating your production database.
Single Index Usage
Although MySQL supports very efficient b-tree indexes, it cannot always use multiple indexes, or merge indexes together for more efficient path to data. Although there are some exceptions, figure at each stage of a query and on each table MySQL will likely choose only one index. When you’re not satisfied with this, you may choose to build a covering or multi-column index. Keep in mind these will only serve that one particular query, or a query against the leading edge column. Also keep in mind that the more indexes you have built on a table, the less performant inserts, updates and deletes will be against this table. All of those operations will require index maintenance so you must balance read performance (indexes on more columns) with write performance (requiring index maintenance and updating).
3. Where Are The Transactional Tables?
Yes MySQL does offer row-level locking, ACID compliance, and repeatable read and other isolation levels and referential integrity. In fact it’s implementation of the buffer cache to support reading and writing blocks from disk is modeled on Oracle’s own algorithm. So why are you getting all of that?
As it turns out MySQL supports various Storage Engines, unlike other databases which have all of this rolled into the kernel, and only support transactional tables. This can be thought of as a feature although in reality it is more of a anomaly of how MySQL has evolved, and where it’s roots lie. As such the default storage engine is MySQL which is not transactional. All changes are effectively auto-commit, and it only supports table level locking.
If you haven’t checked already, verify what table types you have as follows:
mysql> select table_schema, engine, count(*) from information_schema.tables group by table_schema, engine;
Outside of system tables, your application tables should all be InnoDB. You can modify an existing table to be transactional as follows:
mysql> alter table mytable engine=innodb;
You should also set storage_engine variable to default to Innodb. Finally you may also choose to set the sql_mode NO_ENGINE_SUBSTITION which will force MySQL to flag an error whenever the storage engine you choose is not available. This prevents it from creating a table with a storage engine other than your choice.
Read this far? Grab our newsletter – scalable startups.