5 Ways to fortify MySQL replication

fort

Also find Sean Hull’s ramblings on twitter @hullsean.

MySQL replication technology is powerful and flexible. But it doesn’t do everything perfectly all the time. You may experience trouble with the slaves falling behind the master, or want to scale horizontally by building new slaves automatically. Or you might need to build a slave without blocking the master database.

All of these goals can be achieved using some powerful tools. Here’s a quick guide to those tools and how to use them.

    1. Build new Replicas without Downtime

Something we’re sure you need to do quite often is to build new slaves. You can snapshot from another slave by bringing the slave down, copying it’s datadir to an alternate machine, updating the server_id and then starting up. However sometimes you have no slave, or your current slaves are serving data.

In those cases you’ll need a way to snapshot off the master. With the great xtrabackup tool, you can create a snapshot of your production database without the locking of mysqldump. What’s more the resulting snapshot is a full datadir, so you won’t need to import all the data as you would with mysqldump. Saves yourself a lot of time!

Take a look at our how-to for building replication slaves using hotbackups.

    1. Autoscale in the Cloud

We wrote an extensive how-to diving into the nuts and bolts of MySQL Autoscaling.

    1. Use semisynchronous replication

In MySQL 5.5, the Google code contributions got rolled into the Generally Available version. That brings some great new features and functionality to make your replicas more resilient and improve overall redundancy.

We have an upcoming article planned for May that digs into configuring semisynchronous replication.

    1. Add integrity checking

Statement based replication, as powerful as it is, has some serious limitations. As it turns out, many slave databases drift silently out of sync with the master. What’s more they don’t flag errors when they do so. The reason why this can happen has to do with the nature of statement based replication. If you combine transactional & non-transactional tables, for example, and a transaction rolls back, the statements on MyISAM tables will still get replicated, resulting in potentially different rows on the slave. Other cases include various non-deterministic functions such as sysdate which may return different results on the slave.

Row-based replication begins to address these problems, by offering an alternative which includes the actual data changes replicated over the wire, instead of the statement instructions. Still this technology is new, and situations limit it’s use in some environments.

The best way to handle this situation is by using the Percona Toolkit checksum tool. This calculates checksums just like you might to compare files at the filesystem, and just as rsync does to sync data over ssh.

We have an upcoming article planned for May, covering table checksuming in depth. We’ll discuss how to collect the checksums via a cronjob, and then how to write a check which you can roll into Nagios, to monitor your data regularly.

    1. Watch out for the Potholes
    1. Use stored procedures & triggers sparingly or not at all

Stored procedures can behave strangely with statement based replication, and can easily break things. That’s because stored procedures are fairly non-deterministic by nature. If you want your replication to be bulletproof, avoid them.

    1. Don’t write to both masters

Sounds straightforward enough, but MySQL does not prevent you from doing so. We recommend you set the read-only flag on inactive masters, to protect you. You could also monitor certain key tables to be further cautious.

    1. Be sure to set unique server_id

MySQL’s binary logging uses the server_id setting to uniquely identify servers in the replication topology. So it’s crucial that they be set on all servers, and be unique.

    1. Be very wary using temporary tables

If a slave server dies, it’s temporary tables will be missing upon restart. If you have subsequent queries that rely on them, they’ll fail. Better to avoid them or use them very sparingly.

    1. Avoid MyISAM tables altogether

InnoDB should really be used for all MySQL environments. Any exceptions are so few, as to be not worth mentioning.

MyISAM tables are not crash safe, can and will lose data, and can be very slow due to locking problems.

Do you need to do full text searching, Sphinx comes to mind. It’s more scalable, takes load off the database server, and is lightening quick!

Lastly MyISAM tables can break replication, and we don’t want that! They’re not transaction safe, so if you mix them with InnoDB, bad things can happen!

    1. Avoid non-deterministic functions.

Some functions such as UUID() and RAND() will behave differently on different calls. That means they may behave differently when the slave calls the same statement it fetches from the binlog. These types of problems may break things silently, allowing your slave to drift out of sync with your master without error. See integrity checking above for ways to fix this.

    1. UPDATE with LIMIT clause

This interesting SQL combination relies heavily on the current sort of rows in the table. Suffice it to say it can behave differently on the slave, so be cautious if you have this syntax in your code. It may break replication.

  1. Row-based replication (still fairly new)

We are not ready to recommend row-based replication yet, as there have been reports of some troubles and surprises with it. It is a fairly new code path and though it’s included in GA release, we don’t see it widely in production yet. It does take a stab at addressing many of the issues with statement based replication though, so the potential is there. We expect to see it used more widely in the future.

  • http://scale-out-blog.blogspot.com Robert Hodges

    I’m a little surprised you don’t recommend row replication. It has been around for two full releases and is very stable now. In my opinion MIXED binlog format is a best practice as it uses row replication selectively to deal with issues D, F, and G above. Row replication is also indispensible for heterogenous cases such as we handle with Tungsten Replicator. There are of course downsides to row replication, most notably that large transactions can block the binlog because they generate so many row changes. However, these are design trade-offs rather than bugs and you can always go back to statement replication for such cases.

  • http://www.iheavy.com Sean Hull

    Hi Robert, thx for the comments. It’s true row-based replication has a lot of strengths. There are some things that I’ve found stumbling blocks with row-based replication. Here are a few:

    1. Doesn’t replicate schema changes
    2. Can’t be enforced across environment, so replicas can switch from row-based to statement-based down the line.
    3. Row-based logging isn’t backward compatible, so all our servers would have to run the latest version of MySQL.
    4. Troubleshooting is tougher with row-based replication since the originating statement isn’t recorded in the binlog.

    As with any new technology your mileage may vary, but we’re not ready to recommend it wholesale just yet.

    -Sean

  • jesse

    This is some great info, thanks!
    PS: your numbering seems to have broken, as all five ways are #1, and all subheadings are A.

  • hullsean

    Thx Jesse. Yeah the formatting for this theme is a bit wonky sometimes. I’ll try to fix thx. Cheers & happy new year!