No tools to reconcile MySQL with two masters

Here’s the last nail in the coffin. We hope this convinces you not to write to multiple masters with MySQL replication.

Reason 10 – No tools to reconcile inconsistent masters

For all the endless reasons we’ve already outlined MySQL replication is prone to failure. We know it’s going to happen, you now know too.

When replication breaks, your application may still be writing to both sides, both masters. Now you have data differences on both sides, random rows, in random tables.

Percona Toolkit checksum tool is great when you have a single authoritative master. It can compare & show you diffs. But when you’ve changed two masters, you have no authoritative side. Neither one is the last word on your data. You’re then faced with an intractable problem of reconciling your data in some manual or do-it-yourself method.

Not only would such a scenario be hairy and prone to error, but you would effectively have an outage for the duration. All ugly scenarios to be sure.

Click through to the end for multi-master solutions that work with MySQL.

PREV: Reason 9 – Temp tables can break replication

Want more? Grab our Scalable Startups monthly for more tips and special content. Here’s a sample

Temp tables can break replication

If you’re not convinced yet that writing to dual masters is a bad idea, we have a couple more reasons.

9. Temp tables break replication after restart

MySQL’s replication is sensitive to temporary tables. You shouldn’t use them. If your queries create them to work against, and a node crashes, the temp tables will be missing upon restart. You can imagine what this will do to all those queries which expect to see a temp table. That’s right, they’ll fail.

Theoretically you could have a start script which creates and populates such temp tables. However this may add operational complexity if not carefully coordinated with developers & change management.

NEXT: Reason 10 – No tools to reconcile inconsistent masters

PREV: Reason 8 – Crashed nodes can corrupt your cluster

Want more? Grab our Scalable Startups monthly for more tips and special content. Here’s a sample

Crashed nodes corrupt your MySQL cluster

Writing to two masters is like walking around with a loaded shotgun. Eventually one of your instances will fail and when it does, replications position & synchronization information could easily become corrupt!

Reason 8 – Crashed nodes cause big problems

MySQL instances, unfortunately can crash. When that happens, they don’t always sync the replication position properly. That’s a big risk. It’s one of the reasons why operational DBAs know that replication slaves need to be rebuilt from time to time.

If the replication position doesn’t sync properly when MySQL restarts it may do so at the wrong position and replay or miss some transactions. What this would do to an active-active setup is unpredictable. Again no single authoritative master means trouble!

MySQL 5.5 has introduced some parameters to address this. During my tests I’ve seen a performance hit, so use them with caution.

[code]
sync_binlog
sync_relay_log
sync_master_info
sync_relay_log_info
[/code]

NEXT: Reason 9 – Crashed nodes cause big problems

PREV: Reason 7 – Can’t add third node easily

Want more? Grab our Scalable Startups monthly for more tips and special content. Here’s a sample

Transaction isolation breaks when writing two masters

Continuing our discussion of multi-master replication, we hit on five more reasons why writing to two masters aka active-active replication is very dangerous.

Click through to the end for multi-master solutions that work with MySQL.

Reason 6 – You lose transaction isolation

This may sound like a theoretical point to some. But hopefully we can all appreciate that while you’re making changes to your data, you don’t want another user or session with their fingers in the pie, right?

This is implemented in all relational databases, MySQL, Oracle, and SQL Server included as transactions or transactional isolation. While you’re operating on rows they’re locked for other sessions to write to. If they want to do so, they wait in line until you’re done.

In master-master replication, this principal is completely broken. Altough your MySQL instance enforces this on the source side, the second master knows nothing about what you are doing. All of the transactions on that side may do whatever they like to your data.

Let’s give an example:

[code]
mysql> select * from a;
+------+
| c1 |
+------+
| 5 |
+------+
1 row in set (0.00 sec)

masterdb1> set autocommit=0;
masterdb1> update a set c1 = c1 + 5;

masterdb2> set autocommit=0;
masterdb2> update a set c1 = c1 * 5;

masterdb1> commit;
masterdb2> commit;
[/code]

Depending on the order you commit one side will then have 10 as the value, while the other side has 25! Strange indeed.

There’s no way to prevent a myriad of scenarios like this without locking. That’s why relational database introduced shared and exclusive locks, and that’s what’s missing in active-active multi-master replication.

Isolation is big, it’s serious, and without it you’re doomed!

NEXT: Reason 7 – Can’t add third node easily

PREV: Reason 5 – Cannot do integrity checking with active-active

Want more? Grab our Scalable Startups monthly for more tips and special content. Here’s a sample

Active-active replication reduces operational flexibility

Among the myriad data integrity and corruption risks associated with active-active replication, you also lose out on configurability and operational flexibility.

Reason 7 – Can’t add nodes easily

The often touted solution to use auto_increment_increment and auto_increment_offset fixes you to a given setup forever. You can set it up with a discrete two nodes, all primary keys on node1 being even, all on node2 being odd, or three nodes in a similar way.

But once you’ve set that up, adding a new node would require you to renumber every row in every table. A mountainous and painful undertaking.

NEXT: Reason 8 – Crashed nodes cause big problems

PREV: Reason 6 – You lose transaction isolation

Want more? Grab our Scalable Startups monthly for more tips and special content. Here’s a sample

MySQL needs single master to check data integrity

MySQL slaves can drift out of sync. Many of our clients are surprised to find some data differences in their replication topology, once we do some checking and sniffing around. Such checks require a single reliable or authoritative master to compare against.

Click through to the end for multi-master solutions that work with MySQL.

Reason 5 – No way to do integrity checking

MySQL’s statement based replication has a number of problems. It is notoriously bad at handling stored procedures. There are also lots of problems around non-deterministic functions. When you’re lucky it fails with an error. But many environments have replication problems and data inconsistencies that they don’t even know about.

This problem of data drift is well known in MySQL replication. Luckly there is a solution. Formerly part of Maatkit, now redubbed percona toolkit, the table checksum tool attacks this problem head on. pt-table-checksum is a great tool, but won’t help you much when you don’t have a single master.

Which database holds the correct view of all your data?

NEXT: Reason 6 – Active-active breaks transaction isolation

PREV: Reason 4 – Cannot clone & build a slave without a single master

Want more? Grab our Scalable Startups monthly for more tips and special content. Here’s a sample

MySQL requires an authoritative master to build slaves

In MySQL database operations, you often need to rebuild slaves. They fail for a lot of different reasons, fall out of sync or crash. When this happens you may find you need to reclone and start fresh. This is normally done by finding your authoritative master database, and doing a hotbackup.

Click through to the end for multi-master solutions that work with MySQL.

Reason 4 – You cannot reclone without single master

If you use active-active multi-master replication, you no longer have a single authoritative master copy of all your data. That means you have no reliable master to clone from, if a replica breaks.

If you’re an operational dba, you probably know that MySQL replication breaks regularly. You always need to be able to go back to a fresh copy when things get mixed up on the slaves.

NEXT: Reason 5 – MySQL row-based replication has limitations

PREV: Reason 3 – Limitations of MySQL row-based replication

Want more? Grab our Scalable Startups monthly for more tips and special content. Here’s a sample

Limitations of MySQL row-based replication

MySQL offers a few different options for how you perform replication. Statement-based has been around a lot longer, and though it has some troublesome characteristics they’re known well and can be managed. What’s more it supports online schema changes with multi-master active-passive setup. We recommend this solution.

Row-based replication is newer. It attempts to address problems like those introduced by non-deterministic functions, and replicating stored procedures. But it introduces it’s own challenges.

Click through to the end for multi-master solutions that work with MySQL.

3. Row-based replication limitations

Row-based replication addresses some of the limitations of statement based replication.

o works better with stored procedures
o reduces problems associated with non-deterministic functions

But it creates a few of it’s own, some are show-stoppers:

o won’t work if target table storage engine, column order, data types or row itself are different or missing.
o doesn’t write SQL to the binlogs – useful for troubleshooting
o harder to do point-in-time recovery without SQL in binlogs
o harder to do online schema changes by switching masters

NEXT: Reason 4 – Cannot reclone without a single master

PREV: Reason 2 – MySQL Replication is prone to failure

Want more? Grab our Scalable Startups monthly for more tips and special content. Here’s a sample

10 reaons active-active is hard and how to solve it

Multi-master replication provides redundant copies of your most important business assets. What’s more it allows applications to scale out, which is perfect for cloud hosting solutions like Amazon Web Services.

But when you decide you need to scale your write capacity, you may be considering active-active setup. This is dangerous, messy and prone to failure. We’ve outlined how.

Click through to the end for multi-master solutions that work with MySQL.

Reason 1 – auto_increment introduces new problems

o MySQL’s auto_inc settings make it more difficult to change servers around in your overall replication topology

o Using auto_inc settings can cause MySQL to introducing gaps in your primary keys which is a waste of space

o Such a solution would require all tables to have auto_inc primary keys

NEXT: Reason 2 – MySQL replication is brittle to start with

Want more? Grab our Scalable Startups monthly for more tips and special content. Here’s a sample