Category Archives: High Availability

Is Zero downtime even possible on RDS?

amazon rds mysql

Join 29,000 others and follow Sean Hull on twitter @hullsean.

Oh RDS, you offer such promise, but damn it if the devil isn’t always buried in the details.

Diving into a recent project, I’ve been looking at upgrading RDS MySQL. Major MySQL upgrades can be a bit messy. Since the entire engine is rebuilt, queries performance can change, syntax can break, and surely triggers & stored procedures can have problems.

That’s not even getting into it with storage engines. Still have some tables on MyISAM? Beware.

The conclusion I would make is if you want zero downtime, or even nearly zero, you’re going to want to roll your own MySQL on EC2 instances.

Read: Why high availability is so very hard to deliver

1. How long did that upgrade take?

First thing I set out to do was upgrade a test instance. One of the first questions my client asked, how long did that take? “Ummm… you know I can’t tell you clearly.” For an engineer this is the worst feeling. We live & die by finding answers. When your hands are tied, you really can’t say what’s going on behind the curtain.

While I’m sitting at the web dashboard, I feel like I’m trying to pickup a needle with thick leather gloves. Nothing to grasp here. At one point the dashboard was still spinning, and I was curious what was happening. I logged out and back in again, and found the entire upgrade step had already completed. I think that added five minutes to perceived downtime.

Sure I can look at the RDS instance log, and tell you when RDS logged various events. But when did the machine go offline, and when did it return for users? That’s a harder question to answer.

Without command line, I can’t monitor the process carefully, and minimize downtime. I can only give you a broad brush idea of what’s happening.

Also: RDS or MySQL 10 use cases

2. Did we need to restart the instance?

RDS insists on rebooting the instance itself, everytime it performs a “Modify” operations. Often restarting the MySQL process would have been enough! This is like hunting squirrels with a bazooka. Definitely overkill.

As a DBA, it’s frustrating to watch the minutes spin by while your hands are tied. At some point I’m starting to wonder… Why am I even here?

Related: Howto automate MySQL slow query analysis with Amazon RDS

3. EBS Snapshots are blunt instruments

RDS provides some protection against a failed upgrade. The process will automatically snapshot your volume before it begins. That’s great. If I spend

See also: Is Amazon RDS hard to manage

4. Even promoting a read-replica sucks

I also evaluated using a read-replica. Here you spinup a slave first. You then upgrade *THAT* box to 5.6 ahead of your master. While your master is still sending data to the slave, your downtime would in theory be very minimal. Put master in read-only mode, wait few seconds for slave to catchup and switch application to point to slave, then promote it!

All that would work well with command line, as your instances don’t restart. But with RDS, it takes over seven long minutes!

Read this: 5 Reasons to move data to Amazon Redshift

5. RDS can upgrade to MySQL 5.6!

MySQL 5.6 introduced a new timestamp datatype which allows for fractional seconds. Great feature, but it means the on-disk datastructures are different. Uh oh!

If you’re doing replication with MySQL 5.5 to 5.6 it will break because the rows will flow out in one size, and break the 5.6 formatted datafiles! Not good.

The solution requires running ALTER commands run on the master beforehand. That in turn locks up tables. So it turns out promoting a read-replica is a non-starter for 5.5 to 5.6. Doesn’t really save much.

All of this devil in the details stuff is terrible when you don’t have command line access.

Read: Are SQL databases dead?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Why the Twitter IPO mentions scalability

ShannaBanan-o-rama

Join 12,000 others and follow Sean Hull on twitter @hullsean.

1. High availability is hard

After seven years in the business you might think Twitter has operations and scalability nailed. I wouldn’t blame you for hoping, but here’s one thing they said in their IPO filing:

“we are not currently serving traffic equally through our co-located data centers”

What does this mean exactly? Let’s think of your drive to work everyday. Remember that one intersection that’s always congested? Could the city designers have envisioned that 50 or 100 years ago? Probably not. In the present day, with all the buildings & roads, can we redesign around it? Not easily. So we adapt, and evolve and deal with the day-to-day realities of an evolving city.

James Urquhart says these are complex systems. The internet, the cloud and your startup infrastructure are by nature brittle.

Also: Why generalists are better at scaling the web.

2. Fail whale is part of the DNA

The graphic above is a whimsical remake of Twitter’s own by Shanna Banan. Consider though, someone at twitter was tasked with designing a graphic for when the site fails. The devops team then built a page for failure, and have itat the ready, for when there’s an outage, not if. It’s symbolic of the many other things your operations team does behind the scenes in expectation of that fateful day.

As Eric Ries argues, design for failure. Then manage it.

Related: 5 reasons why scalability is a process.

3. Investors, wall street: we’re working on it

What Twitter is really saying is, hey investors, we understand that five nines is extremely difficult, we’re vulnerable in certain ways and want to disclose that.

ReadWrite argues Twitter has not banished the fail whale and is “surprisingly vulnerable”. Readwrite, I ask you… who has? Google? Nope. Facebook? Nope. Not AirBNB or Reddit either.

These are world class firms. They’ve got the deep pockets to do it right, and the engineering talent to match. They still have failures.

Read this: How to hire a developer that doesn’t suck.

Get more in your inbox: Exclusive monthly Scalable Startups. We share tips and special content. Here’s a sample

Make MySQL clustering work for you

We’ve told you all about MySQL mult-master replication’s limitations. If you write to two masters it is bound to fail for myriad reasons.

Now what? Do what the pros do that’s what.

A. Don’t write to both masters

Using multi-master replication works great as long as you do so in active-passive mode. Never write to two masters at the same time. When you promote a new side to being a master, do it carefully:

o Put application into read-only mode temporarily (disable editing)
o Set current master to read-only mode
o Change all webservers to point to new master
o Change new master to read-write mode
o Turn application edit back on

B. Use Statement based replication

Statement based replication has been around forever, it’s proven and it’s limitations are well known. It’s been tested for many many edge cases. You know what you’re getting.

o supports online schema changes

Perform alter tables, add or drop columns or modify indexes on the inactive master. Once those changes are complete, promote the inactive side to being primary master and perform the changes again on that master. All with zero downtime to your application. Statement based replication makes this easy as differences to columns, column order and so forth won’t break things.

o facilitates point-in-time recovery

With the SQL of all your queries being written directly to binlogs, the forensic process of reconstructing things during point-in-time recovery becomes much easier.

o perform regular checksums against current master

Use the pt-table-checksum tool to verify data. Integrity checking will help you avoid any data drift and keep everything tightly in sync.

C. Degrade gracefully – build for a read-only mode

- facilitates failover
- facilitates maintenance
- insurance plan
- disaster recovery
- levers & dials for the operations team

D. Put Memcache between application and database

- reduces load on database
- reduces latency for remote write master
- key value stores are easier to scale
- continue to get fast application response

E. Misc recommenations

o use provisioned IOPS for the database servers
o use percona server 5.6
o use multi-threaded slaves
o use semi-syncronous replication
o using percona toolkit checksum tool to provide data integrity checks
o using percona toolkit heartbeat to check slave lag
o use percona xtrabackup to do hotbackups
o perform firedrills to restore backups
o perform firedrills to do point-in-time recovery

Like our content? We publish an exclusive monthly Scalable Startups where we share more tips and special content. Here’s a sample

No tools to reconcile MySQL with two masters

Here’s the last nail in the coffin. We hope this convinces you not to write to multiple masters with MySQL replication.

Reason 10 – No tools to reconcile inconsistent masters

For all the endless reasons we’ve already outlined MySQL replication is prone to failure. We know it’s going to happen, you now know too.

When replication breaks, your application may still be writing to both sides, both masters. Now you have data differences on both sides, random rows, in random tables.

Percona Toolkit checksum tool is great when you have a single authoritative master. It can compare & show you diffs. But when you’ve changed two masters, you have no authoritative side. Neither one is the last word on your data. You’re then faced with an intractable problem of reconciling your data in some manual or do-it-yourself method.

Not only would such a scenario be hairy and prone to error, but you would effectively have an outage for the duration. All ugly scenarios to be sure.

Click through to the end for multi-master solutions that work with MySQL.

PREV: Reason 9 – Temp tables can break replication

Want more? Grab our Scalable Startups monthly for more tips and special content. Here’s a sample

Temp tables can break replication

If you’re not convinced yet that writing to dual masters is a bad idea, we have a couple more reasons.

9. Temp tables break replication after restart

MySQL’s replication is sensitive to temporary tables. You shouldn’t use them. If your queries create them to work against, and a node crashes, the temp tables will be missing upon restart. You can imagine what this will do to all those queries which expect to see a temp table. That’s right, they’ll fail.

Theoretically you could have a start script which creates and populates such temp tables. However this may add operational complexity if not carefully coordinated with developers & change management.

NEXT: Reason 10 – No tools to reconcile inconsistent masters

PREV: Reason 8 – Crashed nodes can corrupt your cluster

Want more? Grab our Scalable Startups monthly for more tips and special content. Here’s a sample

Crashed nodes corrupt your MySQL cluster

Writing to two masters is like walking around with a loaded shotgun. Eventually one of your instances will fail and when it does, replications position & synchronization information could easily become corrupt!

Reason 8 – Crashed nodes cause big problems

MySQL instances, unfortunately can crash. When that happens, they don’t always sync the replication position properly. That’s a big risk. It’s one of the reasons why operational DBAs know that replication slaves need to be rebuilt from time to time.

If the replication position doesn’t sync properly when MySQL restarts it may do so at the wrong position and replay or miss some transactions. What this would do to an active-active setup is unpredictable. Again no single authoritative master means trouble!

MySQL 5.5 has introduced some parameters to address this. During my tests I’ve seen a performance hit, so use them with caution.

[code]
sync_binlog
sync_relay_log
sync_master_info
sync_relay_log_info
[/code]

NEXT: Reason 9 – Crashed nodes cause big problems

PREV: Reason 7 – Can’t add third node easily

Want more? Grab our Scalable Startups monthly for more tips and special content. Here’s a sample

Transaction isolation breaks when writing two masters

Continuing our discussion of multi-master replication, we hit on five more reasons why writing to two masters aka active-active replication is very dangerous.

Click through to the end for multi-master solutions that work with MySQL.

Reason 6 – You lose transaction isolation

This may sound like a theoretical point to some. But hopefully we can all appreciate that while you’re making changes to your data, you don’t want another user or session with their fingers in the pie, right?

This is implemented in all relational databases, MySQL, Oracle, and SQL Server included as transactions or transactional isolation. While you’re operating on rows they’re locked for other sessions to write to. If they want to do so, they wait in line until you’re done.

In master-master replication, this principal is completely broken. Altough your MySQL instance enforces this on the source side, the second master knows nothing about what you are doing. All of the transactions on that side may do whatever they like to your data.

Let’s give an example:

[code]
mysql> select * from a;
+------+
| c1 |
+------+
| 5 |
+------+
1 row in set (0.00 sec)

masterdb1> set autocommit=0;
masterdb1> update a set c1 = c1 + 5;

masterdb2> set autocommit=0;
masterdb2> update a set c1 = c1 * 5;

masterdb1> commit;
masterdb2> commit;
[/code]

Depending on the order you commit one side will then have 10 as the value, while the other side has 25! Strange indeed.

There’s no way to prevent a myriad of scenarios like this without locking. That’s why relational database introduced shared and exclusive locks, and that’s what’s missing in active-active multi-master replication.

Isolation is big, it’s serious, and without it you’re doomed!

NEXT: Reason 7 – Can’t add third node easily

PREV: Reason 5 – Cannot do integrity checking with active-active

Want more? Grab our Scalable Startups monthly for more tips and special content. Here’s a sample

Active-active replication reduces operational flexibility

Among the myriad data integrity and corruption risks associated with active-active replication, you also lose out on configurability and operational flexibility.

Reason 7 – Can’t add nodes easily

The often touted solution to use auto_increment_increment and auto_increment_offset fixes you to a given setup forever. You can set it up with a discrete two nodes, all primary keys on node1 being even, all on node2 being odd, or three nodes in a similar way.

But once you’ve set that up, adding a new node would require you to renumber every row in every table. A mountainous and painful undertaking.

NEXT: Reason 8 – Crashed nodes cause big problems

PREV: Reason 6 – You lose transaction isolation

Want more? Grab our Scalable Startups monthly for more tips and special content. Here’s a sample

MySQL needs single master to check data integrity

MySQL slaves can drift out of sync. Many of our clients are surprised to find some data differences in their replication topology, once we do some checking and sniffing around. Such checks require a single reliable or authoritative master to compare against.

Click through to the end for multi-master solutions that work with MySQL.

Reason 5 – No way to do integrity checking

MySQL’s statement based replication has a number of problems. It is notoriously bad at handling stored procedures. There are also lots of problems around non-deterministic functions. When you’re lucky it fails with an error. But many environments have replication problems and data inconsistencies that they don’t even know about.

This problem of data drift is well known in MySQL replication. Luckly there is a solution. Formerly part of Maatkit, now redubbed percona toolkit, the table checksum tool attacks this problem head on. pt-table-checksum is a great tool, but won’t help you much when you don’t have a single master.

Which database holds the correct view of all your data?

NEXT: Reason 6 – Active-active breaks transaction isolation

PREV: Reason 4 – Cannot clone & build a slave without a single master

Want more? Grab our Scalable Startups monthly for more tips and special content. Here’s a sample