The Mythical MySQL DBA

I’ve  been getting more than my fair share of calls from recruiters of late. Even in this depressed economic climate where jobs are rarer than a cab at rush-hour, it’s heartening to know that tech engineers are in great demand. And it’s even more heartening to think that demand for MySQL DBAs has never been better.

My reckoning was confirmed by a Bloomberg news report about stalwart retailers suffering from a dearth of talented engineers. Bloomberg cited Target’s outage-prone e-commerce site as a symptom of, among other things the market’s shortage. One of the challenges old-timers like Target face is having to compete with Silicon Valley startups as a fulfilling and ultimately, financially rewarding place to work. Continue reading The Mythical MySQL DBA

What Ops doesn’t tell you about your MySQL Database

MySQL is a very scalable platform which has proven robust even in the most dense and complex data environments. MySQL’s indispensable replication function is ‘sold’ as being fail-safe so you have little to sweat about as long as your backups are running regularly. But what the ops guys aren’t telling you is MySQL performs replication with tiny margins of error that could cause big problems in times of disaster.

Replication database backup

The Scenario

Imagine the scene, you use replication to backup your data. Your secondary database is your peace of mind. It’s the always-on clone of your crown jewels. You even perform backups off of it so you don’t impact your live website. Your backups run without errors. Your slave database runs without errors. Then the dreaded day comes when your primary database fails. You instruct your team to switchover your application to point to your live backup database. The site comes online again. But all is not right. You notice subtle differences and your team begins to question how deep the data divide could be.

The Problem with MySQL replication

Although MySQL replication is fairly easy to setup, and even to keep running without error, you may have unseen problems. MySQL’s core technology to replicate data between master and slave is primarily statement based. Various scenarios can cause what in other database platforms you might call database corruption, that is silent drifting of data from what tables and rows contain on the master. It is no fault of your own, or perhaps one might argue even of your operations team. It is a fundamental flaw in how MySQL performs replication.

The Solution

Fortunately there is a solution. Checksums, the wonderful computational tool for comparing things can be put to work nicely to compare database. The Percona Toolkit (formerly maatkit) includes just such a utility for use with MySQL. It can be used to check the integrity of your slave databases.
If you’ve never performed such a check, you should do so ASAP. If your database has been running for months at a stretch, chances are there could be differences lying undiscovered between the two systems.

Depending on the volume changing in your database, you can continue to use this tool periodically to confirm that all is consistent. If integrity checks fail, there is another tool in Maatkit to syncronize differences, and bring everything back to order.