Categories
All MySQL

Bulletproofing MySQL replication with checksums

bulletproof glass

Also find Sean Hull’s ramblings on twitter @hullsean.

Your MySQL replcas running well? You might not even know if they aren’t. One of the scariest things about MySQL replication is that it can drift out of sync with the master “silently”. No errors, no warnings.

  1. What and Why?
  2. MySQL’s replication solution evolved as a statement based technology. Instead of sending actual block changes, MySQL just has to log committed transactions, and reapply those on the slave side. This affords a wonderful array of topologies and different uses, but has it’s drawbacks. The biggest occur when data does not get updated or changed in the same way on the slave. If you’re new to MySQL or coming from the Oracle world you might expect that this would flag an error. But there are many scenarios in which MySQL will not flag an error:

    • mixed transactional and non-transactional tables
    • use of non-deterministic functions such as uuid()
    • stored procedures and functions
    • update with LIMIT clause

    There are others but suffice it to say if you want to rely on your slave being consistent, you need to check it!

  3. The solution – mathematical checksums
  4. If you’re a seasoned Linux user, you’re probably familiar with the md5sum command. It creates a checksum on a file. You can do so on different servers to compare a file in a mathematically exact way. In fact rsync uses this technique to efficiently determine what files or pieces of files need to be copied across a network. That’s what makes it so fast!

    It turns out that MySQL can checksum tables too. However were we to build our own solution, we might have trouble doing so manually as table data is constantly in a state of flux.

    Enter Percona’s pt-table-checksum tool formerly part of Maatkit. Run it periodically against your master schemas or the entire instance if you like. It will store checksums of all of your tables in a special checksum table. The data from this table then will propagate through replication to all of your connected slaves.

    The tool then has a check mode, which allows you to verify all the connected slaves are ok, or report the differences if it finds any.

  5. Step-by-step Setup
  6. First you’ll need to grab a copy of the percona toolkit. Note that if you previously installed maatkit then you may want to delete those old scripts to avoid confusion. mk-table-checksum if you used maatkit, or pt-table-checksum if you have 1.0 versions. You likely installed using wget or perl Makefile, so you may need to go and remove those manually.

    Assuming you’ve already got the percona repository installed issue:

    $ yum install -y percona-toolkit

    I’ve found some of the maatkit tools to be rather fussy about getting all the options right. The first thing to do which will help simplify this is to add a section in your local user’s “.my.cnf” file like this:

    [client]

    user=root

    password=myrootpassword

    That way the percona tools will look for this whenever it needs authentication credentials. Otherwise we assume localhost for this example, so you should verify you can connect with the mysql client as root from localhost.

    Now let’s checksum the “mysql” system schema.

    $ pt-table-checksum --replicate=test.checksum --create-replicate-table --databases=mysql localhost

    Note the –create-replicate-table option. You only need this option the first time. From there the test.checksum table will exist.

    You should see some output that looks like this:

    TS ERRORS DIFFS ROWS CHUNKS SKIPPED TIME TABLE

    04-24T16:06:45 0 0 0 1 0 0.099 mysql.columns_priv

    04-24T16:06:45 0 0 32 1 0 0.100 mysql.db

    04-24T16:06:45 0 0 0 1 0 0.096 mysql.event

    04-24T16:06:45 0 0 0 1 0 0.096 mysql.func

    04-24T16:06:45 0 0 38 1 0 0.102 mysql.help_category

    04-24T16:06:45 0 0 452 1 0 0.106 mysql.help_keyword

    04-24T16:06:46 0 0 993 1 0 0.096 mysql.help_relation

    04-24T16:06:46 0 0 506 1 0 0.100 mysql.help_topic

    04-24T16:06:46 0 0 0 1 0 0.099 mysql.host

    04-24T16:06:46 0 0 0 1 0 0.104 mysql.ndb_binlog_index

    04-24T16:06:46 0 0 0 1 0 0.107 mysql.plugin

    04-24T16:06:46 0 1 1 1 0 0.115 mysql.proc

    04-24T16:06:46 0 0 0 1 0 0.186 mysql.procs_priv

    04-24T16:06:46 0 1 1 1 0 0.097 mysql.proxies_priv

    04-24T16:06:47 0 0 0 1 0 0.097 mysql.servers

    04-24T16:06:47 0 0 0 1 0 0.096 mysql.tables_priv

    04-24T16:06:47 0 0 0 1 0 0.098 mysql.time_zone

    04-24T16:06:47 0 0 0 1 0 0.097 mysql.time_zone_leap_second

    04-24T16:06:47 0 0 0 1 0 0.100 mysql.time_zone_name

    04-24T16:06:47 0 0 0 1 0 0.100 mysql.time_zone_transition

    04-24T16:06:47 0 0 0 1 0 0.095 mysql.time_zone_transition_type

    04-24T16:06:47 0 1 38 1 0 0.100 mysql.user

  7. How to check slaves
  8. Once you’ve collected all those fancy checksums for your tables, nicely timestamped, you’ll want to verify that your slaves are happily in sync. You can do that with the following command, also on the master:

    $ pt-table-checksum --replicate=test.checksum --replicate-check-only --databases=mysql localhost

    If there’s no differences you’ll see no output. If you have a difference it’ll look something like this:

    Differences on ip-10-15-27-19

    TABLE CHUNK CNT_DIFF CRC_DIFF CHUNK_INDEX LOWER_BOUNDARY UPPER_BOUNDARY

    mysql.user 1 1 1

    In our case you can see we created some users on the slaves accidentally, hence the differences. It illustrates how easy it is for differences to creep into your environment and also how easy it now is to find them!

  9. Special Cases
  10. Since one of my clients uses Drupal, they’ve had trouble replicating the semaphore table. This tables is a MyISAM table, and unfortunately no one dares convert it to InnoDB. So from time to time some gunk builds up in there, and it fails on the slave. We could clean out the table, but we decided to just filter out this one table. Since Drupal doesn’t use fully qualified schema.table names in it’s code, only “use” we have found this to be safe.

    However the percona toolkit explicitely checks for replication filters and will not run. It’ll stop with an error as follows:

    $ pt-table-checksum --replicate=test.checksum --databases=sean --ignore-tables=semaphore localhost

    04-24T15:59:29 Replication filters are set on these hosts:

    ip-10.15.27.19

    replicate_ignore_table = sean.semaphore

    ip-10-15-27-72

    replicate_ignore_table = sean.semaphore

    ip-10-15-27-18

    replicate_ignore_table = sean.semaphore

    Please read the --check-replication-filters documentation to learn how to solve this problem. at /usr/bin/pt-table-checksum line 6166.

The solution is the –nocheck-replication-filters option. Keep in mind that this sanity check is there for a reason, so be sure to skip the relevant tables in your checksum building, and checksum checks.

To build checksums skipping the semaphore table use this command:

$ pt-table-checksum --replicate=test.checksum --ignore-tables=prod.semaphore --nocheck-replication-filters localhost

Now you can check your slaves but ignore the semaphore table:


$ pt-table-checksum --replicate=test.checksum --replicate-check-only --ignore-tables=prod.semaphore --nocheck-replication-filters localhost

We also found a bug which preventing us from specifying multiple tabes on the ignore-tables line. So we used multiple invocations to do different schemas like this:


$ pt-table-checksum --replicate=test.checksum --replicate-check-only --databases=prod1 --ignore-tables=prod1.semaphore --nocheck-replication-filters localhost

$ pt-table-checksum --replicate=test.checksum --replicate-check-only --databases=prod2 --ignore-tables=prod2.semaphore --nocheck-replication-filters localhost

  • Crash Protection
  • If you’ve used MySQL replication for any length of time, you’ve probably seen a server crash. MySQL replication can have trouble restarting if you’re using temporary tables, as they’ll be missing upon restart. Also MySQL before 5.5 leaves syncing the info files to the operating system. So they may be incorrect after a crash.

    1. Use MySQL 5.5 if possible

    There are some new parameters in 5.5 that protect the info files from a crash. These are a great addition, and will make your slave databases more bulletproof.

    sync_master_info = 1

    sync_relay_log = 1

    sync_relay_log_info = 1

    2. Don’t use temporary tables

    After a restart they’ll simply be gone, so queries requiring or running against them will fail.

    The Percona guys’ new book High Performance MySQL third edition, suggests an alternate solution to using temporary tables. Use a special schema to hold your temp data, but create them as normal permanent tables. Be sure your application creates them with unique names, using the connection_id() for example. Lastly have a cleanup process drop tables periodically, based on closed connection_ids.

    Categories
    All MySQL

    5 Ways to fortify MySQL replication

    fort

    Also find Sean Hull’s ramblings on twitter @hullsean.

    MySQL replication technology is powerful and flexible. But it doesn’t do everything perfectly all the time. You may experience trouble with the slaves falling behind the master, or want to scale horizontally by building new slaves automatically. Or you might need to build a slave without blocking the master database.

    All of these goals can be achieved using some powerful tools. Here’s a quick guide to those tools and how to use them.

      1. Build new Replicas without Downtime

    Something we’re sure you need to do quite often is to build new slaves. You can snapshot from another slave by bringing the slave down, copying it’s datadir to an alternate machine, updating the server_id and then starting up. However sometimes you have no slave, or your current slaves are serving data.

    In those cases you’ll need a way to snapshot off the master. With the great xtrabackup tool, you can create a snapshot of your production database without the locking of mysqldump. What’s more the resulting snapshot is a full datadir, so you won’t need to import all the data as you would with mysqldump. Saves yourself a lot of time!

    Take a look at our how-to for building replication slaves using hotbackups.

      1. Autoscale in the Cloud

    We wrote an extensive how-to diving into the nuts and bolts of MySQL Autoscaling.

      1. Use semisynchronous replication

    In MySQL 5.5, the Google code contributions got rolled into the Generally Available version. That brings some great new features and functionality to make your replicas more resilient and improve overall redundancy.

    We have an upcoming article planned for May that digs into configuring semisynchronous replication.

      1. Add integrity checking

    Statement based replication, as powerful as it is, has some serious limitations. As it turns out, many slave databases drift silently out of sync with the master. What’s more they don’t flag errors when they do so. The reason why this can happen has to do with the nature of statement based replication. If you combine transactional & non-transactional tables, for example, and a transaction rolls back, the statements on MyISAM tables will still get replicated, resulting in potentially different rows on the slave. Other cases include various non-deterministic functions such as sysdate which may return different results on the slave.

    Row-based replication begins to address these problems, by offering an alternative which includes the actual data changes replicated over the wire, instead of the statement instructions. Still this technology is new, and situations limit it’s use in some environments.

    The best way to handle this situation is by using the Percona Toolkit checksum tool. This calculates checksums just like you might to compare files at the filesystem, and just as rsync does to sync data over ssh.

    We have an upcoming article planned for May, covering table checksuming in depth. We’ll discuss how to collect the checksums via a cronjob, and then how to write a check which you can roll into Nagios, to monitor your data regularly.

      1. Watch out for the Potholes
      1. Use stored procedures & triggers sparingly or not at all

    Stored procedures can behave strangely with statement based replication, and can easily break things. That’s because stored procedures are fairly non-deterministic by nature. If you want your replication to be bulletproof, avoid them.

      1. Don’t write to both masters

    Sounds straightforward enough, but MySQL does not prevent you from doing so. We recommend you set the read-only flag on inactive masters, to protect you. You could also monitor certain key tables to be further cautious.

      1. Be sure to set unique server_id

    MySQL’s binary logging uses the server_id setting to uniquely identify servers in the replication topology. So it’s crucial that they be set on all servers, and be unique.

      1. Be very wary using temporary tables

    If a slave server dies, it’s temporary tables will be missing upon restart. If you have subsequent queries that rely on them, they’ll fail. Better to avoid them or use them very sparingly.

      1. Avoid MyISAM tables altogether

    InnoDB should really be used for all MySQL environments. Any exceptions are so few, as to be not worth mentioning.

    MyISAM tables are not crash safe, can and will lose data, and can be very slow due to locking problems.

    Do you need to do full text searching, Sphinx comes to mind. It’s more scalable, takes load off the database server, and is lightening quick!

    Lastly MyISAM tables can break replication, and we don’t want that! They’re not transaction safe, so if you mix them with InnoDB, bad things can happen!

      1. Avoid non-deterministic functions.

    Some functions such as UUID() and RAND() will behave differently on different calls. That means they may behave differently when the slave calls the same statement it fetches from the binlog. These types of problems may break things silently, allowing your slave to drift out of sync with your master without error. See integrity checking above for ways to fix this.

      1. UPDATE with LIMIT clause

    This interesting SQL combination relies heavily on the current sort of rows in the table. Suffice it to say it can behave differently on the slave, so be cautious if you have this syntax in your code. It may break replication.

    1. Row-based replication (still fairly new)

    We are not ready to recommend row-based replication yet, as there have been reports of some troubles and surprises with it. It is a fairly new code path and though it’s included in GA release, we don’t see it widely in production yet. It does take a stab at addressing many of the issues with statement based replication though, so the potential is there. We expect to see it used more widely in the future.

    Categories
    All Database Operations Scalability

    5 things toxic to scalability

    The.Rohit - Flickr
    The.Rohit – Flickr

    Check out our followup post 5 More Things Deadly to Scalability

    If you’re using MySQL checkout 5 ways to boost MySQL scalability.

    1. Object Relational Mappers

    ORMs are popular among developers but not among performance experts.  Why is that?  Primarily these two engineers experience a web application from entirely different perspectives.  One is building functionality, delivering features, and results are measured on fitting business requirements.  Performance and scalability are often low priorities at this stage.  ORMs allow developers to be much more productive, abstracting away the SQL difficulties of interacting with the backend datastore, and allowing them to concentrate on building the features and functionality.


    Scalability is about application, architecture and infrastructure design, and careful management of server components.

    On the performance side the picture is a bit different.  By leaving SQL query writing to an ORM, you are faced with complex queries that the database cannot optimize well.  What’s more ORMs don’t allow easy tweaking of queries, slowing down the tuning process further.

    Also: Is the difference between dev & ops a four-letter word?

    2. Synchronous, Serial, Coupled or Locking Processes

    Locking in a web application operates something like traffic lights in the real world.  Replacing a traffic light with a traffic circle often speeds up traffic dramatically.  That’s because when you’re out somewhere in the country where there’s very little traffic, no one is waiting idly at a traffic light for no reason.  What’s more even when there’s a lot of traffic, a traffic circle keeps things flowing.  If you need locking, better to use InnoDB tables as they offer granular row level locking than table level locking like MyISAM tables.

    Avoid things like semi-synchronous replication that will wait for a message from another node before allowing the code to continue.  Such waits can add up in a highly transactional web application with many thousands of concurrent sessions.

    Avoid any type of two-phase commit mechanism that we see in clustered databases quite often.  Multi-phase commit provides a serialization point so that multiple nodes can agree on what data looks like, but they are toxic to scalability.  Better to use technologies that employ an eventually consistent algorithm.

    Related: Is automation killing old-school operations?

    3. One Copy of Your Database

    Without replication, you rely on only one copy of your database.  In this configuration, you limit all of your webservers to using a single backend datastore, which becomes a funnel or bottleneck.  It’s like a highway that is under construction, forcing all the cars to squeeze into one lane.  It’s sure to slow things down.  Better to build parallel roads to start with, and allow the application aka the drivers to choose alternate routes as their schedule and itinerary dictate.

    Using MySQL? Checkout our our howto Easy Replication Setup with Hotbackups.

    Read: Do managers underestimate operational cost?

    4. Having No Metrics

    Having no metrics in place is toxic to scalability because you can’t visualize what is happening on your systems.  Without this visual cue, it is hard to get business units, developers and operations teams all on the same bandwagon about scalability issues.  If teams are having trouble groking this, realize that these tools simple provide analytics for infrastructure.

    There are tons of solutions too, that use SNMP and are non-invasive.  Consider Cacti, Munin, OpenNMS, Ganglia and Zabbix to name a few.  Metrics collections can involve business metrics like user registrations, accounts or widgets sold.  And of course they should also include low level system cpu, memory, disk & network usage as well as database level activity like buffer pool, transaction log, locking sorting, temp table and queries per second activity.

    Also: Are SQL Databases dead?

    5. Lack of Feature Flags

    Applications built without feature flags make it much more difficult to degrade gracefully.  If your site gets bombarded by a spike in web traffic and you aren’t magically able to scale and expand capacity, having inbuilt feature flags gives the operations team a way to dial down the load on the servers without the site going down.   This can buy you time while you scale your webservers and/or database tier or even retrofit your application to allow multiple read and write databases.

    Without these switches in place, you limit scalability and availability.

    Also: Is high availability overrated? The myth of five nines…

    Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

    Categories
    All Database Operations Scalability Web Operations

    5 Ways to Boost MySQL Scalability

    There are a lot of scalability challenges we see with clients over and over. The list could easily include 20, 50 or even 100 items, but we shortened it down to the biggest five issues we see.

    1. Tune those queries

    By far the biggest bang for your buck is query optimization. Queries can be functionally correct and meet business requirements without being stress tested for high traffic and high load. This is why we often see clients with growing pains, and scalability challenges as their site becomes more popular. This also makes sense. It wouldn’t necessarily be a good use of time to tune a query for some page off in a remote corner of your site, that didn’t receive real-world traffic. So some amount of reactive tuning is common and appropriate.

    Enable the slow query log and watch it. Use mk-query-digest, the great tool from Maatkit to analyze the log. Also make sure the log_queries_not_using_indexes flag is set.  Once you’ve found a heavy resource intensive query, optimize it!  Use the EXPLAIN facility, use a profiler, look at index usage and create missing indexes, and understand how it is joining and/or sorting.

    Also: Why generalists are better at scaling the web

    2. Employ Master-Master Replication

    Master-master active-passive replication, otherwise known as circular replication, can be a boon for high availability, but also for scalability.  That’s because you immediately have a read-only slave for your application to hit as well.  Many web applications exhibit an 80/20 split, where 80% of activity is read or SELECT and the remainder is INSERT and UPDATE.  Configure your application to send read traffic to the slave or rearchitect so this is possible.  This type of horizontal scalability can then be extended further, adding additional read-only slaves to the infrastructure as necessary.

    If you’re setting up replication for the first time, we recommend you do it using hotbackups. Here’s how.

    Keep in mind MySQL’s replication has a tendency to drift, often silently from the master. Data can really get out of sync without throwing errors! Be sure to bulletproof your setup with checksums.

    Related: Why you can’t find a MySQL DBA

    3. Use Your Memory

    It sounds very basic and straightforward, yet there are often details overlooked.  At minimum be sure to set these:

    • innodb_buffer_pool_size
    • key_buffer_size (MyISAM index caching)
    • query_cache_size – though beware of issues on large SMP boxes
    • thread_cache & table_cache
    • innodb_log_file_size & innodb_log_buffer_size
    • sort_buffer_size, join_buffer_size, read_buffer_size, read_rnd_buffer_size
    • tmp_table_size & max_heap_table_size

    Read: Why Twitter made a shocking admission about their data centers in the IPO

    4. RAID Your Disk I/O

    What is underneath your database?  You don’t know?  Well please find out!  Are you using RAID 5?  This is a big performance hit.  RAID5 is slow for inserts and updates.  It is also almost non-functional during a rebuild if you lose a disk.  Very very slow performance.  What should I use instead?  RAID 10 mirroring and striping, with as many disks as you can fit in your server or raid cabinet.  A database does a lot of disk I/O even if you have enough memory to hold the entire database.  Why?  Sorting requires rearranging rows, as does group by, joins, and so forth.  Plus the transaction log is disk I/O as well!

    Are you running on EC2?  In that case EBS is already fault tolerant and redundant.  So give your performance a boost by striping-only across a number of EBS volumes using the Linux md software raid.

    Also checkout our Intro to EC2 Cloud Deployments.

    Also of interest autoscaling MySQL on EC2.

    Also: Why startups are trying to do without techops and failing

    5. Tune Key Parameters

    These additional parameters can also help a lot with performance.

    innodb_flush_log_at_trx_commit=2

    This speeds up inserts & updates dramatically by being a little bit lazy about flushing the innodb log buffer.  You can do more research yourself but for most environments this setting is recommended.

    innodb_file_per_table

    Innodb was developed like Oracle with the tablespace model for storage.  Apparently the kernel developers didn’t do a very good job.  That’s because the default setting to use a single tablespace turns out to be a performance bottleneck.  Contention for file descriptors and so forth.  This setting makes innodb create tablespace and underlying datafile for each table, just like MyISAM does.

    Read this: Why a four letter word still divides dev and ops

    Made it to the end eh?!?! Grab our newsletter.

    Categories
    All Website Basics

    Database Replication – What is it and why is it important?

    Database replication is a service for shipping changes to your database, off to a copy housed on another server, potentially even in another data center.

    Each change to the records of information in your database or groups of them are organized into transactions.  These each get unique identifiers or change numbers.  Those groupings of changes are logged to transaction logs which are then moved across to the sibling database server, and applied there.

    Oracle offers this facility in the form of Standby Database aka Dataguard.  Although it can be messy to setup, it is fairly bulletproof.  If you don’t receive any errors in your logfiles, you can rest assured that the data on your main or master database server looks exactly the same as what’s on your secondary server.  With MySQL the situation can be a bit more complicated.  MySQL replication was built based on statements, not changed data.  So those same statements or instructions (SQL statements of DML & DDL) get rerun on the secondary database server.   Some of the problems associated with this are:

    1. MySQL standby transactions may execute at different times than master
    2. Serialization is important to maintain order.  Transactions need to get applied in the same order as on the primary.  However all tables & storage engines are not necessarily transactional in MySQL.  This can cause troubles, and potentially break replication on the slave side.
    3. MySQL replication can fail silently, that is data can drift out of sync with the master without your knowing.

    Caveats aside, MySQL replication is powerful, and quite fast.  With the right caution, and correct tools, it can be as robust and reliable as Oracle’s own standby database technology.

    • Use checksum tools like Maatkit to verify & compare tables on primary & secondary servers
    • Use rigorous monitoring to watch error logs, and checksum logs hourly if necessary.
    • Use Maatkit sync tool to resync tables if they get out of whack.
    • Be especially vigilant if you’re taking backups off the secondary server

    Replication can also be a powerful high availability tool in your arsenal.  Setup as a master-master cluster, you can keep your application pointed to the “active” side or role while performing blocking operations on the inactive database server.  Since master-master cluster also known as circular replication applies changes to both databases, you’ll need to stop replication during the blocking operation.  Once it is completed on the inactive side, redirect the webservers & application to the database where you completed the operation, then reenable replication and watch the databases catch up with each other!

    Quora discussions – What is database replication?