Tag Archives: database management

RDS or MySQL – Ten Use Cases

Amazon’s Relational Database Service is based on MySQL under the hood.  So many colleagues and clients ask me – should I go with RDS or MySQL? As with every technology question, the answer is – it depends.

Here are some scenarios to help you decide.

  1. I’m replicating into Amazon from a physical datacenter
  2. A: This setup is common if you’re using Amazon’s VPC or Virtual Private Cloud.  With a router dropped into your datacenter, VPC allows you to extend and spinup virtual instances from Amazon as if they’re sitting in your own existing datacenter.  Great stuff, but you won’t be able to replicate from your existing master MySQL instance to cloud Amazon RDS instances.  To do that, roll your own with MySQL 5.5 or Percona 5.5.  RDS can only work with other RDS instances.  At least for right now.

  3. I’m using Amazon and replicating to another cloud provider
  4. A: Again roll your own, as RDS won’t work with this configuration.

  5. I’m developer or small startup building a new application.
  6. A: Use RDS.  It’s easier to deploy, and though there are fewer levers and dials you can control, you can always dump your data and move it to roll your own deployment later.

  7. We’re a small shop without a fulltime DBA
  8. A: Use RDS as it is easier to deploy.  You’ll have a couple of replication options at your disposal.  There are read replicas which are basic slaves using MySQL’s built-in asyncronous replication.  You can scale horizontally with these read-only copies, but cannot failover to these instances.  For that you’ll need a multi-az configuration where you deploy across availability zones.  Multi-az is said to be syncronous so it’s likely built on top of a distributed filesystem such as DRBD.

    Keep in mind though there is no free lunch. RDS lacks the slow query log, which is the primary way you’ll identify errant queries, and fix them. You’ll need to be extra vigilant about QA & Test as code deploys add new SQL to your application. There *IS* a way to log slow queries to a table, which is ok for off and on use. However there’s quite some overhead to this feature being turned on all the time.

  9. We want to do master-master for easy failover and failback
  10. A: Use MySQL 5.5 or Percona 5.5.  The two configurations for RDS don’t support this.  With AWS Read Replicas you have one master and multiple read-only slaves, but no failover.  With Multi-AZ you can’t access the inactive secondary database *until* you failover.  Once you failover that instance becomes the primary.

  11. I’m concerned about clean data. Is replication bulletproof?
  12. A: No not out of the box.  Whether you are using RDS or MySQL there are various scenarios where MySQL slaves can drift out of sync from the master, without throwing an error. This likely impacts Read Replicas though it’s hard to say if it affects Multi-AZ.

    If you’re concerned about data integrity with MySQL replication (and you should be!) take a look at our Guide to MySQL Integrity Checking with Percona Toolkit.

  13. Our use case is to scale reads with multiple instances
  14. A: RDS can do this handily with Read Replicas.  As long as your requirements are for vanilla replicas, this will work well for you.

    To learn more about this checkout our Guide to Autoscaling MySQL on Amazon.

  15. I don’t want to be beholden to Amazon for my database
  16. A: Well then your choice is simpler.  Locking in with Amazon RDS means when you hit a bug, your hands are a bit tied.  Amazon’s DNA is as a DIY infrastructure provider, and although they’ve added support contracts to the mix, they’re not a Rackspace.  At least not yet.

    So there’s a tradeoff.  Go with a roll your own solution where you have control over all the nuts and bolts of your technology or go with something more managed.  RDS is a managed database solution, so to the extent things are automated, your hands are also tied when you hit a snag.

  17. We want to use an alternate MySQL distribution
  18. A: Again if you want to go with a Percona, MariaDB or Drizzle, you’re going to be using a roll your own distribution.

  19. We want to use an exotic replication topology

A: If you’re in this camp, you probably already know RDS isn’t going to support you.  MySQL’s replication technology can support a myriad of configurations. Here are a few that might work for you:

  • Master-master or Circular Replication
  • Distribution Master
  • Tree Replication Setup
  • Multi-source Replication
  • Logging Only Slave Server
  • Slaving off of a Slave

Best of Guide – Highlights of Our Popular Content

We cherry pick the top 5 most popular posts of various topics we’ve covered in recent months.

Ten things to remember about MySQL backups

  1. Use Hot Backups
  2. Hot backups are an excellent way to backup MySQL.  They can run without blocking your application, and save tons on restore time.  Percona’s xtrabackup tool is a great way to do this.  We wrote a how-to on using xtrabackup for hotbackups.

  3. Use Logical Backups
  4. Just because we love hot backups using xtrabackup doesn’t mean mysqldump isn’t useful.  Want to load data into Amazon RDS?  Want to isolate and load only one schema, or just one table?  All these great uses make mysqldump indispensable.  Use it in combination with periodic hot backups to give you more recovery options.

  5. Replication isn’t a backup
  6. While replication provides a great way to keep a hot copy of your production database, it’s not the same as a backup.  Why?  Operator error, that’s why!  People make mistakes, drop tables and database schemas that later need to be restored.  This can and will happen, so head off the disaster by doing real backups.

    As an additional note, if you’re using replication, you surely want to perform regular checksums of your data.  These ensure that the primary and secondary do indeed contain the same data.

  7. Firedrills & Restore Time
  8. The only way to be sure your backup is complete is to test restoring everything.  Yes it’s a pain, but it will inevitably be a learning experience.  You’ll document the process to speed it up in future tests, you’ll learn how long recovery takes, and find additional pieces to the pie that must be kept in place.  Doing this in advance of d-day is

    Different backups have different recovery times.  In the industry vernacular, your RTO or recovery time objective should inform what will work for you.  Although a mysqldump may take 30 minutes to complete, your restore of that data might take 8 hours or more.  That’s due in part to rebuilding all those indexes.  When you perform the dump one create index statement is formulated from the data dictionary, but on import the data must be sorted and organized to rebuild the index from scratch.  Percona’s mysqldump utility will capitalize on MySQL’s fast index rebuild for Innodb tables.  According to the Percona guys this can bring a big improvement in import time.  Yet another great reason to use the Percona distro!

  9. Transaction Logs
  10. If you want to be able to do point in time recovery, you’ll need all the binlog files as well.  These are being created all the time, while new transactions are completed in your database. If your last backup was last night at 3am, and you want to recovery today until 3pm, you’ll need all the binary logs from the intervening hours to apply to that backup.  This process is called point-in-time recovery, and can bring your database restore up to the current commited transactions.

  11. Backup Config Files
  12. Don’t forget that lonely /etc/my.cnf file.  That’s an important part of a backup if you’re rebuilding on a newly built server.  It may not need to be backed up with the same frequency, but should be included.

  13. Stored Code & Grants
  14. Stored procedures, triggers and functions are all stored in the mysql database schema.  If you are doing a restore of just one database schema, you may not have this, or it may make the restore more complicated.  So it can be a good idea to backup code separately.  mysqldump can do this with the –routines option.  Hot backups by their nature, will capture everything in the entire instance – that is all database schemas including the system ones.

    Grants are another thing you may want to backup separately.  For the same reasons as stored code, grants are stored in the system tables.  Percona toolkit includes a nice tool for this called pt-show-grants.  We recommend running this periodically anyway, as it’ll give you some perspective on permissions granted in your database.  You’re reviewing those right?

  15. Events & Cronjobs
  16. MySQL allows the running of events inside the database.  SHOW EVENTS or SHOW EVENTS schema_name will display the events scheduled.

    You may also have cronjobs enabled.  Use crontab -l to display those for specific users.  Be sure to check at least “mysql” and “root” users as well as other possible application users on the server.

  17. Monitoring
  18. Backups are a nit picky job, and often you don’t know if they’re complete until it’s time to restore.  That’s why we recommend firedrills above, and they’re very important.  You can also monitor the backups themselves.  Use an error log with mysqldump or xtrabackup, and check that logfile for new messages.  In addition you can check the size of the resulting backup file.  If it has changed measurably from the recent backup sizes, it may indicate problems.  Is your backup size 0, something serious is wrong.  Half the size of recent ones, it may have failed halfway through, or the filesystem filled up.

  19. Security
  20. This is often overlooked area, but may be a concern for some environments.  Is the data contained in your backup sensitive?  Consider where the backups are stored and retained for long term.  Reason who has access to those files, and make use of the least privileges rule.

Like our stuff? Don’t forget to grab our newsletter!