Tag Archives: mysql

Scalability Happiness – A Quiet Query Log

Peter Van Allen - Pin Drop

Join 7500 others and follow Sean Hull on twitter @hullsean.

There’s a lot of talk on the web about scalability. Making web applications scale is not easy. The modern web architecture has so many moving parts. How can we grapple with the underlying problem?

Also: Why Are MySQL DBAs So Hard to Find?

The LAMP stack scales well

The truth that is half right. True there are a lot of moving parts, and a lot to setup. The internet stack made up of Linux, Apache, MySQL & PHP. LAMP as it’s called, was built to be resilient, dynamic, and scalable. It’s essentially why Amazon works. Why what they’re doing is possible. Windows & .NET for example don’t scale well. Strange to see Oracle mating with them, but I digress…

[quote]
Linux and LAMP that is built on top of it, are highly scalable and dynamic to begin with.
[/quote]

Also: AirBNB Didn’t Have to Fail During an AWS Outage

Ok, so what’s this got to do with MySQL? Well a LOT.

The webserver tier, the caching layers like memcache & varnish, as well as the search tier solr. These all scale fairly easily because their assets are fixed. Or almost so.

The database tier is different. So what affects performance of a database server? Server size? Main memory? Disk speed? The truth is all of those. But

Also check out: The Sexiest New Feature of AWS Speeds Up EBS

After you setup the server – set memory settings and so forth, it’s a fairly fixed object. True there are parameters to tweak but on the whole there isn’t a ton of day-to-day tuning to do.

Well if that’s true, why does performance take a hit?! As applications grow, the db server slows down, don’t we need to tweak server settings? Do we need new hardware?

Read this: A CTO Must Never Do This

The answer is possibly, but 9 times out of 10 what really needs to happen is queries must be tuned.

[quote]
In 17 years of consulting that is the single largest cause of scalability problems. Fix those queries and your problems are over.
[/quote]

The Elephant in the Room – Query Tuning

I was talking with a colleague today at AppNexus. He said, so should we do some of that work inside the application, instead of doing a huge UNION or a large JOIN? I said yes you can move work onto the application, but it makes the application more complex. On the flip side the webserver tier is easier to scale. So there are tradeoffs.

I said this:

[quote]
By and large, if scalability is our goal, we should work to quiet the activity in the slow query log. This is an active project for developers & DBAs. Keep it quiet and your server will run well.
[/quote]

Also: Top MySQL DBA Interview Questions for Candidates, Hiring Managers & Recruiters

Yet I still talk to teams where this is mysterious. It’s unclear. There’s no conviction there. And that’s where I think DBAs are failing. Because this is our subject matter expertise, and if we haven’t convinced developer teams of this, we’re not working together enough. API teams aren’t separate from DBA and operations. Siloing technology departments is a killer…

[mytweetlinks]

As you roll out new code, if some queries show up, then those need attention. Tweak the code until the queries drop out. This is the primary project of scalability.

When should I think about upgrading hardware?

If your code is stable, but you’re seeing a steady line rising on load average of the server, *THEN* go up in hardware. Load average means cpu & disk are being taxed. The server can’t keep up.

Related: Should I use RDS or build a MySQL server on AWS?

Devops means work together!

I close with a final point. Devops means bring dev & ops together! Don’t silo them off in different wings. Communicate. DBAs it’s your job to educate Developers about scalability and help with query tuning. Devs, profile new SQL code, test with large datasets & for god sakes don’t use an ORM – it’s one of 5 things toxic to scalability. Run explain and be sure to index all the right columns.

Together we can tackle this scalability thing!

Get some in your inbox: Exclusive monthly Scalable Startups. We share tips and special content. Here’s a sample

MySQL requires an authoritative master to build slaves

In MySQL database operations, you often need to rebuild slaves. They fail for a lot of different reasons, fall out of sync or crash. When this happens you may find you need to reclone and start fresh. This is normally done by finding your authoritative master database, and doing a hotbackup.

Click through to the end for multi-master solutions that work with MySQL.

Reason 4 – You cannot reclone without single master

If you use active-active multi-master replication, you no longer have a single authoritative master copy of all your data. That means you have no reliable master to clone from, if a replica breaks.

If you’re an operational dba, you probably know that MySQL replication breaks regularly. You always need to be able to go back to a fresh copy when things get mixed up on the slaves.

NEXT: Reason 5 – MySQL row-based replication has limitations

PREV: Reason 3 – Limitations of MySQL row-based replication

Want more? Grab our Scalable Startups monthly for more tips and special content. Here’s a sample

Limitations of MySQL row-based replication

MySQL offers a few different options for how you perform replication. Statement-based has been around a lot longer, and though it has some troublesome characteristics they’re known well and can be managed. What’s more it supports online schema changes with multi-master active-passive setup. We recommend this solution.

Row-based replication is newer. It attempts to address problems like those introduced by non-deterministic functions, and replicating stored procedures. But it introduces it’s own challenges.

Click through to the end for multi-master solutions that work with MySQL.

3. Row-based replication limitations

Row-based replication addresses some of the limitations of statement based replication.

o works better with stored procedures
o reduces problems associated with non-deterministic functions

But it creates a few of it’s own, some are show-stoppers:

o won’t work if target table storage engine, column order, data types or row itself are different or missing.
o doesn’t write SQL to the binlogs – useful for troubleshooting
o harder to do point-in-time recovery without SQL in binlogs
o harder to do online schema changes by switching masters

NEXT: Reason 4 – Cannot reclone without a single master

PREV: Reason 2 – MySQL Replication is prone to failure

Want more? Grab our Scalable Startups monthly for more tips and special content. Here’s a sample

10 ways I avoid trouble in database operations

1. Avoid destructive commands

From time to time I’m working with new recruits and bringing them up to speed in operations. The first thing I emphasize is care with destructive commands.

What do I mean here? Well there are all sorts of them. SQL commands such as DROP table & DROP database. But also TRUNCATE and DELETE are all destructive. They’re easy to execute but harder to undo. Think of all the steps it would take to restore from your backup.

If you are logged in as root there are many many ways to shoot your own foot. I hope you know this right? rm has lots of options that can be very difficult to step back from like -r (recursive) and -f (force). Better to not use the command at all and just move the file or directory you’re working on by renaming it. You can always delete later.

2. Set your command prompts

When working on the command line, your prompt is crucial. You check it over and over to make sure you’re working on the right box. At the OS, your prompt can tell you if you’re root or not, what directory you’re sitting in, and what’s the hostname of the box. With a few different terminals open, it’s very easy to execute a heavy loading command or destructive command on the wrong box. Check thrice, cut once!

You can also set your mysql prompt too. This can provide similar insurance. It can tell you the database schema you’re set at default, and the user you’re logged in as. Hostname or localhost too. It is one more piece in the risk aversion puzzle.

3. Perform backups & test them

I know I know, we’re all doing backups already. Well I sure hope so. But if you’re getting on a system for the first time, it should be your very initial impulse to check and find out what types of backups are being done. If they’re not, you should set them up. I don’t care how big the database is. If it’s an obstacle, you need to sell or educate management on what might happen if. Paint some ugly scenarios. It’s not always easy to see urgency in these things without a good war story or two.

We wrote a guide to using xtrabackup for hotbackups. These can be done online even while your production database is serving customers without table locking or other downtime.

4. Stay off production machines

This may sound funny to some of you, but I live by it. If it ain’t broke, don’t go and try to fix it! You don’t need to be on all these boxes all the time. That goes for other folks too. Don’t give devs access to every production box. Too many hands in the pie so to speak. Also limit root users. But again if those systems are running well, you don’t have to login to them and poke around every five minutes. This just brings more chances for operator error.

5. Avoid change as much as possible

This one might sound controversial but it’s saved me more than once.

I worked at one firm a few years back managing the MySQL servers. The Oracle DBA was going on vacation for a few weeks so I was picking up the reigns for a bit. I met with the DBA for some brain dump sessions, and he outlined the main things that can and do go wrong. He also asked that I avoid any table alterations.

Sure enough ten days into his vacation, a problem arose in the application. One page on the site was failing silently. There was a missing field which needed to be added. I resisted. A fight ensued. Suddenly a lot of money was at stake if this change wasn’t pushed through. I continued to resist. I explained that if such a change were not done correctly, it very likely would break replication, pushing a domino of other things to break and causing an unpredictable mess.

I also knew I only had to hold on for a few more days. The resident dba would be returning and he could juggle the change. You see Oracle was setup to use multi-master replication those changes needed to go through a rather complex process to be applied. Done incorrectly the damage would have taken days to cleanup and caused much more financial damage.

The DBA was very thankful at my resistance and management somewhat magically found a solution to the application & edit problem.

Push back is very important sometimes.

[quote]
Many of these ten tips are great characteristics to select for in the DBA hiring process. If you’re a candidate, emphasize your caution and track record with uptime. If you’re a manager, ask candidates about how they handle these situations. We wrote a MySQL DBA hiring guide too.

[/quote]

6. Monitor important things

You should monitor your OS syslog and MySQL error log for starters. But also your slow query log for new activity, analyze them and send the reports along to devs. Provide analysis. Monitor your partitions. You don’t ever want disks to fill up. Monitor load average, and have a check that the database login or some other simple transaction can succeed. You can even monitor your backups to make sure they complete without error. Use your judgement to decide what checks satisfy these requirements.

7. Use one or more slaves & checksum

MySQL slave databases are a great way to provide insurance. You can use a lagging slave to provide insurance against operator error, or one of those destructive commands we mentioned above. Have it lag a few hours behind so you’ll have that much insurance. At night this slave may be fresh enough to use for backups.

Also since mysql uses statement based replication, data can get out of sync over time. Those problems may or may not flag errors. So use a tool to compare your master and slave for data consistency. We wrote a howto on using checksums to do just that.

8. Be very careful of automatic failover

Automation is wonderful when it works. We dream of a data center that works like clockwork, with robots that never sleep. We can work towards this ideal, and in some cases get close. But it’s important to also understand that failure is by nature *not* what we predicted. The myriad ways that complex systems can fail boggles the mind, and surprises even seasoned veterans of operations. So maintain a heathy suspicion of this type of automation. Understand that if you automate things to happen in this crucial time, you can potentially put yourself in an even *more* compromised position than simply failing.

Sometimes monitoring, alerting, and manual intervention are the more prudent path. Your mileage may vary of course.

9. Be paranoid

It takes many years of doing ops to realize you can never be paranoid enough. Already checked that you’re on the right host, and about to execute some command? Quit the shell prompt and check again. Go back and ask the team if that table really needs to be dropped. Try to rephrase what you’re about to do in different words. Email out again to the team and wait some time before you pull the trigger. Check one more time that you have a fresh backup.

Delay that destructive command as long as you possibly can.

10. Keep it simple

I know I know, we all want to use that new command or tool, or jump on the latest hardware and take it for a spin. We want to build beautiful architectures that perform great feats of magic. But the fewer moving parts, the less things that can go wrong. And in ops, your job is stability and availability. Can you avoid using multi-master replication and go with just basic master-slave replication in MySQL? That’s simpler. Can you have fewer schemas or fewer filter rules? Can you skip the complicated HA layer, and use monitoring and manual failover?

Made it this far? Grab our newsletter.

Accidental DBA's Guide to MySQL Management

problem solvingSo you’ve been tasked with managing the MySQL databases in your environment, but you’re not sure where to start.  Here’s the quick & dirty guide. Oh yeah, and for those who love our stuff, take a look to your right. See that subscribe button? Grab our newsletter!

1. Installation

The “yum” tool is your friend.  If you’re using debian, you’ll use apt-get but it’s very similar. You can do a “yum list” to see what packages are available. We prefer to use the Percona distribution of MySQL.  It’s fully compatible with stock MySQL distribution, but usually a bit ahead in terms of tweak and fixes.  Also if you’re not sure, go with MySQL 5.5 for new installations.

$ rpm -Uhv http://www.percona.com/downloads/percona-release/percona-release-0.0-1.x86_64.rpm
$ yum install Percona-Server-client-55
$ yum install Percona-Server-shared-55
$ yum install Percona-Server-shared-compat
$ yum install Percona-Server-server-55

The last command will create a fresh database for you as well.

Already have data in an existing database? Then you can migrate between MySQL and Oracle.

2. Setup replication

MySQL replication is a process you’ll need to setup over and over again. It’s statement based in MySQL. A lot of INSERT, UPDATE, DELETE & CREATE statements are transferred to the slave database, and applied by a thread running on that box.

The steps to setup are as follows:

A. lock the primary with FLUSH TABLES WITH READ LOCK;

B. issue SHOW MASTER STATUS and note the current file & position

C. make a copy of the data. You can dump the data:

$ mysqldump -A --single-transaction > full_primary.mysql

Alternatively you can use xtrabackup to take setup replication without locking!

D. copy the dump to the slave database (scp works, but rsync is even better as it can restart if the connection dies).

E. import the dump on the slave box (overwrites everything so make sure you got your boxes straight!)

$ mysql < full_primary.mysql

F. point to the master

mysql> change master to
> master_user='rep',
> master_password='rep',
> master_host='10.20.30.40',
> master_log_file='bin-log.001122',
> master_log_pos=11995533;

G. start replication & check

mysql> start slave;
mysql> show slave statusG;

You should see something like this:

Slave_IO_Running: Yes
Slave_SQL_Running: Yes

3. Analyze slow query & tune

If you’re managing an existing MySQL database and you hit a performance blip, it’s likely due to something that has changed. You may be getting a spike in user traffic, that’s new! Or you may have some new application code that has been recently deployed, that’s new SQL that’s running in your database. What to do?

If you haven’t already, enable the slow query log:

mysql> set global slow_query_log=1;
mysql> set global long_query_time=0.50;

Now wait a while. A few hours perhaps, or a few days. The file should default to

/var/lib/mysql/server-slow.log

Now analyze it. You’ll use a tool from the percona toolkit to do that. If you haven’t already done so, install the percona toolkit as well.

$ yum install percona-toolkit
$ pt-query-digest /var/lib/mysql/server-slow.log > /tmp/server-report.txt

Once you’ve done that “less” the file, and review. You’ll likely see the top five queries account for 75% of the output. That’s good news because it means less query tuning. Concentrate on those five and you’ll get the most bang for your buck.

Bounce your opinions about the queries off of the developers who build application code. Ask them where the code originates. What are those pages doing?  Check the tables, are there missing indexes? Look at the EXPLAIN output. Consider tuning the table data structures, multi-column or covering indexes. There is typically a lot that can improve these troublesome queries.

4. Monitoring command line tools

You’ll want to have a battery of day-to-day tools at your disposal for interactive monitoring of the database.  Don’t go overboard. Obsessive tuning means obsessively turning knobs and dials. If there are no problems, you’re likely to create some.  So keep that in mind.

innotop is a “top” like utility for monitoring what’s happening inside your little database universe.  It’s probably already available through yum and the “epel” repository:

$ yum install innotop

First edit the .my.cnf file and add:
[client]
user=root
password=mypw

From there you should be able to just fire up innotop without problems.

mysqltuner is a catch all tool that does a once over of your server, and gives you some nice feedback.  Get a copy as follows:

$ wget mysqltuner.pl

Then run it:
$ chmod +x mysqltuner.pl
$ ./mysqltuner.pl

Here are a couple of useful mysql shell commands to get database information:

mysql> show processlist;
mysql> show innodb status;
mysql> show status;

There is also one last tool which can come in handy for reviewing a new MySQL server. Also from percona toolkit, the summary tool. Run it as follows:

$ pt-summary

5. Backups

You absolutely need to know about backups if you want to sleep at night. Hardware and database servers fail, software has bugs that bite. And if all that doesn’t get you, people make mistakes. So-called operator error will surely get you at some point. There are three main types:

A. cold backups

With the database shutdown, make a complete copy of the /var/lib/mysql directory, along with perhaps the /etc/my.cnf file. That together amounts to a cold backup of your database.

B. hot backups

There has been an enterprise tool for MySQL that provides this for some time. But we’re all very lucky to also have the open source Percona xtrabackup at our disposal. Here’s a howto using it for replication setup.

C. logical backups

These will generate a file containing all the CREATE statements to recreate all your objects, and then INSERT statements to add data.

$ mysqldump -A > my_database_dump.mysql

6. Review existing servers

The percona toolkit summary tool is a great place to start.

$ pt-summary

Want to compare the my.cnf files of two different servers?

$ pt-config-diff h=localhost h=10.20.30.40

Of course you’ll want to review the my.cnf file overall. Be sure you have looked at these variables:

tmp_table_size
max_head_table_size
default_storage_engine
read_buffer_size
read_rnd_buffer_size
sort_buffer_size
join_buffer_size
log_slow_queries
log_bin
innodb_log_buffer_size
innodb_log_file_size
innodb_buffer_pool_size
key_buffer_size (for MyISAM)
query_cache_size
max_packet_size
max_connections
table_cache
thread_cache_size
thread_concurrency

7. Security essentials

The output of the pt-summary and mysqltuner.pl scripts should give you some useful information here. Be sure to have passwords set on all accounts. Use fewer privileges by default, and only add additional ones to accounts as necessary.

You can use wildcards for the IP address but try to be as specific as possible. Allow for a subnet, not the whole internet ‘10.20.30.%’ for example instead of just ‘%’.

Also keep in mind that at the operating system or command line level, anyone with root access can really mess up your database. Writing to the wrong datafile or changing permissions can hose a running database very quickly.

8. Monitoring

Use a monitoring system such as Nagios to keep eye on things.  At minimum check for:

A. connect to db
B. server load average
C. disk partitions have free space
D. replication running – see above IO & SQL running status messages
E. no swapping – plenty of free memory

9. Ongoing maintenance

Periodically it’s a good idea to review your systems even when they’re running smoothly. Don’t go overboard with this however. As they say if it ain’t broke, don’t fix it.

A. check for unused & duplicate indexes
B. check for table fragmentation
C. perform table checks (if using MyISAM)

10. Manage the surprises

MySQL is full of surprises. In the Oracle world you might be surprised at how arcane some things are to setup, or how much babysitting they require. Or you might be surprised at how obscure some tuning & troubleshooting techniques are. In the MySQL world there are big surprises too. Albeit sometimes of a different sort.

A. replication checksums

One that continues to defy my expectations are those surrounding replication. Even if it is running without error, you still have more checking today. Unfortunately many DBAs don’t even know this!  That’s because MySQL replication can drift out of sync without error. We go into specific details of what things can cause this, but more importantly how to check and prevent it, by bulletproofing MySQL with table checksums.

B. test & confirm restores of backups

Spinup a cloud server in Amazon EC2, and restore your logical dump or hotbackup onto that box. Point a test application at that database and verify that all is well. It may seem obvious that a backup will do all this. But besides the trouble when a filesystem fills up, or some command had the wrong flag or option included. There can be even bigger problems if some piece or section of the database was simply overlooked.  It’s surprising how easy it is to run into this trouble. Testing also gives you a sense of what restore time looks like in the real world. A bit of information your boss is sure to appreciate.

If you made it this far, you know you want to grab the newsletter.

RDS or MySQL – Ten Use Cases

Amazon’s Relational Database Service is based on MySQL under the hood.  So many colleagues and clients ask me – should I go with RDS or MySQL? As with every technology question, the answer is – it depends.

Here are some scenarios to help you decide.

  1. I’m replicating into Amazon from a physical datacenter
  2. A: This setup is common if you’re using Amazon’s VPC or Virtual Private Cloud.  With a router dropped into your datacenter, VPC allows you to extend and spinup virtual instances from Amazon as if they’re sitting in your own existing datacenter.  Great stuff, but you won’t be able to replicate from your existing master MySQL instance to cloud Amazon RDS instances.  To do that, roll your own with MySQL 5.5 or Percona 5.5.  RDS can only work with other RDS instances.  At least for right now.

  3. I’m using Amazon and replicating to another cloud provider
  4. A: Again roll your own, as RDS won’t work with this configuration.

  5. I’m developer or small startup building a new application.
  6. A: Use RDS.  It’s easier to deploy, and though there are fewer levers and dials you can control, you can always dump your data and move it to roll your own deployment later.

  7. We’re a small shop without a fulltime DBA
  8. A: Use RDS as it is easier to deploy.  You’ll have a couple of replication options at your disposal.  There are read replicas which are basic slaves using MySQL’s built-in asyncronous replication.  You can scale horizontally with these read-only copies, but cannot failover to these instances.  For that you’ll need a multi-az configuration where you deploy across availability zones.  Multi-az is said to be syncronous so it’s likely built on top of a distributed filesystem such as DRBD.

    Keep in mind though there is no free lunch. RDS lacks the slow query log, which is the primary way you’ll identify errant queries, and fix them. You’ll need to be extra vigilant about QA & Test as code deploys add new SQL to your application. There *IS* a way to log slow queries to a table, which is ok for off and on use. However there’s quite some overhead to this feature being turned on all the time.

  9. We want to do master-master for easy failover and failback
  10. A: Use MySQL 5.5 or Percona 5.5.  The two configurations for RDS don’t support this.  With AWS Read Replicas you have one master and multiple read-only slaves, but no failover.  With Multi-AZ you can’t access the inactive secondary database *until* you failover.  Once you failover that instance becomes the primary.

  11. I’m concerned about clean data. Is replication bulletproof?
  12. A: No not out of the box.  Whether you are using RDS or MySQL there are various scenarios where MySQL slaves can drift out of sync from the master, without throwing an error. This likely impacts Read Replicas though it’s hard to say if it affects Multi-AZ.

    If you’re concerned about data integrity with MySQL replication (and you should be!) take a look at our Guide to MySQL Integrity Checking with Percona Toolkit.

  13. Our use case is to scale reads with multiple instances
  14. A: RDS can do this handily with Read Replicas.  As long as your requirements are for vanilla replicas, this will work well for you.

    To learn more about this checkout our Guide to Autoscaling MySQL on Amazon.

  15. I don’t want to be beholden to Amazon for my database
  16. A: Well then your choice is simpler.  Locking in with Amazon RDS means when you hit a bug, your hands are a bit tied.  Amazon’s DNA is as a DIY infrastructure provider, and although they’ve added support contracts to the mix, they’re not a Rackspace.  At least not yet.

    So there’s a tradeoff.  Go with a roll your own solution where you have control over all the nuts and bolts of your technology or go with something more managed.  RDS is a managed database solution, so to the extent things are automated, your hands are also tied when you hit a snag.

  17. We want to use an alternate MySQL distribution
  18. A: Again if you want to go with a Percona, MariaDB or Drizzle, you’re going to be using a roll your own distribution.

  19. We want to use an exotic replication topology

A: If you’re in this camp, you probably already know RDS isn’t going to support you.  MySQL’s replication technology can support a myriad of configurations. Here are a few that might work for you:

  • Master-master or Circular Replication
  • Distribution Master
  • Tree Replication Setup
  • Multi-source Replication
  • Logging Only Slave Server
  • Slaving off of a Slave

Ten things to remember about MySQL backups

  1. Use Hot Backups
  2. Hot backups are an excellent way to backup MySQL.  They can run without blocking your application, and save tons on restore time.  Percona’s xtrabackup tool is a great way to do this.  We wrote a how-to on using xtrabackup for hotbackups.

  3. Use Logical Backups
  4. Just because we love hot backups using xtrabackup doesn’t mean mysqldump isn’t useful.  Want to load data into Amazon RDS?  Want to isolate and load only one schema, or just one table?  All these great uses make mysqldump indispensable.  Use it in combination with periodic hot backups to give you more recovery options.

  5. Replication isn’t a backup
  6. While replication provides a great way to keep a hot copy of your production database, it’s not the same as a backup.  Why?  Operator error, that’s why!  People make mistakes, drop tables and database schemas that later need to be restored.  This can and will happen, so head off the disaster by doing real backups.

    As an additional note, if you’re using replication, you surely want to perform regular checksums of your data.  These ensure that the primary and secondary do indeed contain the same data.

  7. Firedrills & Restore Time
  8. The only way to be sure your backup is complete is to test restoring everything.  Yes it’s a pain, but it will inevitably be a learning experience.  You’ll document the process to speed it up in future tests, you’ll learn how long recovery takes, and find additional pieces to the pie that must be kept in place.  Doing this in advance of d-day is

    Different backups have different recovery times.  In the industry vernacular, your RTO or recovery time objective should inform what will work for you.  Although a mysqldump may take 30 minutes to complete, your restore of that data might take 8 hours or more.  That’s due in part to rebuilding all those indexes.  When you perform the dump one create index statement is formulated from the data dictionary, but on import the data must be sorted and organized to rebuild the index from scratch.  Percona’s mysqldump utility will capitalize on MySQL’s fast index rebuild for Innodb tables.  According to the Percona guys this can bring a big improvement in import time.  Yet another great reason to use the Percona distro!

  9. Transaction Logs
  10. If you want to be able to do point in time recovery, you’ll need all the binlog files as well.  These are being created all the time, while new transactions are completed in your database. If your last backup was last night at 3am, and you want to recovery today until 3pm, you’ll need all the binary logs from the intervening hours to apply to that backup.  This process is called point-in-time recovery, and can bring your database restore up to the current commited transactions.

  11. Backup Config Files
  12. Don’t forget that lonely /etc/my.cnf file.  That’s an important part of a backup if you’re rebuilding on a newly built server.  It may not need to be backed up with the same frequency, but should be included.

  13. Stored Code & Grants
  14. Stored procedures, triggers and functions are all stored in the mysql database schema.  If you are doing a restore of just one database schema, you may not have this, or it may make the restore more complicated.  So it can be a good idea to backup code separately.  mysqldump can do this with the –routines option.  Hot backups by their nature, will capture everything in the entire instance – that is all database schemas including the system ones.

    Grants are another thing you may want to backup separately.  For the same reasons as stored code, grants are stored in the system tables.  Percona toolkit includes a nice tool for this called pt-show-grants.  We recommend running this periodically anyway, as it’ll give you some perspective on permissions granted in your database.  You’re reviewing those right?

  15. Events & Cronjobs
  16. MySQL allows the running of events inside the database.  SHOW EVENTS or SHOW EVENTS schema_name will display the events scheduled.

    You may also have cronjobs enabled.  Use crontab -l to display those for specific users.  Be sure to check at least “mysql” and “root” users as well as other possible application users on the server.

  17. Monitoring
  18. Backups are a nit picky job, and often you don’t know if they’re complete until it’s time to restore.  That’s why we recommend firedrills above, and they’re very important.  You can also monitor the backups themselves.  Use an error log with mysqldump or xtrabackup, and check that logfile for new messages.  In addition you can check the size of the resulting backup file.  If it has changed measurably from the recent backup sizes, it may indicate problems.  Is your backup size 0, something serious is wrong.  Half the size of recent ones, it may have failed halfway through, or the filesystem filled up.

  19. Security
  20. This is often overlooked area, but may be a concern for some environments.  Is the data contained in your backup sensitive?  Consider where the backups are stored and retained for long term.  Reason who has access to those files, and make use of the least privileges rule.

Like our stuff? Don’t forget to grab our newsletter!

Oracle to MySQL – prepare to bushwhack through the open source jungle

oracle to mysql

I was recently approached by a healthcare company for advice on suitable database solutions capable of executing its new initiative. The company was primarily an Oracle shop so naturally, they began by shopping for possible Oracle solutions.

The CTO relayed his conversation with the Oracle sales rep, who at first recommended an Oracle solution that, expensive as it may have been, ultimately aligned with the company’s existing technology and experience. Unfortunately this didn’t match their budget and so predictably, the Oracle sales rep whipped out a MySQL-based solution as an alternative.

Having worked as an Oracle DBA throughout the dot-com years, I know the technology well. I also know the cultural differences between enterprises that choose Oracle solutions and those that choose open-source ones.

This encounter with the healthcare firm struck me as a classic conundrum for today’s companies who are under pressure to meet business targets under a tight budget, and in a very short time.

Can an open-source solution like MySQL be the answer to such huge demands?

The Oracle sales rep will likely nod excitedly and say no sweat. But as a consultant I could only manage an equivocal yes.

As the healthcare CTO rattled off the list of products he wanted to use, specific RTOs and RPOs (recovery time objective + recovery point objective – all I could think was to react with concern.

In my experience with startup after startup I’ve seen plenty of different MySQL installations but I’d never heard of one with the technology stack he described. What’s more I’d never heard of these solutions described with the Oracle Corp titles.

On one hand I wanted to discuss the merits of the solution he was keen to implement, while on the other, I was expressing concern over possible directions and paths we might take.

An Oracle cluster is not a MySQL cluster

The solution Oracle suggested was a MySQL Cluster. The term cluster unfortunately means different things to different people. Such loose usage of the word dilutes its meaning. In particular a lot of Oracle technologists expect that this solution might be similar to Oracle’s Real Application Cluster technology. It’s not. There are a lot of limitations, and frankly it’s really just a different beast.

The list also included various management dashboards which Oracle likes to push, but which I rarely see in my consulting assignments. What’s more I heard nothing about replication integrity considering that replication problems are an ongoing concern for real-world MySQL installations due to the particular technology used under the hood. There are reliable solutions to this problem but none yet available from Oracle. In fact, this is a big problem but one that may be completely off the sales guys’ radar.

Don’t let sales frame your architecture

Honestly, I don’t have a particularly large axe to grind with the sales guys. They have a job to do, and providing solutions which bring revenue to their firm and commissions for themselves is what puts food on their tables. Each party is motivated in different ways. But as a company shopping for solutions, this should be kept clearly in mind when starting down that road.

Beware prescribed architectural frameworks that appear too easy because they almost always don’t do what they say on the tin. Unfortunately sales folks don’t have experiencing designing architectures in the real world, so they can’t really know how the technologies work beyond the data sheet with feature bullet points.

As we all know in the technology space, all software come with bugs and real-world experience does not match the feature lists in the brochures. In law they have de jure and de facto. The former describes what is written and the latter, what’s practiced. For technology solutions, its never just adding water for something to work.

Do your homework

Before you embark on a new trip through the open source technology jungle, do some due diligence. Read up on real-world solutions, and how other large firms are using the technology. What configurations are they having success with? Which are causing trouble for a lot of people.

One of the great advantages of open-source are the very vibrant communities, forums and discussion groups where people are glad to share their experiences and offer advice.

Allow sufficient time to test and
bring your team up to speed

This is very important one. Shifting from an enterprise that relies primarily on Oracle for it’s relational database solution over to one that relies on open source technologies is a very big step indeed. Open-source technologies tend to be much more do-it-yourself and roll your own. Oracle solutions tend much more toward predefined paths and solutions and prescriptions for customers.

There are merits to each of these paths, with attendant pros and cons. But they are decidedly different. It’s likely that your team will also require time to get up to speed, not just with the particular software components, but with the new process by which things happen in the open-source space. Allow sufficient time for this shift to take place, lest you create more problems than solutions.

Top MySQL DBA interview questions (Part 2)

Also find Sean Hull’s ramblings on twitter @hullsean.

Continuing from our Top MySQL DBA interview questions (Part 1) here are five more questions that test a MySQL DBA’s knowledge, with two that will help suss out some personality traits.

  1. Disk I/O
  2. Disk performance should be an ever present concern to a DBA.  So although they don’t need to be a storage specialist, they should have a working knowledge.  Ask them about RAID versions, mirroring versus striping, and so forth.  Mirroring combines two disks as a unit.  Every write is duplicated on both disks.  If you lose one disk, you have an immediate copy.  Like a tandem truck that has spare tires running in parallel.  Lose one, and you don’t have to pull over immediately to replace it.  Striping spreads I/O over multiple disks so you on the one hand increase throughput linearly as you add disks.  That’s because you have more disks working for you.  At the same time you increase risk with each new disk you add, because the failure rate is then the sum total of all those disks.

    For relational databases the best RAID level is 10, which is striping over mirrored sets.  You use more disks, but disks are cheap compared to the hassle of any outage.

    If you’re deploying on Amazon, your candidate should be familiar with the Elastic Block Storage offering also known as EBS.  This is virtualized storage, so it introduces a whole world of operational flexibility.  No longer do you have to jump through hoops to attach, add or reconfigure storage on your servers.  It can all be done through command line API calls.  That said EBS suffers from variability problems as with any other shared resource.  Although Amazon guarantees your average throughput, the I/O you get at a given time can swing wildly from low to high.  Consider Linux software RAID across multiple EBS volumes to mitigate against this.

  3. How would you setup master/slave & master/master replication?
  4. A basic replication setup involves creating a full dump of the primary database, while it’s tables are locked.  The DBA should capture the master status, logfile & position at that time.  She should then copy the dump file to the secondary machine & import the full dump.  Finally the CHANGE MASTER TO statement should be run to point this database instance to it’s master.  Lastly START SLAVE should be issued.  If all goes well SHOW SLAVE STATUS should show YES for both of these status variables:

    Slave_IO_Running: Yes
    Slave_SQL_Running: Yes

    Master-master replication is similar, except one additional step.  After the above steps have run, you know that your application is not pointing at the slave database.  If you’re not sure, verify that fact first.  Now determine the logfile name & position on the slave with SHOW MASTER STATUS.  Return to the primary box, and run the CHANGE MASTER TO command to make it slave from the secondary box.  You’ve essentially asked MySQL to create a circular loop of replication.

    How does MySQL avoid getting into an infinite loop in this scenario?  The server_id variable must be set, and be unique for all MySQL instances in your replication topology.

    For extra credit, ask the candidate about replication integrity checking.  As important as this piece is to a solid reliable replication setup, many folks in the MySQL world are not aware of the necessity.  Though replication can be setup, and running properly, that does not mean it will keep your data clean and perfect.  Due to the nature of statement based replication, and non-deterministic functions and/or non-transactional tables, statements can make their way into the binary logs, without completing.  What this means is they may then complete on the slave, resulting in a different row set on the same table in master & slave instance.

    Percona’s pt-table-checksum is the preventative tool to use.  It can build checksums of all your tables, and then propagate those checksums through replication to the slave.  An additional check can then be run on the slave side to confirm consistency, or show which rows & data are different.

  5. How are Users & Grants different in MySQL than other DBs?
  6. Creating a grant in MySQL can effectively create the user as well.  MySQL users are implemented in a very rudimentary fashion.  The biggest misunderstanding in this area surrounds the idea of a user.  In most databases a username is unique by itself.  In MySQL it is the *combination* of user & hostname that must be unique.  So for example if I create user sean@localhost, sean@server2 and sean@server3, they are actually three distinct users, which can have distinct passwords, and privileges.  It can be very confusing that sean logging in from the local command line has different privileges or password than sean logging in from server2 and server3.  So that’s an important point.

  7. How might you hack a MySQL server?
  8. This is a good opportunity for the candidate to show some creativity with respect to operations and Linux servers.  There are all sorts of ways into a database server:

    a.bad, weak or unset passwords
    b.files with incorrect permissions – modifying or deleting filesystem files can take a database down or corrupt data
    c.intercepting packets – could reveal unencrypted data inside the database
    d.unpatched software – bugs often reveal vulnerabilities that allow unauthorized entry
    e.moving, disabling or interrupting the backup scripts – a possible timebomb until you need to restore
    f.DNS spoofing, could allow login as a different user
    g.generous permissions – may allow an unprivileged user access to protected data

    There are endless possibilities here.  Listening for creative thinking here, reveals how much that person will think thoroughly and effectively about protecting your systems from those same threats.

  9. Brain teasers, riddles and coding problems
  10. Google for a long time was a fan of these types of tests at interviews, but I’m not at all.  For one thing you filter for good test takers, and for another the candidate has no resources – either books or the internet at their disposal.

    Why not instead ask them to tell a story.  Storytelling conveys a lot of things.  It conveys a bit of teaching ability, which extends far beyond internalizing some multiple choice questions.  It tells you more about their personality, which as I’ve said is very important.  It shows how they solve problems, as they’ll take you through their process.  And gives them an opportunity to tell you a real world triumph they presided over.

Personality Questions

In my experience, some of the most important traits of a new hire center around personality traits, and how they might mix with your existing team.  Being punctual for an interview, for instance sets a precedent for many things.  But that door swings both ways, so if you want to hire these types of folks, don’t keep them waiting either!

Pay attention to whether or not the candidate takes some lead in the conversation at all.  This can indicate the person is a self starter.  Obviously a great candidate will also listen carefully and patiently to what you have to say, but may then take the ball and run with it somewhat.

Listen for signals that the person is active in the field, posting on forums, and attending conferences, meetups and forums on technology topics.  You might also ask them if they blog, and what topics interest them.

Top MySQL DBA interview questions (Part 1)

MySQL DBA interview questions

Also find Sean Hull’s ramblings on twitter @hullsean.

MySQL DBAs are in greater demand now than they’ve ever been. While some firms are losing the fight for talent, promising startups with a progressive bent are getting first dibs with the best applicants. Whatever the case, interviewing for a MySQL DBA is a skill in itself so I thought I’d share a guide of top MySQL DBA interview questions to help with your screening process.
It’s long and detailed with some background to give context so I will be publishing this in two parts.

The history of the DBA as a career

In the Oracle world of enterprise applications, the DBA has long been a strong career path. Companies building their sales staff required Peoplesoft or SAP, and those deploying the financial applications or e-business suite needed operations teams to manage those systems. At the heart of that operations team were database administrators or DBAs, a catchall title that included the responsibility of guarding your businesses crown jewels. Security of those data assets, backups, management and performance were all entrusted to the DBA.

In the world of web applications, things have evolved a bit differently. Many a startup are driven only by developers. In those smaller shops, Operations tasks are designated to one developer who takes on the additional responsibility of managing systems. In that scenario, Operations or DBA duties becomes a sort of secondary role to the primary one of building the application. Even in cases where the startup creates a specific operations role with one person managing systems administration, chances are they don’t also have DBA experience. Instead, these startups are more likely to manage the database as a typical Linux application.

When I grow up I (don’t) want to be a MySQL DBA

Where do they come from, and why don’t a lot of computer science folks gravitate towards operations, and DBA? This may be in part due to and the romance of certain job roles which we discussed in a past article, The Mythical MySQL DBA. This pattern appeared a lot in the Oracle world as well. Many folks who were career DBAs actually moved to that role from the business side. In fact you’d find that many didn’t have a computer science or engineering background in the first place. In my experience I saw many Linux and Unix administrators with a stronger foundation who would fit into the DBA role but were simply not interested in it. The same can be said of the MySQL side of the house. Computer science grads don’t get out of school aiming for a career in ops or as a DBA because it has never been regarded as the pinnacle. It’s typically the PROGRAMMERS who become the rockstars in a cool startup.

But as the Internet grows into a richer and more complex medium, things are changing. People talk about scalability, high availability, zero downtime and performance tuning. When brief outages cost millions in losses expectations are very high and that requires skilled, experienced DBAs.

We’ve made a list of comprised of skill questions, general questions and ‘good-to-know’ questions. Have fun grilling your candidate with them, although bear in mind that with interviews it’s not about knowing it all, rather how the person demonstrates critical thinking skills.

Skills Questions

  1. Why are SQL queries so fundamental to database performance?
  2. This is the one question which a DBA should have an answer to. If they can’t answer this question, they’re unlikely to be a good candidate.

    After a MySQL server is setup and running, with many of the switches and dials set to use memory, and play well with other services on the Linux server, queries remain an everyday challenge. Why is this?

    SQL queries are like little programs in and of themselves. They ask the database server to collect selections of records, cross tabulate them with other collections of records, then sort them, and slice and dice them. All of this requires MySQL to build temporary tables, perform resource intensive sorts and then organize the output in nice bite size chunks.

    Unfortunately there are many ways to get the syntax and the results right, yet not do so efficiently. This might sound like a moot point, but with modern websites you may have 5000 concurrent users on your site, each hitting pages that have multiple queries inside them.

    What makes this an ongoing challenge is that websites are typically a moving target, with business requirements pushing new code changes all the time. New code means new queries, which pose ongoing risks to application stability.

  3. Indexes – too many, too few; what’s the difference?
  4. Indexes are very important to the smooth functioning of a relational database. Imagine your telephone book of yore. I can look up all the people with last name of “Hull” in Manhattan because I have the proper index. But most yellow pages don’t include an index for *first* names even though they might occaisionally come in handy, for example with the names “Star” or “Persephone”.

    You can imagine that, if you had a phone book which you maintain and update, everytime you add or remove a name you also have to update the index. That’s right, and the same goes for your relational database.

    So therein lies the trade off, and it’s an important one. When you are *modifying* your data, adding, updating or removing records, you must do work to keep the index up to date. More indexes mean more work. However when you’re looking up data or *querying* in SQL speak, more indexes mean more ways of looking up data fast. One more trade off is that indexes take up more pages in your phonebook, and so too they take up more space on disk.

  5. Backup & Recovery – explain various types & scenarios for restore
  6. Backups come in a few different flavors that the DBA should be familiar with.

    Cold backups involve shutdown down the database server (mysqld) and then backing up all the data files by making a copy of them to another directory. To be really thorough, the entire datadir including binlogs, log files, /etc/my.cnf config file should also be backed up. The cold backup is a database in itself, and can be copied to an alternate server and mounted as-is.

    Logical backups involve using the mysqldump tool. This locks tables while it runs to maintain consistency of changing data, and can cause downtime. The resulting dump file contains CREATE DATABASE, CREATE TABLE & CREATE INDEX statements to rebuild the database. Note the file itself is not a database, but rather a set of instructions which can tell a MySQL server *HOW* to reconstruct the database. Important distinction here.

    Hot backups are a great addition to the mix as they allow the physical database data files to be backed up *WHILE* the server is up and running. In MySQL this can be achieved with the xtrabackup tool, available from Percona. Despite the name, it works very well with MyISAM and InnoDB tables too, so don’t worry if you’re not using xtradb tables.

    There are a few different restore scenarios, and the candidate should be able to describe how these various backups can be restored, and what the steps to do so would be. In addition they should understand what point-in-time recovery is, and how to perform that as well. After restoring one of the above three backup types, the DBA would use the mysqlbinlog utility to apply any subsequent transactions from the binary logs. So if the backup was made at 2am last night, and you restore that backup, the mysqlbinlog tool would be used to dig up transactions since 2am, and apply them to that restored database.

  7. Troubleshooting Performance
  8. Since this is an ongoing challenge with relational databases, a good grasp of it is crucial. One way to challenge the candidate would be to describe a recent performance problem you experienced with your infrastructure, and ask them how they would go about resolving it.

    If they struggle with the particulars of what you ran into, ask them to describe a big performance challenge they solved, what the cause was, and how they performed analysis.

    Typically, first steps involve mitigating the immediate problem by finding out what changed in the environment either operationally or code changes. If there is a bug that was hit, or other strange performance anomaly, the first stop is usually looking at log files. MySQL server error logs, and the slow query log are key files. From there, analyzing those files during the timeframe where problems occurred should yield some clues.

    You might also hope to hear some comment about metrics collection in this discussion. Tools such as cacti, munin, opennms, or ganglia are invaluable tools for drilling down on a past event or outage, and sifting through server stats to find trouble.

  9. Joins – describe a few kinds and how the server performs them
  10. A basic understanding of INNER JOIN and OUTER JOIN would be a great start. A simple example might be employees and departments. If you have four employees and two departments, an INNER JOIN of these tables together will give you the departments employees belong to. Add another employee without assigning her to a department, and the inner join won’t display her. Further adding a new department which doesn’t yet contain employees won’t display either. However performing an OUTER JOIN will give you those matches with null in the department field, and null in the employee field respectively.

    Thought of with another example, take a credit card company. One tables contains cardholders identity, their number, address, and other personal information. A second table contains their account activity. When they first join, they don’t have any monthly statements, so an INNER JOIN of cardholders with statements will yield no rows. However an OUTER JOIN on those two tables will yield a record, with a null for the statements columns.

Feeling like a MySQL expert yet? In Part 2 of Top MySQL DBA Interview Questions we’ll walkthrough four more questions plus a bonus.