Deploying in the Amazon cloud is touted as a great way to achieve high scalability while paying only for the computing power you use. How do you get the best scalability from the technology? Continue reading 3 Ways to Boost Cloud Scalability
1. Tune those queries
By far the biggest bang for your buck is query optimization. Queries can be functionally correct and meet business requirements without being stress tested for high traffic and high load. This is why we often see clients with growing pains, and scalability challenges as their site becomes more popular. This also makes sense. It wouldn’t necessarily be a good use of time to tune a query for some page off in a remote corner of your site, that didn’t receive real-world traffic. So some amount of reactive tuning is common and appropriate.
Enable the slow query log and watch it. Use mk-query-digest, the great tool from Maatkit to analyze the log. Also make sure the log_queries_not_using_indexes flag is set. Once you’ve found a heavy resource intensive query, optimize it! Use the EXPLAIN facility, use a profiler, look at index usage and create missing indexes, and understand how it is joining and/or sorting.
2. Employ Master-Master Replication
Master-master active-passive replication, otherwise known as circular replication, can be a boon for high availability, but also for scalability. That’s because you immediately have a read-only slave for your application to hit as well. Many web applications exhibit an 80/20 split, where 80% of activity is read or SELECT and the remainder is INSERT and UPDATE. Configure your application to send read traffic to the slave or rearchitect so this is possible. This type of horizontal scalability can then be extended further, adding additional read-only slaves to the infrastructure as necessary.
If you’re setting up replication for the first time, we recommend you do it using hotbackups. Here’s how.
Keep in mind MySQL’s replication has a tendency to drift, often silently from the master. Data can really get out of sync without throwing errors! Be sure to bulletproof your setup with checksums.
Related: Why you can’t find a MySQL DBA
3. Use Your Memory
It sounds very basic and straightforward, yet there are often details overlooked. At minimum be sure to set these:
- key_buffer_size (MyISAM index caching)
- query_cache_size – though beware of issues on large SMP boxes
- thread_cache & table_cache
- innodb_log_file_size & innodb_log_buffer_size
- sort_buffer_size, join_buffer_size, read_buffer_size, read_rnd_buffer_size
- tmp_table_size & max_heap_table_size
4. RAID Your Disk I/O
What is underneath your database? You don’t know? Well please find out! Are you using RAID 5? This is a big performance hit. RAID5 is slow for inserts and updates. It is also almost non-functional during a rebuild if you lose a disk. Very very slow performance. What should I use instead? RAID 10 mirroring and striping, with as many disks as you can fit in your server or raid cabinet. A database does a lot of disk I/O even if you have enough memory to hold the entire database. Why? Sorting requires rearranging rows, as does group by, joins, and so forth. Plus the transaction log is disk I/O as well!
Are you running on EC2? In that case EBS is already fault tolerant and redundant. So give your performance a boost by striping-only across a number of EBS volumes using the Linux md software raid.
Also checkout our Intro to EC2 Cloud Deployments.
Also of interest autoscaling MySQL on EC2.
5. Tune Key Parameters
These additional parameters can also help a lot with performance.
This speeds up inserts & updates dramatically by being a little bit lazy about flushing the innodb log buffer. You can do more research yourself but for most environments this setting is recommended.
Innodb was developed like Oracle with the tablespace model for storage. Apparently the kernel developers didn’t do a very good job. That’s because the default setting to use a single tablespace turns out to be a performance bottleneck. Contention for file descriptors and so forth. This setting makes innodb create tablespace and underlying datafile for each table, just like MyISAM does.
Made it to the end eh?!?! Grab our newsletter.
Heavyweight Internet Group provides Open Source Database Professional Services and Consulting to fortune 500 companies. With our low overhead, and focused specialty we can offer very competitive prices. Our value add is simple: aggressive pricing, and personalized service. Contact us for details at 212-533-6828. Our services include:
- New MySQL database setup and administration
- MySQL & Postgres tuning of problem areas
- Correcting degraded MySQL application performance
- Remote DBA – 24×7 Support Services
- High profile deployments & Migrations to EC2
- Running MySQL in the cloud, on Amazon Web Services, EC2 & EBS
- Migrating Web Applications to Amazon RDS
- Tuning and optimizing challenges for open-source databases and Amazon RDS
We have 20 years of experience working on Open Source databases in all types of industries including banking, finance, education, entertainment, media and government. Our consultants are experts in the field, with published material including books, online and print magazine articles, and lectures. Please also feel free to browse our business newsletter archives. Our monthly newsletter discusses business best practices in Oracle consulting, and Open Source integration. We are conveniently located in Rockefeller Center, and are available for onsite meetings at your New York City offices.