There are a lot of components that make up modern internet websites, and a lot of places to get stuck in the mud. Website performance starts with the browser, what caching it is doing, their bandwidth to your server, what the webserver is doing (caching or not and how), if the webserver has sufficient memory, and then what the application code is doing and lastly how it is interacting with the backend database. Continue reading Top 3 Questions From Clients
One very strong case for cloud computing is that it can satisfy applications with seasonal traffic patterns. One way to test the advantages of the cloud is through a hybrid approach.
Cloud infrastructure can be built completely through scripts. You can spinup specific AMIs or machine images, automatically install and update packages, install your credentials, startup services, and you’re running.
All of these steps can be performed in advance of your need at little cost. Simply build and test. When you’re finished, shutdown those instances. What you walk away with is scripts. What do we mean?
The power here is that you carry zero costs for that burst capacity until you need it. You’ve already build the automation scripts, and have them in place. When your capacity planning warrants it, spinup additional compute power, and watch your internet application scale horizontally. Once your busy season is over, scale back and disable your usage until you need it again.
Does anyone remember 15 years ago when the dot-com boom was just starting? A lot of companies were running on Sun. Sun was the best hardware you could buy for the price. It was reliable and a lot of engineers had experience with the operating system, SunOS a flavor of Unix.
Yet suddenly companies were switching to cheap crappy hardware. The stuff failed more often, had lower quality control, and cheaper and slower buses. Despite all of that, cutting edge firms and startups were moving to commodity hardware in droves. Why was it so? Continue reading The New Commodity Hardware Craze aka Cloud Computing
MySQL databases are great work horses of the internet. They back tons of modern websites, from blogs and checkout carts, to huge sites like Facebook. But these technologies don’t run themselves. When you’re faced with a system that is slowing down, you’ll need the right tools to diagnose and troubleshoot the problem. MySQL has a huge community following and that means scores of great tools for your toolbox. Here are 7 ways to troubleshoot MySQL. Continue reading 7 Ways to Troubleshoot MySQL
Deploying in the Amazon cloud is touted as a great way to achieve high scalability while paying only for the computing power you use. How do you get the best scalability from the technology? Continue reading 3 Ways to Boost Cloud Scalability
1. Tune those queries
By far the biggest bang for your buck is query optimization. Queries can be functionally correct and meet business requirements without being stress tested for high traffic and high load. This is why we often see clients with growing pains, and scalability challenges as their site becomes more popular. This also makes sense. It wouldn’t necessarily be a good use of time to tune a query for some page off in a remote corner of your site, that didn’t receive real-world traffic. So some amount of reactive tuning is common and appropriate.
Enable the slow query log and watch it. Use mk-query-digest, the great tool from Maatkit to analyze the log. Also make sure the log_queries_not_using_indexes flag is set. Once you’ve found a heavy resource intensive query, optimize it! Use the EXPLAIN facility, use a profiler, look at index usage and create missing indexes, and understand how it is joining and/or sorting.
2. Employ Master-Master Replication
Master-master active-passive replication, otherwise known as circular replication, can be a boon for high availability, but also for scalability. That’s because you immediately have a read-only slave for your application to hit as well. Many web applications exhibit an 80/20 split, where 80% of activity is read or SELECT and the remainder is INSERT and UPDATE. Configure your application to send read traffic to the slave or rearchitect so this is possible. This type of horizontal scalability can then be extended further, adding additional read-only slaves to the infrastructure as necessary.
If you’re setting up replication for the first time, we recommend you do it using hotbackups. Here’s how.
Keep in mind MySQL’s replication has a tendency to drift, often silently from the master. Data can really get out of sync without throwing errors! Be sure to bulletproof your setup with checksums.
Related: Why you can’t find a MySQL DBA
3. Use Your Memory
It sounds very basic and straightforward, yet there are often details overlooked. At minimum be sure to set these:
- key_buffer_size (MyISAM index caching)
- query_cache_size – though beware of issues on large SMP boxes
- thread_cache & table_cache
- innodb_log_file_size & innodb_log_buffer_size
- sort_buffer_size, join_buffer_size, read_buffer_size, read_rnd_buffer_size
- tmp_table_size & max_heap_table_size
4. RAID Your Disk I/O
What is underneath your database? You don’t know? Well please find out! Are you using RAID 5? This is a big performance hit. RAID5 is slow for inserts and updates. It is also almost non-functional during a rebuild if you lose a disk. Very very slow performance. What should I use instead? RAID 10 mirroring and striping, with as many disks as you can fit in your server or raid cabinet. A database does a lot of disk I/O even if you have enough memory to hold the entire database. Why? Sorting requires rearranging rows, as does group by, joins, and so forth. Plus the transaction log is disk I/O as well!
Are you running on EC2? In that case EBS is already fault tolerant and redundant. So give your performance a boost by striping-only across a number of EBS volumes using the Linux md software raid.
Also checkout our Intro to EC2 Cloud Deployments.
Also of interest autoscaling MySQL on EC2.
5. Tune Key Parameters
These additional parameters can also help a lot with performance.
This speeds up inserts & updates dramatically by being a little bit lazy about flushing the innodb log buffer. You can do more research yourself but for most environments this setting is recommended.
Innodb was developed like Oracle with the tablespace model for storage. Apparently the kernel developers didn’t do a very good job. That’s because the default setting to use a single tablespace turns out to be a performance bottleneck. Contention for file descriptors and so forth. This setting makes innodb create tablespace and underlying datafile for each table, just like MyISAM does.
Made it to the end eh?!?! Grab our newsletter.
Best Practices, the term we hear thrown around a lot. But like going on that new years diet, too often ends up more talk than action.
Operator error ie typing the wrong command is always a risk. Logging into the wrong server to drop a database or typing the dump command such that you dump data into the database, these are risks that operations folks face everyday.
Accountability is important, be sure all of your systems folks login to their own accounts. Apply the least privileges model, give permissions on an as needed basis.
Set prompts with big bold names that indicate production servers and their purpose. Automate repetitive commands that are prone to typos.
Don’t be afraid to give developers read-only accounts on production servers.
Regular team meetings, a la the Agile stand ups are a great way to encourage folks to communicate. Bring the developers and operations folks together. Ask everyone in turn to voice their current todos, their concerns and risks they see. Encourage everyone to listen with an open mind. Consider different perspectives.
Communication is a cultural attribute. So it comes from the top. Encourage this as a CTO or CIO by asking questions, communicating your concerns, repeat your own requests in different words and paraphrase. Listen to what your team is saying, repeat and rephrase those concerns, and how and when they will be addressed.
A culture of documenting services, and processes is healthy. It provides a central location and knowledge base for the team. It also prevents sliding into the situation where only one team member understands how to administer critical business components. Were that person to be unavailable or to leave the company, you’re stuck reverse engineering your infrastructure and guessing at architectural decisions.
Rather than think of best practices as something you need to achieve today, think of it as an ongoing day-to-day quest for improvement.
- repetitive manual processes – employ automation & script those processes where possible.
- where steps require investigation and research – document it
- where production changes are involved – communicate with business units, qa & operations
- always be improving – striving for better practices
In search of a good book on Chef itself, I picked up this new title on O’Reilly. It’s one of their new format books, small in size, only 75 pages.
There was some very good material in this book. Mr. Nelson-Smith’s writing style is good, readable, and informative. The discussion of risks of infrastructure as code was instructive. With the advent of APIs to build out virtual data centers, the idea of automating every aspect of systems administration, and building infrastructure itself as code is a new one. So an honest discussion of the risks of such an approach is bold and much needed. I also liked the introduction to Chef itself, and the discussion of installation.
Chef isn’t really the main focus of this book, unfortunately. The book spends a lot of time introducing us to Agile Development, and specifically test driven development. While these are lofty goals, and the first time I’ve seen treatment of the topic in relation to provisioning cloud infrastructure, I did feel too much time was spent on that. Continue reading Review – Test Driven Infrastructure with Chef – Stephen Nelson-Smith
IOPs are an attempt to standardize comparison of disk speeds across different environments. When you turn on a computer, everything must be read from disk, but thereafter things are kept in memory. However applications typically read and write to disk frequently. When you move to enterprise class applications, especially relational databases, a lot of disk I/O is happening so performance of disk resources is crucial.
For a basic single SATA drive that you might have in server or laptop, you can typically get 30-40 IOPs from it. These numbers vary if you are talking about random versus sequential reads or writes. Picture the needle on a vinyl record. It moves quicker around the center, and slower around the outside. That’s what’s happening the the magnetic needle inside your harddrive too.
In Amazon EC2 environment, there is a lot of variability in performance from EBS. You can stripe across four separate EBS volumes which will be on four different locations on the underlying RAID array and you’ll get a big boost in disk I/O. Also disk performance will vary from an m1.small, m1.large and m1.xlarge instance type, with the latter getting the lions share of network bandwidth, so better disk I/O performance. But in the end your best EBS performance will be in the range of 500-1000 IOPs. That’s not huge by physical hardware standards, so an extremely disk intensive application will probably not perform well in the Amazon cloud.
Still the economic pressures and infrastructure and business flexibility continue to push cloud computing adoption, so expect the trend to continue.
So-called ETL relates to moving data from external sources into and out of relational databases or data warehouses.
Source systems may store data in an infinite variety of formats. Extracting involves getting that data into common files for moving to the destination system. CSV file also known as comma separated values is named because each of the records is stored as one line in the file, and fields are separated by commas, and often surrounded by quotes as well. In MySQL INTO OUTFILE syntax can perform this function. If you have a lot of tables to work with, you can script the process using the data dictionary as a lookup for table names, and create a .mysql script to then run with the mysql shell. In Oracle you would use the spool command in SQL*Plus the command line shell. Spool sends subsequent output from the screen also to a file.
This step involves modifying the extracted data in preparation for moving it into the target database server. It may involve sweeping out blank records, or rearranging columns, or breaking files into smaller subsets of data. You might also map values differently for instance if one column in the source database was gender with values M/F you might transform those to the strings “Male” and “Female” if that is more useful for your target database server. Or you might transform those to numerical values, for instance Male & Female might be 0/1 in your target database.
Although I myriad of high level GUI tools exist to perform these functions, the Unix operating system includes a plethora of very powerful tools that every experience System Administrator is familiar with. Those include grep & sed which operate on regular expressions and can perform data transformation at lightening speed. Then there is sort which can sort data and send the results to stdout or the file of your choosing. Other tools include wc – word count, cut which can remove columns and so forth.
This final step involves moving the data into the database server, and it’s final target tables. For instance in MySQL this might be done with the LOAD DATA INFILE syntax, while in Oracle you might use SQL*Loader, which is a very fast flat file dataloader.