This article is part of a multi-part series Intro to EC2 Cloud Deployments
What types of applications do fit well in the cloud?
o Applications with Seasonal Traffic Patterns
o Proof-of-concept Applications
o Quick Temporary Dev & Test Environments
o CPU Intensive Applications
o On-Demand or Unknown Future Demand
Seasonal Traffic Patterns
Web applications often show the following traffic patterns. Traffic is steady for weeks or months, then experiences a spike in traffic. That spike may be due to a launch of a new product or service, a new marketing or advertising campaign or sudden user interest. Inevitably you’ll need more servers and compute power to handle that spike. That is your peak capacity requirement.
With traditional servers you would need to buy enough servers or big enough ones to support that load or else suffer outages. What’s more you’d have to plan in advance in order to have those servers online and integrated into the web infrastructure.
With Cloud Computing, you already have spinup scripts for your server types, and can bring additional compute power online with only a few commands. Even better with AWS Autoscaling, you can define rules to have new servers spinup for you automatically!
If you’re in the process of testing a new business idea or internet startup, you may not have the budget to order all sorts of heavy iron to support it. Cloud Computing complements this type of requirement very nicely. You need dev servers, voila they’re up and running. Quickly and cheaply. You may not know what you’ll need in six months or if your idea will take off, and don’t have to risk a big purchase. Buy only what you need.
Dev and Test Environments
Another application type that really complements cloud computing well is dev and test environments. You may want to clone your production servers, or bring on a temporary test environment with all of the same components as production. But you don’t need that setup all of the time. Just bring the servers online when you need them and stop them when you’re done testing. You won’t get instance charges while the servers are stopped, but the server images will remain resident on your EBS snapshots!
CPU Intensive Applications
Server farms are used for all sorts of applications such as SETI or the Human Genome Project. These applications require legions of servers working together to churn through large amounts of data. That are uniquely fitted to cloud computing, as they are cpu-intensive. Once you are done, you can easily decomission all of those servers.
Online gaming is another CPU intensive application. As users access Facebook applications such as Farmville, it’s hard to know in advance what those demands will be from day-to-day. Enabling a feature like AWS Autoscaling means the compute power does a lot of the capacity planning for you, responding dynamically to need. We wrote a piece on autoscaling MySQL databases.
On-Demand or Unknown Future Requirements
Any other types of applications that have on-demand needs, and for which you don’t know what the future will look like, match cloud computing well. You avoid the up-front costs of buying a whole rack of servers, and keep servers offline when they’re not busy.
Hey you… made it this far? Grab out newsletter – scalable startups.
Your recent social media campaign has gone viral. It’s what you’ve been dreaming about, pinning your hopes on, and all of your hard work is now coming to fruition. Tens of thousands of internet users, hoards of them in fact, are now descending on your website. Only one problem, it went down!!
That’s a situation you want to avoid. Luckily there are some best practices for avoiding scenarios like the one I described. In engineering it’s termed “degrade gracefully”. That is continue functioning but with the heaviest features disabled.
Browsing Only, But Still Functioning
One way to do this is for your site to have a browsing only mode. On the database side you can still be functioning with a read-only database. With a switch like that, your site will continue to function while pointed to any of your read-only replication slaves. What’s more you can load balance across those easily, and keep your site up and running.
In software development, decoupling involves breaking apart components or pieces of an application that should not depend on one another. One way to do this is to use a queuing system such as Amazon’s SQS to allow pieces of the application to queue up work to be done. This makes those pieces asynchronous, ie they’ll return right away. Another way is to expose services internal to your site through web services. These individual components can then be scaled out as needed. This makes them more highly available, and reduces the need to scale your memcache, webservers or database servers – the hardest ones to scale.
Identify Features You Can Disable
Typically your application will have features that are more superfluous, or that are not part of the core functionality. Perhaps you have star ratings, or some other components that are heavy. Work with the development and operations teams to identify those areas of the application that are heaviest, and that would warrant disabling if the site hits heavy storms.
Once you’ve done all that, document how to disable and reenable those features, so other team members will be able to flip the switches if necessary.
With traditional managed hosting solutions, we have best practices, we have business continuity plans, we have disaster recovery, we document our processes and all the moving parts in our infrastructure. At least we pay lip service to these goals, though from time to time we admit to getting side tracked with bigger fish to fry, high priorities and the emergency of the day. We add “firedrill” to our todo list, promising we’ll test restoring our backups. But many times we find it is in the event of an emergency that we are forced to find out if we actually have all the pieces backed up and can reassemble them properly.
Cloud Computing is different. These goals are no longer be lofty ideals, but must be put into practice. Here’s why.
- Virtual servers are not as reliable as physical servers
- Amazon EC2 has a lower SLA than many managed hosting providers
- Devops introduces new paradigm, infrastructure scripts can be version controlled
- EC2 environment really demands scripting and repeatability
- New flexibility and peace of mind
EC2 virtual servers can and will die. Your spinup scripts and infrastructure should consider this possibility not as some far off anomalous event, but a day-to-day concern. With proper scripts and testing of various scenarios, this should become manageable. Use snapshots to backup EBS root volumes, and build spinup scripts with AMIs that have all the components your application requires. Then test, test and test again.
Amazon EC2′s SLA – Only 99.95%
The computing industry throws around the 99.999% or five-nines uptime SLA standard around a lot. That amounts to less than six minutes of downtime. Amazon’s 99.95% allows for 263 minutes of downtime. Greater downtime merely gets you a credit on your account. With that in mind, repeatable processes and scripts to bring your infrastructure back up in different availability zones or even different datacenters is a necessity. Along with your infrastructure scripts, offsite backups also become a wise choice. You should further take advantage of availability zones and regions to make your infrastructure more robust. By using private IP addresses and network, you can host a MySQL database slave in a separate zone, for instance. You can also do GDLB or Geographically Distributed Load Balancing to send customers on the west coast to that zone, and those on the east coast to one closer to them. In the event that one region or availability zone goes out, your application is still responding, though perhaps with slightly degraded performance.
Devops – Infrastructure as Code
With traditional hosting, you either physically manage all of the components in your infrastructure, or have someone do it for you. Either way a phone call is required to get things done. With EC2, every piece of your infrastructure can be managed from code, so your infrastructure itself can be managed as software. Whether you’re using waterfall method, or agile as your software development lifecycle, you have the new flexibility to place all of these scripts and configuration files in version control. This raises manageability of your environment tremendously. It also provides a type of ongoing documentation of all of the moving parts. In a word, it forces you to deliver on all of those best practices you’ve been preaching over the years.
EC2 Environment Considerations
When servers get restarted they get new IP addresses – both private and public. This may affect configuration files from webservers to mail servers, and database replication too, for example. Your new server may mount an external EBS volume which contains your database. If that’s the case your start scripts should check for that, and not start MySQL until it finds that volume. To further complicate things, you may choose to use software raid over a handful of EBS volumes to get better performance.
The more special cases you have, the more you quickly realize how important it is to manage these things in software. The more the process needs to be repeated, the more the scripts will save you time.
New Flexibility in the Cloud
Ultimately if you take into consideration less reliable virtual servers, and mitigate that with zones and regions, and automated scripts, you can then enjoy all the new benefits of the cloud.
- easy test & dev environment setup
- robust load & scalability testing
- vertically scaling servers in place – in minutes!
- pause a server – incurring only storage costs for days or months as you like
- cheaper costs for applications with seasonal traffic patterns
- no huge up-front costs
The term clustering is often used loosely in the context of enterprise databases. In relation to MySQL in the cloud you can configure:
- Master-master active/passive
- Sharded MySQL Database
- NDB Cluster
Master-Master active/passive replication
Also sometimes known as circular replication. This is used for high availability. You can perform operations on the inactive node (backups, alter tables or slow operations) then switch roles so inactive becomes active. You would then perform the same operations on the former master. Applications sees “zero downtime” because they are always pointing at the active master database. In addition the inactive master can be used as a read-only slave to run SELECT queries and large reporting queries. This is quite powerful as typical web applications tend to have 80% or more of their work performed with read-only queries such as browsing, viewing, and verifying data and information.
Sharded MySQL Database
This is similar to what in the Oracle world is called “application partitioning”. In fact before Oracle 10 most Parallel server and RAC installations required you to do this. For example a user table might be sharded by putting names A-F on node A, G-L on node B and so forth.
You can also achieve this somewhat transparently with user_ids. MySQL has an autoincrement column type to handle serving up unique ids. It also has a cluster-friendly feature called auto_increment_increment. So in an example where you had *TWO* nodes, all EVEN numbered IDs would be generated on node A and all ODD numbered IDs would be generated on node B. They would also be replicating changes to eachother, yet avoid collisions.
Obviously all this has to be done with care, as the database is not otherwise preventing you from doing things that would break replication and your data integrity.
One further caution with sharding your database is that although it increases write throughput by horizontally scaling the master, it ultimately reduces availability. An outage of any server in the cluster means at least a partial outage of the cluster itself.
This is actually a storage engine, and can be used in conjunction with InnoDB and MyISAM tables. Normally you would use it sparingly for a few special tables, providing availability and read/write access to multiple masters. This is decidedly *NOT* like Oracle RAC though many mistake it for that technology.
MySQL Clustering In The Cloud
The most common MySQL cluster configuration we see in the Amazon EC2 environment is by far the Master-Master configuration described above. By itself it provides higher availability of the master node, and a single read-only node for which you can horizontally scale your application queries. What’s more you can add additional read-only slaves to this setup allowing you to scale out tremendously.
Also find Sean Hull’s ramblings on twitter @hullsean.
Migrating from MySQL to Oracle can be as complex as picking up your life and moving from the country to the city. Things in the MySQL world are often just done differently than they are in the Oracle world. Our guide will give you a birds eye view of the differences to help you determine what is the right path for you.
MySQL comes from a more open-source or DIY background. One of Unix and Linux administrators and even developers carrying the responsibility of a DBA.
- Installation & Administration Considerations
- Query and Optimizer Differences
- Security Strengths and Weaknesses
- Replication & High Availability
- Table Types & Storage Engines
- Applications, Connection Pooling, Stored Procedures and More
- Backups & Disaster Recovery
- Community – MySQL & Oracle Differences
- TCO, Licensing, and Cloud Considerations
- Advanced Oracle Features – Missing in MySQL
Check back soon as we update each of these sections.
What Do Consultants Do?
Consultants bring a whole host of tools to experiences to bear on solving your business problems. They can fill a need quickly, look in the right places, reframe the problem, communicate and get teams working together, and bring to light problems on the horizon. And they tell stories of challenges they faced at other businesses, and how they solved them.
Frame or Reframe The Problem
Oftentimes businesses see the symptoms of a larger problem, but not the cause. Perhaps their website is sluggish at key times, causing them to lose customers. Or perhaps it is locking up inexplicably. Framing the problem may involve identifying the bottleneck and pointing to a particular misconfigured option in the database or webserver. Or it may mean looking at the technical problem you’ve chosen to solve and asking if it meets or exceeds what the business needs.
Tell Business Stories
Clients often have a collection of technologies and components in place to meet their business needs. But day-to-day running of a business is ultimately about bringing a product or service to your customer. Telling stories of challenges and solutions of past customers, helps illustrate, educate, and communicate problems you’re facing today.
Fill A Need Quickly
If you have an urgent problem, and your current staff is over extended, bringing in a consultant to solve a specific problem can be a net gain for everyone. They get up to speed quickly, bring fresh perspectives, and review your current processes and operations. What’s more they can be used in a surgical way, to augment your team for a short stint.
Get Teams Communicating
I’ve worked at quite a number of firms over the years and tasked with solving a specific technical problem only to find the problem was a people problem to begin with. In some cases the firm already has the knowledge and expertise to solve a problem, but some members are blocking. This can be because some folks feel threatened by a new solution which will take away responsibilities they formerly held. Or it can be because they feel some solution will create new problems which they will then be responsible to cleanup. In either case bridging the gap between business needs and operations teams to solve those needs can mean communicating to each team in ways that make sense to them. A technical detail oriented focus makes most sense when working with the engineering teams, business and bottom-line focused when communicating with the management team.
Highlight Or Bring To Light Problems On Horizon
Is our infrastructure a ticking timebomb? Perhaps our backups haven’t been tested and are missing some crucial component? Or we’ve missed some security consideration, left some password unset, left the proverbial gate open to the castle. When you deal with your operations on a day-to-day basis, little details can be easy to miss. A fresh perspective can bring needed insight.
BOOK REVIEW – Jaron Lanier – You Are Not a Gadget
Lanier is a programmer, musician, the father of VR way back in the 90′s, and wide-ranging thinker on topics in computing and the internet.
His new book is a great, if at times meandering read on technology, programming, schizophrenia, inflexible design decisions, marxism, finance transformed by cloud, obscurity & security, logical positivism, strange loops and more.
He opposes the thinking-du-jour among computer scientists, leaning in a more humanist direction summed up here: “I believe humans are the result of billions of years of implicit, evolutionary study in the school of hard knocks.” The book is worth a look.
Call Us Daily 11-11 EST: +1-212-533-6828
Planning and implementing a bullet proof disaster recovery strategy forges a a large piece in your business continuity plans. We can:
- Review your entire web operations & cloud environment
- Perform a fire drills to test backups, scripts, and processes
- Examine security of operations
- Provide feedback of current environment
- Work closely with your team
- Performance related outages
- MySQL database crash
- Upgrade related problems
- Server & Hardware outages
Best practices for backups and disaster recovery aren’t tremendously different in the cloud than from a managed hosting environment. But they are more crucial since cloud servers are less reliable than physical servers. Also the security aspect may play a heightened role in the cloud. Here are some points to keep in mind.
Read the original article -
Intro to EC2 Cloud Deployments.
1. Perform multiple types of backups
2. Keep non-proprietary backups offsite
3. Test your backups – perform firedrills
4. Encrypt backups in S3
5. Perform Replication Integrity Checks Continue reading
Also find Sean Hull’s ramblings on twitter @hullsean.
There are a lot of considerations for deploying MySQL in the Cloud. Some concepts and details won’t be obvious to DBAs used to deploying on traditional servers. Here are eight best practices which will certainly set you off on the right foot.
This article is part of a multi-part series Intro to EC2 Cloud Deployments.
Master-Slave replication is easy to setup, and provides a hot online copy of your data. One or more slaves can also be used for scaling your database tier horizontally.
Master-Master active/passive replication can also be used to bring higher uptime, and allow some operations such as ALTER statements and database upgrades to be done online with no downtime. The secondary master can be used for offloading read queries, and additional slaves can also be added as in the master-slave configuration.
Caution: MySQL’s replication can drift silently out of sync with the master. If you’re using statement based replication with MySQL, be sure to perform integrity checking to make your setup run smoothly. Here’s our guide to bulletproofing MySQL replication.
You’ll want to create an AWS security group for databases which opens port 3306, but don’t allow access to the internet at large. Only to your AWS defined webserver security group. You may also decide to use a single box and security group which allows port 22 (ssh) from the internet at large. All ssh connections will then come in through that box, and internal security groups (database & webserver groups) should only allow port 22 connections from that security group.
When you setup replication, you’ll be creating users and granting privileges. You’ll need to grant to the wildcard ‘%’ hostname designation as your internal and external IPs will change each time a server dies. This is safe since you expose your database server port 3306 only to other AWS security groups, and no internet hosts.
You may also decide to use an encrypted filesystem for your database mount point, your database backups, and/or your entire filesystem. Be particularly careful of your most sensitive data. If compliance requirements dictate, choose to store very sensitive data outside of the cloud and secure network connections to incorporate it into application pages.
Be particularly careful of your AWS logins. The password recovery mechanism in Amazon Web Services is all that prevents an attacker from controlling your entire infrastructure, after all.
There are a few ways to backup a MySQL database. By far the easiest way in EC2 is using the AWS snapshot mechanism for EBS volumes. Keep in mind you’ll want to encrypt these snapshots as S3 may not be as secure as you might like. Although you’ll need to lock your MySQL tables during the snapshot, it will typically only take a few seconds before you can release the database locks.
Now snapshots are great, but they can only be used within the AWS environment, so it also behooves you to be performing additional backups, and moving them offsite either to another cloud provider or to your own internal servers. For this your choices are logical backups or hotbackups.
mysqldump can perform logical backups for you. These backups perform SELECT * on every table in your database, so they can take quite some time, and really destroy the warm blocks in your InnoDB buffer cache. What’s more rebuilding a database from a dump can take quite some time. All these factors should be considered before deciding a dump is the best option for you.
xtrabackup is a great open source tool available from Percona. It can perform hotbackups of all MySQL tables including MyISAM, InnoDB and XtraDB if you use them. This means the database will be online, not locking tables, with smarter less destructive hits to your buffer cache and database server as a whole. The hotbackup will build a complete copy of your datadir, so bringing up the server from a backup involves setting the datadir in your my.cnf file and starting.
4. Disk I/O
Obviously Disk I/O is of paramount performance for any database server including MySQL. In AWS you do not want to use instance store storage at all. Be sure your AMI is built on EBS, and further, use a separate EBS mount point for the database datadir.
An even better configuration than the above, but slightly more complex to configure is a software raid stripe of a number of EBS volumes. Linux’s software raid will create an md0 device file which you will then create a filesystem on top of – use xfs. Keep in mind that this arrangement will require some care during snapshotting, but can still work well. The performance gains are well worth it!
5. Network & IPs
When configuring Master & Slave replication, be sure to use the internal or private IPs and internal domain names so as not to incur additional network charges. The same goes for your webservers which will point to your master database, and one or more slaves for read queries.
6. Availability Zones
Amazon Web Services provides a tremendous leap in options for high availability. Take advantage of availability zones by putting one or more of your slaves in a separate zone where possible. Interestingly if you ensure the use of internal or private IP addresses and names, you will not incur additional network charges to servers in other availability zones.
7. Disaster Recovery
EC2 servers are out of the gates *NOT* as reliable as traditional servers. This should send shivers down your spine if you’re trying to treat AWS like a traditional hosted environment. You shouldn’t. It should force you to get serious about disaster recovery. Build bulletproof scripts to spinup your servers from custom built AMIs and test them. Finally you’re taking disaster recovery as seriously as you always wanted to. Take advantage of Availability Zones as well, and various different scenarios.
8. Vertical and Horizontal Scaling
Interestingly vertical scaling can be done quite easily in EC2. If you start with a 64bit AMI, you can stop such a server, without losing the root EBS mount. From there you can then start a new larger instance in EC2 and use that existing EBS root volume and voila you’ve VERTICALLY scaled your server in place. This is quite a powerful feature at the system administrators disposal. Devops has never been smarter! You can do the same to scale *DOWN* if you are no longer using all the power you thought you’d need. Combine this phenomenal AWS feature with MySQL master-master active/passive configuration, and you can scale vertically with ZERO downtime. Powerful indeed.
We wrote an EC2 Autoscaling Guide for MySQL that you should review.
Along with vertical scaling, you’ll also want the ability to scale out, that is add more servers to the mix as required, and scale back when your needs reduce. Build in smarts in your application so you can point SELECT queries to read-only slaves. Many web applications exhibit the bulk of there work in SELECTs so being able to scale those horizontally is very powerful and compelling. By baking this logic into the application you also allow the application to check for slave lag. If your slave is lagging slightly behind the master you can see stale data, or missing data. In those cases your application can choose to go to the master to get the freshest data.
What about RDS?
Wondering whether RDS is right for you? It may be. We wrote a comprehensive guide to evaluating RDS over MySQL.
If you read this far, you should grab our newsletter!