Join 20,000 others and follow Sean Hull’s scalability, startup & innovation content on twitter @hullsean.
Cloud computing is heralding us into a wonderful era where computing can be bought in small increments, like a utility. This changes the whole way we plan, manage budgets, and accelerates startups making them more agile.
But it’s not all wine & roses up there. I’ve heard a few refrains from clients over the years, and thought I’d share some of the most common.
1. Scaling is automatic
Rather recently I was working with a client on building some sophisticated reports. They needed to slice & dice customer data, over various time series, and summarize with invoices & tracking data. Unfortunately their dataset was large, in the half terabyte range.
Client: Can we just load all this data into the cloud?
Me: Yes we can do that. Build a system in Amazon public cloud, can support large datasets.
Client: I want it to scale easily. So we won’t have these slow reports. And as we add data, it’ll just manage it easily for us.
Me: Well it’s a little bit more complicated than that, unfortunately.
Unfortunately this is a rather familiar conversation that I have quite often. A lot of the press around cloud scalability, centers around auto-scaling, Amazon’s renowned & superb virtualization feature. Yes it’s true you can roll out webservers to scale out this way, but that’s not the end of the story. Typically web applications have a lot of components, from caching servers, to search servers, and of course their backend datastore.
But can we scrap our relational database, such as MySQL and go with one that scales out of the box like Riak, Cassandra or Dynamodb?
Those NoSQL solutions are built to be distributed from the start, it’s true. And they lend themselves to that type of architecture. However, if you’ve built up a dataset in MySQL or Oracle, and more so an application around that, you’ll have to migrate data into the NoSQL solution. That process will take some time.
Like teaching a fish to fly, it make take some time. They do well in water, but evolution takes a bit longer.
Related: RDS or MySQL 10 use cases
2. Disaster recovery is free
In the traditional datacenter, when you want DR, you setup a parallel environment. Hopefully not in the same room, same city or same coast even. Preferrably you do so in a different region. What you can’t get around is dishing out cash for that second datacenter. You need the servers, just in case.
In the cloud, things are different. That’s why we’re here, right? In amazon you have regions already setup & available for plugin-n-play use. Setup your various components, servers, software & configure. Once you’ve verified you can failover to the parallel environment you can just turn off all those instances. Great, no big charges for all that iron that you’d pay for to keep the rooms warm in an old-school datacenter. Or do you?
As it turns out, since you don’t have this environment running all the time, you’ll want to test it more often, run fire drills to bring the servers back online. That’ll incur some costs in terms of manpower. You’ll also want to include in there some scripts to start those servers up, and/or some detailed documentation on how to do that. And don’t lose that documentation, either will you?
You may also want to build some infrastructure as code unit tests. Things change, code checkouts evolve, especially in the agile & continuous integration world. Devops beware!
3. Machines are fast
Fast, fast, fast. That’s what we expect, things keep getting faster, right? Hard to believe then that the world of computing took a big step backward when it jumped into the cloud. Something similar happened when we jumped to commodity Linux a decade ago.
In amazon, it’s a multi-tenant world. And just like apartment buildings, popular restaurants, or busy highways you must share. When things are quiet you may have the road to yourself, but it’ll never be as quiet as a dirt road in the country!
Amazon is making big strides though. They now offer memory optimized & storage optimized instances. And an even bigger development is the addition of the most important feature for performance & scalability. That said the network & EBS can still be a real bottleneck.
4. Backups aren’t necessary
I’ve experienced a few horror stories over the years. I wrote about one noteworthy one When fat fingers take down your business.
True EBS snapshots make backing up your whole server, well a snap! That said a few extra steps have to happen (flush the filesystem & lock tables) to make this work for a relational database like MySQL or Oracle. And suddenly you have a verification step that you also need to perform. You see no backups are valid until they’ve been restored, remember?
But even with these wonderful disk snapshots, you’ll still want to do database dumps, and perhaps table dumps. Operator error, deleting the wrong data, or dropping the wrong tables, will always be a risk. Ignore backups at your own peril!
Check this: Why CTOs underestimate operational costs
5. Outages won’t happen
In an ideal world, everything is redundant, and outages will be a thing of the past. We’ll finally reach five nines uptime and devops everywhere will be out of work. 🙂
It’s true that Amazon provides all the components to build redundancy into your architecture, and very cutting edge firms that have taken netflix’s approach with chaos monkey are seeing big improvements here. But AirBNB did fail and at root it was an Amazon outage that shouldn’t ever happen.