I’m interviewed in a new book Devops Paradox

I had the good fortune to be interviewed earlier in the year by Viktor Farcic. I’m excited to see the book finally released. You can pickup your own copy of Devops Paradox here.

Join 35,000 others and follow Sean Hull on twitter @hullsean.

Our interview is pretty wide ranging, touching on aspects of database administration, cloud operations, automation and devops.

1. Defining Devops

Viktor Farcic: Moving on to a more general subject, how would you define DevOps? I’ve gotten a different answer from every single person I’ve asked.

Sean Hull: I have a lot of opinions about it actually. I wrote an article on my blog a few years ago called The Four-Letter Word Dividing Dev and Ops, with the implication being that the four-letter word might be a swear word, akin to the development team swearing at the operations team, and the operations team swearing at the development team. But the four-letter word I was referring to was “risk.”
To summarize my article, in my view the development and the operations teams of old were separate silos in business, and they had very different mandates. Developers are tasked with writing code to build a product and to answer the needs of the customers, while directly building change into and facilitating a more sophisticated product. So, their thinking from day to day is about change and answering the requirements of the products team.
On the other hand, the operations team’s mandate is stability. It’s, “I don’t want these systems going down at 2:00 a.m.” So, over the long term, the operations teams are thinking about being as conservative as possible and having fewer moving parts, less code, and less new technologies. The simpler your stack is, the more reliable it is and the more robust and less likely it is to fail. I think the traditional reason why developers and operations teams were separated into silos was because of those two very different mandates.
They’re two different ways of prioritizing your work and your priorities when you think about the business and the technology. However, the downside was that those teams didn’t really communicate very well, and they were often at each other’s throats, pushing each other in opposite directions. But to answer your question, “what is DevOps?” I think of DevOps as a cultural movement that has made efforts to allow those teams to communicate better, and that’s a really good thing.

Read: What happened when I offered advice outside my pay grade?

2. How to adapt to change

Viktor Farcic: I have the impression that the speed with which new things are coming is only increasing. How do you keep up with it, and how do companies you work with keep up with all that?

Sean Hull: I don’t think they do keep up. I’ve gone to a lot of companies where they’ve never used serverless. None of their engineers know serverless at all. Lambda, web tasks, and Google Cloud functions have been out for a while, but I think there are very few companies that are able to really take advantage of them. I wrote another article blog post called Is Amazon Web Services Too Complex for Small Dev Teams? where I sort of implied that it is.
I do find a lot of companies want the advantage of on-demand computing, but they really don’t have the in-house expertise yet to really take advantage of all the things that Amazon can do and offer. That’s exactly why people aren’t up to speed on the technology, as it’s just changing so quickly. I’m not sure what the answer is. For me personally, there’s definitely a lot of stuff that I don’t know. I know I’m stronger in Python than I am with Node.js. Some companies have Node.js, and you can write Lambda functions in Java, Node.js, Python, and Go. So, I think Amazon’s investment in new technology allows the platform to evolve faster than a lot of companies are able to really take advantage of it.

Read: What did Matt Ranney discover scaling Uber to 1000 microservices?

3. What does the future hold for Devops?

Viktor Farcic: I’m going to ask you a question now that I hate being asked, so you’re allowed not to answer. Where do you see the future, let’s say a year from now?

Sean Hull:
I see more fragmentation happening across the technology landscape, and I think that that is ultimately making things more fragile because, for example, with microservices, companies don’t think twice about having Ruby, Python, Node.js, and Java. They have 10 different stacks, so when you hire new people, either you have to ask them to learn all those stacks or you have to hire people with each of those individual areas of expertise. The same is true with all these different clouds with their own sets of features: there’s a fragmentation happening.
Let’s look at the iPhone as an example. Think about how complex application testing is for Android versus the iPhone. I mean, you have hundreds of different smartphones that run Android, all with different screen sizes, different hardware, different amounts of memory, and the underlying stuff. Some may even have some extra chips that others don’t have, so how do you test your application across all those different platforms?
When you have fragmentation like that, it means the applications end up not working as well. I think the same thing is happening across the technology spectrum today that happened 10 to 15 years ago, where for your database backend there was Oracle, SQL Server, MySQL, and Postgres. Maybe somebody who’s a DB2 enterprise customer uses DB2, but now there are hundreds of open source databases, graph databases, and DynamoDB versus Cassandra, and so on and so on. There’s no real deep expertise in any of those databases.
What ends up happening is you have cases like what happened with customers who were using MongoDB. They found out the hard way about all of the weird behaviors and performance problems it had, because there just weren’t people around with deep knowledge of what was happening behind the scenes, whereas in Oracle’s space, for example, there are career DBAs that are performance experts that specialize in Oracle internals, so you can hire somebody to solve particular problems in that space.
There aren’t, as far as I know, a lot of people with MongoDB internals expertise. You’d have to call MongoDB themselves; maybe they have a few engineers that they can send out, so what’s the future? I see a lot of fragmentation and complexity, and that makes the internet and internet applications more fragile, more brittle, and more prone to failure.

Related: Can humility help you in your career?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

6 Devops interview questions

via GIPHY

Devops is in serious demand these days. At every meetup or tech event I attend, I hear a recruiter or startup founder talking about it. It seems everyone wants to see benefits of talented operations brought to their business.

Join 37,000 others and follow Sean Hull on twitter @hullsean.

That said the skill set is very broad, which explains why there aren’t more devs picking up the batton.





I thought it would be helpful to put together a list of interview questions. There are certainly others, but here’s what I came up with.

1. Explain the gitflow release process

As a devops engineer you should have a good foundation about software delivery. With that you should understand git very well, especially the standard workflow.

Although there are other methods to manage code, one solid & proven method is gitflow. In a nutshell you have two main branches, development & master. Developers checkout a new branch to add a feature, and push it back to development branch. Your stage server can be built automatically off of this branch.

Periodically you will want to release a new version of the software. For this you merge development to master. UAT is then built automatically off of the master branch. When acceptance testing is done, you deploy off of master to production. Hence the saying always ship trunk.

Bonus points if you know that hotfixes are done directly off the master branch & pushed straight out that way.

Related: 8 questions to ask an AWS expert

2. How do you provision resources?

There are a lot of tools in the devops toolbox these days. One that is great at provisioning resources is Terraform. With it you can specify in declarative code everything your application will need to run in the cloud. From IAM users, roles & groups, dynamodb tables, rds instances, VPCs & subnets, security groups, ec2 instances, ebs volumes, S3 buckets and more.

You may also choose to use CloudFormation of course, but in my experience terraform is more polished. What’s more it supports multi-cloud. Want to deploy in GCP or Azure, just port your templates & you’re up and running in no time.

It takes some time to get used to the new workflow of building things in terraform rather than at the AWS cli or dashboard, but once you do you’ll see benefits right away. You gain all the advantages of versioning code we see with other software development. Want to rollback, no problem. Want to do unit tests against your infrastructure? You can do that too!

Related: Does a 4-letter-word divide dev & ops?

3. How do you configure servers?

The four big choices for configuration management these days are Ansible, Salt, Chef & Puppet. For my money Ansible has some nice advantages.

First it doesn’t require an agent. As long as you have SSH access to your box, you can manage it with Ansible. Plus your existing shell scripts are pretty easy to port to playbooks. Ansible also does not require a server to house your playbooks. Simply keep them in your git repository, and checkout to your desktop. Then run ansible-playbook on the yaml file. Voila, server configuration!

Related: How to hire a developer that doesn’t suck

4. What does testing enable?

Unit testing & integration testing are super import parts of continuous integration. As you automate your tests, you formalize how your site & code should behave. That way when you automate the deployment, you can also automate the test process. Let the software do the drudgework of making sure a new feature hasn’t broken anything on the site.

As you automate more tests, you accelerate the software development process, because you’re doing less and less manually. That means being more agile, and makes the business more nimble.

Related: Is AWS too complex for small dev teams?

5. Explain a use case for Docker

Docker a low overhead way to run virtual machines on your local box or in the cloud. Although they’re not strictly distinct machines, nor do they need to boot an OS, they give you many of those benefits.

Docker can encapsulate legacy applications, allowing you to deploy them to servers that might not otherwise be easy to setup with older packages & software versions.

Docker can be used to build test boxes, during your deploy process to facilitate continuous integration testing.

Docker can be used to provision boxes in the cloud, and with swarm you can orchestrate clusters too. Pretty cool!

Related: Will Microservices just die already?

6. How is communicating relevant to Devops

Since devops brings a new process of continuous delivery to the organization, it involves some risk. Actually doing things the old way involves more risk in the long term, because things can and will break. With automation, you can recovery quicker from failure.

But this new world, requires a leap of faith. It’s not right for every organization or in every case, and you’ll likely strike a balance from what the devops holy book says, and what your org can tolerate. However inevitably communication becomes very important as you advocate for new ways of doing things.

Related: How do I migrate my skills to the cloud?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Is a dangerous anti-ops movement gaining momentum?

devops divide

I was talking with a colleague recently. He asked me …

What do you think of the #no-ops movement that seems to be gaining ground? How is it related to devops?

It’s an interesting question. With technologies like lambda & docker containers, the role & responsibilities & challenges of operations are definitely changing quickly.

Join 32,000 others and follow Sean Hull on twitter @hullsean.

The tooling & automation stacks that are available now are great.  Groundbreaking. Paradigm shifting.  But there’s another devops story that’s buried there waiting to be heard…

1. What is ops anyway?

What exactly is operations anyway? Charity Majors wrote an amazing piece – WTF is operations which I highly recommend reading.

At root, operations is about providing a safe nest where software can live. From incubation, to birth, then care & feeding to maturity.

Also: Why Reddit CTO Martin Weiner wants a boring tech stack

2. Is Noops possible?

The trend to a #NOops movement I think is a dangerous one.

At first glance this might seem reflexive on my part.  After all I’ve specialized in operations & databases for years.  But I think there’s something more insidious here.

Devs are often presiding over the first wave of software. That’s the initial period of perhaps five years, where frenetic product development is happening.  After those years have passed, early innovators are long gone, and an OPS team is trying to keep things running, and patch where necessary.  This is when more conservative thinking, and the perspective of fewer moving parts & a simpler infrastructure seems so obvious.  All the technical debt is piled up & it’s hard to find the front door.

There’s an interesting article The ops identity crisis by Susan Fowler that I’d recommend for further reading.

Related: Is zero downtime even possible on RDS?

3. The dev mandate

I’ve sat in on teams talking about getting rid of ops & how it’ll mean more money to spend on devs etc.  It’s always a surprising sentiment to hear.

I would argue that developers have a mandate to build production & functionality that can directly help customers. This is in essence a mandate for change. Faster, more agile & responsive means quicker to market & more responsive to changes there.

Read: Five reasons to move data to Amazon Redshift

4. The ops mandate

I’ve also heard the other camp, ops talking about how stupid & short sighted devs can be. Deploying the lastest shiny toys, without operational or long term considerations being thought of.

The ops mandate then is for this longer term view. How can we keep systems stable at 2am in the morning? How can we keep them chugging along after five or more years?

This great article Happiness is a boring stack by Jason Kester really sums up the sentiment. The sure & steady, standard & reliable stack wins the operations test every time.

Also: Is Amazon too big to fail?

5. Coming together

Ultimately dev & ops have different mandates.  One for change & new product features, the other against change, for long term stability.  It’s about striking a balance between the two.

It’s always a dance. That’s why dev & ops need to come together. That’s really what devops is all about.

For some further reading, I found Julia Evans’ piece What Is Devops to be an excellent read.

Also: Is the difference between dev & ops a four-letter word?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

How do we lock down cloud systems from disgruntled engineers?

CommitStrip.com

I worked at a customer last year, on a short term assignment. A brilliant engineer had built their infrastructure, automated deployments, and managed all the systems. Sadly despite all the sleepless nights, and dedication, they hadn’t managed to build up good report with management.

Join 32,000 others and follow Sean Hull on twitter @hullsean.

I’ve seen this happen so many times, and I do find it a bit sad. Here’s an engineer who’s working his butt off, really wants the company to succeed. Really cares about the systems. But doesn’t connect well with people, often is dismissive, disrespectful or talks down to people like they’re stupid. All of this burns bridges, and there’s a lot of bad feelings between all parties.

How do you manage the exit process? Here’s a battery of recommendations for changing credentials & logins so that systems can’t be accessed anymore.

1. Lock out API access

You can do this by removing the administrator role or any other role their IAM user might have. That way you keep the account around *just in case*. This will also prevent them from doing anything on the console, but you can see if they attempt any logins.

Also: Is AWS too complex for small dev teams?

2. Lock out of servers

They may have the private keys for various serves in your environment. So to lock them out, scan through all the security groups, and make sure their whitelisted IPs are gone.

Are you using a bastion box for access? That’s ideal because then you only have one accesspoint. Eliminate their login and audit access there. Then you’ve covered your bases.

Related: Does Amazon eat it’s own dogfood?

3. Update deployment keys

At one of my customers the outgoing op had setup many moving parts & automated & orchestrated all the deployment processes beautifully. However he also used his personal github key inside jenkins. So when it went to deploy, it used those credentials to get the code from github. Oops.

We ended up creating a company github account, then updating jenkins with those credentials. There were of course other places in the capistrano bits that also needed to be reviewed.

Read: Is aws a patient that needs constant medication?

4. Update dashboard logins

Monitoring with NewRelic or Nagios? Perhaps you have a centralized dashboard for your internal apps? Or you’re using Slack?

Also: Is Amazon too big to fail?

5. Audit Non-key based logins

Have some servers outside of AWS in a traditional datacenter? Or even servers in AWS that are using usernames & passwords? Be sure to audit the full list of systems, and change passwords or disable accounts for the outgoing sysop.

Also: When hosting data on Amazon turns bloodsport?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Are career promotions like marriage… appealing until your first divorce?

surge pricing engineers

I was recently flipping through an interesting email list. It’s focused for tech leaders, managers & startup entrepreneurs. An HR team lead posted asking about “promotion paths” for engineers.

While I have an intuitive grasp of what engineers at those different levels look like, I’m having trouble making those concrete.

Join 32,000 others and follow Sean Hull on twitter @hullsean.

It struck me how antiquated the whole “career ladder” concept is. Work one job for 20-30 years. It feels like the fairytale of dating that leads safely to marriage. It all seems like a wonderful plan until it fizzles out, employees get jaded, they start seeing the real money being paid elsewhere, and begin looking around.

1. Talent in short supply

I’m not a CTO.  I should preface with that bit.  I’m a consultant.  That said I’ve worked in the tech industry for 20 years, so I have a bit of an opinion here.

Going to meetups, startup industry & pitch events. They’re all like a feeding frenzy. There are more companies hiring now than I remember back in 1998 & 1999. It’s just crazy.

Angel List says 18,000 companies are hiring right now. What about Made In NYC? That shows 735 jobs. And of course there’s Ycombinator who is hiring April 2016, which posts every other month. It has 720 comments as of this writing.

Also: Why I don’t work with recruiters

2. Are salary jumps always larger through external promotion?

I’ve seen a pattern repeated over & over.  An outside firm offers more money & grabs the talent, or the talent gets restless, starts looking & finds they get a bigger bump in salary by leaving, than by internal promotions.  

I don’t know why this is, but it seems almost universal that salary jumps are larger from outside firms, than internally through promotion.  

Also: Why devops talent is so hard to find

3. Building a better ladder

There are great posts on engineering ladders like this one from Neo and also this one from RTR. Also take a look at this one at Artsy. And of course somebody has to go and put theirs up on github. 🙂

All the titles & internal shuffling in the world aren’t going to hide industry pay for long.  When an employee gets wise to their career & the skills marketplace, they’ll eventually learn that title does not equal compensation.

Related: How to hire a developer that doesn’t suck?

4. Building a better culture

In a pricey city like New York, the only thing that seems a counterweight to this is phenomenal culture, chance to build something cool & be surrounded by coworkers you love.  To be sure bouncing around you get less of this. Companies like Etsy comes to mind. According to glassdoor companies like Airbnb, Hubspot & facebook also fit the bill.

Read: 8 questions to ask an aws expert

5. Surge pricing for engineers?

Alternatively to better ladders & promotions, perhaps what Uber did for taxi driving would make sense for hiring engineers too. Let the freelancing phenomenon grow even bigger!

Perhaps we need surge pricing for engineers. That way the very best really do get rewarded the most. Let the marketplace work it’s magic.

Also: When you have to take the fall

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Best of Startup Content on Scalable Startups

strawberries

Join 28,000 others and follow Sean Hull on twitter @hullsean.

Costs

Costs of techops can involve short-term architectural, decisions, but what about the longer term affects of choices? Do cto’s underestimate operational costs?

A stack of…

These days the full stack of a internet or mobile startup involves a lot of varied components, from Chef, Puppet & Ansible, to Nginx, haproxy, redis, solr and some database like MySQL or Postgres on the relational end of the spectrum, or Mongodb, Hbase or Cassandra on the NoSQL side. What type of challenges does this pose to a team? I’m curious,
Do startups assemble at their own risk?

Most used tech

Leo Polovets ran some stats over the Angellist data of startups. He wanted to know Which tech do startups use most? and I summarized the results.

Death of ops?

These days with all the talk of automation, I’ve heard heard developers & even CTO’s argue of a diminishing need for backend administrators. Do startups still need techops?

Speed as a feature

Is Fred Wilson right to say speed is a feature? What does this mean for those migrating or already running in the cloud? How does scalability come into play?

Avoiding outages

Are many outages avoidable? Did Airbnb have to fail?

Performance Review

Reviewing architecture & site speed is a type of engagement that a lot of startups can benefit from. Here’s my Anatomy of a performance review.

Let things fail

Does it sometimes make sense to let things break a little? A tale of managed failure.

Young founders

I worked at one startup with a CTO just out of college. Although they were flush with cash & had real problems scaling, communication problems ultimately soured the engagement. Are you too young to be a boss?

80 million fix

Sometimes fixing serious performance bottlenecks can get a site back up on it’s feet. In this success story they went on to get acquired weeks after the fix. In tongue in cheek fashion I askWhere’s my 80 million dollars?

CTO’s should never do

There are times to get into the trenches. But what if it sacrifices leadership?What should a CTO never do?

Startups too cool for school

Joining YC but have no ideas? No problem. Is my startup too cool for your business school?

Instant business, just add water

Can a business be built in just a weekend? Is there a problem with startup bootcamps?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

5 Things I just learned from James Turnbull about Docker

docker containers

Join 28,000 others and follow Sean Hull on twitter @hullsean.

I just got my hands on a copy of James Turnbull’s new book The Docker Book. It’s an excellent introduction to Linux containers & the powerful things you can do with them. It’s 335 pages covering all the introductory topics to get you up and running and then more advanced topics like working with the docker API, building services & extending docker.

Here’s what I learned…

1. Containers aren’t new

The technology today we call containers in Unix is based on chroot mechanism which was introduced way back in the 80’s.

With traditional virtualization, we use a hypervisor layer, so we emulate hardware. The virtual machine running on top, can run anything, from Windows, to different flavors & versions of unix. It appears to be a completely separate piece of hardware.

With containers we move up to the operating system level, and we create isolation between users. These users all share the same parent operating system. This means it requires dramatically less overhead. That means speed!

Docker is an automation layer built on Lightweight Linux Containers or LXC. To applications it looks like they have their own machine, their own userspace, their own filesystem, their own network.

Also: Is Apple betting against big data?

2. No more VirtualBoxes

Are you tired of waiting for your VMs to spinup? Building dev & test environments becomes lightening fast with Docker. This accelerates software development, and makes a lot of things easier.

Also: When prospects mislead

3. Images, registries & containers

Images share some of the properties of images in hypervisor virtualization. However they are implemented with union file systems. While VirtualBox images take some time to boot, as the entire filesystem must be read & code executed anew, docker images are more like source code to the LXC subsystem.

Registries store your public and private images. The Docker Hub is one popular one. You can also host & deploy your own docker registry as your needs dictate.

Like VMs, containers can be started & stopped at will, albeit at lightening fast speed. They can also be deleted much as a VM can be.

Also: What can new york fashion week teach Chad Dickerson about Net Neutrality?

4. Lightning fast sandboxes

As we mentioned containers are fast. Did we mention really fast?

This can facilitate unit testing & continuous integration. A lot of shops are starting to use Jenkins for continuous integration, and fast testing is key to this process.

Also: Is automation killing old-school technical operations?

5. They work with Vagrant

Are you already using Vagrant to automate deployment of virtual environments. If so the transition is easy. Here Docker becomes your provisioner.

Mark Stratmann put together a great how to, Implementing a Vagrant / Docker Dev environment which we’d recommend you take a look at. You can also head over to the Vagrant docs themselves.

Also: Which tech do startups use most?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Is Hunter Walk right about operations & startups?

The.Rohit - Flickr
The.Rohit – Flickr

Join 26,000 others and follow Sean Hull on twitter @hullsean.

Hunter Walk blogged recently about the importance of building great operations teams. And while he was speaking primarily about business operations, the startup technical operations teams are equally difficult to get right.

1. performance & scalability

As your grows like Birchbox, your customer growth curve may begin to look like a hockey stick. That’s a good problem to have. Will your web application be able to keep up with the onslaught of traffic those customers bring?

Getting performance and scalability just right, will mean fewer site crashes during those key moments when all eyes are on your site.

Also: Is top operations talent hard to find?

2. Operations is key to architecture

Developers will always have strong opinions on architecture. However they may be heavily influenced by their own mandate, features, deliverability & deadlines. So it’s no surprise that they may sometimes choose to build on ORM’s, the middleware brought to you by Hibernate, Cake PHP, Active Record & the like.

And while these technologies seem a necessity in todays modern architectures, they play havoc with your long term scalability. Strong technical operations teams mean a better vision in this area. Heading off your reliance on these technologies will mean managing technical debt before it takes down your country.

Read: Are generalists better at scaling the web?

3. Operations informs strategy

Did you build in those operational switches to turn off the heaviest code, when your site gets overloaded? Operations strategy can help you see these problems on the horizon before they overwhelm you.

Have you considered building a browse only mode for your site? If you’ve ever visited Facebook or Yelp after hours you may have been greeted with the message “We can’t save your comments. Please try again later”. A small innocuous message to end users doesn’t disrupt their enjoyment of the site terribly. But from an technical operations perspective it’s huge. It means teams can perform backups, upgrades and maintenance without interrupting day-to-day activity on the site.

Related: Is scalability a big business?

4. Operations means resilience

We only learn real disaster recovery lessons from storms like Sandy. That’s because resilience highlighted best when it is a real & urgent need.

In technical operations, getting backups right & testing your recovery plan all form key steps in your path to excellence. Get them right before you need them, and ensure repeatability.

Read: Is high availability a real possibility?

5. Operations means technical strength

At the end of the day, getting technical operations right, means you can move from strength to strength. It means building on a solid foundation the likes of Google, Facebook, Foursquare & Etsy. It means you can evolve & grow with your customers, and meet their needs confidently.

Check out: Do startup CEO’s underestimate operational costs?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

5 cloud ideas that aren’t actually true

storm coming

Join 20,000 others and follow Sean Hull’s scalability, startup & innovation content on twitter @hullsean.

Cloud computing is heralding us into a wonderful era where computing can be bought in small increments, like a utility. This changes the whole way we plan, manage budgets, and accelerates startups making them more agile.

But it’s not all wine & roses up there. I’ve heard a few refrains from clients over the years, and thought I’d share some of the most common.

1. Scaling is automatic

Rather recently I was working with a client on building some sophisticated reports. They needed to slice & dice customer data, over various time series, and summarize with invoices & tracking data. Unfortunately their dataset was large, in the half terabyte range.


Client: Can we just load all this data into the cloud?
Me: Yes we can do that. Build a system in Amazon public cloud, can support large datasets.
Client: I want it to scale easily. So we won’t have these slow reports. And as we add data, it’ll just manage it easily for us.
Me: Well it’s a little bit more complicated than that, unfortunately.

Unfortunately this is a rather familiar conversation that I have quite often. A lot of the press around cloud scalability, centers around auto-scaling, Amazon’s renowned & superb virtualization feature. Yes it’s true you can roll out webservers to scale out this way, but that’s not the end of the story. Typically web applications have a lot of components, from caching servers, to search servers, and of course their backend datastore.

But can we scrap our relational database, such as MySQL and go with one that scales out of the box like Riak, Cassandra or Dynamodb?

Those NoSQL solutions are built to be distributed from the start, it’s true. And they lend themselves to that type of architecture. However, if you’ve built up a dataset in MySQL or Oracle, and more so an application around that, you’ll have to migrate data into the NoSQL solution. That process will take some time.

Like teaching a fish to fly, it make take some time. They do well in water, but evolution takes a bit longer.

Related: RDS or MySQL 10 use cases

2. Disaster recovery is free

In the traditional datacenter, when you want DR, you setup a parallel environment. Hopefully not in the same room, same city or same coast even. Preferrably you do so in a different region. What you can’t get around is dishing out cash for that second datacenter. You need the servers, just in case.

In the cloud, things are different. That’s why we’re here, right? In amazon you have regions already setup & available for plugin-n-play use. Setup your various components, servers, software & configure. Once you’ve verified you can failover to the parallel environment you can just turn off all those instances. Great, no big charges for all that iron that you’d pay for to keep the rooms warm in an old-school datacenter. Or do you?

As it turns out, since you don’t have this environment running all the time, you’ll want to test it more often, run fire drills to bring the servers back online. That’ll incur some costs in terms of manpower. You’ll also want to include in there some scripts to start those servers up, and/or some detailed documentation on how to do that. And don’t lose that documentation, either will you?

You may also want to build some infrastructure as code unit tests. Things change, code checkouts evolve, especially in the agile & continuous integration world. Devops beware!

Read this: Why a killer title can make or break your content efforts

3. Machines are fast

Fast, fast, fast. That’s what we expect, things keep getting faster, right? Hard to believe then that the world of computing took a big step backward when it jumped into the cloud. Something similar happened when we jumped to commodity Linux a decade ago.

In amazon, it’s a multi-tenant world. And just like apartment buildings, popular restaurants, or busy highways you must share. When things are quiet you may have the road to yourself, but it’ll never be as quiet as a dirt road in the country!

Amazon is making big strides though. They now offer memory optimized & storage optimized instances. And an even bigger development is the addition of the most important feature for performance & scalability. That said the network & EBS can still be a real bottleneck.

Also: What is a relational database & why is it important?

4. Backups aren’t necessary

I’ve experienced a few horror stories over the years. I wrote about one noteworthy one When fat fingers take down your business.

True EBS snapshots make backing up your whole server, well a snap! That said a few extra steps have to happen (flush the filesystem & lock tables) to make this work for a relational database like MySQL or Oracle. And suddenly you have a verification step that you also need to perform. You see no backups are valid until they’ve been restored, remember?

But even with these wonderful disk snapshots, you’ll still want to do database dumps, and perhaps table dumps. Operator error, deleting the wrong data, or dropping the wrong tables, will always be a risk. Ignore backups at your own peril!

Check this: Why CTOs underestimate operational costs

5. Outages won’t happen

In an ideal world, everything is redundant, and outages will be a thing of the past. We’ll finally reach five nines uptime and devops everywhere will be out of work. 🙂

It’s true that Amazon provides all the components to build redundancy into your architecture, and very cutting edge firms that have taken netflix’s approach with chaos monkey are seeing big improvements here. But AirBNB did fail and at root it was an Amazon outage that shouldn’t ever happen.

Read: Why Oracle won’t kill MySQL

Get more. Monthly insights about scalability, startups & innovation.. Our latest Are SQL Databases Dead?

Why I can’t raise the bar at every firm

Screen-shot-2012-08-02-at-1.28.35-PM

Join 17,000 others and follow Sean Hull on twitter @hullsean. Also check out: Who is Sean Hull?

It may seem counterintuitive. If I am not the best solution provider, why on earth would I highlight it?

I believe by pointing those cases out, I also underline the clients and problems that I’m particularly well suited too, and for which I can really provide value. Read on!

1. People Problems

Sometimes, you’re hired to solve a particular problem which is framed as a technical one. Some process needs to be reworked, recoded or retooled. It’s framed as a technical problem, yet as things unfold the client already has the expertise in-house to solve & write the code. What then?

It may be that the right people aren’t communicating, project managers aren’t seeing the issues, or part of the human systems are gummed up. We can’t raise the technical bar, but we can help getting those folks talking.

I wrote about this before in When You’re Hired to Solve a People Problem.

Also: Why are oil spills & financial instability related to datacenter outages?

2. ORM Usage & Technical Debt

If you’ve read my blog you know I am not very fond of Object Relational Modelers.

I would also argue as Ward Cunningham does so elloquently that technical debt can be a real and pressing problem.

Here we would help identify and frame the problem, though the work of raising the bar technically involves the longer process of retooling & refactoring your code base.

Related: Why database choices are tricky

3. Where Commodity & Offshore Works

Some firms are already making use of odesk or offshoring resources, where you might pay as little as $150/day. If you have a very technical manager or CTO, such a solution may work well for you.

At the other end of the spectrum are the high priced senior consultants from firms like Oracle, Percona or Pythian. Yes they may set you back as much as $3500/day.

In those cases a scalability & performance review may make sense. Here’s how.. Although specialists are necessary, remember to ask yourself Why generalists are better at scaling the web.

We sit in the sweet spot between the two options. With low overhead, our prices are more affordable. At the same time you’re getting a whole lot more than a commodity solution. We’ll communicate in plain language with folks at every technical level. And for many firms that in itself is a value add.

Check this: Does Oracle Aim to Kill MySQL?

4. Existing team did their homework

Believe it or not, I’ve gone into consulting engagements where the existing team has really really done their homework.

In those cases it becomes much harder to raise the bar technically. In those cases I can help when existing team missed something. But more importantly, I can validate a correct setup, or identify technical debt.

Having an outside perspective then, can provide reassurance. As I see ten to fifteen new environments per year, I’ve seen hundreds in the past decade & a half. That’s helpful perspective in itself.

Read: Does Oracle Aim to Kill MySQL?

5. Availability & Uptime Are Already High

I wrote in depth about high availability in the Myth of Five Nines

At the end of the day, availability can only approach perfection, not actually reach it. That’s a property of complex systems. If your uptime is already extremely high, again we can validate your environment, review and provide & summarize findings. But we may not be able to raise the bar.

If that’s you, it’s a good problem to have!

Also: Why AirBNB Didn’t Have to Fail

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters