Why managers & CTO’s underestimate operational costs

too much inventory

Join 19k others and follow Sean Hull on twitter @hullsean.

1. Technology choices & talent shortage

I worked at one firm evaluating their technology stack. When we got to the programming language, I paused in my tracks. “Haskell” I asked? “Oh you haven’t heard of it? It’s a really cool functional programming language, and we found it had some cool features that we really wanted to use”.

I had to fight the urge to roll my eyes. Yes I’d heard of the language, sitting in the club with scheme, lisp & prolog, you study them at university. They’re certainly an interesting bunch and to be sure, can do some things that imperative programming languages can’t. But did it belong in the stack of this run-of-the-mill internet startup?

In this case the developers had full reign to choose any technologies they liked, adding more & more to the mix almost daily. But what are some of the ramifications here?

Two years, three years, or five years down the line, this team will be long gone, and another team will be picking up the pieces. Will you as a manager be able to find a lot of Haskell experts? What’s more operationally will you be able to support those choices? Will updates be made often enough to have a secure stack for years to come?

Also: 5 things toxic to scalability

2. Scalability & server costs

Server costs are easier than ever to estimate. Build your application to serve your first 10,000 customers on Amazon with a couple webservers and a database server. Growing 100x to a million customers, just vertically scale your db, scale out your webservers and you’re good. Or are you?

What happens when you hit a wall? Did you build your application on ORM technology or take on technical debt? I’ve seen firm after firm struggle with technologies like hibernate, eating up precious resources, and being helpless to eliminate the problem. Tread carefully on these types of questions.

Related: Why you’re not hitting five nines uptime

3. Patching, fixing bugs & managing security

Another long term cost of an application will be minor repairs and bug fixes. Those might appear in a slow steady trickle over the years, but security may loom larger. Cross-site scripting, SQL injection and many other threats can be a real headache.

What’s more fixes may involve the libraries your application sits on top of. And when they are upgraded, your application will require tweaks too. It’s all basic stuff when you’re knee deep in development, but when your application has been deployed, the original team is long gone, and you’re supporting it years later, it can surely get messy.

Read: The four-letter-word dividing dev & ops

4. missing operational switches

When building a web application, all eyes are on features. Which ones to include, and which are a priority. Pressure is heavy to build functions that can be sold to customers. Pleasing customers is of obvious importance.

So it’s no surprise that backend switches are often missing. But they can be a real boon for operations team. Suppose you roll out a new feature to support star-ratings on certain pieces of content. An operational switch can be built to allow that feature to be disabled as necessary. If the site is loaded, or trouble is brewing, you may desperately want some switches to disable parts of the site, without the whole thing going down. I talk about this in AirBNB didn’t have to fail.

Another useful thing is a browse only mode. This allows your site to operate, even when writing to the database is not possible. If you’ve ever tried to update on a social network like twitter, facebook or instagram, perhaps late and nite and gotten a “please try again later” message, you’ll understand the value. Here users can’t make changes, but otherwise the site appears to be working, and browsing works normally.

Check this: Are SQL Databases Dead?

5. Consider bitcoin

Mt. Gox, the Japanese exchange handling bitcoin failed in a spectacular fashion. 500 million of the digital currency was stolen. And what’s more since it’s all frictionless currency, untraceable, there’s no marked bills to try and track down. Ooops!

How does this relate to operational costs? The failure was squarely with the operations department. Functionally the site worked fine. But security wasn’t handled well enough, intrusion detection wasn’t employed, and “unspecified weaknesses” were to blame.

Security is one of those things that can be ignored without pain. Until something goes wrong. What’s more if it is being handled well, it’s invisible, and unappreciated besides.

Read this: Why Oracle won’t kill MySQL

Get more. Grab our exclusive monthly Scalable Startups. We insights on scalability, startups & innovation. Our latest Why I don’t work with recruiters