Categories
All Devops Disaster Recovery Scalability

Do simpler systems fail better?

via GIPHY

I was recently reading Greg Kogan’s blog Simple Systems have less downtime.

It really caught my attention.

Join 35,000 others and follow Sean Hull on twitter @hullsean.

As a professional services consultant over the years, I’ve worked with almost 200 firms. And many of those required unraveling of complex systems. And systems that were no longer well understood after the first wave of builders have long since gone.

So this topic resonates strongly for me.

I believe if firms adopted these advice, I would have a lot less work over the years. Seriously.

1. Redundancy

Redundancy means backup systems. If your laptop fails, do you have a second one with all your up-to-date data? If us-east-1 fails, do you have a backup or live copy of your database in another region?

Redundancy isn’t just backup systems, it is backup people. If Jane who manages Salesforce gets in an accident, what will the business do to support the sales teams? If a system gets hacked and compromised, how can you restore the most recent data?

Complex systems fail in surprising ways. Having a plan B, and plan C, and for really essential services and plan D will save the day.

Take Greg’s example of a container ship:
o if the automatic system fails you can steer the thing manually. Wow!
o if other electronics fail, you can control the damn rudder by hand!

Incredible to think a ship that big is basically a giant sailboat when you disengaged the powered systems. That is truly a lesson for all of us startup engineers.

Read: How can 1% of something equal nothing?

2. Overlapping skillsets

If you only have one guy who knows how to use the database platform, that’s a problem. If you have only one woman who knows how to program in Rust, that’s a problem. If there’s only one person who knows how the reporting system works, and can make changes, that’s a problem.

Better to have overlapping job roles and skillsets. If you have a chance to adopt a new technology, make sure it’s rock solid one that is mainstream, and easy to hire for.

Related: Is Fred Wilson right about dealing in an honest, direct and transparent way?

3. Beware of technical debt

We’ve all heard the reasons.


o We don’t have the luxury to fix that now.

o We can’t afford the downtime.

o We have pressing features to ship.

But as technical debt piles on, so does complexity. And you’ll quickly end up end up carrying a larger burden than you realized.

As advocated by Kogan, rip and replace is often a more serious solution, and better for the firm. Yes you’ll have some downtime. Yes you’ll redirect team members temporarily. But you’ll solve the real problem, and will bring more simplicity to your architecture.

What’s more the pain of paying down the debt will make you think twice about borrowing in the future!

Related: What mistakes did you make when starting as a consultant?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters