Why maintenance sometimes a forgotten art?

via GIPHY

Just finished reading the excellent
Why do people neglect maintenance?.

Join 35,000 others and follow Sean Hull on twitter @hullsean.

With a wide ranging discussion, from cultural myths to cognitive biases, there are a lot of reasons why your organization may not be giving maintenance the attention it deserves.

Here are some thoughts…

1. Weighing short & long term tradeoffs

When looking at maintenance problems, we sometimes must weigh easy to implement quick fixes, versus the better though weightier longer term fix.

Often the longer term fix requires more downtime, and so is a harder pill to swallow. So sometimes taking that medicine is put off till later. Weighing how much that could cost you is not easy. But that is often the reality of maintenance and ops teams.

Read: How to hire a developer that doesn’t suck

2. Keeping it sexy

One interesting point they make in the article is around the culture of innovation. Since building new products and changing the world is sexy, some how the day-to-day realities of managing and maintaining that after delivery can take a back seat.

Just as we prioritize creating new features, and building a changed product, we must always weigh the costs of such change. As I wrote in the four letter word dividing dev and ops, different team members have different mandates. And that’s important.

While developers are mandated with bringing new features to life, ops are mandated with keeping things running at 4am. Software updates, maintenance, outages are what the ops team is worried about.

Related: Does Amazon have a dirty little secret?

3. Status symbols – dev versus ops

There is some interesting discussion by Andy Jess & Lee about how these different job roles can be viewed as low or high status in an organization.

In my article why we need techops I mention some of this narrative. Once at a keynote, I heard a sales guy advocate a product that would obviate your needing to hire ops. Yep, more money to hire Devs!

On the flip side at ops and DBA conferences, I’ve heard over and over the story of “some idiot developer that took down our production systems…”

You get it on both sides. When will teams ever work together?

Read: The art of resistance – when you have to be the bad guy

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

What tools & tech are devops engineers using today

via GIPHY

I just stumbled upon Graham King’s blog, and I’m liking his writing. He wrote an excellent piece a developer goes to a DevOps conference.

Join 35,000 others and follow Sean Hull on twitter @hullsean.

I’ve been to plenty of Unix & operations type conferences over the years, so topics don’t surprise me. But hearing about a developer’s experience brings a new perspective and some great insights.

1. Tools change but mindset stays the same

Some talk about Devops as doing away with operations. Those job roles just aren’t necessary anymore. Well maybe for a small firm, or maybe shops that have pushed 2-pizza agile to the max. But handing the operations duties to developers has limitations. As I mentioned here (the difference between dev and ops is a four letter word…) these different job roles have different mandates.

It’s like an architect can design a building, and it can be a very beautiful house. But a super or building manager keeps it running over the years. He or she knows what to look for in cracked roofs, knows how to keep rodents & pests at bay, knows how to repair and maintain & stay ahead of the game.

In that analogy, the architect is the developer, while the super or building manager is the operations team. They’re two different mindsets, rarely shared in one person.

Read: What did Matt Ranney discover scaling Uber to 1000 microservices?

2. Being on-call is a b*tch

I could write volumes about being on-call. Getting woken up in the middle of the night, because someone pushed broken code is no fun. What’s more broken can have different meanings.

Broken can be something QA should catch, like a button doesn’t work or there’s an issue with some browser. It could also be that some new product feature doesn’t work properly.

But from the ops perspective, broken could also be some new feature doesn’t scale. It makes a million API calls, or makes a servless call that times out. These types of broken are much harder to test for.

This is also why traditionally operations and development were two different teams. Because from the vantage of the business, they had different mandates.

Ops was mandated with stability. So they don’t want change. Change breaks things, and wakes you up at 3am.

Devs are mandated with features changes, and product improvement. So they naturally bring change to the table.

And between the two we search for balance. I wrote a piece that hit on exactly these points the difference between dev and ops is a four letter word…

Related: Can humility help you in your career?

3. The kingmaker tools

Kubernetes – you’ve heard of it, you’re probably using it. Devs package their app as a docker container, and ops push that container through CI/CD pipeline, and finally orchestrate & deploy with kubernetes. Seems like the *only* way to do things these days, right?

But some argue Docker may not be right for everyone and certainly this stack brings a *lot* of complexity for small organizations.

Related: Is AWS too complex for small dev teams?

Terraform I’m a big fan of this technology. Once you’ve captured your entire stack in code, you can version it, check it into git, and manage it like any other asset. That’s great, but there are so many other benefits. You can easily deploy that same stack in another region, or tweak it to create dev, stage and production. Cool stuff!

Related: I tried to build infrastructure as with Terraform and AWS. It didn’t go as I expected

Ansible All those BASH scripts you have sitting around? Check them into version control before it’s too late! One great thing about Ansible is with slight tweaks and can run those bash scripts almost as-is.

And for ops who already have experience with managing things by hand, you can get up to speed with Ansible, in a few days. The learning curve isn’t as tough as Puppet or Chef, and brings many or most of the benefits.

Packer Here’s another cool tool. Chances are all those AMI’s that Amazon has pre-baked, need tweaks for your setup. Now you could do all that work post spinup with Ansible. And that’s fine. But it’ll be slower, and possibly prone to breaking if the base AMI changes.

Enter Packer, another great tool from the folks who brought us Terraform, Hashicorp. This tool allows you to write yaml files that then build AMI’s. You can then use your pipeline and other automation tools to automate those as well. Cool !

Read: What happened when I offered advice outside my pay grade?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Is there a serious skills shortage around devops space?

via GIPHY

As devops adoption picks up pace, the signs are everywhere. Infrastructure as code once a backwater concept, and a hoped for ideal, has become an essential to many startups.

Why might that be?

Join 37,000 others and follow Sean Hull on twitter @hullsean.

My theory is that devops enables the business in a lot of profound ways. Sure it means one sysadmin can do much more, manage a fleet of servers, and support a large user base. But it goes much deeper than that.





Being able to standup your entire dev, qa, or production environment at the click of the button transforms software delivery dramatically. It means it can happen more often, more easily, and with less risk to the business. It means you can do things like blue/green deployments, rolling out featues without any risk to the production environment running in parallel.

What kind of chops does it take?

Strong generalist skills

For starters you’ll need a pragmatist mindset. Not fanatical about one technology, but open to the many choices available. And as a generalist, you start with a familiarity with a broad spectrum of skills, from coding, troubleshooting & debugging, to performance tuning & integration testing.

Stir into the mix good operating system fundamentals, top to bottom knowledge of Unix & Linux, networking, configuration and more. Maybe you’ve built kernels, compiled packages by hand, or better yet contributed to a few open source projects yourself.

You’ll be comfortable with databases, frontend frameworks, backend technologies & APIs. But that’s not all. You’ll need a broad understanding of cloud technologies, from GCP to AWS. S3, EC2, VPCs, EBS, webservers, caching servers, load balancing, Route53 DNS, serverless lambda. Add to all of that programmable infrastructure through CloudFormation or Terraform.

Related: 30 questions to ask a serverless fanboy

Competent programmer

Although as a devop you probably won’t be doing frontend dev, you’ll need some cursory understanding of those. You should be competent at Python and perhaps Nodejs. Maybe Ruby & bash scripts. You’ll need to understand JSON & Yaml, CloudFormation & Terraform if you want to deliver IAC.

Related: Does a 4-letter-word divide dev & ops?

Strong sysadmin with ops mindset

These are fundamental. But what does that mean? Ops mindset is born out of necessity. Having seen failures & outages, you prioritize around uptime. A simpler stack means fewer moving parts & less to manage. Do as Martin Weiner would suggest & use boring tech.

But you’ll also need to reason about all these components. That’ll come from dozens of debug & troubleshooting sessions you’ll do through years of practice.

Related: How to hire a developer that doesn’t suck

Understand build systems & deployment models

Build systems like CircleCI, Jenkins or Gitlab offer a way to automate code delivery. And as their use becomes more widespread knowing them becomes de rigueur. But it doesn’t end there.

With deployments you’ll have a lot to choose from. At the very simplest a single target deploy, to all-at-once, minimum in service and rolling upgrades. But if you have completely automated your dev, qa & prod infra buildout, you can dive into blue/green deployments, where you make a completely knew infra for each deploy, test, then tear down the old.

Related: Is AWS too complex for small dev teams?

Personality to communicate across organization

I think if you’ve made it this far you will agree that the technical know-how is a broad spectrum of modern computing expertise. But you’ll also need excellent people skills to put all this into practice.

That’s because devops is also about organizational transformation. Yes devs & ops have to get up to speed on the tech, but the organization has to get on board too. Many entrenched orgs pay lip service to devops, but still do a lot of things manually. This is out of fear as much as it stands as technical debt.

But getting past that requires evangelizing, and advocating. For that a leader in the devops department will need superb people skills. They’ll communicate concepts broadly across the organization to win hearts and minds.

Related: Will Microservices just die already?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Is maintenance as sexy as innovation?

via GIPHY

A recent NYT piece on our aging american infrastructure got me thinking. It seems that roads, bridges, airports & city sewer systems are all in need of repair. Sadly as budgets to maintain these systems in good repair are often short, they become larger problems to fix as their status becomes critical.

Join 37,000 others and follow Sean Hull on twitter @hullsean.

“Americans have an impoverished and immature conception of technology, one that fetishizes innovation as a kind of art and demeans upkeep as a mere drudgery.”

I’m not sure this is an American-only phenomenon. However I do see it a lot with technology companies & startups.

1. Do we have to manage ops in the cloud?

The cloud has enabled infrastructure automation in some pretty phenomenal ways. Code pipelines can deliver changes to a repo, through automated unit testing, and out to customers all without human intervention. This makes teams more agile, and ultimately businesses faster & more profitable.

We might be distracted enough to stop worrying about operations altogether. After all Amazon knows how to manage broken servers & alert us right? I write do we have to manage operations in the cloud previously, as this sentiment seems to be growing.

Modern applications have a ton of interdependencies. Even with decent integration testing, the full stack is complex, and requires monitoring. Co-tenancy can complicate your performance tuning efforts as neighboring customers may directly affect your application. Third party services may be delivered from smaller or less experienced companies, whose SLA may be limiting besides. And hey if Amazon goes down, I can just tell my customers it was their fault, right?

Also: Is Amazon too big to fail?

2. Do you know Dustin Moskovitz?

Chances are I’m guessing you’ll say no. He was part of the original Facebook team alongside Zuckerberg. You don’t know his name? He had the sexy job of, you guessed it maintenance! He was the operations guy. Did he write the application code? More than likely he knew that code very well as he had to fix & maintain it. Along with the infrastructure to scale & support Facebook’s massive growth.

Read: Is AWS too complex for small dev teams? The growing demand for Cloud SRE

3. Is a little technical debt ok?

Ward Cunningham has an excellent interview about technical debt. Is a little bit ok? Maybe. But each amount is kicking the can down the road. As the NYT article on maintenance makes clear, you can move the responsibility on to the next administration, the next term, or someone else, but eventually you’ll have a critical problem on your hands, which will be much more expensive to fix.

Related: How to build an operational datastore on Amazon Redshift with S3

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Is AWS too complex for small dev teams & startups?

via GIPHY

I was discussing a server outage with a colleague recently. AWS had done some confusing things, and the team was rallying to troublehsoot & fix.

He made an offhand comment that caught my attention…


AWS is too complex for small dev teams. I’d recommend we host in a traditional datacenter.

Join 32,000 others and follow Sean Hull on twitter @hullsean.

It’s an interesting point. For all the fanfare over Amazon, lost in the shuffle is the staggering complexity that we’re taking on. For small firms, this is a cost that’s often forgotten when we smell the on-demand cool-aid that is EC2.

Here are my thoughts…

1. Over 70 services offered

Everytime I login to the AWS console there’s a new service offering. Lambda & serverless computing. CodeDeploy, Redshift, EMR, VPC’s, developer tools, IOT, the list goes on. If you haven’t enabled MFA on your IAM accounts you’re not alone!

Also: Is Amazon too big to fail?

2. Still complex to build high availability

The song I hear out of Amazon is, we offer all the components for a high availability infrastructure. multiple availability zones, regions, load balancers, autoscaling, geo & latency dns routing. What’s more companies like Netflix have open sourced tools to help.

But at a lot of startups that I see, all these components are not in use, nor are they well understood. Many admins are still using Amazon like an old-school datacenter. And that’s not good.

Sometimes it seems that AWS is a patient in need of constant medication.

Related: Are we fast approaching cloud-mageddon?

3. Need a dedicated devops

As AWS becomes more complex, and the offering more robust, so too the need for dedicated ops. If you’re devs are already out of bandwidth, but you don’t quite have so much need for a fulltime resource a consultant may be an option. Round out the team & keep costs manageable.

If you’re looking for an aws solutions architect, we can help!

Check out: Does Amazon eat it’s own dogfood?

4. Orchestration involves many moving parts

Infrastructure as code offers the promise of completely versioning all your servers, configurations and changes. From there we can apply test driven development & bring a more professional level of service to our business. That’s the theory anyway.

In practice it brings an incredible number of new toolsets to master and a more complex stack besides. All those components can have bugs, need troubleshooting. This sometimes just kicks the can down the road, moving the complexity elsewhere.

It’s not clear that for smaller shops, all this complexity is manageable.

Also: 5 things toxic to scalability

5. Troubleshooting failed deployments

I was looking at a problem with a broken deploy recently. Turns out a developer had copy & pasted some code solution off the internet, possibly from a tutorial, and broke deployments to staging.

Yes perhaps this was avoidable, and more checks & balances can fix. But my thought is continuous integration & continuous deployments are not a panacea. More complexity brings a more complex web to unweave.

I sometimes wonder if we aren’t fast approaching cloud-mageddon?

Read: Why Airbnb didn’t have to fail?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters