Tag Archives: devops

How do we lock down cloud systems from disgruntled engineers?

CommitStrip.com

I worked at a customer last year, on a short term assignment. A brilliant engineer had built their infrastructure, automated deployments, and managed all the systems. Sadly despite all the sleepless nights, and dedication, they hadn’t managed to build up good report with management.

Join 32,000 others and follow Sean Hull on twitter @hullsean.

I’ve seen this happen so many times, and I do find it a bit sad. Here’s an engineer who’s working his butt off, really wants the company to succeed. Really cares about the systems. But doesn’t connect well with people, often is dismissive, disrespectful or talks down to people like they’re stupid. All of this burns bridges, and there’s a lot of bad feelings between all parties.

How do you manage the exit process? Here’s a battery of recommendations for changing credentials & logins so that systems can’t be accessed anymore.

1. Lock out API access

You can do this by removing the administrator role or any other role their IAM user might have. That way you keep the account around *just in case*. This will also prevent them from doing anything on the console, but you can see if they attempt any logins.

Also: Is AWS too complex for small dev teams?

2. Lock out of servers

They may have the private keys for various serves in your environment. So to lock them out, scan through all the security groups, and make sure their whitelisted IPs are gone.

Are you using a bastion box for access? That’s ideal because then you only have one accesspoint. Eliminate their login and audit access there. Then you’ve covered your bases.

Related: Does Amazon eat it’s own dogfood?

3. Update deployment keys

At one of my customers the outgoing op had setup many moving parts & automated & orchestrated all the deployment processes beautifully. However he also used his personal github key inside jenkins. So when it went to deploy, it used those credentials to get the code from github. Oops.

We ended up creating a company github account, then updating jenkins with those credentials. There were of course other places in the capistrano bits that also needed to be reviewed.

Read: Is aws a patient that needs constant medication?

4. Update dashboard logins

Monitoring with NewRelic or Nagios? Perhaps you have a centralized dashboard for your internal apps? Or you’re using Slack?

Also: Is Amazon too big to fail?

5. Audit Non-key based logins

Have some servers outside of AWS in a traditional datacenter? Or even servers in AWS that are using usernames & passwords? Be sure to audit the full list of systems, and change passwords or disable accounts for the outgoing sysop.

Also: When hosting data on Amazon turns bloodsport?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Are career promotions like marriage… appealing until your first divorce?

surge pricing engineers

I was recently flipping through an interesting email list. It’s focused for tech leaders, managers & startup entrepreneurs. An HR team lead posted asking about “promotion paths” for engineers.

While I have an intuitive grasp of what engineers at those different levels look like, I’m having trouble making those concrete.

Join 32,000 others and follow Sean Hull on twitter @hullsean.

It struck me how antiquated the whole “career ladder” concept is. Work one job for 20-30 years. It feels like the fairytale of dating that leads safely to marriage. It all seems like a wonderful plan until it fizzles out, employees get jaded, they start seeing the real money being paid elsewhere, and begin looking around.

1. Talent in short supply

I’m not a CTO.  I should preface with that bit.  I’m a consultant.  That said I’ve worked in the tech industry for 20 years, so I have a bit of an opinion here.

Going to meetups, startup industry & pitch events. They’re all like a feeding frenzy. There are more companies hiring now than I remember back in 1998 & 1999. It’s just crazy.

Angel List says 18,000 companies are hiring right now. What about Made In NYC? That shows 735 jobs. And of course there’s Ycombinator who is hiring April 2016, which posts every other month. It has 720 comments as of this writing.

Also: Why I don’t work with recruiters

2. Are salary jumps always larger through external promotion?

I’ve seen a pattern repeated over & over.  An outside firm offers more money & grabs the talent, or the talent gets restless, starts looking & finds they get a bigger bump in salary by leaving, than by internal promotions.  

I don’t know why this is, but it seems almost universal that salary jumps are larger from outside firms, than internally through promotion.  

Also: Why devops talent is so hard to find

3. Building a better ladder

There are great posts on engineering ladders like this one from Neo and also this one from RTR. Also take a look at this one at Artsy. And of course somebody has to go and put theirs up on github. 🙂

All the titles & internal shuffling in the world aren’t going to hide industry pay for long.  When an employee gets wise to their career & the skills marketplace, they’ll eventually learn that title does not equal compensation.

Related: How to hire a developer that doesn’t suck?

4. Building a better culture

In a pricey city like New York, the only thing that seems a counterweight to this is phenomenal culture, chance to build something cool & be surrounded by coworkers you love.  To be sure bouncing around you get less of this. Companies like Etsy comes to mind. According to glassdoor companies like Airbnb, Hubspot & facebook also fit the bill.

Read: 8 questions to ask an aws expert

5. Surge pricing for engineers?

Alternatively to better ladders & promotions, perhaps what Uber did for taxi driving would make sense for hiring engineers too. Let the freelancing phenomenon grow even bigger!

Perhaps we need surge pricing for engineers. That way the very best really do get rewarded the most. Let the marketplace work it’s magic.

Also: When you have to take the fall

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Locking down cloud systems from disgruntled engineers

medieval gate fortified aws

I worked at a customer last year, on a short term assignment. A brilliant engineer had built their infrastructure, automated deployments, and managed all the systems. Sadly despite all the sleepless nights, and dedication, they hadn’t managed to build up good report with management.

Join 32,000 others and follow Sean Hull on twitter @hullsean.

I’ve seen this happen so many times, and I do find it a bit sad. Here’s an engineer who’s working his butt off, really wants the company to succeed. Really cares about the systems. But doesn’t connect well with people, often is dismissive, disrespectful or talks down to people like they’re stupid. All burns bridges, and there’s a lot of bad feelings between all parties.

How to manage the exit process. Here’s a battery of recommendations for changing credentials & logins so that systems can’t be accessed anymore.

1. Lock out API access

You can do this by removing the administrator role or any other role their IAM user might have. That way you keep the account around *just in case*. This will also prevent them from doing anything on the console, but you can see if they attempt any logins.

Also: Is AWS too complex for small dev teams?

2. Lock out of servers

They may have the private keys for various serves in your environment. So to lock them out, scan through all the security groups, and make sure their whitelisted IPs are gone.

Are you using a bastion box for access? That’s ideal because then you only have one accesspoint. Eliminate their login and audit access there. Then you’ve covered your bases.

Related: Does Amazon eat it’s own dogfood?

3. Update deployment keys

At one of my customers the outgoing op had setup many moving parts & automated & orchestrated all the deployment processes beautifully. However he also used his personal github key inside jenkins. So when it went to deploy, it used those credentials to get the code from github. Oops.

We ended up creating a company github account, then updating jenkins with those credentials. There were of course other places in the capistrano bits that also needed to be reviewed.

Read: Is aws a patient that needs constant medication?

4. Dashboard logins

Monitoring with NewRelic or Nagios? Perhaps you have a centralized dashboard for your internal apps? Or you’re using Slack?

Also: Is Amazon too big to fail?

5. Non-key based logins

Have some servers outside of AWS in a traditional datacenter? Or even servers in AWS that are using usernames & passwords? Be sure to audit the full list of systems, and change passwords or disable accounts for the outgoing sysop.

Also: When hosting data on Amazon turns bloodsport?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Is AWS too complex for small dev teams & startups?

via GIPHY

I was discussing a server outage with a colleague recently. AWS had done some confusing things, and the team was rallying to troublehsoot & fix.

He made an offhand comment that caught my attention…


AWS is too complex for small dev teams. I’d recommend we host in a traditional datacenter.

Join 32,000 others and follow Sean Hull on twitter @hullsean.

It’s an interesting point. For all the fanfare over Amazon, lost in the shuffle is the staggering complexity that we’re taking on. For small firms, this is a cost that’s often forgotten when we smell the on-demand cool-aid that is EC2.

Here are my thoughts…

1. Over 70 services offered

Everytime I login to the AWS console there’s a new service offering. Lambda & serverless computing. CodeDeploy, Redshift, EMR, VPC’s, developer tools, IOT, the list goes on. If you haven’t enabled MFA on your IAM accounts you’re not alone!

Also: Is Amazon too big to fail?

2. Still complex to build high availability

The song I hear out of Amazon is, we offer all the components for a high availability infrastructure. multiple availability zones, regions, load balancers, autoscaling, geo & latency dns routing. What’s more companies like Netflix have open sourced tools to help.

But at a lot of startups that I see, all these components are not in use, nor are they well understood. Many admins are still using Amazon like an old-school datacenter. And that’s not good.

Sometimes it seems that AWS is a patient in need of constant medication.

Related: Are we fast approaching cloud-mageddon?

3. Need a dedicated devops

As AWS becomes more complex, and the offering more robust, so too the need for dedicated ops. If you’re devs are already out of bandwidth, but you don’t quite have so much need for a fulltime resource a consultant may be an option. Round out the team & keep costs manageable.

If you’re looking for an aws solutions architect, we can help!

Check out: Does Amazon eat it’s own dogfood?

4. Orchestration involves many moving parts

Infrastructure as code offers the promise of completely versioning all your servers, configurations and changes. From there we can apply test driven development & bring a more professional level of service to our business. That’s the theory anyway.

In practice it brings an incredible number of new toolsets to master and a more complex stack besides. All those components can have bugs, need troubleshooting. This sometimes just kicks the can down the road, moving the complexity elsewhere.

It’s not clear that for smaller shops, all this complexity is manageable.

Also: 5 things toxic to scalability

5. Troubleshooting failed deployments

I was looking at a problem with a broken deploy recently. Turns out a developer had copy & pasted some code solution off the internet, possibly from a tutorial, and broke deployments to staging.

Yes perhaps this was avoidable, and more checks & balances can fix. But my thought is continuous integration & continuous deployments are not a panacea. More complexity brings a more complex web to unweave.

I sometimes wonder if we aren’t fast approaching cloud-mageddon?

Read: Why Airbnb didn’t have to fail?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Why you need a performance dashboard like StackExchange

stackexchange

Most startups talk about performance crucial. But often with all the other pressing business demands, it can be forgotten until it becomes a real problem.

Flipping through High Scalability today, I found a post about Stack Exchange’s performance dashboard.

Join 28,000 others and follow Sean Hull on twitter @hullsean.

The dashboard for Stack Exchange performance is truly a tectonic shift. They have done a tremendous job with the design, to make this all visually appealing.

But to focus just on the visual aesthetics would be to miss many of the other impacts to the business.

1. Highlight reliability to the business

Many dashboards, from Cacti to New Relic present performance data. But they’re also quite technical and complicated to understand. This inhibits their usefulness across the business.

The dashboard at Stack Exchange boils performance down to the essentials. What customers are viewing, how quickly the site is serving them, and where bottlenecks are if any.

Also: Is the difference between dev & ops a four-letter word?

2. What’s our architecture?

Another thing their dashboard does is illustrate their infrastructure clearly.

I can’t count the number of startups I’ve worked at where there are extra services running, odd side utility boxes performing tasks, and general disorganization. In some cases engineering can’t tall you what one service or server does.

By outlining the architecture here, they create a living network diagram that everyone benefits from.

Related: Is automation killing old-school operations?

3. Because Fred Wilson says so

If you’re not convinced by what google says, consider Fred Wilson who surely should know. He says speed is an essential feature. In fact *the* essential feature.

The 10 Golden Principles of Successful Web Apps from Carsonified on Vimeo.

Read: Do managers underestimate operational cost?

4. Focus on page loading times!

If you scroll to the very bottom of the dashboard, you have two metrics. Homepage load time, and their “questions” page. The homepage is a metric everyone can look at, as many customers will arrive at your site though this portal. The questions page will be different for everyone. But there will be some essential page or business process that it highlights.

By sifting down to just these two metrics, we focus on what’s most important. All of this computing power, all these servers & networks are all working together to bring the fastest page load times possible!

Also: Is the difference between dev & ops a four-letter word?

5. Expose reliability to the customer

This performance page doesn’t just face the business. It also faces the customers. It lets them know how important speed is, and can underscore how serious the business takes it’s customers. Having an outage or a spike that’s slowing you down. Customers have some transparency into what’s happening.

Also: Is the difference between dev & ops a four-letter word?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Best of Startup Content on Scalable Startups

strawberries

Join 28,000 others and follow Sean Hull on twitter @hullsean.

Costs

Costs of techops can involve short-term architectural, decisions, but what about the longer term affects of choices? Do cto’s underestimate operational costs?

A stack of…

These days the full stack of a internet or mobile startup involves a lot of varied components, from Chef, Puppet & Ansible, to Nginx, haproxy, redis, solr and some database like MySQL or Postgres on the relational end of the spectrum, or Mongodb, Hbase or Cassandra on the NoSQL side. What type of challenges does this pose to a team? I’m curious,
Do startups assemble at their own risk?

Most used tech

Leo Polovets ran some stats over the Angellist data of startups. He wanted to know Which tech do startups use most? and I summarized the results.

Death of ops?

These days with all the talk of automation, I’ve heard heard developers & even CTO’s argue of a diminishing need for backend administrators. Do startups still need techops?

Speed as a feature

Is Fred Wilson right to say speed is a feature? What does this mean for those migrating or already running in the cloud? How does scalability come into play?

Avoiding outages

Are many outages avoidable? Did Airbnb have to fail?

Performance Review

Reviewing architecture & site speed is a type of engagement that a lot of startups can benefit from. Here’s my Anatomy of a performance review.

Let things fail

Does it sometimes make sense to let things break a little? A tale of managed failure.

Young founders

I worked at one startup with a CTO just out of college. Although they were flush with cash & had real problems scaling, communication problems ultimately soured the engagement. Are you too young to be a boss?

80 million fix

Sometimes fixing serious performance bottlenecks can get a site back up on it’s feet. In this success story they went on to get acquired weeks after the fix. In tongue in cheek fashion I askWhere’s my 80 million dollars?

CTO’s should never do

There are times to get into the trenches. But what if it sacrifices leadership?What should a CTO never do?

Startups too cool for school

Joining YC but have no ideas? No problem. Is my startup too cool for your business school?

Instant business, just add water

Can a business be built in just a weekend? Is there a problem with startup bootcamps?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

5 Things I just learned from James Turnbull about Docker

docker containers

Join 28,000 others and follow Sean Hull on twitter @hullsean.

I just got my hands on a copy of James Turnbull’s new book The Docker Book. It’s an excellent introduction to Linux containers & the powerful things you can do with them. It’s 335 pages covering all the introductory topics to get you up and running and then more advanced topics like working with the docker API, building services & extending docker.

Here’s what I learned…

1. Containers aren’t new

The technology today we call containers in Unix is based on chroot mechanism which was introduced way back in the 80’s.

With traditional virtualization, we use a hypervisor layer, so we emulate hardware. The virtual machine running on top, can run anything, from Windows, to different flavors & versions of unix. It appears to be a completely separate piece of hardware.

With containers we move up to the operating system level, and we create isolation between users. These users all share the same parent operating system. This means it requires dramatically less overhead. That means speed!

Docker is an automation layer built on Lightweight Linux Containers or LXC. To applications it looks like they have their own machine, their own userspace, their own filesystem, their own network.

Also: Is Apple betting against big data?

2. No more VirtualBoxes

Are you tired of waiting for your VMs to spinup? Building dev & test environments becomes lightening fast with Docker. This accelerates software development, and makes a lot of things easier.

Also: When prospects mislead

3. Images, registries & containers

Images share some of the properties of images in hypervisor virtualization. However they are implemented with union file systems. While VirtualBox images take some time to boot, as the entire filesystem must be read & code executed anew, docker images are more like source code to the LXC subsystem.

Registries store your public and private images. The Docker Hub is one popular one. You can also host & deploy your own docker registry as your needs dictate.

Like VMs, containers can be started & stopped at will, albeit at lightening fast speed. They can also be deleted much as a VM can be.

Also: What can new york fashion week teach Chad Dickerson about Net Neutrality?

4. Lightning fast sandboxes

As we mentioned containers are fast. Did we mention really fast?

This can facilitate unit testing & continuous integration. A lot of shops are starting to use Jenkins for continuous integration, and fast testing is key to this process.

Also: Is automation killing old-school technical operations?

5. They work with Vagrant

Are you already using Vagrant to automate deployment of virtual environments. If so the transition is easy. Here Docker becomes your provisioner.

Mark Stratmann put together a great how to, Implementing a Vagrant / Docker Dev environment which we’d recommend you take a look at. You can also head over to the Vagrant docs themselves.

Also: Which tech do startups use most?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Is automation killing old-school operations?

puppet logo

Join 27,000 others and follow Sean Hull on twitter @hullsean.

I was shocked to find this article on ReadWrite: The Truth About DevOps: IT Isn’t Dead; It’s not even Dying. Wait a second, do people really think this?

Truth is I have heard whispers of this before. I was at a meetup recently where the speaker claimed “With more automation you can eliminate ops. You can then spend more on devs”. To an audience of mostly developers & startup founders, I can imagine the appeal.

1. Does less ops mean more devs?

If you’re listening to a platform service sales person or a developer who needs more resources to get his or her job done, no one would be surprised to hear this. If we can automate away managing the stack, we’ll be able to clear the way for the real work that needs to be done!

This is a very seductive perspective. But it may be akin to taking on technical debt, ignoring the complexity of operations and the perspective that can inform a longer view.

chef logo

Puppet Labs’ Luke Kanies says “Become uniquely valuable. Become great at something the market finds useful.”. I couldn’t agree more.

Read: Are SQL Databases Dead?

2. What happens when developers leave?

I would argue that ops have a longer view of product lifecycle. I for one have been brought in to many projects after the first round of developers have left, and teams are trying to support that software five years after the first version was built.

That sort of long term view, of how to refresh performance, and revitalize code is a unique one. It isn’t the “building the future” mindset, the sexy products, and disruptive first mover “we’re changing the world” mentality.

It’s a more stodgy & conservative one. The mindset is of reliability, simplicity, and long term support.

Also: How to hire a developer that doesn’t suck

3. What’s your mandate?

From what I’ve seen, devs & ops are divided by a four letter word.

That word I believe is “risk”. Devs have a mandate from the business to build features & directly answer to customer requests today. Ops have a mandate to reliability, working against change and thinking in terms of making all that change manageable.

Different mandates mean different perspectives.

Related: What is Devops & why is it important?

4. Can infrastructure live as code?

Puppet along with infrastructure automation & configuration management tools like Chef offer the promise of fully automated infrastructure. But the truth is much much more complex. As typical technology stacks expand from load balancer, webserver & database, to multiple databases, caching server, search server, puppet masters, package repositories, monitoring & metrics collection & jump boxes we’re all reaching a saturation point.

Yes automation helps with that saturation, but ultimately you need people with those wide ranging skills, to manage the complex web of dependencies when things fail.

And fail they will.

Check out: Why are MySQL DBA’s and ops so hard to find?

5. ORM’s and architecture

If you aren’t familiar, ORM’s are a rather dry sounding name for a component that is regularly overlooked. It’s a middleware sitting between application & database, and they drastically simplify developers lives. It helps them write better code and get on with the work of delivering to the business. It’s no wonder they are popular.

But as Ward Cunningham elloquently explains, they are surely technical debt that eventually must get paid. Indeed.

There is broad agreement among professional DBA’s. Each query should be written, each one tuned, and each one deployed. Just like any other bit of code. Handing that process to a library is doomed to failure. Yet ORM’s are still evolving, and the dream still lives on.

And all that because devs & ops have a completely different perspective. We need both of them to run modern internet applications. Lets not forget folks. 🙂

Read this: Do managers and CTO’s underestimate operational costs?

Want more? Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Is Hunter Walk right about operations & startups?

The.Rohit - Flickr
The.Rohit – Flickr

Join 26,000 others and follow Sean Hull on twitter @hullsean.

Hunter Walk blogged recently about the importance of building great operations teams. And while he was speaking primarily about business operations, the startup technical operations teams are equally difficult to get right.

1. performance & scalability

As your grows like Birchbox, your customer growth curve may begin to look like a hockey stick. That’s a good problem to have. Will your web application be able to keep up with the onslaught of traffic those customers bring?

Getting performance and scalability just right, will mean fewer site crashes during those key moments when all eyes are on your site.

Also: Is top operations talent hard to find?

2. Operations is key to architecture

Developers will always have strong opinions on architecture. However they may be heavily influenced by their own mandate, features, deliverability & deadlines. So it’s no surprise that they may sometimes choose to build on ORM’s, the middleware brought to you by Hibernate, Cake PHP, Active Record & the like.

And while these technologies seem a necessity in todays modern architectures, they play havoc with your long term scalability. Strong technical operations teams mean a better vision in this area. Heading off your reliance on these technologies will mean managing technical debt before it takes down your country.

Read: Are generalists better at scaling the web?

3. Operations informs strategy

Did you build in those operational switches to turn off the heaviest code, when your site gets overloaded? Operations strategy can help you see these problems on the horizon before they overwhelm you.

Have you considered building a browse only mode for your site? If you’ve ever visited Facebook or Yelp after hours you may have been greeted with the message “We can’t save your comments. Please try again later”. A small innocuous message to end users doesn’t disrupt their enjoyment of the site terribly. But from an technical operations perspective it’s huge. It means teams can perform backups, upgrades and maintenance without interrupting day-to-day activity on the site.

Related: Is scalability a big business?

4. Operations means resilience

We only learn real disaster recovery lessons from storms like Sandy. That’s because resilience highlighted best when it is a real & urgent need.

In technical operations, getting backups right & testing your recovery plan all form key steps in your path to excellence. Get them right before you need them, and ensure repeatability.

Read: Is high availability a real possibility?

5. Operations means technical strength

At the end of the day, getting technical operations right, means you can move from strength to strength. It means building on a solid foundation the likes of Google, Facebook, Foursquare & Etsy. It means you can evolve & grow with your customers, and meet their needs confidently.

Check out: Do startup CEO’s underestimate operational costs?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

What happens when you combine devops & continuous delivery into a card game?

release devops game

Join 25,000 others and follow Sean Hull on twitter @hullsean.

Alex Papadimoulis & the guys at Inedo put together CodeMash The Game an interesting game for a new twist to conference going.

Now they’re at it again with a kickstarter to build Release! a game about devops & continuous delivery.

1. Bring your team together

Weekly standups are great, but what about throwing a quick card game in to mix things up? It’s an interesting twist and one that’s sure to help with team building.

Read: Why has no-one heard of Moskovitz but everyone knows Zuckerberg?

2. Learn more about cutting edge software development

Weak on your agile or want to raise your teams software quality. Release seems like a new and surprising way to do just that.

Related: Why I ask clients for a deposit

3. Learn about software development luminaries

Many of the important folks in the evolution of software development are featured in the game, such as Patrick Dubois, Jez Humble & Dan North.

Also: Is Amazon RDS hard to manage?