Category Archives: Devops

Top Amazon Lambda questions for hiring a serverless expert

via GIPHY

If you’re looking to fill a job roll that says microservices or find an expert that knows all about serverless computing, you’ll want to have a battery of questions to ask them.

Join 33,000 others and follow Sean Hull on twitter @hullsean.

For technical interviews, I like to focus on concepts & the big picture. Which rules out coding exercises or other puzzles which I think are distracting from the process. I really like what what the guys at 37 Signals say

“Hire for attitude. Train for skill.”

So let’s get started.

1. How do you automate deployment?

Programming lambda functions is much like programming in other areas, with some particular challenges. When you first dive in, you’ll use the Amazon dashboard to upload a zipfile with your code. But as you become more proficient, you’ll want to create a deployment pipeline.

o What features in Amazon facilitate automatic deployments?

AWS Lambda supports environment variables. Use these for credentials & other data you don’t want in your deployment package.

Amazon’s serverless offering, also supports aliases. You can have a dev, stage & production alias. That way you can deploy functions for testing, without interrupting production code. What’s more when you are ready to push to production, the endpoint doesn’t change.

o What frameworks are available for serverless?

Serverless Framework is the most full featured option. It fully supports Amazon Lambda & as of 1.0 provides support for other platforms such as IBM Openwhisk, Google Cloud Functions & Azure functions. There is also something called SAM or Serverless Application Model which extends CloudFormation. With this, you can script changes to API Gateway, Dynamo DB & Cognos authentication stuff.

If you’re using Auth0 instead of Cognito or Firebase instead of Dynamodb, you’ll have to come up with your own way to automate changes there.

Also: Is the difference between dev & ops a four-letter word?

2. What are the pros of serverless?

Why are we moving to a serverless computing model? What are the advantages & benefits of it?

o easier operations means faster time to market
o large application components become managed
o reduced costs, only pay while code is running
o faster deploy means more experimentation, more agile
o no more worry about which servers will this code run on?
o reduced people costs & less infrastructure
o no chef playbooks to manage, no deploy keys or IAM roles

Related: Is automation killing old-school operations?

3. What are the cons of serverless?

There are a lot of fanboys of serverless, because of the promise & hope of this new paradigm. But what about healthy criticism? A little dose of reality can identify a critical & active mind.

o With Lambda you have less vendor control which could mean… more downtime, system limits, sudden cost changes, loss of functionality or features and possible forced API upgrades. Remember that Amazon will choose the needs of the many over your specific application idiosyncracies.

o There’s no dedicated hardware option with serverless. So you have the multi-tenant challenges of security & performance problems of other customers code. You may even bump into problems because of other customers errors!

o Vendor lock-in is a real obvious issue. Changing to Google Cloud Functions or Azure Functions would mean new deployment & monitoring tools, a code rewrite & rearchitect, and new infrastructure too. You would also have to export & import your data. How easy does Amazon make this process?

o You can no longer store application & state data in local server memory. Because each instantiation of a function will effectively be a new “server”. So everything must be stored in the database. This may affect performance.

o Testing is more complicated. With multiple vendors, integration testing becomes more crucial. Also how do you create dev db instance? How do you fully test offline on a laptop?

o You could hit system wide limits. For example a big dev deploy could take out production functions by hitting an AWS account limit. You would thus have DDoS yourself! You can also hit the 5 minute execution time limit. And code will get aborted!

o How do you do zero downtime deployments? Since Amazon currently deploys function-by-function, if you have a group of 10 or 20 that act as a unit, they will get deployed in pieces. So your app would need to be taken offline during that period or it would be executing some from old version & some from new version together. With unpredictable results.

Read: Do managers underestimate operational cost?

4. How does security change?

o In serverless you may use multiple vendors, such as Auth0 for authentication, and perhaps Firebase for your data. With Lambda as your serverless platform you now have three vendors to work with. More vendors means a larger area across which hackers may attack your application.

o With the function as a service application model, you lose the protective wall around your database. It is no longer safely deployed & hidden behind a private subnet. Is this sufficient protection of your key data assets?

Also: Is the difference between dev & ops a four-letter word?

5. How do you troubleshoot & debug microservices?

o Monitoring & debugging is still very limited. This becomes a more complex process in the serverless world. You can log error & warning messages to CloudWatch.

o Currently Lambda doesn’t have any open API for third party tooling. This will probably come with time, but again it’s hard to see & examine a serverless function “server” while it is running.

o For example there is no New Relic for serverless.

o Performance tuning may be a bit of a guessing game in the serverless space right now. Amazon will surely be expanding it’s offering, and this is one area that will need attention.

Also: Is the difference between dev & ops a four-letter word?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Key lessons from the Devops Handbook

I picked up a copy of the DevOps Handbook.

This is not a book about how to setup Amazon servers, how to use git, codePipeline or Jenkins. It’s not about Chef or Ansible or other tools.

Join 33,000 others and follow Sean Hull on twitter @hullsean.

This is a book about processes & people. It’s about how & why automation & world-class infrastructure will make your business more agile, raise quality & increase productivity.

1. Infrastructure in version control

With technologies like Terraform and CloudFormation, the entire state of your infrastructure can be captured. That means you can manage it just like any other code.

Also: Myth of five nines – Why high availability is overrated

2. Pushbutton builds

You’ve heard it before. Automate your builds. That means putting everything in version control, from environment building scripts, to configs, artifacts & reference data. Once you can do that, you’re on your way to automating production deploys completely.

Related: 5 ways to move data to amazon redshift

3. Devs & Ops comingled

In the devops world, devs should learn about operations, infrastructure, performance & more. What’s more operations teams should work closely with devs.

Read: Why were dev & ops siloed job roles?

4. Servers as cattle not pets

In the old days, we logged into servers & provided personal care & feeding. We treated them like pets.

In the new world of devops, we should treat servers like cattle. When it begins to fail, take it out back and shoot it. (tbh i don’t love the analogy, but it carries some meaning…)

Also: Are SQL databases dead?

5. Open to learnings & failures

Organizations that are open to failures, without playing the blame game, learn quicker & recover from problems faster.

Also: Is Amazon too big to fail?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

30 questions to ask a serverless fanboy

Everyone is hot under the collar again. So-called serverless or no-ops services are popping up everywhere allowing you to deploy “just code” into the cloud. Not only won’t you have to login to a server, you won’t even have to know they’re there.

As your code is called, but cloud events such a file upload, or hitting an http endpoint, your code runs. Behind the scene through the magic of containers & autoscaling, Amazon & others are able to provision in milliseconds.

Join 32,000 others and follow Sean Hull on twitter @hullsean.

Pretty cool. Yes even as it outsources the operations role to invisible teams behind Amazon Lambda, Google Cloud Functions or Webtask it’s also making companies more agile, and allowing startup innovation to happen even faster.

Believe it or not I’m a fan too.

That said I thought it would be fun to poke a hole in the bubble, and throw some criticisms at the technology. I mean going serverless today is still bleeding edge, and everyone isn’t cut out to be a pioneer!

With that, here’s 30 questions to throw on the serverless fanboys (and ladies!)…

1. Security

o Are you comfortable removing the barrier around your database?
o With more services, there is more surface area. How do you prevent malicious code?
o How do you know your vendor is doing security right?
o How transparent is your vendor about vulnerabilities?

Also: Myth of five nines – Why high availability is overrated

2. Testing

o How do you do integration testing with multiple vendor service components?
o How do you test your API Gateway configurations?
o Is there a way to version control changes to API Gateway configs?
o Can Terraform or CloudFormation help with this?
o How do you do load testing with a third party db backend?
o Are your QA tests hitting the prod backend db?
o Can you easily create & destroy test dbs?

Related: 5 ways to move data to amazon redshift

3. Management

o How do you do zero downtime deployments with Lambda?
o Is there a way to deploy functions in groups, all at once?
o How do you manage vendor lock-in at the monitoring & tools level but also code & services?
o How do you mitigate your vendors maintenance? Downtime? Upgrades?
o How do you plan for move to alternate vendor? Database import & export may not be ideal, plus code & infrastructure would need to be duplicated.
o How do you manage a third party service for authentication? What are the pros & cons there?
o What are the pros & cons of using a service-based backend database?
o How do you manage redundancy of code when every client needs to talk to backend db?

Read: Why were dev & ops siloed job roles?

4. Monitoring & debugging

o How do you build a third-party monitoring tool? Where are the APIs?
o When you’re down, is it your app or a system-wide problem?
o Where is the New Relic for Lambda?
o How do you degrade gracefully when using multiple vendors?
o How do you monitor execution duration so your function doesn’t fail unexpectedly?
o How do you monitor your account wide limits so dev deploy doesn’t take down production?

Also: Are SQL databases dead?

5. Performance

o How do you handle startup latency?
o How do you optimize code for mobile?
o Does battery life preclude a large codebase on client?
o How do you do caching on server when each invocation resets everything?
o How do you do database connection pooling?

Also: Is Amazon too big to fail?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Some irresistible reading for March – outages, code, databases, legacy & hiring

via GIPHY

I decided this week to write a different type of blog post. Because some of my favorite newsletters are lists of articles on topics of the day.

Join 32,000 others and follow Sean Hull on twitter @hullsean.

Here’s what I’m reading right now.

1. On Outages

While everyone is scrambling to figure out why part of the internet went down … wait is S3 is part of the internet, really? While I’m figuring out if it is a service of Amazon, or if Amazon is so big that Amazon *is* the internet now…

Let’s look at s3 architectural flaws in depth.

Meanwhile Gitlab had an outage too in which they *gasp* lost data. Seriously? An outage is one thing, losing data though. Hmmm…

And this article is brilliant on so many levels. No least because Matthew knows that “post truth” is a trending topic now, and uses it his title. So here we go, AWS Service status truth in a post truth world. Wow!

And meanwhile the Atlantic tries to track down where exactly are those Amazon datacenters?

Also: Is Amazon too big to fail?

2. On Code

Project wise I’m fiddling around with a few fun things.

Take a look at Guy Geerling’s Ansible on a Mac playbooks. Nice!

And meanwhile a very nice deep dive on Amazon Lambda serverless best practices.

Brandur Leach explains how to build awesome APIs aka ones that are robust & idempotent

Meanwhile Frans Rosen explains how to 0wn slack. And no you don’t want this. 🙂

Related: 5 surprising features in Amazon’s serverless Lambda offering

3. On Hiring & Talent

Are you a rock star dev or a digital nomad? Take a look at the 12 best international cities to live in for software devs.

And if you’re wondering who’s hiring? Well just about everyone!

Devs are you blogging? You should be.

Looking to learn or teach… check out codementor.

Also: why did dev & ops used to be separate job roles?

4. On Legacy Systems

I loved Drew Bell’s story of stumbling into home ownership, attempting to fix a doorbell, and falling down a familiar rabbit hole. With parallels to legacy software systems… aka any older then oh say five years?

Ian Bogost ruminates why nothing works anymore… and I don’t think an hour goes by where I don’t ask myself the same question!

Also: Are we fast approaching cloud-mageddon?

5. On Databases

If you grew up on the virtual world of the cloud, you may have never touched hardware besides your own laptop. Developing in this world may completely remove us from understanding those pesky underlying physical layers. Yes indeed folks containers do run in “virtual” machines, but those themselves are running on metal, somewhere down the stack.

With that let’s not forget that No, databases are not for containers… but a healthy reminder ain’t bad..

Meanwhile Larry’s mothership is sinking…(hint: Oracle) Does anybody really care? Now’s the time to revisit Mike Wilson’s classic The difference between god and Larry Ellison.

Read: Are SQL Databases Dead?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

5 surprising features in Amazon’s Lambda serverless offering

Amazon is building out it’s serverless offering at a rapid clip. Lambda makes a great solution for a lot of different use cases including:

o a hybrid approach, building lambda functions for small pieces of your application, sitting along side your full application, working in concert with it

o working with Kinesis firehose to add ETL functionality into your pipeline. Extract Transform & Load is a method of transforming data from a relational or backend transactional databases, into one better fit for reporting & analytics.

o retrofitting your API? Layer Lambda functions in front, to allow you to rebuild in a managed way.

o a natural way to build microservices, with each function as it’s own little universe

Join 32,000 others and follow Sean Hull on twitter @hullsean.

Great, tons of ways to put serverless to use. What’s Amazon doing to make it even better? Here are some of the features you’ll find indispensible in building with Lambda.

1. Versioned functions

As your serverless functions get more sophisticated, you’ll want to control & deploy different versions. Lambda supports this, allowing you to upload multiple copies of the same function. Coupled with Aliases below, this becomes a very powerful feature.

Also: When hosting data on Amazon turns bloodsport

2. Aliases

As you deploy multiple versions of your functions in AWS, you don’t want to recreate the API endpoints each time. That’s where aliases come in. Create one alias for dev, another for test, and a third for production. That way when new versions of those are deployed, all you have to do is change the alias & QA or customers will be hitting the new code. Cool!

Related: Are you getting errors building lambda functions?

3. Caching & throttling

Using the API gateway, we can do some fancy footwork with Lambda. First we can enabling caching to speedup access to our endpoint. Control the time-to-live, capacity of the cache easily. We’ll also need to invalidate the cache when we make changes & redeploy our functions.

Throttling is another useful feature, allowing you to control the maximum number of times your function can be called per second on average (the rate) and maximum number of times (burst limit). These can be set at both the stage & method levels.

Read: Is Amazon too big to fail?

4. Stage variables

Creating multiple stages, for dev, test & production means you can separate out and control environment variables with more granular control. For example suppose you have access & secret keys to reach S3. You can set environment variables for these to avoid committing any credentials or secrets in your code. Definitely don’t do that!

Allowing multiple copies of stage variables, means you can set them separately for dev, test & production.

Also: How to deploy on Amazon EC2 with Vagrant?

5. Logging

You can enable logging in your Lambda function configuration. This will send error and/or info warning messages out to CloudWatch.

You may also choose the log all of the request & response data. This is controlled in the API Gateway settings for individual stages.

Also: Is Amazon RDS hard to manage?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

How can startups learn from the Dyn DNS outage?

storm coming

As most have heard by now, last Friday saw a serious DDOS attack against one of the major US DNS providers, Dyn.

Join 32,000 others and follow Sean Hull on twitter @hullsean.

DNS being such a critical dependency, this affected many businesses across the board. We’re talking twitter, etsy, github, Airbnb & Reddit to name just a few. In fact Amazon Web Services itself was severely affected. And with so many companies hosting on the Amazon cloud, it’s no wonder this took down so much of the internet.

1. What happened?

According to Brian Krebs, a Mirai botnet was responsible for the attack. What’s even scarier, those requests originated for IOT devices. You know, baby monitors, webcams & DVRs. You’ve secured those right? 🙂

Brian has posted a list of IOT device makers that have backdoors & default passwords and are involved. Interesting indeed.

Also: Is a dangerous anti-ops movement gaining momentum?

2. What can be done?

Companies like Dyn & Cloudflare among others spend plenty of energy & engineering resources studying attacks like this one, and figuring out how to reduce risk exposure.

But what about your startup in particular? How can we learn from these types of outages? There are a number of ways that I outline below.

Also: How do we lock down systems from disgruntled engineers?

3. What are your dependencies?

After an outage like the Dyn one, it’s an opportunity to survey your systems. Take stock of what technologies, software & services you rely on. This is something your ops team can & likely wants to do.

What components does your stack rely on? Which versions are hardest to upgrade? What hardware or services do you rely on? Which APIs do you call out to? Which steps or processes are still manual?

Related: The myth of five nines

4. Put your eggs in many baskets

Awareness around your dependencies, helps you see where you may need to build in redundancy. Can you setup a second cloud provider for DR? Can you use an alternate API to get data, when your primary is out? For which dependencies are your hands tied? Where are your weaknesses?

Read: Is AWS too complex for small dev teams?

5. Don’t assume five nines

The gold standard in technology & startup land has been 5 nines availability. This is the SLA we’re expected to shoot for. I’ve argued before (see: myth of five nines) that it’s rarely ever achieved. Outages like this one, bringing hours long downtime, kill hour 5 nines promise for years. That’s because 5 nines means only 5 ½ minutes downtime per year!

Better to be realistic that outages can & will happen, manage & mitigate, and be realistic with your team & your customers.

Also: Is AWS a patient that needs constant medication?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Is a dangerous anti-ops movement gaining momentum?

devops divide

I was talking with a colleague recently. He asked me …

What do you think of the #no-ops movement that seems to be gaining ground? How is it related to devops?

It’s an interesting question. With technologies like lambda & docker containers, the role & responsibilities & challenges of operations are definitely changing quickly.

Join 32,000 others and follow Sean Hull on twitter @hullsean.

The tooling & automation stacks that are available now are great.  Groundbreaking. Paradigm shifting.  But there’s another devops story that’s buried there waiting to be heard…

1. What is ops anyway?

What exactly is operations anyway? Charity Majors wrote an amazing piece – WTF is operations which I highly recommend reading.

At root, operations is about providing a safe nest where software can live. From incubation, to birth, then care & feeding to maturity.

Also: Why Reddit CTO Martin Weiner wants a boring tech stack

2. Is Noops possible?

The trend to a #NOops movement I think is a dangerous one.

At first glance this might seem reflexive on my part.  After all I’ve specialized in operations & databases for years.  But I think there’s something more insidious here.

Devs are often presiding over the first wave of software. That’s the initial period of perhaps five years, where frenetic product development is happening.  After those years have passed, early innovators are long gone, and an OPS team is trying to keep things running, and patch where necessary.  This is when more conservative thinking, and the perspective of fewer moving parts & a simpler infrastructure seems so obvious.  All the technical debt is piled up & it’s hard to find the front door.

There’s an interesting article The ops identity crisis by Susan Fowler that I’d recommend for further reading.

Related: Is zero downtime even possible on RDS?

3. The dev mandate

I’ve sat in on teams talking about getting rid of ops & how it’ll mean more money to spend on devs etc.  It’s always a surprising sentiment to hear.

I would argue that developers have a mandate to build production & functionality that can directly help customers. This is in essence a mandate for change. Faster, more agile & responsive means quicker to market & more responsive to changes there.

Read: Five reasons to move data to Amazon Redshift

4. The ops mandate

I’ve also heard the other camp, ops talking about how stupid & short sighted devs can be. Deploying the lastest shiny toys, without operational or long term considerations being thought of.

The ops mandate then is for this longer term view. How can we keep systems stable at 2am in the morning? How can we keep them chugging along after five or more years?

This great article Happiness is a boring stack by Jason Kester really sums up the sentiment. The sure & steady, standard & reliable stack wins the operations test every time.

Also: Is Amazon too big to fail?

5. Coming together

Ultimately dev & ops have different mandates.  One for change & new product features, the other against change, for long term stability.  It’s about striking a balance between the two.

It’s always a dance. That’s why dev & ops need to come together. That’s really what devops is all about.

For some further reading, I found Julia Evans’ piece What Is Devops to be an excellent read.

Also: Is the difference between dev & ops a four-letter word?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters