Categories
All Cloud Computing CTO/CIO Devops

When should I use Ansible versus packer or Terraform?

via GIPHY

I was having a conversation with a colleague recently. We wee discussing devops, and the topic of Ansible came up as I was advocating it as a great too to get things done.

Join 35,000 others and follow Sean Hull on twitter @hullsean.

Here’s what he had to say…

— quote —
I’ve tried using ansible a few times and this is what I found with it.

It is great for what it does. It’s wonderful to be able to spin up a new app or web server automatically. However what I have found for my needs is …

It is easier to build a piece of furniture than it is to explain all the steps required for someone else to build it. Or in order to replicated the steps automatically.

With cloud servers, it’s enough, for me that I’ve built it once. When I need to spin up another, I simply clone the working copy.

— unquote —

My thoughts below.

1. When is Terraform good

Terraform is a coss-platform infrastructure building tool. If you need an IAM user or S3 bucket, Terraform can create it. Need an ec2 instance of a particular type, deployed with an autoscaling group TF is a great tool for that.

With Terraform you can capture in code, everything about your application stack, so that you can standup a complete copy in another region, that’s powerful!

Read: How can 1% of something equal nothing?

2. When is packer right?

Packer is another useful tool that Devops can use to automate. Like AWS own EC2 Image Builder, it allows you to create the images that you boot your instances off of. Think of them as docker images for the server itself.

For example there are lots of dependencies your application requires, and you’ll install with your package manager. And there are services you want to start. You *could* use an ansible playbook to get these going, but better to build a new image that contains all the software you need on the box.

Packer easily sits into your CI pipeline, so you can have new software deployed and ready anytime.

The principal difference is that a new AMI requires you to spinup a new server. You can’t take action on a running server with this tool.

Related: Is Fred Wilson right about dealing in an honest, direct and transparent way?

3. When does Ansible make sense?

In particular here’s what my response was about Ansible itself.

— quote —

Absolutely. It’s an interesting balance to strike.

Because of course packer or EC2 image builder are very powerful and fit neatly into a CI pipeline. That said there are things ansible is nicely suited for too.

For example I want to distribute public keys onto specific servers. I have a yml file with the keys. I have a new developer starting, I have him or her git checkout branch, edit keys.yml, commit, push changes, then make a pull request. When the new keys.yml file gets merged, an ansible playbook kicks off to distribute the new set of keys to the relavant servers.

— unquote —

If you want to take actions on running servers, like deploying keys or other ongoing tweaks, that is where Ansible really shines.

Related: What mistakes did you make when starting as a consultant?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Categories
All Consulting CTO/CIO Security

How do we secure an existing aws hosted application?

via GIPHY

What if you don’t have the luxury of a greenfield. You are looking at an already built application, and asking yourself, how do I secure this?

Join 35,000 others and follow Sean Hull on twitter @hullsean.

One can think of it as a giant labyrinth, with many turns and many paths. Some of those paths have not had light shining in them for some time. So you’ll need to be cautious, thorough, and vigilant.

Here are some notes on where to start.

1. Scanning – code

One area you’ll need to dig into is the application code itself. If you don’t have the luxury to push new code, you’ll need to verify what version is deployed, and scan the repository for keys or passwords. You can also scan on the server itself. Better to double your efforts.

Read: What do the best engineers do better?

2. Scanning – network

Your VPC is obviously your first layer of defense. Scan the routing table policies, to make sure there aren’t open ports or whitelisted IPs.

Do the same sort of review for security groups, as those are an alternative method for configuring access to servers.

AWS has a service called Flowlogs, which can be enabled. These give you detailed network layer logging, which you can then scan for trouble.

Related: Is Fred Wilson right about dealing in an honest, direct and transparent way?

3. Scanning – IAM, keys & console

Your existing devs probably have keys to some or all of the EC2 boxes. If you don’t want to relaunch all of these boxes with new keys, or don’t have the luxury to do that, you’ll need to lock down the security groups, whitelisted IPs and VPC routing rules.

You’ll also need to carefully review IAM roles & policies. Amazon Inspector may be a useful tool to scan your environment, and find glaring holes and enforce best practices. But you’ll also want to do your own scanning both automated and manually eyeballing the accounts.

You’ll also want to lock down console access, especially the root account, and any others that have adminstrator privileges. Enable password policies and password rotation, as well as multi-factor authentication. There is also a nice toggle for “alert on login”. You certainly want to know about those!

Related: What mistakes did you make when starting as a consultant?

4. Scanning – services

Review all of the AWS services that are deployed. Ask yourself some of these questions:

o which regions & availability zones am I deployed in?
o what elastic IPs do I have configured where?
o what IAM roles & policies do I have created?
o what databases, API gateways & S3 buckets are configured
o etc…

Cloudtrail can be a great help here as it can log all sorts of useful information. You can then scan those logs for problems.

Related: Why did mailchimp fraudulently charge my credit card?

5. Rebuilding

The scanning approach can work, but there is a strong need to be thorough. If you miss one whitelisted IP or existing ssh key, you can leave the whole network open to a crafty intruder.

Another option is to rebuild the whole application. This gives you the time to:

o automate the whole stack with terraform
o test that everything is working
o plan for failover
o ensure that every bucket is secure with lifecycle policies enabled
o ensure that every EBS volume is encrypted
o enabled cloudtrail, cloudwatch etc

o potentially setup in a *brand new* aws account, for even more confidence
o backup all the pieces of the application as you go

Read: Did Disney+ have to fail?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Categories
All Cloud Computing CTO/CIO Devops

I’m interviewed in a new book Devops Paradox

I had the good fortune to be interviewed earlier in the year by Viktor Farcic. I’m excited to see the book finally released. You can pickup your own copy of Devops Paradox here.

Join 35,000 others and follow Sean Hull on twitter @hullsean.

Our interview is pretty wide ranging, touching on aspects of database administration, cloud operations, automation and devops.

1. Defining Devops

Viktor Farcic: Moving on to a more general subject, how would you define DevOps? I’ve gotten a different answer from every single person I’ve asked.

Sean Hull: I have a lot of opinions about it actually. I wrote an article on my blog a few years ago called The Four-Letter Word Dividing Dev and Ops, with the implication being that the four-letter word might be a swear word, akin to the development team swearing at the operations team, and the operations team swearing at the development team. But the four-letter word I was referring to was “risk.”
To summarize my article, in my view the development and the operations teams of old were separate silos in business, and they had very different mandates. Developers are tasked with writing code to build a product and to answer the needs of the customers, while directly building change into and facilitating a more sophisticated product. So, their thinking from day to day is about change and answering the requirements of the products team.
On the other hand, the operations team’s mandate is stability. It’s, “I don’t want these systems going down at 2:00 a.m.” So, over the long term, the operations teams are thinking about being as conservative as possible and having fewer moving parts, less code, and less new technologies. The simpler your stack is, the more reliable it is and the more robust and less likely it is to fail. I think the traditional reason why developers and operations teams were separated into silos was because of those two very different mandates.
They’re two different ways of prioritizing your work and your priorities when you think about the business and the technology. However, the downside was that those teams didn’t really communicate very well, and they were often at each other’s throats, pushing each other in opposite directions. But to answer your question, “what is DevOps?” I think of DevOps as a cultural movement that has made efforts to allow those teams to communicate better, and that’s a really good thing.

Read: What happened when I offered advice outside my pay grade?

2. How to adapt to change

Viktor Farcic: I have the impression that the speed with which new things are coming is only increasing. How do you keep up with it, and how do companies you work with keep up with all that?

Sean Hull: I don’t think they do keep up. I’ve gone to a lot of companies where they’ve never used serverless. None of their engineers know serverless at all. Lambda, web tasks, and Google Cloud functions have been out for a while, but I think there are very few companies that are able to really take advantage of them. I wrote another article blog post called Is Amazon Web Services Too Complex for Small Dev Teams? where I sort of implied that it is.
I do find a lot of companies want the advantage of on-demand computing, but they really don’t have the in-house expertise yet to really take advantage of all the things that Amazon can do and offer. That’s exactly why people aren’t up to speed on the technology, as it’s just changing so quickly. I’m not sure what the answer is. For me personally, there’s definitely a lot of stuff that I don’t know. I know I’m stronger in Python than I am with Node.js. Some companies have Node.js, and you can write Lambda functions in Java, Node.js, Python, and Go. So, I think Amazon’s investment in new technology allows the platform to evolve faster than a lot of companies are able to really take advantage of it.

Read: What did Matt Ranney discover scaling Uber to 1000 microservices?

3. What does the future hold for Devops?

Viktor Farcic: I’m going to ask you a question now that I hate being asked, so you’re allowed not to answer. Where do you see the future, let’s say a year from now?

Sean Hull:
I see more fragmentation happening across the technology landscape, and I think that that is ultimately making things more fragile because, for example, with microservices, companies don’t think twice about having Ruby, Python, Node.js, and Java. They have 10 different stacks, so when you hire new people, either you have to ask them to learn all those stacks or you have to hire people with each of those individual areas of expertise. The same is true with all these different clouds with their own sets of features: there’s a fragmentation happening.
Let’s look at the iPhone as an example. Think about how complex application testing is for Android versus the iPhone. I mean, you have hundreds of different smartphones that run Android, all with different screen sizes, different hardware, different amounts of memory, and the underlying stuff. Some may even have some extra chips that others don’t have, so how do you test your application across all those different platforms?
When you have fragmentation like that, it means the applications end up not working as well. I think the same thing is happening across the technology spectrum today that happened 10 to 15 years ago, where for your database backend there was Oracle, SQL Server, MySQL, and Postgres. Maybe somebody who’s a DB2 enterprise customer uses DB2, but now there are hundreds of open source databases, graph databases, and DynamoDB versus Cassandra, and so on and so on. There’s no real deep expertise in any of those databases.
What ends up happening is you have cases like what happened with customers who were using MongoDB. They found out the hard way about all of the weird behaviors and performance problems it had, because there just weren’t people around with deep knowledge of what was happening behind the scenes, whereas in Oracle’s space, for example, there are career DBAs that are performance experts that specialize in Oracle internals, so you can hire somebody to solve particular problems in that space.
There aren’t, as far as I know, a lot of people with MongoDB internals expertise. You’d have to call MongoDB themselves; maybe they have a few engineers that they can send out, so what’s the future? I see a lot of fragmentation and complexity, and that makes the internet and internet applications more fragile, more brittle, and more prone to failure.

Related: Can humility help you in your career?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Categories
All Cloud Computing Consulting CTO/CIO Devops

Viktor Farcic Interview excerpts

via GIPHY

I recently did an interview with Viktor Farcic all about operations, DBA & Devops. Here are some excerpts: What does Dev-Ops mean?

Join 35,000 others and follow Sean Hull on twitter @hullsean.

Continuing where I left off, I’ve included a few more highlights below. Enjoy!

1. Can I use a tool to migrate to the public cloud?

Viktor Farcic: I’ve seen quite a few of these tools that tell you if you buy our tool, we’re going to transfer whatever you have to the cloud. For example, Docker announced in the last DockerCon that they’re going to put in containers without a single change and everything will work. What do you think about that?

Sean Hull Salespeople often simplify things quite a bit in order to sell a product; in my experience, the devil is in the detail. It’s not to say that an automation tool like that might not be valuable and useful. It might be a good first step to getting your application in the cloud, and it might be an easier way than to rebuild everything one by one. But I doubt that it’s going to work magically just by one script.
EC2 instances, for example, have different performance characteristics, not only in terms of the disk I/O, memory, and CPU, but in smaller instances, they actually throttle the network access so you might spin up an instance and it just might not behave well. It might take time. In fact, all sorts of things could happen. You might have written MySQL scripts that assume you have root access to the server and then you rebuild that in an RDS and you get errors because you don’t have access to those resources on RDS. There’s a lot of things to consider.

Read: What happened when I offered advice outside my pay grade?

2. How do you adapt to change?

Viktor Farcic: I have the impression that the speed with which new things are coming is only increasing. How do you keep up with it, and how do companies you work with keep up with all that?

Sean Hull: I don’t think they do keep up. I’ve gone to a lot of companies where they’ve never used serverless. None of their engineers know serverless at all. Lambda, web tasks, and Google Cloud functions have been out for a while, but I think there are very few companies that are able to really take advantage of them. I wrote another article blog post called Is Amazon Web Services Too Complex for Small Dev Teams? where I sort of implied that it is.
I do find a lot of companies want the advantage of on-demand computing, but they really don’t have the in-house expertise yet to really take advantage of all the things that Amazon can do and offer. That’s exactly why people aren’t up to speed on the technology, as it’s just changing so quickly. I’m not sure what the answer is. For me personally, there’s definitely a lot of stuff that I don’t know. I know I’m stronger in Python than I am with Node.js. Some companies have Node.js, and you can write Lambda functions in Java, Node.js, Python, and Go. So, I think Amazon’s investment in new technology allows the platform to evolve faster than a lot of companies are able to really take advantage of it.
Read: What did Matt Ranney discover scaling Uber to 1000 microservices?

3. What is the future of Devops?

Viktor Farcic: I’m going to ask you a question now that I hate being asked, so you’re allowed not to answer. Where do you see the future, let’s say a year from now?

Sean Hull:
I see more fragmentation happening across the technology landscape, and I think that that is ultimately making things more fragile because, for example, with microservices, companies don’t think twice about having Ruby, Python, Node.js, and Java. They have 10 different stacks, so when you hire new people, either you have to ask them to learn all those stacks or you have to hire people with each of those individual areas of expertise. The same is true with all these different clouds with their own sets of features: there’s a fragmentation happening.
Let’s look at the iPhone as an example. Think about how complex application testing is for Android versus the iPhone. I mean, you have hundreds of different smartphones that run Android, all with different screen sizes, different hardware, different amounts of memory, and the underlying stuff. Some may even have some extra chips that others don’t have, so how do you test your application across all those different platforms?
When you have fragmentation like that, it means the applications end up not working as well. I think the same thing is happening across the technology spectrum today that happened 10 to 15 years ago, where for your database backend there was Oracle, SQL Server, MySQL, and Postgres. Maybe somebody who’s a DB2 enterprise customer uses DB2, but now there are hundreds of open source databases, graph databases, and DynamoDB versus Cassandra, and so on and so on. There’s no real deep expertise in any of those databases.
What ends up happening is you have cases like what happened with customers who were using MongoDB. They found out the hard way about all of the weird behaviors and performance problems it had, because there just weren’t people around with deep knowledge of what was happening behind the scenes, whereas in Oracle’s space, for example, there are career DBAs that are performance experts that specialize in Oracle internals, so you can hire somebody to solve particular problems in that space.
There aren’t, as far as I know, a lot of people with MongoDB internals expertise. You’d have to call MongoDB themselves; maybe they have a few engineers that they can send out, so what’s the future? I see a lot of fragmentation and complexity, and that makes the internet and internet applications more fragile, more brittle, and more prone to failure.

Related: Can humility help you in your career?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Categories
All Cloud Computing CTO/CIO High Availability Startups

Are you as good as the public cloud?

via GIPHY

According to Lyft’s recent public filing, they plan to spend 300 million buckaroos in the next 2.5 years on AWS.

Did I hear that right?

Join 38,000 others and follow Sean Hull on twitter @hullsean.

Perhaps that is their estimate, or the maximum amount they want to budget for. Regardless that’s a lot of money any way you slice it. A lot of folks are commenting about how crazy that is, and how much datacenter you could build yourself with that much money.

What do you think? Is it foolhardy? Or is there a hidden wisdom here?

Here’s my take.

1. Do you have one million customers testing your datacenter?

If you’re comparing the cost of the cloud to the raw numbers of running your own datacenter, the hardware costs are not enough. You’ll need to include the ops teams & other engineers. Right, you probably guessed that.

But did you factor in the costs of a legion of testers. This is the hidden cost that commercial software carries, even while open source software gets this benefit for free.

With a public cloud like AWS you have millions of customers testing the product everyday, and running into edge cases long before you do. So you get a better service, that’s more reliable, all invisibly for free.

Related: How can we keep cloud architectures simple

2. Do you have 66 datacenters spread across 21 regions and a free network between them?

Anybody who was building web applications in the year 2000 will remember how websites didn’t load the same for different customers. Depending on where in the world they were located, they could experience a very different user experience.

These days we assume that we can be global from day one. But how exactly do we achieve this? Remember with a public cloud, you’re getting tons of things for free, without knowing it. Moving data between AZs or regions? That’s all going across a private interconnect.

And that’s not even including the 180 nodes inside cloudfront that give you a global CDN footprint too!

Also: What hidden things does a deposit reveal?

3. Do you have an engineering team automating away job roles?

I remember the days of DBA job role, do you? Probably not. I specialized in this for years, and there were tons of companies hiring me to help them with it. First Oracle, then MySQL, then Postgres.

Then along came Amazon RDS. Guess what, companies don’t really hire for that role anymore. They do need help with it from time to time, but not as a primary specialization.

What do I mean? Well by hosting your application on AWS, you’re benefiting from the work of teams of engineers in different departments, all expanding on APIs and automating things that those one million customers are asking for.

You’re not going to be able to innovate that well and that quickly in your own datacenter. So you’ll pay more!

Read: Can communication mixups sour an engagement?

4. Do you have APIs that tons of engineers have already written code for?

A quick peek at Terraform’s community modules on Github and you’ll probably blush. From VPCs to bastion boxes, key management to load balancers, lots of code has been written and open sourced.

By deploying on a platform that a lot of other devs are using, you’ll benefit from all this open source code. That means you won’t have to write that stuff yourself.

Sure you’ll have integration work to do, but the hidden benefit of being on a popular platform saves you money.

Check out: How I use 5 daily habits to stay on track

5. Can you do disaster recovery for free?

If you build your own datacenter, you have to buy all your capacity. So there are no spare servers sitting around waiting for your use. In the public cloud there is always spare capacity.

What that means is you can write automation code to spinup copies of your application stack in alternate regions, at the push of a button. Thus you effectively get disaster recovery for free!

Also: Can daily notes help you work better with clients?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Categories
All Cloud Computing CTO/CIO Data Database Operations Devops

How can we keep cloud architectures simple?

via GIPHY

I was reading hacker news, as I often do. And I found David Futcher’s post You Don’t Need all That Complex/Expensive/Distracting Infrastructure..

Of course it caught my attention. You may be surprised by the reasons

Join 38,000 others and follow Sean Hull on twitter @hullsean.

One quote that should raise your eyebrows…

I’ve seen the idea that every minute spent on infrastructure is a minute less spent shipping features

Here’s what I think…

1. Performance tuning is often about removing things

That sounds strange right? How can performance tuning be about removing things?

Here are a few examples:

o removing results: When you add an index you remove data, returning just the pieces you need.
o removing lag time: When you remove time, you get faster response. This cascades through your entire application, allowing more requests to get handled in a fixed amount of time. On AWS you get allocated a faster NIC when you use a larger instance size. It’s automatic, though somewhat invisible.
o removing data: By trimming tables, access speeds go up. Reads are faster when you hit the whole table, because there’s fewer records to sift through. Writes are faster because you are maintaining smaller associated indexes.
o removing codepaths: By having fewer libraries, and layers between your application, and the data it retrieves, you have less overhead. And that translates to quicker response time too.
o removing databases: If you’re fully microservices, you have a database behind every service. This means your service sometimes proxies just to get at data that has been decoupled. By consolidating databases to a shared db model, you reduce this cross-traffic dramatically.

Related: When you have to take the fall

2. Are we just building what everyone else does?

In technology as with any other industry, following the big trends is safe. If you’re building an architecture that is used by Facebook, Amazon, Apple, Netflix & Google are using, you’re on the best path, right? Certainly few would criticise their success. So yes it is safe. Even if it fails.

Going with a much simpler architecture, that has even a whiff of so-called legacy, may seem like bucking the trend. But fewer moving parts means less to break, less to manage, and less to tune.

Related: Why generalists are better at scaling the web

3. Customers don’t care

Remember, customers aren’t devops gurus nor do they care about Rust versus Swift versus Elixir. What they care about is they can comment on their social media app or order your widget. They want your product to work.

They don’t care if it is hosted in the cloud, or at a managed datacenter. They probably don’t even care about tiny short outages either. What they do care about is that it works, and works well. And fast.

If your infrastructure allows you to be responsive to customers, roll out new product features & updates, you’re going to have some happy customers. The end!

Related: Why i ask for a deposit

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Categories
All CTO/CIO Devops

Does migrating to the cloud require a mindset change of your team?

via GIPHY

We’ve all heard the success stories at firms that have grappled with automation. The dividends are legendary.

Take Amazon themselves for example. By decoupling their teams, allowing each to grow independently and at their own pace, they’ve been able to scale massively.

Join 38,000 others and follow Sean Hull on twitter @hullsean.

One look at the AWS dashboard these days, or their wikipedia page, reveals over 90 services on offer. And each of those is growing and expanding by day.

I’ve worked with a lot of startups, trying to get there. They’ve heard the gospel, and want to gain the benefits themselves.

Here are the challenges I’ve found.

1. Building ain’t easy

One example story was building an ELK box. ELK is elasticsearch, Logstash and Kibana. It provides a centralized place to send all your application & service logs, collect them all together on one dashboard. It’s the business intelligence of devops & software development. Super valuable tool.

In building our solution, we took a marketplace AMI off the shelf, and then customized that. After building the terraform code to spinup the server, we added Ansible scripts to further customize. This allowed us to add a cronjob for backups, set a password, add additional logstash configs, and a few other important housekeeping tasks.

All was great until we hit a snag, we found some CloudWatch logs were not making there way into ELK. Digging through the log messages, we eventually uncovered an error. And that was caused by a conflicting port configuration. So we removed that unused in logstash.conf, and problem solved.

Later, we rebuilt the server and that was pretty quick. Having all the scripts in place, meant we could rebuild quickly. In this case we just needed to resize the root volume by 25x to make room for future logs. This was 3 lines of terraform code and then done!

A couple of weeks later however, we found missing logs again. Digging digging digging, and then we finally discover it is a repeat of our old problem! Turns out the change to logstash.conf never got rolled into the automation scripts. It was done manually! Bad bad!

Moral of the story, with automation, your workflow needs to change. You should *always be working on the scripts* and then reapplying those. Never work on the server directly!

Time to eat my own dogfood!

Related: Is AGILE right for fixing performance issues?

2. Troubleshooting is tough

In the automation universe, as I wrote above, you really want to avoid logging into servers and doing things manually. But that may be easier said than done.

Take another example, I had an ssh key distribution script. I repurposed from the Terraform Community Modules. It works great when it works. It gets injected onto the server at boot time, by terraform inside the user-data script.

The code gets added to cron, and relies on awscli. As it turns out awscli is *not* on all of the aws linux images. Who knows why?!? But that’s where we are.

Should be easy to install. Use yum to get pip (python package manager) installed. Then use pip to install awscli. The script even has *both* yum and apt-get commands to attempt to install pip on either ubuntu or amzn linux. Problem is sometimes it doesn’t. Sometimes? You ask. Yes indeed.

Digging further, it seems that the new pip package gets installed in /usr/local/bin, while it used to install in /usr/bin/. Seems simple. Add a symlink. Yeah did that. Sometimes the package has a different name, such as python-pip3. Great!

Now all this is magnified because you can’t just go on the box and go through the steps. Why? Because in the primordial cromagnon universe that is linux server boot time, sometimes things happen in weird orders, or slower. So you may have something missing during that period, that is later available. So after boot you see no errors.

Yes complicated. Yes you need to build, destory, build destroy the server in endless cycles.

At the next level of automation, we will implement infrastructure testing pipeline. This will automatically build the server for you. The infrastructure unit testing framework seems pretty darn cool. And there is also Gruntworks Terratest.

Related: Is automation killing old-school operations?

3. The dividend is agility

What have i seen in terms of agility?

Well moving our application to a new region takes 20 minutes. Crazy as that sounds, from vpc, to 3 private subnets, 3 public subnets, bastion boxes, load balancers, rds & redis instances, security groups, ingress rules, iam roles, users, s3 buckets, ecs cluster, and various ec2 instances, route 53 zones & cnames, plus even EIPs all can be moved with a few simple code changes. Wow!

What else? We can resize our ELK box root volume by deploying a brand new setup, all in about ten minutes.

This kind of speed is so exciting. It brings repeatability to your engineering processes. It brings confidence to all of those components.

And best of all it allows the business to experiment with new product ideas, and accelerate in the marketplace.

And we all know what that means!

Related: I have a new appreciation for AGILE

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Categories
All Cloud Computing Consulting CTO/CIO Devops

Before you do infrastructure as code, consider your workflow carefully

via GIPHY

What happens with infrastructure as code when you want to make a change on prod?

I’ve been working on automation for a few years now. When you build your cloud infrastructure with code, you really take everything to a whole new level.

Join 38,000 others and follow Sean Hull on twitter @hullsean.

You might be wondering, what’s the day to day workflow look like? It’s not terribly different from regular software development. You create a branch, write some code, test it, and commit it. But there are some differences.

1. Make a change on production

In this scenario, I have development branch reference the repo directly on my laptop. There is a module, but it references locally like this:

source = “../”

So here are the steps:

1. Make your change to terraform in main repo

This happens in the root source directory. Make your changes to .tf files, and save them.

2. Apply change on dev environment

You haven’t committed any changes to the git repo yet. You want to test them. Make sure there are no syntax errors, and they actually build the cloud resources you expect.

$ terraform plan
$ terraform apply

fix errors, etc.

3. Redeploy containers

If you’re using ECS, your code above may have changed a task definition, or other resources. You may need to update the service. This will force the containers to redeploy fresh with any updates from ECR etc.

4. Eyeball test

You’ll need to ssh to the ecs-host and attach to containers. There you can review env, or verify that docker ps shows your new containers are running.

5. commit changes to version control

Ok, now you’re happy with the changes on dev. Things seem to work, what next? You’ll want to commit your changes to the git repo:

$ git commit -am “added some variables to the application task definition”

6. Now tag your code – we’ll use v1.5

$ git tag -a v1.5 -m “added variables to app task definition”
$ git push origin v1.5

Be sure to push the tag (step 2 above)

7. Update stage terraform module to use v1.5

In your stage main.tf where your stage module definition is, change the source line:

source = “git::https://github.com/hullsean/infra-repo.git?ref=v1.5”

8. Apply changes to stage

$ terraform init
$ terraform plan
$ terraform apply

Note that you have to do terraform init this time. That’s because you are using a new version of your code. So terraform has to go and fetch the whole thing, and cache it in .terraform directory.

9. apply change on prod

Redo steps 7 & 8 for your prod-module main.tf.

Related: When you have to take the fall

2. Pros of infrastructure code

o very professional pipeline
o pipeline can be further automated with tests
o very safe changes on prod
o infra changes managed carefully in version control
o you can back out changes, or see how you got here
o you can audit what has happened historically

Related: When clients don’t pay

3. Cons of over automating

o no easy way to sidestep
o manual changes will break everything
o you have to have a strong knowledge of Terraform
o you need a strong in-depth knowledge of AWS
o the whole team has to be on-board with automation
o you can’t just go in and tweak things

Related: Why i ask for a deposit

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Categories
All Devops

How to setup an Amazon EKS demo with Terraform

via GIPHY

Since EKS is pretty new, there aren’t a lot of howtos on it yet.

I wanted to follow along with Amazon’s Getting started with EKS & Kubernetes Guide.

However I didn’t want to use cloudformation. We all know Terraform is far superior!

Join 38,000 others and follow Sean Hull on twitter @hullsean.

With that I went to work getting it going. And a learned a few lessons along the way.

My steps follow pretty closely with the Amazon guide above, and setting up the guestbook app. The only big difference is I’m using Terraform.

1. create the EKS service role

Create a file called eks-iam-role.tf and add the following:

resource "aws_iam_role" "demo-cluster" {
  name = "terraform-eks-demo-cluster"

  assume_role_policy = --POLICY
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "eks.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
POLICY
}

resource "aws_iam_role_policy_attachment" "demo-cluster-AmazonEKSClusterPolicy" {
  policy_arn = "arn:aws:iam::aws:policy/AmazonEKSClusterPolicy"
  role       = "${aws_iam_role.demo-cluster.name}"
}

resource "aws_iam_role_policy_attachment" "demo-cluster-AmazonEKSServicePolicy" {
  policy_arn = "arn:aws:iam::aws:policy/AmazonEKSServicePolicy"
  role       = "${aws_iam_role.demo-cluster.name}"
}

Note we reference demo-cluster resource. We define that in step #3 below.

Related: How to setup Amazon ECS with Terraform

2. Create the EKS vpc

Here’s the code to create the VPC. I’m using the Terraform community module to do this.

There are two things to notice here. One is I reference eks-region variable. Add this in your vars.tf. “us-east-1” or whatever you like. Also add cluster-name to your vars.tf.

Also notice the special tags. Those are super important. If you don’t tag your resources properly, kubernetes won’t be able to do it’s thing. Or rather EKS won’t. I had this problem early on and it is very hard to diagnose. The tags in this vpc module, with propagate to subnets, and security groups which is also crucial.

#
provider "aws" {
  region = "${var.eks-region}"
}

#
module "eks-vpc" {
  source = "terraform-aws-modules/vpc/aws"

  name = "eks-vpc"
  cidr = "10.0.0.0/16"

  azs             = "${var.eks-azs}"
  private_subnets = "${var.eks-private-cidrs}"
  public_subnets  = "${var.eks-public-cidrs}"

  enable_nat_gateway = false
  single_nat_gateway = true

  #  reuse_nat_ips        = "${var.eks-reuse-eip}"
  enable_vpn_gateway = false

  #  external_nat_ip_ids  = ["${var.eks-nat-fixed-eip}"]
  enable_dns_hostnames = true

  tags = {
    Terraform                                   = "true"
    Environment                                 = "${var.environment_name}"
    "kubernetes.io/cluster/${var.cluster-name}" = "shared"
  }
}

resource "aws_security_group_rule" "allow_http" {
  type              = "ingress"
  from_port         = 80
  to_port           = 80
  protocol          = "TCP"
  security_group_id = "${module.eks-vpc.default_security_group_id}"
  cidr_blocks       = ["0.0.0.0/0"]
}

resource "aws_security_group_rule" "allow_guestbook" {
  type              = "ingress"
  from_port         = 3000
  to_port           = 3000
  protocol          = "TCP"
  security_group_id = "${module.eks-vpc.default_security_group_id}"
  cidr_blocks       = ["0.0.0.0/0"]
}

Related: How I resolved some tough Docker problems when i was troubleshooting amazon ECS

3. Create the EKS Cluster

Creating the cluster is a short bit of terraform code below. The aws_eks_cluster resource.

#
# main EKS terraform resource definition
#
resource "aws_eks_cluster" "eks-cluster" {
  name = "${var.cluster-name}"

  role_arn = "${aws_iam_role.demo-cluster.arn}"

  vpc_config {
    subnet_ids = ["${module.eks-vpc.public_subnets}"]
  }
}

output "endpoint" {
  value = "${aws_eks_cluster.eks-cluster.endpoint}"
}

output "kubeconfig-certificate-authority-data" {
  value = "${aws_eks_cluster.eks-cluster.certificate_authority.0.data}"
}

Related: Is Amazon too big to fail?

4. Install & configure kubectl

The AWS docs are pretty good on this point.

First you need to install the client on your local desktop. For me i used brew install, the mac osx package manager. You’ll also need the heptio-authenticator-aws binary. Again refer to the aws docs for help on this.

The main piece you will add is a directory (~/.kube) and edit this file ~/.kube/config as follows:

apiVersion: v1
clusters:
- cluster:
    server: https://3A3C22EEF7477792E917CB0118DD3X22.yl4.us-east-1.eks.amazonaws.com
    certificate-authority-data: "a-really-really-long-string-of-characters"
  name: kubernetes
contexts:
- context:
    cluster: kubernetes
    user: aws
  name: aws
current-context: aws
kind: Config
preferences: {}
users:
- name: aws
  user:
    exec:
      apiVersion: client.authentication.k8s.io/v1alpha1
      command: heptio-authenticator-aws
      args:
        - "token"
        - "-i"
        - "sean-eks"
      #  - "-r"
      #  - "arn:aws:iam::12345678901:role/sean-eks-role"
      #env:
      #  - name: AWS_PROFILE
      #    value: "seancli"%  

Related: Is AWS too complex for small dev teams?

5. Spinup the worker nodes

This is definitely the largest file in your terraform EKS code. Let me walk you through it a bit.

First we attach some policies to our role. These are all essential to EKS. They’re predefined but you need to group them together.

Then you need to create a security group for your worker nodes. Notice this also has the special kubernetes tag added. Be sure that it there or you’ll have problems.

Then we add some additional ingress rules, which allow workers & the control plane of kubernetes all to communicate with eachother.

Next you’ll see some serious user-data code. This handles all the startup action, on the worker node instances. Notice we reference some variables here, so be sure those are defined.

Lastly we create a launch configuration, and autoscaling group. Notice we give it the AMI as defined in the aws docs. These are EKS optimized images, with all the supporting software. Notice also they are only available currently in us-east-1 and us-west-1.

Notice also that the autoscaling group also has the special kubernetes tag. As I’ve been saying over and over, that super important.

#
# EKS Worker Nodes Resources
#  * IAM role allowing Kubernetes actions to access other AWS services
#  * EC2 Security Group to allow networking traffic
#  * Data source to fetch latest EKS worker AMI
#  * AutoScaling Launch Configuration to configure worker instances
#  * AutoScaling Group to launch worker instances
#

resource "aws_iam_role" "demo-node" {
  name = "terraform-eks-demo-node"

  assume_role_policy = --POLICY
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "ec2.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
POLICY
}

resource "aws_iam_role_policy_attachment" "demo-node-AmazonEKSWorkerNodePolicy" {
  policy_arn = "arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy"
  role       = "${aws_iam_role.demo-node.name}"
}

resource "aws_iam_role_policy_attachment" "demo-node-AmazonEKS_CNI_Policy" {
  policy_arn = "arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy"
  role       = "${aws_iam_role.demo-node.name}"
}

resource "aws_iam_role_policy_attachment" "demo-node-AmazonEC2ContainerRegistryReadOnly" {
  policy_arn = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"
  role       = "${aws_iam_role.demo-node.name}"
}

resource "aws_iam_role_policy_attachment" "demo-node-lb" {
  policy_arn = "arn:aws:iam::12345678901:policy/eks-lb-policy"
  role       = "${aws_iam_role.demo-node.name}"
}

resource "aws_iam_instance_profile" "demo-node" {
  name = "terraform-eks-demo"
  role = "${aws_iam_role.demo-node.name}"
}

resource "aws_security_group" "demo-node" {
  name        = "terraform-eks-demo-node"
  description = "Security group for all nodes in the cluster"

  #  vpc_id      = "${aws_vpc.demo.id}"
  vpc_id = "${module.eks-vpc.vpc_id}"

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  tags = "${
    map(
     "Name", "terraform-eks-demo-node",
     "kubernetes.io/cluster/${var.cluster-name}", "owned",
    )
  }"
}

resource "aws_security_group_rule" "demo-node-ingress-self" {
  description              = "Allow node to communicate with each other"
  from_port                = 0
  protocol                 = "-1"
  security_group_id        = "${aws_security_group.demo-node.id}"
  source_security_group_id = "${aws_security_group.demo-node.id}"
  to_port                  = 65535
  type                     = "ingress"
}

resource "aws_security_group_rule" "demo-node-ingress-cluster" {
  description              = "Allow worker Kubelets and pods to receive communication from the cluster control plane"
  from_port                = 1025
  protocol                 = "tcp"
  security_group_id        = "${aws_security_group.demo-node.id}"
  source_security_group_id = "${module.eks-vpc.default_security_group_id}"
  to_port                  = 65535
  type                     = "ingress"
}

data "aws_ami" "eks-worker" {
  filter {
    name   = "name"
    values = ["eks-worker-*"]
  }

  most_recent = true
  owners      = ["602401143452"] # Amazon
}

# EKS currently documents this required userdata for EKS worker nodes to
# properly configure Kubernetes applications on the EC2 instance.
# We utilize a Terraform local here to simplify Base64 encoding this
# information into the AutoScaling Launch Configuration.
# More information: https://amazon-eks.s3-us-west-2.amazonaws.com/1.10.3/2018-06-05/amazon-eks-nodegroup.yaml
locals {
  demo-node-userdata = --USERDATA
#!/bin/bash -xe

CA_CERTIFICATE_DIRECTORY=/etc/kubernetes/pki
CA_CERTIFICATE_FILE_PATH=$CA_CERTIFICATE_DIRECTORY/ca.crt
mkdir -p $CA_CERTIFICATE_DIRECTORY
echo "${aws_eks_cluster.eks-cluster.certificate_authority.0.data}" | base64 -d >  $CA_CERTIFICATE_FILE_PATH
INTERNAL_IP=$(curl -s http://169.254.169.254/latest/meta-data/local-ipv4)
sed -i s,MASTER_ENDPOINT,${aws_eks_cluster.eks-cluster.endpoint},g /var/lib/kubelet/kubeconfig
sed -i s,CLUSTER_NAME,${var.cluster-name},g /var/lib/kubelet/kubeconfig
sed -i s,REGION,${var.eks-region},g /etc/systemd/system/kubelet.service
sed -i s,MAX_PODS,20,g /etc/systemd/system/kubelet.service
sed -i s,MASTER_ENDPOINT,${aws_eks_cluster.eks-cluster.endpoint},g /etc/systemd/system/kubelet.service
sed -i s,INTERNAL_IP,$INTERNAL_IP,g /etc/systemd/system/kubelet.service
DNS_CLUSTER_IP=10.100.0.10
if [[ $INTERNAL_IP == 10.* ]] ; then DNS_CLUSTER_IP=172.20.0.10; fi
sed -i s,DNS_CLUSTER_IP,$DNS_CLUSTER_IP,g /etc/systemd/system/kubelet.service
sed -i s,CERTIFICATE_AUTHORITY_FILE,$CA_CERTIFICATE_FILE_PATH,g /var/lib/kubelet/kubeconfig
sed -i s,CLIENT_CA_FILE,$CA_CERTIFICATE_FILE_PATH,g  /etc/systemd/system/kubelet.service
systemctl daemon-reload
systemctl restart kubelet
USERDATA
}

resource "aws_launch_configuration" "demo" {
  associate_public_ip_address = true
  iam_instance_profile        = "${aws_iam_instance_profile.demo-node.name}"
  image_id                    = "${data.aws_ami.eks-worker.id}"
  instance_type               = "m4.large"
  name_prefix                 = "terraform-eks-demo"
  security_groups             = ["${aws_security_group.demo-node.id}"]
  user_data_base64            = "${base64encode(local.demo-node-userdata)}"

  lifecycle {
    create_before_destroy = true
  }
}

resource "aws_autoscaling_group" "demo" {
  desired_capacity     = 2
  launch_configuration = "${aws_launch_configuration.demo.id}"
  max_size             = 2
  min_size             = 1
  name                 = "terraform-eks-demo"

  #  vpc_zone_identifier  = ["${aws_subnet.demo.*.id}"]
  vpc_zone_identifier = ["${module.eks-vpc.public_subnets}"]

  tag {
    key                 = "Name"
    value               = "eks-worker-node"
    propagate_at_launch = true
  }

  tag {
    key                 = "kubernetes.io/cluster/${var.cluster-name}"
    value               = "owned"
    propagate_at_launch = true
  }
}

Related: How to hire a developer that doesn’t suck

6. Enable & Test worker nodes

If you haven’t already done so, apply all your above terraform:

$ terraform init
$ terraform plan
$ terraform apply

After that all runs, and all your resources are created. Now edit the file “aws-auth-cm.yaml” with the following contents:

apiVersion: v1
kind: ConfigMap
metadata:
  name: aws-auth
  namespace: kube-system
data:
  mapRoles: |
    - rolearn: arn:aws:iam::12345678901:role/terraform-eks-demo-node
      username: system:node:{{EC2PrivateDNSName}}
      groups:
        - system:bootstrappers
        - system:nodes% 

Then apply it to your cluster:

$ kubectl apply -f aws-auth-cm.yaml

you should be able to use kubectl to view node status:

$ kubectl get nodes
NAME                           STATUS    ROLES     AGE       VERSION
ip-10-0-101-189.ec2.internal   Ready         10d       v1.10.3
ip-10-0-102-182.ec2.internal   Ready         10d       v1.10.3
$ 

Related: Why would I help a customer that’s not paying?

7. Setup guestbook app

Finally you can follow the exact steps in the AWS docs to create the app. Here they are again:

$ kubectl apply -f https://raw.githubusercontent.com/kubernetes/kubernetes/v1.10.3/examples/guestbook-go/redis-master-controller.json
$ kubectl apply -f https://raw.githubusercontent.com/kubernetes/kubernetes/v1.10.3/examples/guestbook-go/redis-master-service.json
$ kubectl apply -f https://raw.githubusercontent.com/kubernetes/kubernetes/v1.10.3/examples/guestbook-go/redis-slave-controller.json
$ kubectl apply -f https://raw.githubusercontent.com/kubernetes/kubernetes/v1.10.3/examples/guestbook-go/redis-slave-service.json
$ kubectl apply -f https://raw.githubusercontent.com/kubernetes/kubernetes/v1.10.3/examples/guestbook-go/guestbook-controller.json
$ kubectl apply -f https://raw.githubusercontent.com/kubernetes/kubernetes/v1.10.3/examples/guestbook-go/guestbook-service.json

Then you can get the endpoint with kubectl:

$ kubectl get services        
NAME           TYPE           CLUSTER-IP       EXTERNAL-IP        PORT(S)          AGE
guestbook      LoadBalancer   172.20.177.126   aaaaa555ee87c...   3000:31710/TCP   4d
kubernetes     ClusterIP      172.20.0.1                    443/TCP          10d
redis-master   ClusterIP      172.20.242.65                 6379/TCP         4d
redis-slave    ClusterIP      172.20.163.1                  6379/TCP         4d
$ 

Use “kubectl get services -o wide” to see the entire EXTERNAL-IP. If that is saying you likely have an issue with your node iam role, or missing special kubernetes tags. So check on those. It shouldn’t show for more than a minute really.

Hope you got everything working.

Good luck and if you have questions, post them in the comments & I’ll try to help out!

Related: How to migrate my skills to the cloud?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Categories
Cloud Computing Consulting CTO/CIO Devops

I tried to build infrastructure as code Terraform and Amazon. It didn’t go as I expected.

via GIPHY

As I was building infrastructure code, I stumbled quite a few times. You hit a wall and you have to work through those confusing and frustrating moments.

Join 38,000 others and follow Sean Hull on twitter @hullsean.

Here are a few of the lessons I learned in the process of building code for AWS. It’s not easy but when you get there you can enjoy the vistas. They’re pretty amazing.

Don’t pass credentials

As you build your applications, there are moments where components need to use AWS in some way. Your webserver needs to use S3 or your ELK box needs to use CloudWatch. Maybe you want to do an RDS backup, or list EC2 instances.

However it’s not safe to pass your access_key and secret_access_key around. Those should be for your desktop only. So how best to handle this in the cloud?

IAM roles to the rescue. These are collections of privileges. The cool thing is they can be assigned at the INSTANCE LEVEL. Meaning your whole server has permissions to use said resources.

Do this by first creating a role with the privileges you want. Create a json policy document which outlines the specific rules as you see fit. Then create an instance profile for that role.

When you create your ec2 instance in Terraform, you’ll specify that instance profile. Either by ARN or if Terraform created it, by resource ID.

Related: How to avoid insane AWS bills

Keep passwords out of code

Even though we know it should not happen, sometimes it does. We need to be vigilant to stay on top of this problem. There are projects like Pivotal’s credential scan. This can be used to check your source files for passwords.

What about something like RDS? You’re going to need to specify a password in your Terraform code right? Wrong! You can define a variable with no default as follows:

variable "my_rds_pass" {
  description = "password for rds database"
}

When Terraform comes upon this variable in your code, but sees there is no “default” value, it will prompt you when you do “$ terraform apply”

Related: How best to do discovery in cloud and devops engagements?

Versioning your code

When you first start building terraform code, chances are you create a directory, and some tf files, then do your “$ terraform apply”. When you watch that infra build for the first time, it’s exciting!

After you add more components, your code gets more complex. Hopefully you’ve created a git repo to house your code. You can check & commit the files, so you have them in a safe place. But of course there’s more to the equation than this.

How do you handle multiple environments, dev, stage & production all using the same code?

That’s where modules come in. Now at the beginning you may well have a module that looks like this:

module "all-proj" {

  source = "../"

  myvar = "true"
  myregion = "us-east-1"
  myami = "ami-64300001"
}

Etc and so on. That’s the first step in the right direction, however if you change your source code, all of your environments will now be using that code. They will get it as soon as you do “$ terraform apply” for each. That’s fine, but it doesn’t scale well.

Ultimately you want to manage your code like other software projects. So as you make changes, you’ll want to tag it.

So go ahead and checkin your latest changes:

# push your latest changes
$ git push origin master
# now tag it
$ git tag -a v0.1 -m "my latest coolest infra"
# now push the tags
$ git push origin v0.1

Great now you want to modify your module slightly. As follows:

module "all-proj" {

  source = "git::https://[email protected]/hullsean/myproj-infra.git?ref=v0.1"

  myvar = "true"
  myregion = "us-east-1"
  myami = "ami-64300001"
}

Cool! Now each dev, stage and prod can reference a different version. So you are free to work on the infra without interrupting stage or prod. When you’re ready to promote that code, checkin, tag and update stage.

You could go a step further to be more agile, and have a post-commit hook that triggers the stage terraform apply. This though requires you to build solid infra tests. Checkout testinfra and terratest.

Related: Are you getting good at Terraform or wrestling with a bear?

Managing RDS backups

Amazon’s RDS service is a bit weird. I wrote in the past asking Is upgrading RDS like a shit-storm that will not end?. Yes I’ve had my grievances.

My recent discovery is even more serious! Terraform wants to build infra. And it wants to be able to later destroy that infra. In the case of databases, obviously the previous state is one you want to keep. You want that to be perpetual, beyond the infra build. Obvious, no?

Apparently not to the folks at Amazon. When you destroy an RDS instance it will destroy all the old backups you created. I have no idea why anyone would want this. Certainly not as a default behavior. What’s worse you can’t copy those backups elsewhere. Why not? They’re probably sitting in S3 anyway!

While you can take a final backup when you destroy an RDS instance, that’s wondeful and I recommend it. However that’s not enough. I highly suggest you take matters into your own hands. Build a script that calls pg_dump yourself, and copy those .sql or .dump files to S3 for safe keeping.

Related: Is zero downtime even possible on RDS?

When to use force_destroy on S3 buckets

As with RDS, when you create S3 buckets with your infra, you want to be able to cleanup later. But the trouble is that once you create a bucket, you’ll likely fill it with objects and files.

What then happens is when you go to do “$ terraform destroy” it will fail with an error. This makes sense as a default behavior. We don’t want data disappearing without our knowledge.

However you do want to be able to cleanup. So what to do? Two things.

Firstly, create a process, perhaps a lambda job or other bucket replication to regularly sync your s3 bucket to your permanent bucket archive location. Run that every fifteen minutes or as often as you need.

Then add a force_destroy line to your s3 bucket resource. Here’s an example s3 bucket for storing load balancer logs:

data "aws_elb_service_account" "main" {}

resource "aws_s3_bucket" "lb_logs" {
  count         = "${var.create-logs-bucket ? 1 : 0}"
  force_destroy = "${var.force-destroy-logs-bucket}"
  bucket        = "${var.lb-logs-bucket}"
  acl           = "private"

  policy = POLICY
{
  "Id": "Policy",
  "Version": "2012-10-17",
  "Statement": [
    {
      "Action": [
        "s3:PutObject"
      ],
      "Effect": "Allow",
      "Resource": "arn:aws:s3:::${var.lb-logs-bucket}/*",
      "Principal": {
        "AWS": [
          "${data.aws_elb_service_account.main.arn}"
        ]
      }
    }
  ]
}
POLICY

  tags {
    Environment = "${var.environment_name}"
  }
}

NOTE: There should be “< <" above and to the left of POLICY. HTML was not having this, and I couldn't resolve it quickly. Oh well.

Related: Why generalists are better at scaling the web

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters