How to setup an Amazon ECS cluster with Terraform

via GIPHY

ECS is Amazon’s Elastic Container Service. That’s greek for how you get docker containers running in the cloud. It’s sort of like Kubernetes without all the bells and whistles.

It takes a bit of getting used to, but This terraform how to, should get you moving. You need an EC2 host to run your containers on, you need a task that defines your container image & resources, and lastly a service which tells ECS which cluster to run on and registers with ALB if you have one.

Join 38,000 others and follow Sean Hull on twitter @hullsean.

For each of these sections, create files: roles.tf, instance.tf, task.tf, service.tf, alb.tf. What I would recommend is create the first file roles.tf, then do:


$ terraform init
$ terraform plan
$ terraform apply

Then move on to instance.tf and do the terraform apply. One by one, next task, then service then finally alb. This way if you encounter errors, you can troubleshoot minimally, rather than digging through five files for the culprit.

This howto also requires a vpc. Terraform has a very good community vpc which will get you going in no time.

I recommend deploying in the public subnets for your first run, to avoid complexity of jump box, and private IPs for ecs instance etc.

Good luck!

May the terraform force be with you!

First setup roles

Roles are a really brilliant part of the aws stack. Inside of IAM or identity access and management, you can create roles. These are collections of privileges. I’m allowed to use this S3 bucket, but not others. I can use EC2, but not Athena. And so forth. There are some special policies already created just for ECS and you’ll need roles to use them.

These roles will be applied at the instance level, so your ecs host doesn’t have to pass credentials around. Clean. Secure. Smart!


resource "aws_iam_role" "ecs-instance-role" {
name = "ecs-instance-role"
path = "/"
assume_role_policy = "${data.aws_iam_policy_document.ecs-instance-policy.json}"
}

data "aws_iam_policy_document" "ecs-instance-policy" {
statement {
actions = ["sts:AssumeRole"]

principals {
type = "Service"
identifiers = ["ec2.amazonaws.com"]
}
}
}

resource "aws_iam_role_policy_attachment" "ecs-instance-role-attachment" {
role = "${aws_iam_role.ecs-instance-role.name}"
policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonEC2ContainerServiceforEC2Role"
}

resource "aws_iam_instance_profile" "ecs-instance-profile" {
name = "ecs-instance-profile"
path = "/"
role = "${aws_iam_role.ecs-instance-role.id}"
provisioner "local-exec" {
command = "sleep 60"
}
}

resource "aws_iam_role" "ecs-service-role" {
name = "ecs-service-role"
path = "/"
assume_role_policy = "${data.aws_iam_policy_document.ecs-service-policy.json}"
}

resource "aws_iam_role_policy_attachment" "ecs-service-role-attachment" {
role = "${aws_iam_role.ecs-service-role.name}"
policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonEC2ContainerServiceRole"
}

data "aws_iam_policy_document" "ecs-service-policy" {
statement {
actions = ["sts:AssumeRole"]

principals {
type = "Service"
identifiers = ["ecs.amazonaws.com"]
}
}
}

Related: 30 questions to ask a serverless fanboy

Setup your ecs host instance

Next you need EC2 instances on which to run your docker containers. Turns out AWS has already built AMIs just for this purpose. They call them ECS Optimized Images. There is one unique AMI id for each region. So be sure you’re using the right one for your setup.

The other thing that your instance needs to do is echo the cluster name to /etc/ecs/ecs.config. You can see us doing that in the user_data script section.

Lastly we’re configuring our instance inside of an auto-scaling group. That’s so we can easily add more instances dynamically to scale up or down as necessary.


#
# the ECS optimized AMI's change by region. You can lookup the AMI here:
# https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-optimized_AMI.html
#
# us-east-1 ami-aff65ad2
# us-east-2 ami-64300001
# us-west-1 ami-69677709
# us-west-2 ami-40ddb938
#

#
# need to add security group config
# so that we can ssh into an ecs host from bastion box
#

resource "aws_launch_configuration" "ecs-launch-configuration" {
name = "ecs-launch-configuration"
image_id = "ami-aff65ad2"
instance_type = "t2.medium"
iam_instance_profile = "${aws_iam_instance_profile.ecs-instance-profile.id}"

root_block_device {
volume_type = "standard"
volume_size = 100
delete_on_termination = true
}

lifecycle {
create_before_destroy = true
}

associate_public_ip_address = "false"
key_name = "testone"

#
# register the cluster name with ecs-agent which will in turn coord
# with the AWS api about the cluster
#
user_data = <> /etc/ecs/ecs.config
EOF
}

#
# need an ASG so we can easily add more ecs host nodes as necessary
#
resource "aws_autoscaling_group" "ecs-autoscaling-group" {
name = "ecs-autoscaling-group"
max_size = "2"
min_size = "1"
desired_capacity = "1"

# vpc_zone_identifier = ["subnet-41395d29"]
vpc_zone_identifier = ["${module.new-vpc.private_subnets}"]
launch_configuration = "${aws_launch_configuration.ecs-launch-configuration.name}"
health_check_type = "ELB"

tag {
key = "Name"
value = "ECS-myecscluster"
propagate_at_launch = true
}
}

resource "aws_ecs_cluster" "test-ecs-cluster" {
name = "myecscluster"
}

Related: Is there a serious skills shortage in the devops space?

Setup your task definition

The third thing you need is a task. This one will spinup a generic nginx container. It’s a nice way to demonstrate things. For your real world usage, you’ll replace the image line with a docker image that you’ve pushed to ECR. I’ll leave that as an exercise. Once you have the cluster working, you should get the hang of things.

Note the portmappings, memory and CPU. All things you might expect to see in a docker-compose.yml file. So these tasks should look somewhat familiar.


data "aws_ecs_task_definition" "test" {
task_definition = "${aws_ecs_task_definition.test.family}"
depends_on = ["aws_ecs_task_definition.test"]
}

resource "aws_ecs_task_definition" "test" {
family = "test-family"

container_definitions = <

Related: Is AWS too complex for small dev teams?

Setup your service definition

The fourth thing you need to do is setup a service. The task above is a manifest, describing your containers needs. It is now registered, but nothing is running.

When you apply the service your container will startup. What I like to do is, ssh into the ecs host box. Get comfortable. Then issue $ watch "docker ps". This will repeatedly run "docker ps" every two seconds. Once you have that running, do your terraform apply for this service piece.

As you watch, you'll see ECS start your container, and it will suddenly appear in your watch terminal. It will first show "starting". Once it is started, it should say "healthy".


resource "aws_ecs_service" "test-ecs-service" {
name = "test-vz-service"
cluster = "${aws_ecs_cluster.test-ecs-cluster.id}"
task_definition = "${aws_ecs_task_definition.test.family}:${max("${aws_ecs_task_definition.test.revision}", "${data.aws_ecs_task_definition.test.revision}")}"
desired_count = 1
iam_role = "${aws_iam_role.ecs-service-role.name}"

load_balancer {
target_group_arn = "${aws_alb_target_group.test.id}"
container_name = "nginx"
container_port = "80"
}

depends_on = [
# "aws_iam_role_policy.ecs-service",
"aws_alb_listener.front_end",
]
}

Related: Does AWS have a dirty secret?

Setup your application load balancer

The above will all work by itself. However for a real-world use case, you'll want to have an ALB. This one has only a simple HTTP port 80 listener. These are much simpler than setting up 443 for SSL, so baby steps first.

Once you have the ALB going, new containers will register with the target group, to let the alb know about them. In "docker ps" you'll notice they are running on a lot of high numbered ports. These are the hostPorts which are dynamically assigned. The container ports are all 80.


#
#
resource "aws_alb_target_group" "test" {
name = "my-alb-group"
port = 80
protocol = "HTTP"
vpc_id = "${module.new-vpc.vpc_id}"
}

resource "aws_alb" "main" {
name = "my-alb-ecs"
subnets = ["${module.new-vpc.public_subnets}"]
security_groups = ["${module.new-vpc.default_security_group_id}"]
}

resource "aws_alb_listener" "front_end" {
load_balancer_arn = "${aws_alb.main.id}"
port = "80"
protocol = "HTTP"

default_action {
target_group_arn = "${aws_alb_target_group.test.id}"
type = "forward"
}
}

You will also want to add a domain name, so that as your infra changes, and if you rebuild your ALB, the name of your application doesn't vary. Route53 will adjust as terraform changes are applied. Pretty cool.


resource "aws_route53_record" "myapp" {
zone_id = "${aws_route53_zone.primary.zone_id}"
name = "myapp.mydomain.com"
type = "CNAME"
ttl = "60"
records = ["${aws_alb.main.dns_name}"]

depends_on = ["aws_alb.main"]
}

Related: How to deploy on EC2 with vagrant

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don't work with recruiters

How do we test performance in a microservices world?

via GIPHY

I recently ran across this interesting question on a technology forum.

“I’m an engineering team lead at a startup in NYC. Our app is written in Ruby on Rails and hosted on Heroku. We use metrics such as the built-in metrics on Heroku, as well as New Relic for performance monitoring. This summer, we’re expecting a large influx of traffic from a new partnership and would like to have confidence that our system can handle the load.”

“I’ve tried to wrap my head around different types of performance/load testing tools like JMeter, Blazemeter, and others. Additionally, I’ve experimented with scripts which have grown more complex and I’m following rabbit holes of functionality within JMeter (such as loading a CSV file for dynamic user login, and using response data in subsequent requests, etc.). Ultimately, I feel this might be best left to consultants or experts who could be far more experienced and also provide our organization an opportunity to learn from them on key concepts and best practices.”

Join 38,000 others and follow Sean Hull on twitter @hullsean.

Here’s my point by point response.

I’ve been doing performance tuning since the old dot-com days.

It used to be you point a loadrunner type tool at your webpage and let it run. Then watch the load, memory & disk on your webserver or database. Before long you’d find some bottlenecks. Shortage of resources (memory, cpu, disk I/O) or slow queries were often the culprit. Optimizing queries, and ripping out those pesky ORMs usually did the trick.

Related: Why generalists are better at scaling the web

Today things are quite a bit more complicated. Yes jmeter & blazemeter are great tools. You might also get newrelic installed on your web nodes. This will give you instrumentation on where your app spends time. However it may still not be easy. With microservices, you have the docker container & orchestration layer to consider. In the AWS environment you can have bottlenecks on disk I/O where provisioned IOPS can help. But instance size also impacts network interfaces in the weird world of multi-tenant. So there’s that too!

Related: 5 things toxic to scalability

What’s more a lot of frameworks are starting to steer back towards ORMs again. Sadly this is not a good trend. On the flip side if you’re using RDS, your default MySQL or postgres settings may be decent. And newer versions of MySQL are getting some damn fancy & performant indexes. So there’s lots of improvement there.

Related: Anatomy of a performance review

There is also the question of simulating real users. What is a real user? What is an ACTIVE user? These are questions that may seem obvious, although I’ve worked at firms where engineering, product, sales & biz-dev all had different answers. But lets say you’ve answered that. Does are load test simply login the user? Or do they use a popular section of the site? Or how about an unpopular section of the site? Often we are guessing what “real world” users do and how they use our app.

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

What makes a highly valued docker expert?

via GIPHY

What exactly do we need to know about to manage docker effectively? What are the main pain points?

Join 38,000 others and follow Sean Hull on twitter @hullsean.

The basics aren’t tough. You need to know the anatomy of a Dockerfile, and how to setup a docker-compose.yml to ease the headache of docker run. You also should know how to manage docker images, and us docker ps to find out what’s currently running. And get an interactive shell (docker exec -it imageid). You’ll also make friends with inspect. But what else?

1. Manage image bloat

Docker images can get quite large. Even as you try to pair them down they can grow. Why is this?

Turns out the architecture of docker means as you add more stuff, it creates more “layers”. So even as you delete files, the lower or earlier layers still contain your files.

One option, during a package install you can do this:

RUN apt-get update && apt-get install -y mypkg && rm -rf /var/lib/apt/lists/*

This will immediately cleanup the crap that apt-get built from, without it ever becoming permanent in that layer. Cool! As long as you use “&&” it is part of that same RUN command, and thus part of that same layer.

Another option is you can flatten a big image. Something like this should work:

$ docker export 0453814a47b3 | docker import – newimage

Related: 30 questions to ask a serverless fanboy

2. Orchestrate

Running docker containers on dev is great, and it can be a fast and easy way to get things running. Plus it can work across dev environments well, so it solves a lot of problems.

But what about when you want to get those containers up into the cloud? That’s where orchestration comes in. At the moment you can use docker’s own swarm or choose fleet or mesos.

But the biggest players seem to be kubernetes & ECS. The former of course is what all the cool kids in town are using, and couple it with Helm package manager, it becomes very manageable system. Get your pods, services, volumes, replicasets & deployments ready to go!

On the other hand Amazon is pushing ahead with it’s Elastic Container Service, which is native to AWS, and not open source. It works well, allowing you to apply a json manifest to create a task. Then just as with kubernetes you create a “service” to run one or more copies of that. Think of the task as a docker-compose file. It’s in json, but it basically specifies the same types of things. Entrypoint, ports, base image, environment etc.

For those wanting to go multi-cloud, kubernetes certainly has an appeal. But amazon is on the attack. They have announced a service to further ease container deployments. Dubbed Amazon Fargate. Remember how Lambda allowed you to just deploy your *code* into the cloud, and let amazon worry about the rest? Imaging you can do that with containers, and that’s what Fargate is.

Check out what Krish has to say – Why Kubernetes should be scared of AWS

Related: What’s the luckiest thing that’s happened in your career?

3. Registries & Deployment

There are a few different options for where to store those docker images.

One choice is dockerhub. It’s not feature rich, but it does the job. There is also Quay.io. Alternatively you can run your own registry. It’s as easy as:

$ docker run -d -p 5000:5000 registry:2

Of course if you’re running your own registry, now you need to manage that, and think about it’s uptime, and dependability to your deployment pipeline.

If you’re using ECS, you’ll be able to use ECR which is a private docker registry that comes with your AWS account. I think you can use this, even if you’re not on ECS. The login process is a little weird.

Once you have those pieces in place, you can do some fun things. Your jenkins deploy pipeline can use docker containers for testing, to spinup a copy of your app just to run some unittests, or it can build your images, and push them to your registry, for later use in ECS tasks or Kubernetes manifests. Awesome sauce!

Related: Is Amazon Web Services too complex for small dev teams?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

6 Devops interview questions

via GIPHY

Devops is in serious demand these days. At every meetup or tech event I attend, I hear a recruiter or startup founder talking about it. It seems everyone wants to see benefits of talented operations brought to their business.

Join 37,000 others and follow Sean Hull on twitter @hullsean.

That said the skill set is very broad, which explains why there aren’t more devs picking up the batton.





I thought it would be helpful to put together a list of interview questions. There are certainly others, but here’s what I came up with.

1. Explain the gitflow release process

As a devops engineer you should have a good foundation about software delivery. With that you should understand git very well, especially the standard workflow.

Although there are other methods to manage code, one solid & proven method is gitflow. In a nutshell you have two main branches, development & master. Developers checkout a new branch to add a feature, and push it back to development branch. Your stage server can be built automatically off of this branch.

Periodically you will want to release a new version of the software. For this you merge development to master. UAT is then built automatically off of the master branch. When acceptance testing is done, you deploy off of master to production. Hence the saying always ship trunk.

Bonus points if you know that hotfixes are done directly off the master branch & pushed straight out that way.

Related: 8 questions to ask an AWS expert

2. How do you provision resources?

There are a lot of tools in the devops toolbox these days. One that is great at provisioning resources is Terraform. With it you can specify in declarative code everything your application will need to run in the cloud. From IAM users, roles & groups, dynamodb tables, rds instances, VPCs & subnets, security groups, ec2 instances, ebs volumes, S3 buckets and more.

You may also choose to use CloudFormation of course, but in my experience terraform is more polished. What’s more it supports multi-cloud. Want to deploy in GCP or Azure, just port your templates & you’re up and running in no time.

It takes some time to get used to the new workflow of building things in terraform rather than at the AWS cli or dashboard, but once you do you’ll see benefits right away. You gain all the advantages of versioning code we see with other software development. Want to rollback, no problem. Want to do unit tests against your infrastructure? You can do that too!

Related: Does a 4-letter-word divide dev & ops?

3. How do you configure servers?

The four big choices for configuration management these days are Ansible, Salt, Chef & Puppet. For my money Ansible has some nice advantages.

First it doesn’t require an agent. As long as you have SSH access to your box, you can manage it with Ansible. Plus your existing shell scripts are pretty easy to port to playbooks. Ansible also does not require a server to house your playbooks. Simply keep them in your git repository, and checkout to your desktop. Then run ansible-playbook on the yaml file. Voila, server configuration!

Related: How to hire a developer that doesn’t suck

4. What does testing enable?

Unit testing & integration testing are super import parts of continuous integration. As you automate your tests, you formalize how your site & code should behave. That way when you automate the deployment, you can also automate the test process. Let the software do the drudgework of making sure a new feature hasn’t broken anything on the site.

As you automate more tests, you accelerate the software development process, because you’re doing less and less manually. That means being more agile, and makes the business more nimble.

Related: Is AWS too complex for small dev teams?

5. Explain a use case for Docker

Docker a low overhead way to run virtual machines on your local box or in the cloud. Although they’re not strictly distinct machines, nor do they need to boot an OS, they give you many of those benefits.

Docker can encapsulate legacy applications, allowing you to deploy them to servers that might not otherwise be easy to setup with older packages & software versions.

Docker can be used to build test boxes, during your deploy process to facilitate continuous integration testing.

Docker can be used to provision boxes in the cloud, and with swarm you can orchestrate clusters too. Pretty cool!

Related: Will Microservices just die already?

6. How is communicating relevant to Devops

Since devops brings a new process of continuous delivery to the organization, it involves some risk. Actually doing things the old way involves more risk in the long term, because things can and will break. With automation, you can recovery quicker from failure.

But this new world, requires a leap of faith. It’s not right for every organization or in every case, and you’ll likely strike a balance from what the devops holy book says, and what your org can tolerate. However inevitably communication becomes very important as you advocate for new ways of doing things.

Related: How do I migrate my skills to the cloud?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Very easy cloudformation template comparison with simple terraform for beginners

via GIPHY

If you search a bit on google, you’ll find lots of sample templates for both of these systems. However I found they had a lot of complexity.

When you’re just starting, you want a very simple example. So I thought I’d put one together.

Join 38,000 others and follow Sean Hull on twitter @hullsean.

I’m going to compare both terraform & cloudformation. They get you to the same endpoint, but do it slightly differently.

Very basic terraform template

Ok, you’ve got terraform installed right? If not there are howtos here.

Now let’s create a server.

Create a directory “terraform” then cd into it. Edit this file as main.tf

provider "aws" {
    region = "us-east-1"
}
resource "aws_instance" "example" {
    ami = "ami-40d28157"
    subnet_id = "subnet-111ddaaa"
    instance_type = "t2.micro"
    key_name = "seanKey"
}

Please change the subnet to a valid one for you. In the real world you would definitely *not* hardcode a subnet like this. But I wanted to keep this example very simple. Don’t know what subnet to use? Navigate your aws dashboard over to “VPC” and dig around.

Also of course edit for your key.

Ok, you’re ready to test. Let’s first ask terraform what it will do with the “plan” command:

levanter:terraform sean$ terraform plan
Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but
will not be persisted to local or remote state storage.


The Terraform execution plan has been generated and is shown below.
Resources are shown in alphabetical order for quick scanning. Green resources
will be created (or destroyed and then created if an existing resource
exists), yellow resources are being changed in-place, and red resources
will be destroyed. Cyan entries are data sources to be read.

Note: You didn't specify an "-out" parameter to save this plan, so when
"apply" is called, Terraform can't guarantee this is what will execute.

+ aws_instance.example
    ami:                      "ami-40d28157"
    availability_zone:        ""
    ebs_block_device.#:       ""
    ephemeral_block_device.#: ""
    instance_state:           ""
    instance_type:            "t2.micro"
    key_name:                 "seanKey"
    network_interface_id:     ""
    placement_group:          ""
    private_dns:              ""
    private_ip:               ""
    public_dns:               ""
    public_ip:                ""
    root_block_device.#:      ""
    security_groups.#:        ""
    source_dest_check:        "true"
    subnet_id:                "subnet-111ddaaa"
    tenancy:                  ""
    vpc_security_group_ids.#: ""


Plan: 1 to add, 0 to change, 0 to destroy.
levanter:terraform sean$

Related: What is devops and why is it important?

Build & change with Terraform

Next you want to ask terraform to go ahead and do the work. Because above we only did a dry-run.

levanter:terraform sean$ terraform apply
aws_instance.example: Creating...
  ami:                      "" => "ami-40d28157"
  availability_zone:        "" => ""
  ebs_block_device.#:       "" => ""
  ephemeral_block_device.#: "" => ""
  instance_state:           "" => ""
  instance_type:            "" => "t2.micro"
  key_name:                 "" => "seanKey"
  network_interface_id:     "" => ""
  placement_group:          "" => ""
  private_dns:              "" => ""
  private_ip:               "" => ""
  public_dns:               "" => ""
  public_ip:                "" => ""
  root_block_device.#:      "" => ""
  security_groups.#:        "" => ""
  source_dest_check:        "" => "true"
  subnet_id:                "" => "subnet-111ddaaa"
  tenancy:                  "" => ""
  vpc_security_group_ids.#: "" => ""
aws_instance.example: Still creating... (10s elapsed)
aws_instance.example: Still creating... (20s elapsed)
aws_instance.example: Creation complete

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

The state of your infrastructure has been saved to the path
below. This state is required to modify and destroy your
infrastructure, so keep it safe. To inspect the complete state
use the `terraform show` command.

State path: terraform.tfstate
levanter:terraform sean$ 

One thing I like is terraform shows us the progress at command line. Cloudformation isn’t so nicely finished. ๐Ÿ™‚

Ok, let’s add a tag name to our server. We’re going to add just three lines to our main.tf file:

provider "aws" {
    region = "us-east-1"
}

resource "aws_instance" "example" {
    ami = "ami-40d28157"
    subnet_id = "subnet-111ddaaa"
    instance_type = "t2.micro"
    tags {
        Name = "terraform-box"
    }
}

Now we do terraform apply again. Look how easy that change is to make!

levanter:terraform sean$ terraform apply
aws_instance.example: Refreshing state... (ID: i-0ddd063bbbbce56e2)
aws_instance.example: Modifying...
  tags.%:    "0" => "1"
  tags.Name: "" => "terraform-box"
aws_instance.example: Modifications complete

Apply complete! Resources: 0 added, 1 changed, 0 destroyed.

The state of your infrastructure has been saved to the path
below. This state is required to modify and destroy your
infrastructure, so keep it safe. To inspect the complete state
use the `terraform show` command.

State path: terraform.tfstate
levanter:terraform sean$ 

Navigate to the EC2 dashboard and you should see the first column showing your new name.

That was cool!

Chances are you don’t wanna leave these components sitting around. Let’s cleanup. That’s easy too!

levanter:terraform sean$ terraform destroy
Do you really want to destroy?
  Terraform will delete all your managed infrastructure.
  There is no undo. Only 'yes' will be accepted to confirm.

  Enter a value: yes

aws_instance.example: Refreshing state... (ID: i-0ddd063bbbbce56e2)
aws_instance.example: Destroying...
aws_instance.example: Still destroying... (10s elapsed)
aws_instance.example: Still destroying... (20s elapsed)
aws_instance.example: Still destroying... (30s elapsed)
aws_instance.example: Still destroying... (40s elapsed)
aws_instance.example: Still destroying... (50s elapsed)
aws_instance.example: Still destroying... (1m0s elapsed)
aws_instance.example: Destruction complete

Destroy complete! Resources: 1 destroyed.
levanter:terraform sean$ 

Related: Top questions to ask on a devops interview

Very basic CloudFormation template example

Hopefully you wrote down your subnet name & keyname. So this will be easy.

Let’s create a “cfn” directory and cd into it.

Next edit main.yml

AWSTemplateFormatVersion: '2010-09-09'

Resources:
  EC2Instance:
    Type: AWS::EC2::Instance
    Properties:
      InstanceType: t2.micro
      SubnetId: subnet-333dfe6a
      KeyName: "iheavy"
      ImageId: "ami-40d28157"

Now let’s build that with cloudformation. You need to have the awscli installed. Here’s amazon’s howto.

Now let’s create. Cloudformation organizes things as “stacks.

aws cloudformation create-stack --template-body file://sean-instance.yml --stack-name cfn-test

Since I didn’t define “outputs” to keep the yaml simple, the command above should just return without error.

You can go into the aws dashboard, and navigate to “CloudFormation” and see the stack being created. You can also see under “EC2” a new instance has been created.

Related: How do I migrate my skills to the cloud?

Add an instance name with tags in Cloud Formation

As we did with terraform, let’s add a name to the server. This is just a tag, not a hostname, so it’s only useful throughout the AWS API.

AWSTemplateFormatVersion: '2010-09-09'

Resources:
  EC2Instance:
    Type: AWS::EC2::Instance
    Properties:
      InstanceType: t2.micro
      SubnetId: subnet-333dfe6a
      KeyName: "iheavy"
      ImageId: "ami-40d28157"
      Tags:
        - Key: "Name"
          Value: "cfn-box"

Note the three new lines at the bottom. Ok, let’s apply those changes:

levanter:cfn sean$ aws cloudformation update-stack --template-body file://sean-instance.yml --stack-name cfn-test

Navigate to the EC2 dashboard and you should see the first column showing your new name.

Time to cleanup. Let’s delete that stack:

levanter:cfn sean$ aws cloudformation delete-stack --stack-name cfn-test12
levanter:cfn sean$ 

Related: Is upgrading Amazon RDS like a sh*t storm that will not end?

Conclusions

Terraform just supports JSON or it’s HCL (hashicorp configuration language). Actually the latter way of formatting is better supported.

On the CloudFormation side you can use yaml or json.

However CloudFormation can be clunky and frustrating to work with. For example to dry-run in terraform is easy. Just use “plan”. And isn’t something we’re going to do over and over?

In CloudFormation there is a “validate-template” option, but this just checks your JSON or YAML. It doesn’t hit amazon’s API or test things in any real way. They have added something called Change Sets, but I haven’t tried them too much yet.

Also CloudFormations error messages are really lacking. They often give you a syntax error or tell you a resource is incomplete without real details on where or how. It makes debugging slow and tedious. Sometimes I see errors at create-stack calls. Other times that succeeds only to find errors within the CloudFormation dashboard.

Terraform is wayyyyy better.

Related: Is Amazon Web Services too complex for small dev teams?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

How is automation impacting the dba role?

via GIPHY

I was at a dinner party recently, and talking with some colleagues. I had worked with them years back on Oracle systems.

One colleague Maria said she really enjoyed my newsletter.

Join 38,000 others and follow Sean Hull on twitter @hullsean.

She went on to say how much has changed in the last decade. We talked about how the database administrator, as a career role, wasn’t really being hired for much these days. Things had changed. Evolved a lot.

How do you keep up with all the new technology, she asked?

I went on to talk about Amazon RDS, EC2, lambda & serverless as really exciting stuff. And lets not forget terraform (I wrote a howto on terraform), ansible, jenkins and all the other deployment automation technologies.





We talked about Redshift too. It seems to be everywhere these days and starting to supplant hadoop as the warehouse of choice for analytics.

It was a great conversation, and afterward I decided to summarize my thoughts. Here’s how I think automation and the cloud are impacting the dba role.

My career pivots

Over the years I’ve poured all those computer science algorithms, coding & hardware skills into a lot of areas. Tools & popular language change. Frameworks change. But solid deductive reasoning remains priceless.

o C++ Developer

Fresh out of college I was doing Object Oriented Programming on the Macintosh with Codewarrior & powerplant. C++ development is no joke, and daily coding builds strength in a lot of areas. Turns out he application was a database application, so I was already getting my feet wet with databases.

o Jack of all trades developer & Unix admin

One type of job role that I highly recommend early on is as a generalist. At a small startup with less than ten employees, you become the primary technology solutions architect. So any projects that come along you get your hands dirty with. I was able to land one of these roles. I got to work on Windows one day, Mac programming another & Unix administration & Oracle yet another day.

o Oracle DBA

The third pivot was to work primarily on Oracle. I attended Oracle conferences & my peers were Oracle admins. Interestingly, many of the Oracle “experts” came from more of a business background, not computer science. So to have a more technical foundation really made you stand out.

For the startups I worked with, I was a performance guru, scalability expert. Managers may know they have Oracle in the mix, but ultimately the end goal is to speed up the website & make the business run. The technical nuts & bolts of Oracle DBA were almost incidental.

o MySQL & Postgres

As Linux matured, so did a lot of other open source projects. In particular the two big Open Source databases, MySQL & Postgres became viable.

Suddenly startups were willing to put their businesses on these technologies. They could avoid huge fees in Oracle licenses. Still there were not a lot of career database experts around, so this proved a good niche to focus on.

o RDS & Redshift on Amazon Cloud

Fast forward a few more years and it’s my fifth career pivot. Amazon Web Services bursts on the scene. Every startup is deploying their applications in the cloud. And they’re using Amazon RDS their managed database service to do it. That meant the traditional DBA role was less crucial. Sure the business still needed data expertise, but usually not as a dedicated role.

Time to shift gears and pour all of that Linux & server building experience into cloud deployments & migrating to the cloud.

o Devops, data, scalability & performance

Now of course the big sysadmin type role is usually called an SRE or Devops role. SRE being site reliability engineer. New name but many of the same responsibilities.

Now though infrastructure as code becomes front & center. Tools like CloudFormation & Terraform, plus Ansible, Chef & Jenkins are all quite mature, and being used everywhere.

Checkout your infrastructure code from git, and run terraform apply. And minutes later you have rebuilt your entire stack from bare metal to fully functioning & autoscaling application. Cool!

Related: 30 questions to ask a serverless fanboy

How I’ve steered DBA skills

There’s no doubt that data expertise & management skills are still huge. But the career role of database administrator has evolved quite a bit.

Related: 5 surprising features of Amazon Lambda serverless computing

Pros of automation & managing databases

For DBAs who are looking at the cloud from the old way of doing things, there’s a lot to love about it.

Automation brings repeatability to work & jobs. This is great. It raises the bar & makes us more professional, reducing manual processes & mistakes.

Infrastructure as code is self documenting. It means we have a better idea of day-to-day processes, and can more easily handoff to new folks as we change roles or companies.

Related: Why generalists are better at scaling the web

Cons of automation & databases

However these days cloud, automation & microservices have brought a lot of madness too! Don’t believe me check out this piece on microservice madness.

With microservices you have more databases across the enterprise, on more platforms. How do you restore all at the same time? How do you do point-in-time recovery? What if your managed service goes down?

Migration scripts have become popular to make DDL changes in the database. Going forward (adding columns or tables) is great. But should we be letting our deployment automation roll *BACK* DDL changes? Remember that deletes data right? ๐Ÿ™‚

What about database drop & rebuild? Or throwing databases in a docker container? No bueno. But we’re seeing this more and more. New performance problems are cropping up because of that.

What about when your database upgrades automatically? Remember when you use a managed service, it is build for 1000 users, not one. So if your use case is different you may struggle.

In my experience upgrading RDS was a nightmare. Database as a service upgrades lack visibility. You don’t have OS or SSH access so you can’t keep track of things. You just simply wait.

No longer do we have “zero downtime”. With amazon RDS you have guarenteed downtime upgrades. No seriously.

As the field of databases fragments, we are wearing many more hats. If you like this challenge & enjoy being a generalist, you may feel at home here. But it is a long way from one platform one skill set career path.

Also fragmented db platforms means more complex recovery. I can’t stress this enough. It would become practically impossible to restore all microservices, all their underlying databases & all systems to one single point in time, if you need to.

Related: Is upgrading Amazon RDS like a sh*t storm that will not end?

DBAs, it’s time to step up and pivot

As the DBA role evolves, it also brings great opportunity. For those with solid database & data skills are sorely in need at startups and many fortune 500 organizations.

What I’m seeing is that organizations have lost much of the discipline they had as separate dba or operations departments. Schemaless databases have proliferated, and performance has suffered.

All these are more complex now, but strong DBA, performance & troubleshooting skills are needed now more than ever.

Related: The art of resistance in tech consulting

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

How do I migrate my skills to the cloud?

via GIPHY


Hi, I’m currently an IT professional and I’m training for AWS Solutions Architect – Associate exam. My question is how to gain some valuable hands-on experience without quitting my well-paying consulting gig I currently have which is not cloud based. I was thinking, perhaps I could do some cloud work part time after I get certified.

Join 38,000 others and follow Sean Hull on twitter @hullsean.


I work in the public sector and the IT contract prohibits the agency from engaging any cloud solutions until the current contract expires in 2019. But I can’t just sit there without using these new skills – I’ll lose it. And if I jump ship I’ll loose $$$ because I don’t have the cloud experience.


Hi George,

Here’s what I’d suggest:

1. Setup your AWS account

A. open aws account, secure with 2FA & create IAM roles

First things first, if you don’t already have one, go signup. Takes 5 minutes & a credit card.

From there be sure to enable two factor authentication. Then stop using your root account! Create a new IAM user with permissions to command line & API. Then use that to authenticate. You’ll be using the awscli python package.

Also: Is Amazon too big to fail?

2. Automatic deployments

B. plugin a github project
C. setup CI & deployment
D. get comfy with Ansible

Got a pet project on github? If not it’s time to start one. ๐Ÿ™‚

You can also alternatively use Amazon’s own CodeCommit which is a drop-in replacement for github and works fine too. Get your code in there.

Next setup codedeploy so that you can deploy that application to your EC2 instance with one command.

But you’re not done yet. Now automate the spinup of the EC2 instance itself with Ansible. If you’re comfortable with shell scripts, or other operational tools, the learning curve should be pretty easy for you.

Read: Is AWS too complex for small dev teams? The growing demand for Cloud SRE

3. Clusters

E. play around with kubernetes or docker swarm

Both of these technologies allow you to spinup & control a fleet of containers that are running on a fixed set of EC2 instances. You may also use Amazon ECS which is a similar type of offering.

Related: How to deploy on EC2 with Vagrant

4. Version your infrastructure

F. use terraform or cloudformation to manage your aws objects
G. put your terraform code into version control
H. test rollback & roll foward infrastructure changes

Amazon provides CloudFormation as it’s foundational templating system. You can use JSON or YAML. Basically you can describe every object in your account, from IAM users, to VPCs, RDS instances to EC2, lambda code & on & on all inside of a template file written in JSON.

Terraform is a sort of cloud-agnostic version of the same thing. It’s also more feature rich & has got a huge following. All reasons to consider it.

Once you’ve got all your objects in templates, you can checkin these files into your git or CodeCommit repository. Then updating infrastructure is like updating any other pieces of code. Now you’re self-documenting, and you can roll-forward & backward if you make a mistake!

Related: How I use terraform & composer to automate wordpress on AWS

5. Learn serverless

I. get familiar with lambda & use serverless framework

Building applications & deploying only code is the newest paradigm shift happening in cloud computing. On Amazon you have Lambda, on Google you have Cloud Functions.

Related: 30 questions to ask a serverless fanboy

6. Bonus: database skills

J. Learn RDS – MySQL, Postgres, Aurora, Oracle, SQLServer etc

For a bonus page on your resume, dig into Amazon Relational Database Service or RDS. The platform supports various databases, so try out the ones you know already first. You’ll find that there are a few surprises. I wrote Is upgrading RDS like a sh*t storm that will not end?. That was after a very frustrating weekend upgrading a customers production RDS instance. ๐Ÿ™‚

Related: Is Amazon about to disrupt your data warehouse?

7. Bonus: Data warehousing

K. Redshift, Spectrum, Glue, Quicksight etc

If you’re interested in the data side of the house, there is a *LOT* happening at AWS. From their spectrum technology which allows you to keep most of your data in S3 and still query it, to Glue which provides an ETL as a service offering.

You can also use a world-class columnar storage database called Redshift. This is purpose built for reporting & batch jobs. It’s not going to meet your transactional web-backend needs, but it will bring up those Tableau reports blazingly fast!

Related: Is Amazon about to disrupt your data warehouse?

8. Now go find that cloud deployment job!


With the above under your belt there’s plenty of work for you. There is tons of demand right now for this stuff.

Did you do learn all that? You’ve now got very very in-demand skills. The recruiters will be chomping at the bit. Update those buzzwords (I mean keywords). This will help match you with folks looking for someone just like you!

Related: Why I don’t work with recruiters

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Why would I help a customer that’s not paying? The reason might surprise you

via GIPHY

I just received an email. It was from a woman building a website, and wanted help with AWS. She wondered if I might be able to provide any assistance.

Join 37,000 others and follow Sean Hull on twitter @hullsean.

Having a popular publicly facing blog, I get a lot of leads that seem to come out of thin air. This is the good problem of publicity. ๐Ÿ™‚

I followed up with her and asked what she was building. “Nothing” she explained, I just want to learn about AWS. I was a little confused at first, but as we talked further, it seemed she was just beginning to branch out onto the wild world of the internet, and didn’t know where to start.

I explained that to build an e-commerce site, she could use a service like Shopify, and would likely not need to use AWS directly, and certainly wouldn’t have to learn it. That might take five to ten years learning computing first!

I realized I was telling her she didn’t need the services of someone like me, and further giving her half of a solution. Though I couldn’t help her build a product, the information could surely help her sell it.

Then I thought to myself, why would I do that? Why give away your time & advice for free?

1. Find time to followup

LESSON: a quick call is always worthwhile networking

Yep it’s true, I’ve learned over the years it’s always worth your time for a quick call. I even talk to recruiters on occasion though I don’t work with them.

You’d be surprised how often you learn from someone, especially when they don’t work in your domain. You learn from the way they frame questions, how others might view or search for you. You learn how better to explain & sell your services to future customers too.

Also: When clients don’t pay

2. Be helpful

LESSON: Provide some real help or value

In a call like this one, it costs me very little to “drop some knowledge” as the cool kids like to say. ๐Ÿ™‚ Sure my time is worth something, and yes I’m giving something away for free. But in this case it was someone who currently doesn’t have the budget for my services so isn’t my target audience anyway.

Read: When you have to take the fall

3. Pay forward

LESSON: Always be networking

Be patient. As Keith Ferrazzi likes to say “Never Eat Alone”! I’ve taken hundreds of calls like this one over the years, and some later get funded & call me back. They’re eager to put me to work, already sold on my integrity & personality.

What’s more she may run in different circles than I do, bump into a colleague or recommend me at some point. If your openness really stands out, it’ll leave a memorable impression long into the future.

In a place like New York where we’re often singularly focused on profit & personal gain, it’s easy to stand out by a small act of kindness.

Related: A look at the serverless hype cycle

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Is maintenance as sexy as innovation?

via GIPHY

A recent NYT piece on our aging american infrastructure got me thinking. It seems that roads, bridges, airports & city sewer systems are all in need of repair. Sadly as budgets to maintain these systems in good repair are often short, they become larger problems to fix as their status becomes critical.

Join 37,000 others and follow Sean Hull on twitter @hullsean.

“Americans have an impoverished and immature conception of technology, one that fetishizes innovation as a kind of art and demeans upkeep as a mere drudgery.”

I’m not sure this is an American-only phenomenon. However I do see it a lot with technology companies & startups.

1. Do we have to manage ops in the cloud?

The cloud has enabled infrastructure automation in some pretty phenomenal ways. Code pipelines can deliver changes to a repo, through automated unit testing, and out to customers all without human intervention. This makes teams more agile, and ultimately businesses faster & more profitable.

We might be distracted enough to stop worrying about operations altogether. After all Amazon knows how to manage broken servers & alert us right? I write do we have to manage operations in the cloud previously, as this sentiment seems to be growing.

Modern applications have a ton of interdependencies. Even with decent integration testing, the full stack is complex, and requires monitoring. Co-tenancy can complicate your performance tuning efforts as neighboring customers may directly affect your application. Third party services may be delivered from smaller or less experienced companies, whose SLA may be limiting besides. And hey if Amazon goes down, I can just tell my customers it was their fault, right?

Also: Is Amazon too big to fail?

2. Do you know Dustin Moskovitz?

Chances are I’m guessing you’ll say no. He was part of the original Facebook team alongside Zuckerberg. You don’t know his name? He had the sexy job of, you guessed it maintenance! He was the operations guy. Did he write the application code? More than likely he knew that code very well as he had to fix & maintain it. Along with the infrastructure to scale & support Facebook’s massive growth.

Read: Is AWS too complex for small dev teams? The growing demand for Cloud SRE

3. Is a little technical debt ok?

Ward Cunningham has an excellent interview about technical debt. Is a little bit ok? Maybe. But each amount is kicking the can down the road. As the NYT article on maintenance makes clear, you can move the responsibility on to the next administration, the next term, or someone else, but eventually you’ll have a critical problem on your hands, which will be much more expensive to fix.

Related: How to build an operational datastore on Amazon Redshift with S3

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

What does the fight between palantir & nypd mean for your data?

via GIPHY

In a recent buzzfeed piece, NYPD goes to the mat with Palantir over their data. It seems the NYPD has recently gotten cold feet.

Join 35,000 others and follow Sean Hull on twitter @hullsean.

As they explored options, they found an alternative that might save them a boatload of money. They considered switching to an IBM alternative called Cobalt.

And I mean this is Silicon Valley, what could go wrong?

Related: Will SQL just die already?

Who owns your data?

In the case of Palantir, they claim to be an open system. And of course this is good marketing. Essential in fact to get the contract. Promise that it’s easy to switch. Don’t dig too deep into the technical details there. According to the article, Palantir spokeperson claims:

“Palantir is an open platform. As with all our customers, their data & analysis are available to them at all times in an open & nonproprietary format.”

And that does appear to be true. What appears to be troubling NYPD isn’t that they can’t get the analysis, for that’s available to them in perpetuity. Within the Palantir system. But getting access to how the analysis is done, well now that’s the secret sauce. Palantir of course is not going to let go of that.

And that’s the devil in the details when you want to switch to a competing service.

Also: Top serverless interview questions for hiring aws lambda experts

Who owns the algorithms?

Although the NYPD can get their data into & out of the Palantir system easily, that’s just referring to the raw data. That’s the data they ingested in the first place, arrest records, license plate reads, parking tickets, stuff like that.

“This notion of how portable your data is when you engage in a contract with a platform is really, really complex, and hasnโ€™t really been tested” – Tal Klein

Palantir’s secret sauce, their intellectual property, is finding the needle in the haystack. What pieces of data are relevant & how can I present the detectives the right information at the right time.

Analysis *is* the algorithms. It’s the big data 64 million dollar question. Or in this case $3.5 million per year, as the contract is reported to be worth!

Related: Which engineering roles are in greatest demand?

The nature of software as a service

The web is bringing us great platforms, like google & amazon cloud. It’s bringing a myriad of AI solutions to our fingertips. Palantir is providing a push button solution to those in need of insights like the NYPD.

The Cobalt solution that IBM is offering goes the other way. Build it yourself, manage it, and crucially control it. And that’s the difference.

It remains to be seen how the rush to migrate the universe of computing to Amazon’s own cloud will settle out. Right now their in a growth phase, so it’s all about lowering prices. But at some point their market muscle will mean they can go the Oracle route a la Larry Ellison. That’s why customers start feeling the squeeze.

If the NYPD example is any indication, it could get ugly!

Read: Can on-demand consulting save startups time & money?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters