Does migrating to the cloud require a mindset change of your team?

via GIPHY

We’ve all heard the success stories at firms that have grappled with automation. The dividends are legendary.

Take Amazon themselves for example. By decoupling their teams, allowing each to grow independently and at their own pace, they’ve been able to scale massively.

Join 38,000 others and follow Sean Hull on twitter @hullsean.

One look at the AWS dashboard these days, or their wikipedia page, reveals over 90 services on offer. And each of those is growing and expanding by day.

I’ve worked with a lot of startups, trying to get there. They’ve heard the gospel, and want to gain the benefits themselves.

Here are the challenges I’ve found.

1. Building ain’t easy

One example story was building an ELK box. ELK is elasticsearch, Logstash and Kibana. It provides a centralized place to send all your application & service logs, collect them all together on one dashboard. It’s the business intelligence of devops & software development. Super valuable tool.

In building our solution, we took a marketplace AMI off the shelf, and then customized that. After building the terraform code to spinup the server, we added Ansible scripts to further customize. This allowed us to add a cronjob for backups, set a password, add additional logstash configs, and a few other important housekeeping tasks.

All was great until we hit a snag, we found some CloudWatch logs were not making there way into ELK. Digging through the log messages, we eventually uncovered an error. And that was caused by a conflicting port configuration. So we removed that unused in logstash.conf, and problem solved.

Later, we rebuilt the server and that was pretty quick. Having all the scripts in place, meant we could rebuild quickly. In this case we just needed to resize the root volume by 25x to make room for future logs. This was 3 lines of terraform code and then done!

A couple of weeks later however, we found missing logs again. Digging digging digging, and then we finally discover it is a repeat of our old problem! Turns out the change to logstash.conf never got rolled into the automation scripts. It was done manually! Bad bad!

Moral of the story, with automation, your workflow needs to change. You should *always be working on the scripts* and then reapplying those. Never work on the server directly!

Time to eat my own dogfood!

Related: Is AGILE right for fixing performance issues?

2. Troubleshooting is tough

In the automation universe, as I wrote above, you really want to avoid logging into servers and doing things manually. But that may be easier said than done.

Take another example, I had an ssh key distribution script. I repurposed from the Terraform Community Modules. It works great when it works. It gets injected onto the server at boot time, by terraform inside the user-data script.

The code gets added to cron, and relies on awscli. As it turns out awscli is *not* on all of the aws linux images. Who knows why?!? But that’s where we are.

Should be easy to install. Use yum to get pip (python package manager) installed. Then use pip to install awscli. The script even has *both* yum and apt-get commands to attempt to install pip on either ubuntu or amzn linux. Problem is sometimes it doesn’t. Sometimes? You ask. Yes indeed.

Digging further, it seems that the new pip package gets installed in /usr/local/bin, while it used to install in /usr/bin/. Seems simple. Add a symlink. Yeah did that. Sometimes the package has a different name, such as python-pip3. Great!

Now all this is magnified because you can’t just go on the box and go through the steps. Why? Because in the primordial cromagnon universe that is linux server boot time, sometimes things happen in weird orders, or slower. So you may have something missing during that period, that is later available. So after boot you see no errors.

Yes complicated. Yes you need to build, destory, build destroy the server in endless cycles.

At the next level of automation, we will implement infrastructure testing pipeline. This will automatically build the server for you. The infrastructure unit testing framework seems pretty darn cool. And there is also Gruntworks Terratest.

Related: Is automation killing old-school operations?

3. The dividend is agility

What have i seen in terms of agility?

Well moving our application to a new region takes 20 minutes. Crazy as that sounds, from vpc, to 3 private subnets, 3 public subnets, bastion boxes, load balancers, rds & redis instances, security groups, ingress rules, iam roles, users, s3 buckets, ecs cluster, and various ec2 instances, route 53 zones & cnames, plus even EIPs all can be moved with a few simple code changes. Wow!

What else? We can resize our ELK box root volume by deploying a brand new setup, all in about ten minutes.

This kind of speed is so exciting. It brings repeatability to your engineering processes. It brings confidence to all of those components.

And best of all it allows the business to experiment with new product ideas, and accelerate in the marketplace.

And we all know what that means!

Related: I have a new appreciation for AGILE

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

What types of management problems plague startups?

via GIPHY

Being an avid reader of Fred Wilson’s AVC, I’ve learned much over the years. And one thing he underscores is that *ideas* are a dime a dozen. And that great investments are in team & execution.

Join 38,000 others and follow Sean Hull on twitter @hullsean.

As a long time consultant, I’ve had the good fortune to see a lot of startups under the microscope. If you work a FT role for 2 years, over a decade you may work at 5 companies. In the same amount of time, i’ve worked at over 65 companies.

In those years I’ve encountered great teams that are super organized, and continue to move the product forward. But I have also seen a number of symptoms, that caused the business problems, and slowed down their march forward.

Low morale

One firm I worked at a few years back was in the space around education, specifically with a lot of microlearning products, with big customers doing corporate onboarding.

Their sales team was world class, closing bigger and bigger deals, but engineering had terrible and festering problems. As it went, they grew to have hundreds of employees in a matter of a year or two. Meanwhile the CTO was not a big people person. He didn’t like speaking in front of large groups, nor was he very hands on. As a small ten person startup he was super technical and talented, but as the company grew so fast, it left a leadership vacuum.

And then some bad hires grew the engineer team fast. But internally there was a lot of infighting. The original founding team worked hard and had strong direction, but the new hires all vied for control. And the ugly personalities reared their heads.

After a few short weeks, half the engineering team quit, in a matter of days. A tough blow to a team already struggling to keep up with growth.

It is not easy to right a large plane in mid flight like that, carrying plenty of technical debt besides.

Related: A CTO must never do this

Bad alignment

Another place I had the pleasure of working at was a well known digital media brand, that expanded into film production, recording and even investigative reporting. For all it’s wide ranging efforts, it presided over a huge growth business, with seemingly unlimited revenue. Impressive to be sure.

On the technology side, however things were not so sunny. As their business grew, they planned to consoldate data from many disparate divisions. And this is a process that many growing businesses go through. Finance in one platform & database, bookings & production in another, while analytics and viewer statistics in yet a third. But how to report on all of that data?

As a special crack team of big data experts, we were assigned the task of building out this centralized repository of business truth. And as we built and architect that system, we needed to work closely with the operations division.

Now in this business, they were using public cloud, Amazon Web Services like many other startups. However they had a separate team of devops who presided over these accounts.

As our team was handed strict deadlines to deliver working reports & systems, we had conference calls with the Devops team. However that team was not on board with those deadlines. They pushed back and claimed such systems would take months to setup.

As we explained expectations being pushed on our shoulders, Devops said “just push back and say no”. They advised that we “send it back up the chain”

But what if there’s a chink in the chain?

Clearly the two teams were not aligned at all on deadlines & deliverables. And that’s not a fault of either of those teams. It straightaway falls in the lap of management to align those.

And we were somehow stuck in the middle. Ugh!

Related: How to avoid legal trouble in consulting

Loose discipline

One startup I worked at had a security and authentication app.

Here teams were fairly happy on the whole. In fact they raved about having a great boss. Indeed the boss was a very kind leader, understanding, patient, and hardworking.

However, over and over, we lacked a “decider”. Here other team members were giving each other tasks. Promises were made loosely, and then forgotten one or two weeks later. And a constant lack of direction dragged down delivery.

For my money, a promise to have a meeting at 10am, is one all parties should abide. Whatever their level in the business. Not be late, have excuses about trains, or simply skip the meeting with no explanation. These types of habits cause the team to grow weary, and lower the bar of expectations.

Frustrating indeed.

Related: When you have to take the fall

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Before you do infrastructure as code, consider your workflow carefully

via GIPHY

What happens with infrastructure as code when you want to make a change on prod?

I’ve been working on automation for a few years now. When you build your cloud infrastructure with code, you really take everything to a whole new level.

Join 38,000 others and follow Sean Hull on twitter @hullsean.

You might be wondering, what’s the day to day workflow look like? It’s not terribly different from regular software development. You create a branch, write some code, test it, and commit it. But there are some differences.

1. Make a change on production

In this scenario, I have development branch reference the repo directly on my laptop. There is a module, but it references locally like this:

source = “../”

So here are the steps:

1. Make your change to terraform in main repo

This happens in the root source directory. Make your changes to .tf files, and save them.

2. Apply change on dev environment

You haven’t committed any changes to the git repo yet. You want to test them. Make sure there are no syntax errors, and they actually build the cloud resources you expect.

$ terraform plan
$ terraform apply

fix errors, etc.

3. Redeploy containers

If you’re using ECS, your code above may have changed a task definition, or other resources. You may need to update the service. This will force the containers to redeploy fresh with any updates from ECR etc.

4. Eyeball test

You’ll need to ssh to the ecs-host and attach to containers. There you can review env, or verify that docker ps shows your new containers are running.

5. commit changes to version control

Ok, now you’re happy with the changes on dev. Things seem to work, what next? You’ll want to commit your changes to the git repo:

$ git commit -am “added some variables to the application task definition”

6. Now tag your code – we’ll use v1.5

$ git tag -a v1.5 -m “added variables to app task definition”
$ git push origin v1.5

Be sure to push the tag (step 2 above)

7. Update stage terraform module to use v1.5

In your stage main.tf where your stage module definition is, change the source line:

source = “git::https://github.com/hullsean/infra-repo.git?ref=v1.5”

8. Apply changes to stage

$ terraform init
$ terraform plan
$ terraform apply

Note that you have to do terraform init this time. That’s because you are using a new version of your code. So terraform has to go and fetch the whole thing, and cache it in .terraform directory.

9. apply change on prod

Redo steps 7 & 8 for your prod-module main.tf.

Related: When you have to take the fall

2. Pros of infrastructure code

o very professional pipeline
o pipeline can be further automated with tests
o very safe changes on prod
o infra changes managed carefully in version control
o you can back out changes, or see how you got here
o you can audit what has happened historically

Related: When clients don’t pay

3. Cons of over automating

o no easy way to sidestep
o manual changes will break everything
o you have to have a strong knowledge of Terraform
o you need a strong in-depth knowledge of AWS
o the whole team has to be on-board with automation
o you can’t just go in and tweak things

Related: Why i ask for a deposit

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

I have a new appreciation for Agile and not because it worked

via GIPHY

A couple of years ago I worked at a startup in the online publishing space. But this story isn’t about online publishing. This story is about deadlines, and not meeting them.

For those who don’t know me, I’m a professional services consultant. I’ve worked on a project basis for 90% of two plus decades of my professional career. I mentioned this to give my opinions and perspectives some context. Although I’m not a manager, I’ve worked under more managers than most. Because I work on 5-10 projects per year, that comes to close to two hundred in my career.

Join 38,000 others and follow Sean Hull on twitter @hullsean.

My career path has given me a unique perspective, on teams, leadership and motivation. Of course like anyone who’s worked in tech, I’ve see a lot of agile.

Daily standups are de rigueur of course. As are breaking up your work into two week sprints, and assigning story points to those tasks.

I have to admit there are times when Agile seemed the most fashionable way I saw teams working. And I guess I believe it was more trendy than functional.

Doesn’t everybody already work off tight todo lists, break tasks down into manageable pieces, and always deliver what they promised? It may be because I’ve spent a lot of time managing myself, and surviving as a freelancer that I’ve picked up some of these habits. But I digress…

1. Crunch time

While we as a team had been working on two week sprints, something happened to derail us. Suddenly a major customer was in jeopardy, because we had not delivered on sales promises.

Although what was being promised, we *could* build, we were stumbling over the details.

As an emergency plan, we dropped our current sprint tasks, and marshaled our forces towards this new goal. Everyone on the team knew the customer was hanging by a thread.

Related: When you have to take the fall

2. We need to deliver production by Friday

While this edict looks great on paper, it didn’t work out so well. Developers and ops weren’t clear which systems were included in “production”, and how they needed to be connected together.

Each engineer ended up interpreting the goal in their own way, often assuming that others were shouldering ultimate responsibility of delivery.

“Well I did my part, this other part of the team is responsible for those other pieces…” was the refrain I heard. Sadly the deadline was missed, and everything was a mess. Only after management later dug in and sifted through the rubble, did they actually break up the tasks, assign story points, and give each engineer actionable items to work on.

Related: When clients don’t pay

3. Please work together to make that happen

This doesn’t work because “production” is not an actionable item.

Actionable is much more specific
o deploy this container
o setup these five servers
o fix these three bugs and push code
o setup these new environment variables

Why is “production” not specific?

Which application? Which layer or API must work? Are there intervening steps before production will work? Are their individual tasks for each engineer?

To me you need to “break things down” further if you have tasks that span multiple people, and multiple sessions. I think of a session as a 2-3 hour bucket of productive work. For me it is also the length of time my laptop battery takes to drain from 100% to 0%. When that happens I know I need a break.

And I know that’s how I get chunks of work done.

So to take this to a more specific level, if Friday is 5 work days away, I figure I have 12 increments of work I can do in that time, and if I have 3 engineers, then I need 36 chunks of work.

If you assign 36 chunks of work I believe you are much more likely to get 25-30 of those chunks done.

If you stick to the one macro goal of “get production to work”, engineers may secretly drop the ball figuring, well there are dependent tasks that are not done, so we’re not gonna get there. And also since the goal points at everyone, nobody saddles the failure.

Whereas if you have 36 chunks of work, individual engineers are less likely to drag their feet because it’ll be clear that the hold up was three of john’s tasks…

Related: Why i ask for a deposit

All of this gave me a new appreciation for Agile. It truly does keep teams on track, and focuses everyone on actionable work. This allows management more transparency on whats working, and engineers the feedback they need to get to the finish line.

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

How to avoid legal problems in consulting

via GIPHY

I posted a newsletter recently entitled “When Clients Don’t Pay”.

I got a lot of responses in email, which is always encouraging. I’m happy to know that folks are reading and getting something out of my ideas.

One colleague suggested that I modify my last point about going to court. He suggested that legal action does make sense after other avenues are exhausted.

Join 38,000 others and follow Sean Hull on twitter @hullsean.

My feeling about avoiding court, has only grown stronger over the years.

There are usually only a few reasons a customer won’t pay. In my experience each of them are avoidable without going to court.

Here are my thoughts on those…

1. Misaligned on tasks, deliverables or deadlines

I find weekly progress reports and endless notes go a long way towards avoiding this problem. If it does arise, there is usually something specific in those notes that can be remedied.

One also needs to be willing to compromise. Putting yourself in the other’s shoes will help to understand their perspective.

Communicate, communicate, and communicate more!

Related: When you have to take the fall

2. Budget problems

Here there isn’t a lot to do anyway. Although companies are obligated to meet payroll by law, they are not so with vendors. If they are out of cash, will court really resolve that?

My way of heading off this problem is, billing/invoicing in smaller increments, getting a deposit, and keep on top of things, so larger debts don’t build up.

Related: The fine art of resistance

3. Shady customers

These I usually suss out well before becoming engaged. I’ve had a few incidents where a prospect was meeting me to get “free advice”. They ask a lot of architectural questions, and take careful notes. Then don’t engage, or use their own people to implement.

One situation in particular I remember was around scalability. The product was a website & app for teachers. From the beginning they built it to sync data instantly. As they got bigger and more customers used the platform, their servers became heavily loaded.

I suggested, instead of looking for a technical solution, why not offer your customers, silver, bronze & gold service levels. For the gold customers, yeah they get their own servers, and can sync all the time. But for the silver ones, once-a-day would probably suffice. Much less load on the servers, because 75% of customers would go silver, 20% bronze and 5% gold.

They actually ran with the idea and implemented it, but never hired me even for an hour of work. I knew they implemented it because I had a friend in the company. It is experiences like that which teach you quite a lot about business and about how you conduct yourself.

This has happened a few times, and I guess it’s part of doing business. But usually that comes out before we go much further, so in a sense it’s a blessing in disguise. 🙂

Related: How to hire a developer that doesn’t suck

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

What are the key aws skills and how do you interview for them?

via GIPHY

Whether you’re striving for a new role as a Devops engineer, or a startup looking to hire one, you’ll need to be on the lookout for specific skills.

Join 38,000 others and follow Sean Hull on twitter @hullsean.

I’ve been on both sides of the fence, at times interviewing candidates, and other times the candidate looking to impress to win a new role.

Here are my suggestions…

Devops Pipeline

Jenkins isn’t the only build server, but it’s been around a long time, so it’s everywhere. You can also do well with CircleCI or Travis. Or even Amazon’s own CodeBuild & CodePipeline.

You should also be comfortable with a configuration management system. Ansible is my personal favorite but obviously there is lots of Puppet & Chef out there too. Talk about a playbook you wrote, how it configures the server, installs packages, edits configs and restarts services.

Bonus points if you can talk about handling deployments with autoscaling groups. Those dynamic environments can’t easily be captured in static host manifests, so talk about how you handle that.

Of course you should also be strong with Git, bitbucket or codecommit. Talk about how you create a branch, what’s gitflow and when/how do you tag a release.

Also be ready to talk about how a code checkin can trigger a post commit hook, which then can go and build your application, or new infra to test your code.

Related: How to avoid insane AWS bills

CloudFormation or Terraform

I’m partial to Terraform. Terraform is MacOSX or iPhone to CloudFormation as Android or Windows. Why do I say that? Well it’s more polished and a nicer language to write in. CloudFormation is downright ugly. But hey both get the job done.

Talk about some code you wrote, how you configured IAM roles and instance profiles, how you spinup an ECS cluster with Terraform for example.

Related: How best to do discovery in cloud and devops engagements?

AWS Services

There are lots of them. But the core services, are what you should be ready to talk about. CloudWatch for centralized logging. How does it integrate with ECS or EKS?

Route53, how do you create a zone? How do you do geo load balancing? How does it integrate with CertificateManager? Can Terraform build these things?

EC2 is the basic compute service. Tell me what happens when an instance dies? When it boots? What is a user-data script? How would you use one? What’s an AMI? How do you build them?

What about virtual networking? What is a VPC? And a private subnet? What’s a public subnet? How do you deploy a NAT? WHat’s it for? How do security groups work?

What are S3 buckets? Talk about infraquently accessed? How about glacier? What are lifecycle policies? How do you do cross region replication? How do you setup cloudfront? What’s a distribution?

What types of load balancers are there? Classic & Application are the main ones. How do they differ? ALB is smarter, it can integrate with ECS for example. What are some settings I should be concerned with? What about healthchecks?

What is Autoscaling? How do I setup EC2 instances to do this? What’s an autoscaling group? Target? How does it work with ECS? What about EKS?

Devops isn’t about writing application code, but you’re surely going to be writing jobs. What language do you like? Python and shell scripting  are a start. What about Lambda? Talk about frameworks to deploy applications.

Related: Are you getting good at Terraform or wrestling with a bear?

Databases

You should have some strong database skills even if you’re not the day-to-day DBA. Amazon RDS certainly makes administering a bit easier most of the time. But upgrade often require downtime, and unfortunately that’s wired into the service. I see mostly Postgresql, MySQL & Aurora. Get comfortable tuning SQL queries and optimizing. Analyze your slow query log and provide an output.

Amazon’s analytics offering is getting stronger. The purpose built Redshift is everywhere these days. It may use a postgresql driver, but there’s a lot more under the hood. You also may want to look at SPectrum, which provides a EXTERNAL TABLE type interface, to query data directly from S3.

Not on Redshift yet? Well you can use Athena as an interface directly onto your data sitting in S3. Even quicker.

For larger data analysis or folks that have systems built around the technology, Hadoop deployments or EMR may be good to know as well. At least be able to talk intelligently about it.

Related: Is zero downtime even possible on RDS?

Questions

Have you written any CloudFormation templates or Terraform code? For example how do you create a VPC with private & public subnets, plus bastion box with Terraform? What gotches do you run into?

If you are given a design document, how do you proceed from there? How do you build infra around those requirements? What is your first step? What questions would you ask about the doc?

What do you know about Nodejs? Or Python? Why do you prefer that language?

If you were asked to store 500 terrabytes of data on AWS and were going to do analysis of the data what would be your first choice? Why? Let’s say you evaluated S3 and Athena, and found the performance wasn’t there, what would you move to? Redshift? How would you load the data?

Describe a multi-az VPC setup that you recommend. How do you deploy multiple subnets in a high availability arragement?

Related: Why generalists are better at scaling the web

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

I tried to build infrastructure as code Terraform and Amazon. It didn’t go as I expected.

via GIPHY

As I was building infrastructure code, I stumbled quite a few times. You hit a wall and you have to work through those confusing and frustrating moments.

Join 38,000 others and follow Sean Hull on twitter @hullsean.

Here are a few of the lessons I learned in the process of building code for AWS. It’s not easy but when you get there you can enjoy the vistas. They’re pretty amazing.

Don’t pass credentials

As you build your applications, there are moments where components need to use AWS in some way. Your webserver needs to use S3 or your ELK box needs to use CloudWatch. Maybe you want to do an RDS backup, or list EC2 instances.

However it’s not safe to pass your access_key and secret_access_key around. Those should be for your desktop only. So how best to handle this in the cloud?

IAM roles to the rescue. These are collections of privileges. The cool thing is they can be assigned at the INSTANCE LEVEL. Meaning your whole server has permissions to use said resources.

Do this by first creating a role with the privileges you want. Create a json policy document which outlines the specific rules as you see fit. Then create an instance profile for that role.

When you create your ec2 instance in Terraform, you’ll specify that instance profile. Either by ARN or if Terraform created it, by resource ID.

Related: How to avoid insane AWS bills

Keep passwords out of code

Even though we know it should not happen, sometimes it does. We need to be vigilant to stay on top of this problem. There are projects like Pivotal’s credential scan. This can be used to check your source files for passwords.

What about something like RDS? You’re going to need to specify a password in your Terraform code right? Wrong! You can define a variable with no default as follows:

variable "my_rds_pass" {
  description = "password for rds database"
}

When Terraform comes upon this variable in your code, but sees there is no “default” value, it will prompt you when you do “$ terraform apply”

Related: How best to do discovery in cloud and devops engagements?

Versioning your code

When you first start building terraform code, chances are you create a directory, and some tf files, then do your “$ terraform apply”. When you watch that infra build for the first time, it’s exciting!

After you add more components, your code gets more complex. Hopefully you’ve created a git repo to house your code. You can check & commit the files, so you have them in a safe place. But of course there’s more to the equation than this.

How do you handle multiple environments, dev, stage & production all using the same code?

That’s where modules come in. Now at the beginning you may well have a module that looks like this:

module "all-proj" {

  source = "../"

  myvar = "true"
  myregion = "us-east-1"
  myami = "ami-64300001"
}

Etc and so on. That’s the first step in the right direction, however if you change your source code, all of your environments will now be using that code. They will get it as soon as you do “$ terraform apply” for each. That’s fine, but it doesn’t scale well.

Ultimately you want to manage your code like other software projects. So as you make changes, you’ll want to tag it.

So go ahead and checkin your latest changes:

# push your latest changes
$ git push origin master
# now tag it
$ git tag -a v0.1 -m "my latest coolest infra"
# now push the tags
$ git push origin v0.1

Great now you want to modify your module slightly. As follows:

module "all-proj" {

  source = "git::https://[email protected]/hullsean/myproj-infra.git?ref=v0.1"

  myvar = "true"
  myregion = "us-east-1"
  myami = "ami-64300001"
}

Cool! Now each dev, stage and prod can reference a different version. So you are free to work on the infra without interrupting stage or prod. When you’re ready to promote that code, checkin, tag and update stage.

You could go a step further to be more agile, and have a post-commit hook that triggers the stage terraform apply. This though requires you to build solid infra tests. Checkout testinfra and terratest.

Related: Are you getting good at Terraform or wrestling with a bear?

Managing RDS backups

Amazon’s RDS service is a bit weird. I wrote in the past asking Is upgrading RDS like a shit-storm that will not end?. Yes I’ve had my grievances.

My recent discovery is even more serious! Terraform wants to build infra. And it wants to be able to later destroy that infra. In the case of databases, obviously the previous state is one you want to keep. You want that to be perpetual, beyond the infra build. Obvious, no?

Apparently not to the folks at Amazon. When you destroy an RDS instance it will destroy all the old backups you created. I have no idea why anyone would want this. Certainly not as a default behavior. What’s worse you can’t copy those backups elsewhere. Why not? They’re probably sitting in S3 anyway!

While you can take a final backup when you destroy an RDS instance, that’s wondeful and I recommend it. However that’s not enough. I highly suggest you take matters into your own hands. Build a script that calls pg_dump yourself, and copy those .sql or .dump files to S3 for safe keeping.

Related: Is zero downtime even possible on RDS?

When to use force_destroy on S3 buckets

As with RDS, when you create S3 buckets with your infra, you want to be able to cleanup later. But the trouble is that once you create a bucket, you’ll likely fill it with objects and files.

What then happens is when you go to do “$ terraform destroy” it will fail with an error. This makes sense as a default behavior. We don’t want data disappearing without our knowledge.

However you do want to be able to cleanup. So what to do? Two things.

Firstly, create a process, perhaps a lambda job or other bucket replication to regularly sync your s3 bucket to your permanent bucket archive location. Run that every fifteen minutes or as often as you need.

Then add a force_destroy line to your s3 bucket resource. Here’s an example s3 bucket for storing load balancer logs:

data "aws_elb_service_account" "main" {}

resource "aws_s3_bucket" "lb_logs" {
  count         = "${var.create-logs-bucket ? 1 : 0}"
  force_destroy = "${var.force-destroy-logs-bucket}"
  bucket        = "${var.lb-logs-bucket}"
  acl           = "private"

  policy = POLICY
{
  "Id": "Policy",
  "Version": "2012-10-17",
  "Statement": [
    {
      "Action": [
        "s3:PutObject"
      ],
      "Effect": "Allow",
      "Resource": "arn:aws:s3:::${var.lb-logs-bucket}/*",
      "Principal": {
        "AWS": [
          "${data.aws_elb_service_account.main.arn}"
        ]
      }
    }
  ]
}
POLICY

  tags {
    Environment = "${var.environment_name}"
  }
}

NOTE: There should be “< <" above and to the left of POLICY. HTML was not having this, and I couldn't resolve it quickly. Oh well.

Related: Why generalists are better at scaling the web

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

How to avoid insane AWS bills

via GIPHY

I was flipping through the aws news recently and ran into this article by Juan Ramallo – I was billed 14k on AWS!

Scary stuff!

Join 38,000 others and follow Sean Hull on twitter @hullsean.

When you see headlines like this, your first instinct as a CTO is probably, “Am I at risk?” And then “What are the chances of this happening to me?”

Truth can be stranger than fiction. Our efforts as devops should be towards mitigating risk, and reducing potential for these kinds of things to happen.

1. Use aws instance profiles instead

Those credentials that aws provides, are great for enabling the awscli. That’s because you control your desktop tightly. Don’t you?

But passing them around in your application code is prone to trouble. Eventually they’ll end up in a git repo. Not good!

The solution is applying aws IAM permissions at the instance level. That’s right, you can grant an instance permissions to read or write an s3 bucket, describe instances, create & write to dynamodb, or anything else in aws. The entire cloud is api configurable. You create a custom policy for your instance, and attach it to a named instance profile.

When you spinup your EC2 instance, or later modify it, you attach that instance profile, and voila! The instance has those permissions! No messy credentials required!

Related: Is Amazon too big to fail?

2. Enable 2 factor authentication

If you haven’t already, you should force 2 factor authentication on all of your IAM users. It’s an extra step, but well well worth it. Here’s how to set it up

Mobile phones support all sorts of 2FA apps now, from Duo, to Authenticator, and many more.

Related: Is AWS too complex for small dev teams?

3. blah blah

Encourage developers to use tools like Pivotal’s Credentials Scan.

Hey, while you’re at it, why not add a post commit hook to your code repo in git. Have it run the credentials scan each time code is committed. And when it finds trouble, it should email out the whole team.

This will get everybody on board quick!

Related: Are we fast approaching cloud-mageddon?

4. Scan your S3 Buckets

Open S3 buckets can be a real disaster, offering up your private assets & business data to the world. What to do about it?

Scan your S3 buckets regularly.

Also you can tie in this scanning process to a monitoring alert. That way as soon as an errant bucket is found, you’re notified of the problem. Better safe than sorry!

Related: Which tech do startups use most?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

How to find freelance work

via GIPHY

I’ve decided to take the plunge, and begin a career as a freelancer. What do you think of services like UpWork? Can I build a business around that?

Join 38,000 others and follow Sean Hull on twitter @hullsean.

There are lots of services that promise the same thing. Headshops too are businesses built around reselling you to customers.

1. Whose relationship?

On those platforms you are a commodity. And further you don’t control the relationship. Upwork becomes your customer.

This is a crucial point. You can’t negotiate additional services or fees, or build on the relationship. Because your customer is UpWork. They control the business they bring to you.

Just remember, your boss/client/customer is the one who writes you a check.

Related: When you have to take the fall

2. Learn sales

If you think you’re not so great at sales, join the club. It’s a real talent, and one everybody is not born with.

But if you want to work for yourself, it’s absolutely crucial. So get practicing!

Related: When clients don’t pay

3. Go to events

The ways i have found, network, meetups, blog weekly and have a newsletter that you send out monthly. Add everyone you ever meet to your newsletter. Write interesting things & appeal to a broad audience. Some receiving your newsletter will not read it but they will see your name pop up in their inbox once a month.

Related: Why i ask for a deposit

4. Expand

As you network, ask others for recommendations. Events, private email lists, single day conferences, forums etc.

Related: Can progress reports help consulting engagementss succeed?

5. Craft an origin story

And don’t forget to tell your story. And tell it well. Craft a memorable origin narrative. Practice & and add or remove things that resonate with people you meet. Even ask people, what do you think about my presentation? Any suggestions? Is it confusing, enticing, exciting?

Related: Why do people leave consulting?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

How to succeed with fixed price projects

via GIPHY

Bidding on projects is an art as much as a science. Exciting a customer, around skills and past successes is as important as being able to see details that haven’t yet materialized.

Join 38,000 others and follow Sean Hull on twitter @hullsean.

So how does one approach this challenge. One way is to steer towards time and materials, and let things evolve in their own way. But that may not always work.

Here are my thoughts on how to navigate a fixed-fee project.

Overhead costs

When thinking about costing of projects, there are a lot of hidden costs. For fulltime folks, there is the cost of overhead around office space, supplies, training, liability & health insurance, retirement, time off and even severance in some cases.

There is also the cost of time, to hire the right team, manage them, and bring all the pieces together to success product out the door.

Lots of intangibles.

Related: Can progress reports help you achieve successful engagements?

Evolving scope

When looking at a project, to come up with a realistic fixed bid, the scope must be carefully considered. If the bridge has two spans at either end and you decide to add one in the middle, does that mean a project of twice the size?

Both the vendor and manager must together attempt to break down the full scope into smaller pieces. Inevitably there will be some amount of emergent tasks and the scope will change and evolve.

Both consultant and customer must be realistic about this. You can call them product features or in the agile universe stories, but at the end of the day when you have many pieces surprises will happen.

The devil is surely in the details!

Related: How best to do discovery in cloud and devops engagements?

Horse Trading Skills

Given that we know things will change, the customer and vendor should plan for change.

If both parties have a realistic perspective, there is the possibility of exchanging original scoped items for emergent or evolving scoped surprises.

That is both need to be comfortable doing some sort of horse trading, to keep the levels balanced. The client then gets some leeway, as does the consultant in deliverables.

It’s not easily, but truly necessary in a fixed priced project. Because a scope never really sits still.

Related: Why do people leave consulting?

Underbidding

Another approach that may work is underbidding to win the project. Here your scope is expected to change, and it becomes a painful process each time. If you are strong on sales, this may work, but you’re sure to get an endless stream of change orders, and many many scrapes and bruises.

Related: Why I ask for a deposit

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters