What are the key aws skills and how do you interview for them?

via GIPHY

Whether you’re striving for a new role as a Devops engineer, or a startup looking to hire one, you’ll need to be on the lookout for specific skills.

Join 38,000 others and follow Sean Hull on twitter @hullsean.

I’ve been on both sides of the fence, at times interviewing candidates, and other times the candidate looking to impress to win a new role.

Here are my suggestions…

Devops Pipeline

Jenkins isn’t the only build server, but it’s been around a long time, so it’s everywhere. You can also do well with CircleCI or Travis. Or even Amazon’s own CodeBuild & CodePipeline.

You should also be comfortable with a configuration management system. Ansible is my personal favorite but obviously there is lots of Puppet & Chef out there too. Talk about a playbook you wrote, how it configures the server, installs packages, edits configs and restarts services.

Bonus points if you can talk about handling deployments with autoscaling groups. Those dynamic environments can’t easily be captured in static host manifests, so talk about how you handle that.

Of course you should also be strong with Git, bitbucket or codecommit. Talk about how you create a branch, what’s gitflow and when/how do you tag a release.

Also be ready to talk about how a code checkin can trigger a post commit hook, which then can go and build your application, or new infra to test your code.

Related: How to avoid insane AWS bills

CloudFormation or Terraform

I’m partial to Terraform. Terraform is MacOSX or iPhone to CloudFormation as Android or Windows. Why do I say that? Well it’s more polished and a nicer language to write in. CloudFormation is downright ugly. But hey both get the job done.

Talk about some code you wrote, how you configured IAM roles and instance profiles, how you spinup an ECS cluster with Terraform for example.

Related: How best to do discovery in cloud and devops engagements?

AWS Services

There are lots of them. But the core services, are what you should be ready to talk about. CloudWatch for centralized logging. How does it integrate with ECS or EKS?

Route53, how do you create a zone? How do you do geo load balancing? How does it integrate with CertificateManager? Can Terraform build these things?

EC2 is the basic compute service. Tell me what happens when an instance dies? When it boots? What is a user-data script? How would you use one? What’s an AMI? How do you build them?

What about virtual networking? What is a VPC? And a private subnet? What’s a public subnet? How do you deploy a NAT? WHat’s it for? How do security groups work?

What are S3 buckets? Talk about infraquently accessed? How about glacier? What are lifecycle policies? How do you do cross region replication? How do you setup cloudfront? What’s a distribution?

What types of load balancers are there? Classic & Application are the main ones. How do they differ? ALB is smarter, it can integrate with ECS for example. What are some settings I should be concerned with? What about healthchecks?

What is Autoscaling? How do I setup EC2 instances to do this? What’s an autoscaling group? Target? How does it work with ECS? What about EKS?

Devops isn’t about writing application code, but you’re surely going to be writing jobs. What language do you like? Python and shell scripting  are a start. What about Lambda? Talk about frameworks to deploy applications.

Related: Are you getting good at Terraform or wrestling with a bear?

Databases

You should have some strong database skills even if you’re not the day-to-day DBA. Amazon RDS certainly makes administering a bit easier most of the time. But upgrade often require downtime, and unfortunately that’s wired into the service. I see mostly Postgresql, MySQL & Aurora. Get comfortable tuning SQL queries and optimizing. Analyze your slow query log and provide an output.

Amazon’s analytics offering is getting stronger. The purpose built Redshift is everywhere these days. It may use a postgresql driver, but there’s a lot more under the hood. You also may want to look at SPectrum, which provides a EXTERNAL TABLE type interface, to query data directly from S3.

Not on Redshift yet? Well you can use Athena as an interface directly onto your data sitting in S3. Even quicker.

For larger data analysis or folks that have systems built around the technology, Hadoop deployments or EMR may be good to know as well. At least be able to talk intelligently about it.

Related: Is zero downtime even possible on RDS?

Questions

Have you written any CloudFormation templates or Terraform code? For example how do you create a VPC with private & public subnets, plus bastion box with Terraform? What gotches do you run into?

If you are given a design document, how do you proceed from there? How do you build infra around those requirements? What is your first step? What questions would you ask about the doc?

What do you know about Nodejs? Or Python? Why do you prefer that language?

If you were asked to store 500 terrabytes of data on AWS and were going to do analysis of the data what would be your first choice? Why? Let’s say you evaluated S3 and Athena, and found the performance wasn’t there, what would you move to? Redshift? How would you load the data?

Describe a multi-az VPC setup that you recommend. How do you deploy multiple subnets in a high availability arragement?

Related: Why generalists are better at scaling the web

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

5 core pieces of the Amazon Cloud puzzle to get your project off the ground

amazon cloud automation

One of the most common engagements I do is working with firms in and around the NYC startup sector. I evaluate AWS infrastructures & applications built in the Amazon cloud.

Join 32,000 others and follow Sean Hull on twitter @hullsean.

I’ve seen some patterns in customers usage of Amazon. Below is a laundry list of the most important ones.

On our products & pricing page you can see more detail including how we perform a performance review and a sample executive summary.

1. Use automation

When you first start using Amazon Web Services to host your application, you like many before you may think of it like you’re old school hosting. Setup a machine, configure it, get your code running. The traditional model of systems administration. It’s fine for a single server, but if you’re managing a more complex deploy with continuous integration, or want to be resilient to regular server failures you need to do more.

Enter the various automation tools on offer. The simplest of the three is Elastic Beanstalk. If you’re using a very standard stack & don’t need a lot of customizations, this may well work for you.

With more complex deployments you’ll likely want to look at Opsworks Sounds familiar? That’s because it *is* Opscode Chef. Everything you can do with Chef & all the templates out there will work with Amazon’s offering. Let AWS manage your templates & make sure your servers are in the right state, just like hosted chef.

If you want to get down to the assembly language layer of infrastructure in Amazon, you’ll eventually be dealing with CloudFormation. This is JSON code which defines everything, from a server with an attached EBS volume, to a VPC with security rules, IAM users & everything inbetween. It is ultimately what these other services utilize under the hood.

Also: Is Amazon too big to fail?

2. Use Advisor & Alerts

Amazon has a few cool tools to help you manage your infrastructure better. One is called Trusted Advisor . This helps you by looking at your aws usage for best practices. Cost, performance, security & high availability are the big focal points.

In order to make best use of alerts, you’ll want to do a few things. First define an auto scaling group. Even if you don’t want to use autoscaling, putting your instance into one allows amazon to do the monitoring you’ll want.

Next you’ll want to analyze your CloudWatch metrics for usage patterns. Notice a spike, could be a job that is running, or it could be a seasonal traffic spike that you need to manage. Once you have some ideas here, you can set alerts around normal & problematic usage patterns.

Related: Are we fast approaching cloud-mageddon?

3. Use Multi-factor at Login

If you haven’t already done so, you’ll want to enable multi-factor authentication on your AWS account. This provides much more security than a password (even a sufficiently long one) can ever do. You can use Google authenticator to generate the mfa codes and associated it with your smartphone.

While you’re at it, you’ll want to create at least one alternate IAM account so you’re not logging in through the root AWS account. This adds a layer of security to your infrastructure. Consider creating an account for your command line tools to spinup components in the cloud.

You can also use MFA for your command line SSH logins. This is also recommended & not terribly hard to setup.

Read: When hosting data on Amazon turns bloodsport

4. Use virtual networking

Amazon offers Virtual Private Cloud which allows you to create virtual networks within the Amazon cloud. Set your own ip address range, create route tables, gateways, subnets & control security settings.

There is another interesting offering called VPC peering. Previously, if you wanted to route between two VPCs or across the internet to your office network, you’d have to run a box within your VPC to do the networking. This became a single point of failure, and also had to be administered.

With VPC peering, Amazon can do this at the virtualization layer, without extra cost, without single point of failure & without overhead. You can even use VPC peering to network between two AWS accounts. Cool stuff!

Also: Are SQL databases dead?

5. Size instances & I/O

I worked with one startup that had been founded in 2010. They had initially built their infrastructure on AWS so they chose instances based on what was available at the time. Those were m1.large & m1.xlarge. A smart choice at the time, but oh how things evolve in the amazon world.

Now those instance types are “previous generation”. Newer instances offer SSD, more CPU & better I/O for roughly the same price. If you’re in this position, be sure to evaluate upgrading your instances.

If you’re on Amazon RDS, you may not be able to get to the newer instance sizes until you upgrade your database. Does upgrading MySQL involve much more downtime on Amazon RDS? In my experience it surely does.

Along with instance sizes, you’ll also want to evaluate disk I/O options. By default instances in amazon being multi-tenant, use disk as a shared resource. So they’ll see it go up & down dramatically. This can kill database performance & can be painful. There are expensive solutions. Consider looking at provisioned IOPS and additional SSD storage.

Also: Is the difference between dev & ops a four-letter word?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Connect to MySQL in the Amazon Public Cloud

MySQL on Amazon Cloud AWS

Troubleshooting MySQL on Amazon can be a real test of patience. There are quite a few different things to watch out for in terms of connectivity & networking. Sometimes a checklist can help.

Join 16,000 others and follow Sean Hull on twitter @hullsean.

Here’s my exhaustive list of things that can block you.

1. Be sure to create users & grants

Chances are you did something like this to create your user:


mysql> CREATE USER ‘sean’@‘localhost’ IDENTIFIED BY ‘password’;
mysql> GRANT ALL PRIVILEGES ON sean_schema.* TO ‘sean’@‘localhost' WITH GRANT OPTION;

But that won’t help you when connecting from a remote Amazon box. So what to do? Here’s an example:


mysql> CREATE USER ‘sean’@’10.10.%’ IDENTIFIED BY ‘password’;
mysql> GRANT ALL PRIVILEGES ON sean_schema.* TO ‘sean’@‘%’ WITH GRANT OPTION;

You may need to make your source IP wildcard *more* aggressive. For example consider ’10.%’. You *may* even with with ‘%’ which allows *all* source IPs. This may sound dangerous, but if you use a tight security group (see item #3 below), you can still be safe.

Related: Why Oracle Won’t Kill MySQL

2. Make sure iptables is not a problem

IPTables is a Linux service that acts like a private firewall for each server. Some AMIs will have it enabled by default. If you’re having trouble like I did, this can definitely trip you up. That’s because your connection will fail silently without telling you, hey the OS won’t let me into that port!

If you are a networking pro you’ve probably already fiddled with iptables. Feel free to add specific rules, and keep it turned on. However I’d recommend just disabling it completely, and using your Amazon security groups to protect your ports.


$ /etc/init.d/iptables stop
$ chkconfig --list iptables
iptables 0:off 1:off 2:on 3:on 4:on 5:on 6:off
$ chkconfig --del iptables
$ chkconfig --list iptables
service iptables supports chkconfig, but is not referenced in any runlevel (run 'chkconfig --add iptables')

Also: Are SQL Databases Dying Out?

3. Test & verify amazon security group settings

Security groups in Amazon can be tricky. I recommend the following:

o create a security group webserver_group
– allow port 80 from 0.0.0.0/0
– allow port 443 from 0.0.0.0/0
– allow port 22 from

o create a security group db_group
– allow port 22 from
– allow 3306 from

What’s happening here? We can’t specify a fixed set of IP addresses because they can change in Amazon. So essentially what we’ve done is say *any* requests from servers in our Amazon package, which are in the webserver_group security group, can connect to port 3306. Pretty cool right?

This means we’re pretty locked down. No internet connections to 3306, so we can be a little looser (see item #1 above) about our grants and source IPs.

What about if you want to use your GUI tools to hit your Amazon hosted MySQL boxes? Say you like to use the Oracle Workbench, Navicat or Toad to connect to MySQL. One way you could do this is configure your db_group to allow 3306 from your office subnet. Then anyone VPN’d into your office will be able to use the tools they like.

Another option is to use Amazon VPC for your servers. You’ll setup an Amazon Virtual Private Gateway, which is a direct VPN connection between Amazon’s datacenter and your datacenter. This can be a messy process, and you’ll want to contact your network admin to help. Once it’s setup, amazon boxes appear to sit on your office or datacenter network. Cool stuff!


$ mysql -h xxx.xxx.xxx.xxx -u admin -p
Enter password:
ERROR 2003 (HY000): Can't connect to MySQL server on 'xxx.xxx.xxx.xxx'

Read this: Why are MySQL experts in such short supply?

4. MySQL network settings

If MySQL is bound to the wrong IP address you can have real problems. First be sure skip_networking is OFF. If it is ON change it in /etc/my.cnf & restart MySQL.


mysql> show variables like 'skip_net%';
+-----------------+-------+
| Variable_name | Value |
+-----------------+-------+
| skip_networking | OFF |
+-----------------+-------+
1 row in set (0.00 sec)

The other MySQL setting that can be problematic is bind-address. First check what it is set to:


$ cat /etc/my.cnf | grep bind
bind-address=127.0.0.1

This isn’t going to allow remote connections. In amazon however, your IP address may change upon reboot. So there is a special setting to allow binding to any IP:


bind-address=0.0.0.0

Related: Bulletproofing MySQL Replication with Checksums

5. installing mysql client & telnet for troubleshooting

You have two options for troubleshooting on the webserver side. If you’re simply trying to check by mysql command line, you may get blocked up if the network settings & security groups aren’t configured right. So use telnet first.


$ yum install -y telnet

$ telnet 10.10.10.1 3306
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
4
5.1.71??gu9Y6B'/y9Oay`QV

If you don't get a responce, it's not an issue with users or grants, but rather that the port isn't opened. Check iptables, check bind-address and check security groups.

Check this: Top MySQL DBA Interview Questions

6. SE Linux related issues

SE Linux will do a lot of good, if managed properly. However if you're not aware of it's existence, it can be very very frustrating. Symptoms can be as abstract as allergies, a cold or flu. It can monitor files, and prevent MySQL from being able to write where it needs to,

Read this: Migrating MySQL to Oracle

7. RPM & later centos yum repo install conflicts

I had real problems doing a custom install for a customer. They didn't want to use a repository for various settings, but preferred downloading RPMs. There were a few other customizations which were tripping things up.

Based on all the connectivity issues I was having, I backed out of the RPM based install, and then ran through a stock yum install. After doing that, I started seeing these weird errors in the mysqld.log

120328 21:32:40 [ERROR] Can't start server: Bind on TCP/IP port: Address already in use
120328 21:32:40 [ERROR] Do you already have another mysqld server running on port: 3306 ?
120328 21:32:40 [ERROR] Aborting
If I run "netstat -nat | grep 3306" in my terminal, I get the following:
tcp4 0 0 *.3306 . LISTEN

I spent hours spinning my wheels and not able to figure out what was happening here. At first it seemed a leftover pid file was the culprit. In the end it appeared the *old* /etc/init.d/mysql script was still in place, and the new yum packages wouldn't work with that.

I ended up just scrapping the whole box, and starting from scratch. Sometimes you have to do that. After a clean build, all was fine.

Related: RDS or MySQL 10 Use Cases

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don't work with recruiters