Categories
All Security

Does Amazon’s security work well for startups?

via GIPHY

I was sifting through my project & progress reports from former clients today. Something struck me loud and clear. It seems 4 out of 5 of them don’t implement VPC best practices.

Join 35,000 others and follow Sean Hull on twitter @hullsean.

Which begs the question again and again, is the service just too damn complicated? I wrote about this topic before… Is aws a bit too complex for most or at least smaller dev teams?

1. No private subnets

What are those you ask? I really hope you’re not asking that.

The best practices way to deploy on amazon is using a vpc. This provides a logical grouping. You could have a dev, stage and prod vpc, and perhaps a utility one for other more permanent services.

Within that VPC, you want to have everything deployed in one or more private subnets. These are each mapped to a specific AZ in that region. The AZ mapps to a physical datacenter, a single building within that region. These private subnets have *NO route to the internet*.

How do you reach resources in the private subnet? You must be coming from the public subnet deployed within that same VPC. All the routing rules enforce this. The two types of resources that would be deployed in public subnet: load balancer for 80/443 traffic, and a jump or bastion box for ssh.

Read: How can 1% of something equal nothing?

2. Security groups with all ports open

Another thing that I see more often than you might guess is all ports open by some wildcard rule. *BAD*. We all know it’s bad, but it happens. And then it gets forgotten. We see developers doing it as a temporary fix to get something working and forget to later plug up the hole.

Even for security groups that don’t have this problem, they often allow port 22 from anywhere on the internet (0.0.0.0). This is unnecessary and rather reckless. Everyone should be coming from known source IPs. This can be an office network, or it can be some other trusted server on the internet. Or a block of IPs that you’ll always have assigned.

And of course don’t have your database port open. MySQL and Postgres don’t have particularly great protections here.

Related: Is Fred Wilson right about dealing in an honest, direct and transparent way?

3. No flowlogs enabled

Flowlogs allow you to log things at the packet level. Want to know about failed ssh attempts? Log that. What to know about other ports? Log that too.

If you are funneling all your connections through a jump box, then you can just enable flowlogs then you can configure your vpc flowlogs monitoring just for that box itself. You may also want to watch what’s happening with the load balancer too.

Flowlogs work at the network interface layer of your VPC, so you’ll need to understand VPCs in depth.

Related: What mistakes did you make when starting as a consultant?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Categories
All Consulting CTO/CIO Security

How do we secure an existing aws hosted application?

via GIPHY

What if you don’t have the luxury of a greenfield. You are looking at an already built application, and asking yourself, how do I secure this?

Join 35,000 others and follow Sean Hull on twitter @hullsean.

One can think of it as a giant labyrinth, with many turns and many paths. Some of those paths have not had light shining in them for some time. So you’ll need to be cautious, thorough, and vigilant.

Here are some notes on where to start.

1. Scanning – code

One area you’ll need to dig into is the application code itself. If you don’t have the luxury to push new code, you’ll need to verify what version is deployed, and scan the repository for keys or passwords. You can also scan on the server itself. Better to double your efforts.

Read: What do the best engineers do better?

2. Scanning – network

Your VPC is obviously your first layer of defense. Scan the routing table policies, to make sure there aren’t open ports or whitelisted IPs.

Do the same sort of review for security groups, as those are an alternative method for configuring access to servers.

AWS has a service called Flowlogs, which can be enabled. These give you detailed network layer logging, which you can then scan for trouble.

Related: Is Fred Wilson right about dealing in an honest, direct and transparent way?

3. Scanning – IAM, keys & console

Your existing devs probably have keys to some or all of the EC2 boxes. If you don’t want to relaunch all of these boxes with new keys, or don’t have the luxury to do that, you’ll need to lock down the security groups, whitelisted IPs and VPC routing rules.

You’ll also need to carefully review IAM roles & policies. Amazon Inspector may be a useful tool to scan your environment, and find glaring holes and enforce best practices. But you’ll also want to do your own scanning both automated and manually eyeballing the accounts.

You’ll also want to lock down console access, especially the root account, and any others that have adminstrator privileges. Enable password policies and password rotation, as well as multi-factor authentication. There is also a nice toggle for “alert on login”. You certainly want to know about those!

Related: What mistakes did you make when starting as a consultant?

4. Scanning – services

Review all of the AWS services that are deployed. Ask yourself some of these questions:

o which regions & availability zones am I deployed in?
o what elastic IPs do I have configured where?
o what IAM roles & policies do I have created?
o what databases, API gateways & S3 buckets are configured
o etc…

Cloudtrail can be a great help here as it can log all sorts of useful information. You can then scan those logs for problems.

Related: Why did mailchimp fraudulently charge my credit card?

5. Rebuilding

The scanning approach can work, but there is a strong need to be thorough. If you miss one whitelisted IP or existing ssh key, you can leave the whole network open to a crafty intruder.

Another option is to rebuild the whole application. This gives you the time to:

o automate the whole stack with terraform
o test that everything is working
o plan for failover
o ensure that every bucket is secure with lifecycle policies enabled
o ensure that every EBS volume is encrypted
o enabled cloudtrail, cloudwatch etc

o potentially setup in a *brand new* aws account, for even more confidence
o backup all the pieces of the application as you go

Read: Did Disney+ have to fail?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Categories
All Devops

When I found gold in my customer archives

via GIPHY

I’m good at keeping notes. I’ve blogged about Can progress reports & daily notes help engagements succeed. I would give that an emphatic YES!

From helping with communication, to sharing arcane details about blocking issues, struggles & hurdles, notes can illuminate things that a CTO or manager may not otherwise be aware of.

Join 35,000 others and follow Sean Hull on twitter @hullsean.

I was digging through mine archives recently, and found a bunch of notes on how to think about Terraform. In particular, how do you think about infrastructure as code? How do you architect to make it all work together?

1. A dead end started me backtracking

You’re going to dig your heels in by getting your application working. To do that you’ll spinup a vpn, private public subnet, bastion boxes, ECS hosts to deploy containers to, and an application load balancer endpoint. Getting that all working wasn’t terrible. We even included a prometheus node, to give us some monitoring visibility. We even added our jenkins server into the mix. Do you see where this is going?

At a certain point we of course needed to destroy the whole setup, but didn’t want to destroy the CI pipeline. Duh! And what about monitoring? Lose all that data each time no way!

Read: Infrastructure provisioning – what is it and why is it important?

2. Organize around VPCs

After dragging yourself through that, you see a bit better. It’s like standing at 20,000 feet.

Your vpn is a logical collection of instances. A box that holds your application, provides security, and gets created and destroyed with it. You can even see in your Terraform code, a subnet requires a VPN id within which to create it. And an instance requires a subnet within which to create it. For security reasons the application instances will sit within PRIVATE subnets, and only bastion box & load balancers will sit in PUBLIC subnets.

TO my mind that means each environment DEV, STAGE, PROD all get their own vpn. This also allows you to control who can access stage & production, as they have their own bastion access points.

Read: How to hire a developer that doesn’t suck

3. Build a utility VPC

What you’ll also see from the above story is that you need a place to have business wide, non-application services sit inside. Welcome the UTILITY VPC!

This can contain prometheus, ELK or other log collection service, your jenkins or other CI pipeline, and any other services that don’t logically fit within the application VPC.

Related: Why generalists are better at scaling the web

4. A VPC should be ok to destroy and rebuild in another region – in one-click

When you use infrastructure code, you want to test, create & destroy often. That shouldn’t disrupt anything. That means all state data should sit outside of those instances. Logging data, send it to logstash or cloudwatch. Application state, keep that inside of an RDS instance. And you’ve tested those backups right?

Speaking of RDS, I encountered problems with Amazon’s own backup & restore. For my money, I had a lot of problems and ended up writing a custom db dump script. That may require a custom restore to, so buyer beware. Here’s my story though… I tried to build infrastructure as code with Terraform and Amazon and it didn’t go as i expected.

You also may encounter issues when you move across regions, such as elastic IPs and so forth. And you’ll need to check and verify the code which creates and destroy S3 buckets and domain name certs. These areas gave me some hiccups, but you can work through it with diligence!

Read: Is zero downtime even possible on RDS?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Categories
All Cloud Computing CTO/CIO Devops

What are the key aws skills and how do you interview for them?

via GIPHY

Whether you’re striving for a new role as a Devops engineer, or a startup looking to hire one, you’ll need to be on the lookout for specific skills.

Join 38,000 others and follow Sean Hull on twitter @hullsean.

I’ve been on both sides of the fence, at times interviewing candidates, and other times the candidate looking to impress to win a new role.

Here are my suggestions…

Devops Pipeline

Jenkins isn’t the only build server, but it’s been around a long time, so it’s everywhere. You can also do well with CircleCI or Travis. Or even Amazon’s own CodeBuild & CodePipeline.

You should also be comfortable with a configuration management system. Ansible is my personal favorite but obviously there is lots of Puppet & Chef out there too. Talk about a playbook you wrote, how it configures the server, installs packages, edits configs and restarts services.

Bonus points if you can talk about handling deployments with autoscaling groups. Those dynamic environments can’t easily be captured in static host manifests, so talk about how you handle that.

Of course you should also be strong with Git, bitbucket or codecommit. Talk about how you create a branch, what’s gitflow and when/how do you tag a release.

Also be ready to talk about how a code checkin can trigger a post commit hook, which then can go and build your application, or new infra to test your code.

Related: How to avoid insane AWS bills

CloudFormation or Terraform

I’m partial to Terraform. Terraform is MacOSX or iPhone to CloudFormation as Android or Windows. Why do I say that? Well it’s more polished and a nicer language to write in. CloudFormation is downright ugly. But hey both get the job done.

Talk about some code you wrote, how you configured IAM roles and instance profiles, how you spinup an ECS cluster with Terraform for example.

Related: How best to do discovery in cloud and devops engagements?

AWS Services

There are lots of them. But the core services, are what you should be ready to talk about. CloudWatch for centralized logging. How does it integrate with ECS or EKS?

Route53, how do you create a zone? How do you do geo load balancing? How does it integrate with CertificateManager? Can Terraform build these things?

EC2 is the basic compute service. Tell me what happens when an instance dies? When it boots? What is a user-data script? How would you use one? What’s an AMI? How do you build them?

What about virtual networking? What is a VPC? And a private subnet? What’s a public subnet? How do you deploy a NAT? WHat’s it for? How do security groups work?

What are S3 buckets? Talk about infraquently accessed? How about glacier? What are lifecycle policies? How do you do cross region replication? How do you setup cloudfront? What’s a distribution?

What types of load balancers are there? Classic & Application are the main ones. How do they differ? ALB is smarter, it can integrate with ECS for example. What are some settings I should be concerned with? What about healthchecks?

What is Autoscaling? How do I setup EC2 instances to do this? What’s an autoscaling group? Target? How does it work with ECS? What about EKS?

Devops isn’t about writing application code, but you’re surely going to be writing jobs. What language do you like? Python and shell scripting  are a start. What about Lambda? Talk about frameworks to deploy applications.

Related: Are you getting good at Terraform or wrestling with a bear?

Databases

You should have some strong database skills even if you’re not the day-to-day DBA. Amazon RDS certainly makes administering a bit easier most of the time. But upgrade often require downtime, and unfortunately that’s wired into the service. I see mostly Postgresql, MySQL & Aurora. Get comfortable tuning SQL queries and optimizing. Analyze your slow query log and provide an output.

Amazon’s analytics offering is getting stronger. The purpose built Redshift is everywhere these days. It may use a postgresql driver, but there’s a lot more under the hood. You also may want to look at SPectrum, which provides a EXTERNAL TABLE type interface, to query data directly from S3.

Not on Redshift yet? Well you can use Athena as an interface directly onto your data sitting in S3. Even quicker.

For larger data analysis or folks that have systems built around the technology, Hadoop deployments or EMR may be good to know as well. At least be able to talk intelligently about it.

Related: Is zero downtime even possible on RDS?

Questions

Have you written any CloudFormation templates or Terraform code? For example how do you create a VPC with private & public subnets, plus bastion box with Terraform? What gotches do you run into?

If you are given a design document, how do you proceed from there? How do you build infra around those requirements? What is your first step? What questions would you ask about the doc?

What do you know about Nodejs? Or Python? Why do you prefer that language?

If you were asked to store 500 terrabytes of data on AWS and were going to do analysis of the data what would be your first choice? Why? Let’s say you evaluated S3 and Athena, and found the performance wasn’t there, what would you move to? Redshift? How would you load the data?

Describe a multi-az VPC setup that you recommend. How do you deploy multiple subnets in a high availability arragement?

Related: Why generalists are better at scaling the web

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Categories
All Cloud Computing

5 core pieces of the Amazon Cloud puzzle to get your project off the ground

amazon cloud automation

One of the most common engagements I do is working with firms in and around the NYC startup sector. I evaluate AWS infrastructures & applications built in the Amazon cloud.

Join 32,000 others and follow Sean Hull on twitter @hullsean.

I’ve seen some patterns in customers usage of Amazon. Below is a laundry list of the most important ones.

On our products & pricing page you can see more detail including how we perform a performance review and a sample executive summary.

1. Use automation

When you first start using Amazon Web Services to host your application, you like many before you may think of it like you’re old school hosting. Setup a machine, configure it, get your code running. The traditional model of systems administration. It’s fine for a single server, but if you’re managing a more complex deploy with continuous integration, or want to be resilient to regular server failures you need to do more.

Enter the various automation tools on offer. The simplest of the three is Elastic Beanstalk. If you’re using a very standard stack & don’t need a lot of customizations, this may well work for you.

With more complex deployments you’ll likely want to look at Opsworks Sounds familiar? That’s because it *is* Opscode Chef. Everything you can do with Chef & all the templates out there will work with Amazon’s offering. Let AWS manage your templates & make sure your servers are in the right state, just like hosted chef.

If you want to get down to the assembly language layer of infrastructure in Amazon, you’ll eventually be dealing with CloudFormation. This is JSON code which defines everything, from a server with an attached EBS volume, to a VPC with security rules, IAM users & everything inbetween. It is ultimately what these other services utilize under the hood.

Also: Is Amazon too big to fail?

2. Use Advisor & Alerts

Amazon has a few cool tools to help you manage your infrastructure better. One is called Trusted Advisor . This helps you by looking at your aws usage for best practices. Cost, performance, security & high availability are the big focal points.

In order to make best use of alerts, you’ll want to do a few things. First define an auto scaling group. Even if you don’t want to use autoscaling, putting your instance into one allows amazon to do the monitoring you’ll want.

Next you’ll want to analyze your CloudWatch metrics for usage patterns. Notice a spike, could be a job that is running, or it could be a seasonal traffic spike that you need to manage. Once you have some ideas here, you can set alerts around normal & problematic usage patterns.

Related: Are we fast approaching cloud-mageddon?

3. Use Multi-factor at Login

If you haven’t already done so, you’ll want to enable multi-factor authentication on your AWS account. This provides much more security than a password (even a sufficiently long one) can ever do. You can use Google authenticator to generate the mfa codes and associated it with your smartphone.

While you’re at it, you’ll want to create at least one alternate IAM account so you’re not logging in through the root AWS account. This adds a layer of security to your infrastructure. Consider creating an account for your command line tools to spinup components in the cloud.

You can also use MFA for your command line SSH logins. This is also recommended & not terribly hard to setup.

Read: When hosting data on Amazon turns bloodsport

4. Use virtual networking

Amazon offers Virtual Private Cloud which allows you to create virtual networks within the Amazon cloud. Set your own ip address range, create route tables, gateways, subnets & control security settings.

There is another interesting offering called VPC peering. Previously, if you wanted to route between two VPCs or across the internet to your office network, you’d have to run a box within your VPC to do the networking. This became a single point of failure, and also had to be administered.

With VPC peering, Amazon can do this at the virtualization layer, without extra cost, without single point of failure & without overhead. You can even use VPC peering to network between two AWS accounts. Cool stuff!

Also: Are SQL databases dead?

5. Size instances & I/O

I worked with one startup that had been founded in 2010. They had initially built their infrastructure on AWS so they chose instances based on what was available at the time. Those were m1.large & m1.xlarge. A smart choice at the time, but oh how things evolve in the amazon world.

Now those instance types are “previous generation”. Newer instances offer SSD, more CPU & better I/O for roughly the same price. If you’re in this position, be sure to evaluate upgrading your instances.

If you’re on Amazon RDS, you may not be able to get to the newer instance sizes until you upgrade your database. Does upgrading MySQL involve much more downtime on Amazon RDS? In my experience it surely does.

Along with instance sizes, you’ll also want to evaluate disk I/O options. By default instances in amazon being multi-tenant, use disk as a shared resource. So they’ll see it go up & down dramatically. This can kill database performance & can be painful. There are expensive solutions. Consider looking at provisioned IOPS and additional SSD storage.

Also: Is the difference between dev & ops a four-letter word?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Categories
All Database Management MySQL

Connect to MySQL in the Amazon Public Cloud

MySQL on Amazon Cloud AWS

Troubleshooting MySQL on Amazon can be a real test of patience. There are quite a few different things to watch out for in terms of connectivity & networking. Sometimes a checklist can help.

Join 16,000 others and follow Sean Hull on twitter @hullsean.

Here’s my exhaustive list of things that can block you.

1. Be sure to create users & grants

Chances are you did something like this to create your user:


mysql> CREATE USER ‘sean’@‘localhost’ IDENTIFIED BY ‘password’;
mysql> GRANT ALL PRIVILEGES ON sean_schema.* TO ‘sean’@‘localhost' WITH GRANT OPTION;

But that won’t help you when connecting from a remote Amazon box. So what to do? Here’s an example:


mysql> CREATE USER ‘sean’@’10.10.%’ IDENTIFIED BY ‘password’;
mysql> GRANT ALL PRIVILEGES ON sean_schema.* TO ‘sean’@‘%’ WITH GRANT OPTION;

You may need to make your source IP wildcard *more* aggressive. For example consider ’10.%’. You *may* even with with ‘%’ which allows *all* source IPs. This may sound dangerous, but if you use a tight security group (see item #3 below), you can still be safe.

Related: Why Oracle Won’t Kill MySQL

2. Make sure iptables is not a problem

IPTables is a Linux service that acts like a private firewall for each server. Some AMIs will have it enabled by default. If you’re having trouble like I did, this can definitely trip you up. That’s because your connection will fail silently without telling you, hey the OS won’t let me into that port!

If you are a networking pro you’ve probably already fiddled with iptables. Feel free to add specific rules, and keep it turned on. However I’d recommend just disabling it completely, and using your Amazon security groups to protect your ports.


$ /etc/init.d/iptables stop
$ chkconfig --list iptables
iptables 0:off 1:off 2:on 3:on 4:on 5:on 6:off
$ chkconfig --del iptables
$ chkconfig --list iptables
service iptables supports chkconfig, but is not referenced in any runlevel (run 'chkconfig --add iptables')

Also: Are SQL Databases Dying Out?

3. Test & verify amazon security group settings

Security groups in Amazon can be tricky. I recommend the following:

o create a security group webserver_group
– allow port 80 from 0.0.0.0/0
– allow port 443 from 0.0.0.0/0
– allow port 22 from

o create a security group db_group
– allow port 22 from
– allow 3306 from

What’s happening here? We can’t specify a fixed set of IP addresses because they can change in Amazon. So essentially what we’ve done is say *any* requests from servers in our Amazon package, which are in the webserver_group security group, can connect to port 3306. Pretty cool right?

This means we’re pretty locked down. No internet connections to 3306, so we can be a little looser (see item #1 above) about our grants and source IPs.

What about if you want to use your GUI tools to hit your Amazon hosted MySQL boxes? Say you like to use the Oracle Workbench, Navicat or Toad to connect to MySQL. One way you could do this is configure your db_group to allow 3306 from your office subnet. Then anyone VPN’d into your office will be able to use the tools they like.

Another option is to use Amazon VPC for your servers. You’ll setup an Amazon Virtual Private Gateway, which is a direct VPN connection between Amazon’s datacenter and your datacenter. This can be a messy process, and you’ll want to contact your network admin to help. Once it’s setup, amazon boxes appear to sit on your office or datacenter network. Cool stuff!


$ mysql -h xxx.xxx.xxx.xxx -u admin -p
Enter password:
ERROR 2003 (HY000): Can't connect to MySQL server on 'xxx.xxx.xxx.xxx'

Read this: Why are MySQL experts in such short supply?

4. MySQL network settings

If MySQL is bound to the wrong IP address you can have real problems. First be sure skip_networking is OFF. If it is ON change it in /etc/my.cnf & restart MySQL.


mysql> show variables like 'skip_net%';
+-----------------+-------+
| Variable_name | Value |
+-----------------+-------+
| skip_networking | OFF |
+-----------------+-------+
1 row in set (0.00 sec)

The other MySQL setting that can be problematic is bind-address. First check what it is set to:


$ cat /etc/my.cnf | grep bind
bind-address=127.0.0.1

This isn’t going to allow remote connections. In amazon however, your IP address may change upon reboot. So there is a special setting to allow binding to any IP:


bind-address=0.0.0.0

Related: Bulletproofing MySQL Replication with Checksums

5. installing mysql client & telnet for troubleshooting

You have two options for troubleshooting on the webserver side. If you’re simply trying to check by mysql command line, you may get blocked up if the network settings & security groups aren’t configured right. So use telnet first.


$ yum install -y telnet

$ telnet 10.10.10.1 3306
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
4
5.1.71??gu9Y6B'/y9Oay`QV

If you don't get a responce, it's not an issue with users or grants, but rather that the port isn't opened. Check iptables, check bind-address and check security groups.

Check this: Top MySQL DBA Interview Questions

6. SE Linux related issues

SE Linux will do a lot of good, if managed properly. However if you're not aware of it's existence, it can be very very frustrating. Symptoms can be as abstract as allergies, a cold or flu. It can monitor files, and prevent MySQL from being able to write where it needs to,

Read this: Migrating MySQL to Oracle

7. RPM & later centos yum repo install conflicts

I had real problems doing a custom install for a customer. They didn't want to use a repository for various settings, but preferred downloading RPMs. There were a few other customizations which were tripping things up.

Based on all the connectivity issues I was having, I backed out of the RPM based install, and then ran through a stock yum install. After doing that, I started seeing these weird errors in the mysqld.log

120328 21:32:40 [ERROR] Can't start server: Bind on TCP/IP port: Address already in use
120328 21:32:40 [ERROR] Do you already have another mysqld server running on port: 3306 ?
120328 21:32:40 [ERROR] Aborting
If I run "netstat -nat | grep 3306" in my terminal, I get the following:
tcp4 0 0 *.3306 . LISTEN

I spent hours spinning my wheels and not able to figure out what was happening here. At first it seemed a leftover pid file was the culprit. In the end it appeared the *old* /etc/init.d/mysql script was still in place, and the new yum packages wouldn't work with that.

I ended up just scrapping the whole box, and starting from scratch. Sometimes you have to do that. After a clean build, all was fine.

Related: RDS or MySQL 10 Use Cases

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don't work with recruiters