Tag Archives: ec2

How do I migrate my skills to the cloud?

via GIPHY


Hi, I’m currently an IT professional and I’m training for AWS Solutions Architect – Associate exam. My question is how to gain some valuable hands-on experience without quitting my well-paying consulting gig I currently have which is not cloud based. I was thinking, perhaps I could do some cloud work part time after I get certified.

Join 38,000 others and follow Sean Hull on twitter @hullsean.


I work in the public sector and the IT contract prohibits the agency from engaging any cloud solutions until the current contract expires in 2019. But I can’t just sit there without using these new skills – I’ll lose it. And if I jump ship I’ll loose $$$ because I don’t have the cloud experience.


Hi George,

Here’s what I’d suggest:

1. Setup your AWS account

A. open aws account, secure with 2FA & create IAM roles

First things first, if you don’t already have one, go signup. Takes 5 minutes & a credit card.

From there be sure to enable two factor authentication. Then stop using your root account! Create a new IAM user with permissions to command line & API. Then use that to authenticate. You’ll be using the awscli python package.

Also: Is Amazon too big to fail?

2. Automatic deployments

B. plugin a github project
C. setup CI & deployment
D. get comfy with Ansible

Got a pet project on github? If not it’s time to start one. 🙂

You can also alternatively use Amazon’s own CodeCommit which is a drop-in replacement for github and works fine too. Get your code in there.

Next setup codedeploy so that you can deploy that application to your EC2 instance with one command.

But you’re not done yet. Now automate the spinup of the EC2 instance itself with Ansible. If you’re comfortable with shell scripts, or other operational tools, the learning curve should be pretty easy for you.

Read: Is AWS too complex for small dev teams? The growing demand for Cloud SRE

3. Clusters

E. play around with kubernetes or docker swarm

Both of these technologies allow you to spinup & control a fleet of containers that are running on a fixed set of EC2 instances. You may also use Amazon ECS which is a similar type of offering.

Related: How to deploy on EC2 with Vagrant

4. Version your infrastructure

F. use terraform or cloudformation to manage your aws objects
G. put your terraform code into version control
H. test rollback & roll foward infrastructure changes

Amazon provides CloudFormation as it’s foundational templating system. You can use JSON or YAML. Basically you can describe every object in your account, from IAM users, to VPCs, RDS instances to EC2, lambda code & on & on all inside of a template file written in JSON.

Terraform is a sort of cloud-agnostic version of the same thing. It’s also more feature rich & has got a huge following. All reasons to consider it.

Once you’ve got all your objects in templates, you can checkin these files into your git or CodeCommit repository. Then updating infrastructure is like updating any other pieces of code. Now you’re self-documenting, and you can roll-forward & backward if you make a mistake!

Related: How I use terraform & composer to automate wordpress on AWS

5. Learn serverless

I. get familiar with lambda & use serverless framework

Building applications & deploying only code is the newest paradigm shift happening in cloud computing. On Amazon you have Lambda, on Google you have Cloud Functions.

Related: 30 questions to ask a serverless fanboy

6. Bonus: database skills

J. Learn RDS – MySQL, Postgres, Aurora, Oracle, SQLServer etc

For a bonus page on your resume, dig into Amazon Relational Database Service or RDS. The platform supports various databases, so try out the ones you know already first. You’ll find that there are a few surprises. I wrote Is upgrading RDS like a sh*t storm that will not end?. That was after a very frustrating weekend upgrading a customers production RDS instance. 🙂

Related: Is Amazon about to disrupt your data warehouse?

7. Bonus: Data warehousing

K. Redshift, Spectrum, Glue, Quicksight etc

If you’re interested in the data side of the house, there is a *LOT* happening at AWS. From their spectrum technology which allows you to keep most of your data in S3 and still query it, to Glue which provides an ETL as a service offering.

You can also use a world-class columnar storage database called Redshift. This is purpose built for reporting & batch jobs. It’s not going to meet your transactional web-backend needs, but it will bring up those Tableau reports blazingly fast!

Related: Is Amazon about to disrupt your data warehouse?

8. Now go find that cloud deployment job!


With the above under your belt there’s plenty of work for you. There is tons of demand right now for this stuff.

Did you do learn all that? You’ve now got very very in-demand skills. The recruiters will be chomping at the bit. Update those buzzwords (I mean keywords). This will help match you with folks looking for someone just like you!

Related: Why I don’t work with recruiters

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

5 surprising features in Amazon’s Lambda serverless offering

Amazon is building out it’s serverless offering at a rapid clip. Lambda makes a great solution for a lot of different use cases including:

o a hybrid approach, building lambda functions for small pieces of your application, sitting along side your full application, working in concert with it

o working with Kinesis firehose to add ETL functionality into your pipeline. Extract Transform & Load is a method of transforming data from a relational or backend transactional databases, into one better fit for reporting & analytics.

o retrofitting your API? Layer Lambda functions in front, to allow you to rebuild in a managed way.

o a natural way to build microservices, with each function as it’s own little universe

Join 32,000 others and follow Sean Hull on twitter @hullsean.

Great, tons of ways to put serverless to use. What’s Amazon doing to make it even better? Here are some of the features you’ll find indispensible in building with Lambda.

1. Versioned functions

As your serverless functions get more sophisticated, you’ll want to control & deploy different versions. Lambda supports this, allowing you to upload multiple copies of the same function. Coupled with Aliases below, this becomes a very powerful feature.

Also: When hosting data on Amazon turns bloodsport

2. Aliases

As you deploy multiple versions of your functions in AWS, you don’t want to recreate the API endpoints each time. That’s where aliases come in. Create one alias for dev, another for test, and a third for production. That way when new versions of those are deployed, all you have to do is change the alias & QA or customers will be hitting the new code. Cool!

Related: Are you getting errors building lambda functions?

3. Caching & throttling

Using the API gateway, we can do some fancy footwork with Lambda. First we can enabling caching to speedup access to our endpoint. Control the time-to-live, capacity of the cache easily. We’ll also need to invalidate the cache when we make changes & redeploy our functions.

Throttling is another useful feature, allowing you to control the maximum number of times your function can be called per second on average (the rate) and maximum number of times (burst limit). These can be set at both the stage & method levels.

Read: Is Amazon too big to fail?

4. Stage variables

Creating multiple stages, for dev, test & production means you can separate out and control environment variables with more granular control. For example suppose you have access & secret keys to reach S3. You can set environment variables for these to avoid committing any credentials or secrets in your code. Definitely don’t do that!

Allowing multiple copies of stage variables, means you can set them separately for dev, test & production.

Also: How to deploy on Amazon EC2 with Vagrant?

5. Logging

You can enable logging in your Lambda function configuration. This will send error and/or info warning messages out to CloudWatch.

You may also choose the log all of the request & response data. This is controlled in the API Gateway settings for individual stages.

Also: Is Amazon RDS hard to manage?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Deploy wordpress on aws by first decoupling assets

too much inventory

You want to make your wordpress site bulletproof? No server outage worries? Want to make it faster & more reliable. And also host on cheaper components?

I was after all these gains & also wanted to kick the tires on some of Amazon’s latest devops offerings. So I plotted a way forward to completely automate the deployment of my blog, hosted on wordpress.

Here’s how!

Join 28,000 others and follow Sean Hull on twitter @hullsean.

The article is divided into two parts…

Deploy a wordpress site on aws – decouple assets (part 1)

In this one I decouple the assets from the website. What do I mean by this? By moving the db to it’s own server or RDS of even simpler management, it means my server can be stopped & started or terminated at will, without losing all my content. Cool.

You’ll also need to decouple your assets. Those are all the files in the uploads directory. Amazon’s S3 offering is purpose built for this use case. It also comes with easy cloudfront integration for object caching, and lifecycle management to give your files backups over time. Cool !

Terraform a wordpress site on aws – automate deploy (part 2)

The second part we move into all the automation pieces. We’ll use PHP’s Composer to manage dependencies. That’s fancy talk for fetching wordpress itself, and all of our plugins.

1. get your content into S3

How to do it?

A. move your content

$ cd html/wp-content/
$ aws s3 cp uploads s3://iheavy/wp-content/

Don’t have the aws command line tools installed?

$ yum install aws-cli -y
$ aws configure

B. Edit your .htaccess file with these lines:

These above steps handle all *existing* content. However you also want new content to go to S3. For that wordpress needs to understand how to put files there. Luckily there’s a plugin to help!


RewriteEngine On
RewriteRule ^wp-content/uploads/(.*)$ http://s3.amazonaws.com/your-bucket/wp-content/uploads/$1 [P]

C. Fetch WP Offload S3 Lite

You’ll see the plugin below in our composer.json file as “amazon-s3-and-cloudfront”

Theoretically you need to specify your aws credentials inside the wp-config.php. However this is insecure. You don’t ever want stuff like that in S3 or any code repository. What to do?

The best way is to use AWS ROLES for your instance. These give the whole instance access to API calls without credentials. Cool! Read more about AWS roles for instances.

Related: Is there a devops talent gap?

2. Move to your database to RDS

You may also use a roll-your-own MySQL instance. The point here is to make it a different EC2 instance. That way you can kill & rebuild the webserver at will. This offers us some cool advantages.

A. Create an RDS instance in a private subnet.

o Be sure it has no access to the outside world.
o note the root user, password
o note the endpoint or hostname

I recommend changing the password from your old instance. That way you can’t accidentally login to your old db. Well it’s still possible, but it’s one step harder.

B. mysqldump your current wp db from another server

$ cd /tmp
$ mysqldump –opts wp_database > wp_database.mysql

C. copy that dump to an instance in the same VPC & subnet as the rds instance

$ scp -i .ssh/mykey.pem ec2-user@oldbox.com:/tmp/wp_database.mysql /tmp/

D. import the data into your new db

$ cd /tmp
$ echo “create database wp_database” | mysql
$ mysql < wp_database.mysql E. Edit your wp-config.php

define(‘DB_PASSWORD’, ‘abc123’);
define(‘DB_HOST’, ‘my-rds.ccccjjjjuuuu.us-east-1.rds.amazonaws.com’);

Also: When hosting data on Amazon turns bloodsport?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Does AWS have a dirty little secret?

tell a secret

I was recently talking with a colleague of mine about where AWS is today. Obviously there companies are migrating to EC2 & the cloud rapidly. The growth rates are staggering.

Join 32,000 others and follow Sean Hull on twitter @hullsean.

The question was…

“What’s good and bad with Amazon today?”

It’s an interesting question. I think there some dirty little secrets here, but also some very surprising bright spots. This is my take.

1. VPC is not well understood  (FAIL)

This is the biggest one in my mind.  Amazon’s security model is all new to traditional ops folks.  Many customers I see deploy in “classic EC2”.  Other’s deploy haphazerdly in their own VPC, without a clear plan.

The best practices is to have one or more VPCs, with private & public subnet.  Put databases in private, webservers in public.  Then create a jump box in the public subnet, and funnel all ssh connections through there, allow any source IP, use users for authentication & auditing (only on this box), then use google-authenticator for 2factor at the command line.  It also provides an easy way to decommission accounts, and lock out users who leave the company.

However most customers have done little of this, or a mixture but not all of it.  So GETTING TO BEST PRACTICES around vpc, would mean deploying a vpc as described, then moving each and every one of your boxes & services over there.  Imagine the risk to production services.  Imagine the chances of error, even if you’re using Chef or your own standardized AMIs.

Also: Are we fast approaching cloud-mageddon?

2. Feature fatigue (FAIL)

Another problem is a sort of “paradox of choice”.  That is that Amazon is releasing so many new offerings so quickly, few engineers know it all.  So you find a lot of shops implementing things wrong because they didn’t understand a feature.  In other words AWS already solved the problem.

OpenRoad comes to mind.  They’ve got media files on the filesystem, when S3 is plainly Amazon’s purpose-built service for this.  

Is AWS too complex for small dev teams & startups?

Related: Does Amazon eat it’s own dogfood? Apparently yes!

3. Required redundancy & automation  (FAIL)

The model here is what Netflix has done with ChaosMonkey.  They literally knock machines offline to test their setup.  The problem is detected, and new hardware brought online automatically.  Deploying across AZs is another example.  As Amazon says, we give you the tools, it’s up to you to implement the resiliency.

But few firms do this.  They’re deployed on Amazon as if it’s a traditional hosting platform.  So they’re at risk in various ways.  Of Amazon outages.  Of hardware problems under the VMs.  Of EBS network issues, of localized outages, etc.

Read: Is Amazon too big to fail?

4. Lambda  (WIN)

I went to the serverless conference a week ago.  It was exiting to see what is happening.  It is truely the *bleeding edge* of cloud.  IBM & Azure & Google all have a serverless offering now.  

The potential here is huge.  Eliminating *ALL* of the server management headaches, from packages to config management & scaling, hiding all of that could have a huge upside.  What’s more it takes the on-demand model even further.  YOu have no compute running idle until you hit an endpoint.  Cost savings could be huge.  Wonder if it has the potential to cannibalize Amazon’s own EC2 …  we’ll see.

Charity Majors wrote a very good critical piece – WTF is Operations? #serverless
WTF is operations? #serverless

Patrick Dubois 

Also: Is the difference between dev & ops a four-letter word?

5. Redshift  (WIN)

Seems like *everybody* is deploying a data warehouse on Redshift these days.  It’s no wonder, because they already have their transactional database, their web backend on RDS of some kind.  So it makes sense that Amazon would build an offering for reporting.

I’ve heard customers rave about reports that took 10 hours on MySQL run in under a minute on Redshift.  It’s not surprising because MySQL wasn’t built for the size servers it’s being deployed on today.  So it doesn’t make good use of all that memory.  Even with SSD drives, query plans can execute badly.

Also: Is there a better way to build a warehouse in 2016?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Is AWS too complex for small dev teams & startups?

via GIPHY

I was discussing a server outage with a colleague recently. AWS had done some confusing things, and the team was rallying to troublehsoot & fix.

He made an offhand comment that caught my attention…


AWS is too complex for small dev teams. I’d recommend we host in a traditional datacenter.

Join 32,000 others and follow Sean Hull on twitter @hullsean.

It’s an interesting point. For all the fanfare over Amazon, lost in the shuffle is the staggering complexity that we’re taking on. For small firms, this is a cost that’s often forgotten when we smell the on-demand cool-aid that is EC2.

Here are my thoughts…

1. Over 70 services offered

Everytime I login to the AWS console there’s a new service offering. Lambda & serverless computing. CodeDeploy, Redshift, EMR, VPC’s, developer tools, IOT, the list goes on. If you haven’t enabled MFA on your IAM accounts you’re not alone!

Also: Is Amazon too big to fail?

2. Still complex to build high availability

The song I hear out of Amazon is, we offer all the components for a high availability infrastructure. multiple availability zones, regions, load balancers, autoscaling, geo & latency dns routing. What’s more companies like Netflix have open sourced tools to help.

But at a lot of startups that I see, all these components are not in use, nor are they well understood. Many admins are still using Amazon like an old-school datacenter. And that’s not good.

Sometimes it seems that AWS is a patient in need of constant medication.

Related: Are we fast approaching cloud-mageddon?

3. Need a dedicated devops

As AWS becomes more complex, and the offering more robust, so too the need for dedicated ops. If you’re devs are already out of bandwidth, but you don’t quite have so much need for a fulltime resource a consultant may be an option. Round out the team & keep costs manageable.

If you’re looking for an aws solutions architect, we can help!

Check out: Does Amazon eat it’s own dogfood?

4. Orchestration involves many moving parts

Infrastructure as code offers the promise of completely versioning all your servers, configurations and changes. From there we can apply test driven development & bring a more professional level of service to our business. That’s the theory anyway.

In practice it brings an incredible number of new toolsets to master and a more complex stack besides. All those components can have bugs, need troubleshooting. This sometimes just kicks the can down the road, moving the complexity elsewhere.

It’s not clear that for smaller shops, all this complexity is manageable.

Also: 5 things toxic to scalability

5. Troubleshooting failed deployments

I was looking at a problem with a broken deploy recently. Turns out a developer had copy & pasted some code solution off the internet, possibly from a tutorial, and broke deployments to staging.

Yes perhaps this was avoidable, and more checks & balances can fix. But my thought is continuous integration & continuous deployments are not a panacea. More complexity brings a more complex web to unweave.

I sometimes wonder if we aren’t fast approaching cloud-mageddon?

Read: Why Airbnb didn’t have to fail?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Does Amazon eat it’s own dog food (ahem…) or drink it’s own champagne?

laura grit aws amazon retail champagne

I was flipping through the AWS reddit channel and found this excellent presentation from RE:Invent by Laura Grit. She’s in charge of Amazon Retail, and worked very closely with teams on migrating to AWS. She goes in-depth on what that cost in terms of development, what it saved in terms of unused capacity, and surprisingly operational headaches.

Join 32,000 others and follow Sean Hull on twitter @hullsean.

Laura’s a great speaker. I was surprised to find that Amazon Retails migration was similar to many of the customers I’ve worked with in New York. Often they take a hybrid approach where Direct Connect is key, allowing them to move over in a measured way.

What’s more she talks about how EC2 instances have different performance characteristics & applications typically need to be tuned for that world.

I learned a lot more, here are the highlights…

1. Hybrid cloud was key

Around 11:00 in the video she talks about AWS Direct Connect & VPC. These two technologies allow you to leverage AWS as a hybrid cloud, connecting to your existing datacenter. Scale elastically, but migrate in steps.

For example Amazon Retail did only webserver fleet in isolation.

Also: Is Amazon too big to fail?

2. Excite business & developers both

Around 18:20 …

“Moving the webserver fleet not only got the business excited about the cost savings & our ability to scale linearly, but also got developers excited about the operational load decrease that they had to burden.

Once benefits of this were shown to the rest of the company it actually jump started a wave of migrations to ec2 from inside amazon retail. And we found from a program perspective this is important. To find early migrations that benefit both the business & the developers because then they are both working together to figure out how to move their services to AWS.”

And she also pointed out an interesting bit abaout cultural change…

“You may choose to not migration the simplest service from inside your company, but instead one that will create a cultural change in the company & force more migrations automatically to AWS.”

Related: Are SQL Databases dead?

4. Expect application changes

Flip through to 27:47 and she talks about application changes for the new environment of the cloud.

“Don’t expect migrations to require no changes to your applications…
The webserver fleet was not lift & shift”

Also: Why Dropbox didn’t have to fail

5. Cloud not a panacea

Fast forward over to 37:10 and you’ll hear Laura talk about technical debt. That’s big.

“The cloud is not a universal panacea. It can’t coverup for messy engineering practices.
An example of this is availability. Design for failure is a fundamental design principle of amazon.”

Also: Are generalists better at scaling the web?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Is AWS the patient that needs constant medication?

storm coming

I was just reading High Scalability about Why Swiftype moved off Amazon EC2 to Softlayer and saw great wins!

We’ve all heard by now how awesome the cloud is. Spinup infrastructure instantly. Just add water! No up front costs! Autoscale to meet seasonal application demands!

But less well known or even understood by most engineering teams are the seasonal weather patterns of the cloud environment itself!

Join 28,000 others and follow Sean Hull on twitter @hullsean.

Sure there are firms like Netflix, who have turned the fickle cloud into one of virtues & reliability. But most of the firms I work with everyday, have moved to Amazon as though it’s regular bare-metal. And encountered some real problems in the process.

1. Everyday hardware outages

Many of the firms I’ve seen hosted on AWS don’t realize the servers fail so often. Amazon actually choosing cheap commodity components as a cost-savings measure. The assumption is, resilience should be built into your infrastructure using devops practices & automation tools like Chef & Puppet.

The sad reality is most firms provision the usual way, through the dashboard, with no safety net.

Also: Is your cloud speeding for a scalability cliff

2. Ongoing network problems

Network latency is a big problem on Amazon. And it will affect you more. One reason is you’re most likely sitting on EBS as your storage. EBS? That’s elastic block storage, it’s Amazon’s NAS solution. Your little cheapo instance has to cross the network to get to storage. That *WILL* affect your performance.

If you’re not already doing so, please start using their most important & easily missed performance feature – provisioned IOPS.

Related: The chaos theory of cloud scalability

3. Hard to be as resilient as netflix

We’ve by now heard of firms such as Netflix building their Chaos Monkey to actively knock out servers, in effort to test their ability to self-healing infrastructure.

From what I’m seeing at startups, most have a bit of devops in place, a bit of automation, such as autoscaling around the webservers. But little in terms of cross-region deployments. What’s more their database tier is protected only by multi-az or just a read-replica or two. These are fine for what they are, but will require real intervention when (not if) the server fails.

I recommend building a browse-only mode for your application, to eliminate downtime in these cases.

Read: 8 questions to ask an aws expert

4. Provisioning isn’t your only problem

But the cloud gives me instant infrastructure. I can spinup servers & configure components through an API! Yes this is a major benefit of the cloud, compared to 1-2 hours in traditional environments like Softlayer or Rackspace. But you can also compare that with an outage every couple of years! Amazon’s hardware may fail a couple times a hear, more if you’re unlucky.

Meanwhile you’re going to deal with season weather problems *INSIDE* your datacenter. Think of these as swarms of customers invading your servers, like a DDOS attack, but self-inflicted.

Amazon is like a weak immune system attacking itself all the time, requiring constant medication to keep the host alive!

Also: 5 Things toxic to scalability

5. RDS is going to bite you

Besides all these other problems, I’m seeing more customers build their applications on the managed database solution MySQL RDS. I’ve found RDS terribly hard to manage. It introduces downtime at every turn, where standard MySQL would incur none.

In my experience Upgrading RDS is like a shit-storm that will not end!

Also: Does open source enable the cloud?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

If you use MySQL in the Amazon cloud, you need to ask yourself this question

Join 25,000 others and follow Sean Hull on twitter @hullsean.

Are you serious about backups?

If you’re just using Amazon EBS snapshots, that may not be sufficient. There’s a good chance it won’t protect you against your next data loss.

That’s why I like to have a few different types of backups

Also: 5 more things deadly to scalability

Protect against operator error

mysqldump is a tool every DBA is familiar with. Same as a hotbackup or snapshot you say? Just more labor? Not true.

A dump allows you to restore one table, or one schema. That’s why they’re also known as logical backups. What’s more you can edit the file, remove indexes, change object names, or datatypes. All these can be essential in the screwy and unpredictable event of a real world outage.

Expect the unexpected!

Read: Why devops talent is in short supply

Test those backups regularly

If you haven’t actually tried to restore, you really don’t know if you have everything. Did you backup stored procedures & database code? How about grants? Database events? How about cronjobs? What about the my.cnf file? And your replication configuration?

Yes there are a lot of little pieces, and testing your backups by rebuilding everything is an attempt to poke holes in your plan, and hit issues before d-day!

Related: MySQL interview guide for managers and candidates alike

Replication isn’t a backup

Replication is getting better and better in MySQL. It used to fail regularly. MyiSAM was very unpredictable. But even in the comfortable realm of Innodb, there can still be data drift. If you’re on MySQL 5.0 or 5.1, you should consider performing regular checksums. These test the integrity of data and compare what’s actually in master & slave. Bulletproofing MySQL replication with checksums.

Read: Why high availability is so very hard to deliver

Have you considered security around your backup files?

While you’re thinking about backups, make sure the files themselves are secure. Remember they contain your crown jewels. Hopefully individual data that’s sensitive is encrypted, but still you should secure their final resting place as well.

If you’re using S3, consider encrypting the file before shipping it up to the bucket.

Read this: Why a four letter word divides dev and ops

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

5 cloud ideas that aren’t actually true

storm coming

Join 20,000 others and follow Sean Hull’s scalability, startup & innovation content on twitter @hullsean.

Cloud computing is heralding us into a wonderful era where computing can be bought in small increments, like a utility. This changes the whole way we plan, manage budgets, and accelerates startups making them more agile.

But it’s not all wine & roses up there. I’ve heard a few refrains from clients over the years, and thought I’d share some of the most common.

1. Scaling is automatic

Rather recently I was working with a client on building some sophisticated reports. They needed to slice & dice customer data, over various time series, and summarize with invoices & tracking data. Unfortunately their dataset was large, in the half terabyte range.


Client: Can we just load all this data into the cloud?
Me: Yes we can do that. Build a system in Amazon public cloud, can support large datasets.
Client: I want it to scale easily. So we won’t have these slow reports. And as we add data, it’ll just manage it easily for us.
Me: Well it’s a little bit more complicated than that, unfortunately.

Unfortunately this is a rather familiar conversation that I have quite often. A lot of the press around cloud scalability, centers around auto-scaling, Amazon’s renowned & superb virtualization feature. Yes it’s true you can roll out webservers to scale out this way, but that’s not the end of the story. Typically web applications have a lot of components, from caching servers, to search servers, and of course their backend datastore.

But can we scrap our relational database, such as MySQL and go with one that scales out of the box like Riak, Cassandra or Dynamodb?

Those NoSQL solutions are built to be distributed from the start, it’s true. And they lend themselves to that type of architecture. However, if you’ve built up a dataset in MySQL or Oracle, and more so an application around that, you’ll have to migrate data into the NoSQL solution. That process will take some time.

Like teaching a fish to fly, it make take some time. They do well in water, but evolution takes a bit longer.

Related: RDS or MySQL 10 use cases

2. Disaster recovery is free

In the traditional datacenter, when you want DR, you setup a parallel environment. Hopefully not in the same room, same city or same coast even. Preferrably you do so in a different region. What you can’t get around is dishing out cash for that second datacenter. You need the servers, just in case.

In the cloud, things are different. That’s why we’re here, right? In amazon you have regions already setup & available for plugin-n-play use. Setup your various components, servers, software & configure. Once you’ve verified you can failover to the parallel environment you can just turn off all those instances. Great, no big charges for all that iron that you’d pay for to keep the rooms warm in an old-school datacenter. Or do you?

As it turns out, since you don’t have this environment running all the time, you’ll want to test it more often, run fire drills to bring the servers back online. That’ll incur some costs in terms of manpower. You’ll also want to include in there some scripts to start those servers up, and/or some detailed documentation on how to do that. And don’t lose that documentation, either will you?

You may also want to build some infrastructure as code unit tests. Things change, code checkouts evolve, especially in the agile & continuous integration world. Devops beware!

Read this: Why a killer title can make or break your content efforts

3. Machines are fast

Fast, fast, fast. That’s what we expect, things keep getting faster, right? Hard to believe then that the world of computing took a big step backward when it jumped into the cloud. Something similar happened when we jumped to commodity Linux a decade ago.

In amazon, it’s a multi-tenant world. And just like apartment buildings, popular restaurants, or busy highways you must share. When things are quiet you may have the road to yourself, but it’ll never be as quiet as a dirt road in the country!

Amazon is making big strides though. They now offer memory optimized & storage optimized instances. And an even bigger development is the addition of the most important feature for performance & scalability. That said the network & EBS can still be a real bottleneck.

Also: What is a relational database & why is it important?

4. Backups aren’t necessary

I’ve experienced a few horror stories over the years. I wrote about one noteworthy one When fat fingers take down your business.

True EBS snapshots make backing up your whole server, well a snap! That said a few extra steps have to happen (flush the filesystem & lock tables) to make this work for a relational database like MySQL or Oracle. And suddenly you have a verification step that you also need to perform. You see no backups are valid until they’ve been restored, remember?

But even with these wonderful disk snapshots, you’ll still want to do database dumps, and perhaps table dumps. Operator error, deleting the wrong data, or dropping the wrong tables, will always be a risk. Ignore backups at your own peril!

Check this: Why CTOs underestimate operational costs

5. Outages won’t happen

In an ideal world, everything is redundant, and outages will be a thing of the past. We’ll finally reach five nines uptime and devops everywhere will be out of work. 🙂

It’s true that Amazon provides all the components to build redundancy into your architecture, and very cutting edge firms that have taken netflix’s approach with chaos monkey are seeing big improvements here. But AirBNB did fail and at root it was an Amazon outage that shouldn’t ever happen.

Read: Why Oracle won’t kill MySQL

Get more. Monthly insights about scalability, startups & innovation.. Our latest Are SQL Databases Dead?

Connect to MySQL in the Amazon Public Cloud

MySQL on Amazon Cloud AWS

Troubleshooting MySQL on Amazon can be a real test of patience. There are quite a few different things to watch out for in terms of connectivity & networking. Sometimes a checklist can help.

Join 16,000 others and follow Sean Hull on twitter @hullsean.

Here’s my exhaustive list of things that can block you.

1. Be sure to create users & grants

Chances are you did something like this to create your user:


mysql> CREATE USER ‘sean’@‘localhost’ IDENTIFIED BY ‘password’;
mysql> GRANT ALL PRIVILEGES ON sean_schema.* TO ‘sean’@‘localhost' WITH GRANT OPTION;

But that won’t help you when connecting from a remote Amazon box. So what to do? Here’s an example:


mysql> CREATE USER ‘sean’@’10.10.%’ IDENTIFIED BY ‘password’;
mysql> GRANT ALL PRIVILEGES ON sean_schema.* TO ‘sean’@‘%’ WITH GRANT OPTION;

You may need to make your source IP wildcard *more* aggressive. For example consider ’10.%’. You *may* even with with ‘%’ which allows *all* source IPs. This may sound dangerous, but if you use a tight security group (see item #3 below), you can still be safe.

Related: Why Oracle Won’t Kill MySQL

2. Make sure iptables is not a problem

IPTables is a Linux service that acts like a private firewall for each server. Some AMIs will have it enabled by default. If you’re having trouble like I did, this can definitely trip you up. That’s because your connection will fail silently without telling you, hey the OS won’t let me into that port!

If you are a networking pro you’ve probably already fiddled with iptables. Feel free to add specific rules, and keep it turned on. However I’d recommend just disabling it completely, and using your Amazon security groups to protect your ports.


$ /etc/init.d/iptables stop
$ chkconfig --list iptables
iptables 0:off 1:off 2:on 3:on 4:on 5:on 6:off
$ chkconfig --del iptables
$ chkconfig --list iptables
service iptables supports chkconfig, but is not referenced in any runlevel (run 'chkconfig --add iptables')

Also: Are SQL Databases Dying Out?

3. Test & verify amazon security group settings

Security groups in Amazon can be tricky. I recommend the following:

o create a security group webserver_group
– allow port 80 from 0.0.0.0/0
– allow port 443 from 0.0.0.0/0
– allow port 22 from

o create a security group db_group
– allow port 22 from
– allow 3306 from

What’s happening here? We can’t specify a fixed set of IP addresses because they can change in Amazon. So essentially what we’ve done is say *any* requests from servers in our Amazon package, which are in the webserver_group security group, can connect to port 3306. Pretty cool right?

This means we’re pretty locked down. No internet connections to 3306, so we can be a little looser (see item #1 above) about our grants and source IPs.

What about if you want to use your GUI tools to hit your Amazon hosted MySQL boxes? Say you like to use the Oracle Workbench, Navicat or Toad to connect to MySQL. One way you could do this is configure your db_group to allow 3306 from your office subnet. Then anyone VPN’d into your office will be able to use the tools they like.

Another option is to use Amazon VPC for your servers. You’ll setup an Amazon Virtual Private Gateway, which is a direct VPN connection between Amazon’s datacenter and your datacenter. This can be a messy process, and you’ll want to contact your network admin to help. Once it’s setup, amazon boxes appear to sit on your office or datacenter network. Cool stuff!


$ mysql -h xxx.xxx.xxx.xxx -u admin -p
Enter password:
ERROR 2003 (HY000): Can't connect to MySQL server on 'xxx.xxx.xxx.xxx'

Read this: Why are MySQL experts in such short supply?

4. MySQL network settings

If MySQL is bound to the wrong IP address you can have real problems. First be sure skip_networking is OFF. If it is ON change it in /etc/my.cnf & restart MySQL.


mysql> show variables like 'skip_net%';
+-----------------+-------+
| Variable_name | Value |
+-----------------+-------+
| skip_networking | OFF |
+-----------------+-------+
1 row in set (0.00 sec)

The other MySQL setting that can be problematic is bind-address. First check what it is set to:


$ cat /etc/my.cnf | grep bind
bind-address=127.0.0.1

This isn’t going to allow remote connections. In amazon however, your IP address may change upon reboot. So there is a special setting to allow binding to any IP:


bind-address=0.0.0.0

Related: Bulletproofing MySQL Replication with Checksums

5. installing mysql client & telnet for troubleshooting

You have two options for troubleshooting on the webserver side. If you’re simply trying to check by mysql command line, you may get blocked up if the network settings & security groups aren’t configured right. So use telnet first.


$ yum install -y telnet

$ telnet 10.10.10.1 3306
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
4
5.1.71??gu9Y6B'/y9Oay`QV

If you don't get a responce, it's not an issue with users or grants, but rather that the port isn't opened. Check iptables, check bind-address and check security groups.

Check this: Top MySQL DBA Interview Questions

6. SE Linux related issues

SE Linux will do a lot of good, if managed properly. However if you're not aware of it's existence, it can be very very frustrating. Symptoms can be as abstract as allergies, a cold or flu. It can monitor files, and prevent MySQL from being able to write where it needs to,

Read this: Migrating MySQL to Oracle

7. RPM & later centos yum repo install conflicts

I had real problems doing a custom install for a customer. They didn't want to use a repository for various settings, but preferred downloading RPMs. There were a few other customizations which were tripping things up.

Based on all the connectivity issues I was having, I backed out of the RPM based install, and then ran through a stock yum install. After doing that, I started seeing these weird errors in the mysqld.log

120328 21:32:40 [ERROR] Can't start server: Bind on TCP/IP port: Address already in use
120328 21:32:40 [ERROR] Do you already have another mysqld server running on port: 3306 ?
120328 21:32:40 [ERROR] Aborting
If I run "netstat -nat | grep 3306" in my terminal, I get the following:
tcp4 0 0 *.3306 . LISTEN

I spent hours spinning my wheels and not able to figure out what was happening here. At first it seemed a leftover pid file was the culprit. In the end it appeared the *old* /etc/init.d/mysql script was still in place, and the new yum packages wouldn't work with that.

I ended up just scrapping the whole box, and starting from scratch. Sometimes you have to do that. After a clean build, all was fine.

Related: RDS or MySQL 10 Use Cases

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don't work with recruiters