Category Archives: All

5 cloud ideas that aren’t actually true

storm coming

Join 20,000 others and follow Sean Hull’s scalability, startup & innovation content on twitter @hullsean.

Cloud computing is heralding us into a wonderful era where computing can be bought in small increments, like a utility. This changes the whole way we plan, manage budgets, and accelerates startups making them more agile.

But it’s not all wine & roses up there. I’ve heard a few refrains from clients over the years, and thought I’d share some of the most common.

1. Scaling is automatic

Rather recently I was working with a client on building some sophisticated reports. They needed to slice & dice customer data, over various time series, and summarize with invoices & tracking data. Unfortunately their dataset was large, in the half terabyte range.


Client: Can we just load all this data into the cloud?
Me: Yes we can do that. Build a system in Amazon public cloud, can support large datasets.
Client: I want it to scale easily. So we won’t have these slow reports. And as we add data, it’ll just manage it easily for us.
Me: Well it’s a little bit more complicated than that, unfortunately.

Unfortunately this is a rather familiar conversation that I have quite often. A lot of the press around cloud scalability, centers around auto-scaling, Amazon’s renowned & superb virtualization feature. Yes it’s true you can roll out webservers to scale out this way, but that’s not the end of the story. Typically web applications have a lot of components, from caching servers, to search servers, and of course their backend datastore.

But can we scrap our relational database, such as MySQL and go with one that scales out of the box like Riak, Cassandra or Dynamodb?

Those NoSQL solutions are built to be distributed from the start, it’s true. And they lend themselves to that type of architecture. However, if you’ve built up a dataset in MySQL or Oracle, and more so an application around that, you’ll have to migrate data into the NoSQL solution. That process will take some time.

Like teaching a fish to fly, it make take some time. They do well in water, but evolution takes a bit longer.

Related: RDS or MySQL 10 use cases

2. Disaster recovery is free

In the traditional datacenter, when you want DR, you setup a parallel environment. Hopefully not in the same room, same city or same coast even. Preferrably you do so in a different region. What you can’t get around is dishing out cash for that second datacenter. You need the servers, just in case.

In the cloud, things are different. That’s why we’re here, right? In amazon you have regions already setup & available for plugin-n-play use. Setup your various components, servers, software & configure. Once you’ve verified you can failover to the parallel environment you can just turn off all those instances. Great, no big charges for all that iron that you’d pay for to keep the rooms warm in an old-school datacenter. Or do you?

As it turns out, since you don’t have this environment running all the time, you’ll want to test it more often, run fire drills to bring the servers back online. That’ll incur some costs in terms of manpower. You’ll also want to include in there some scripts to start those servers up, and/or some detailed documentation on how to do that. And don’t lose that documentation, either will you?

You may also want to build some infrastructure as code unit tests. Things change, code checkouts evolve, especially in the agile & continuous integration world. Devops beware!

Read this: Why a killer title can make or break your content efforts

3. Machines are fast

Fast, fast, fast. That’s what we expect, things keep getting faster, right? Hard to believe then that the world of computing took a big step backward when it jumped into the cloud. Something similar happened when we jumped to commodity Linux a decade ago.

In amazon, it’s a multi-tenant world. And just like apartment buildings, popular restaurants, or busy highways you must share. When things are quiet you may have the road to yourself, but it’ll never be as quiet as a dirt road in the country!

Amazon is making big strides though. They now offer memory optimized & storage optimized instances. And an even bigger development is the addition of the most important feature for performance & scalability. That said the network & EBS can still be a real bottleneck.

Also: What is a relational database & why is it important?

4. Backups aren’t necessary

I’ve experienced a few horror stories over the years. I wrote about one noteworthy one When fat fingers take down your business.

True EBS snapshots make backing up your whole server, well a snap! That said a few extra steps have to happen (flush the filesystem & lock tables) to make this work for a relational database like MySQL or Oracle. And suddenly you have a verification step that you also need to perform. You see no backups are valid until they’ve been restored, remember?

But even with these wonderful disk snapshots, you’ll still want to do database dumps, and perhaps table dumps. Operator error, deleting the wrong data, or dropping the wrong tables, will always be a risk. Ignore backups at your own peril!

Check this: Why CTOs underestimate operational costs

5. Outages won’t happen

In an ideal world, everything is redundant, and outages will be a thing of the past. We’ll finally reach five nines uptime and devops everywhere will be out of work. :)

It’s true that Amazon provides all the components to build redundancy into your architecture, and very cutting edge firms that have taken netflix’s approach with chaos monkey are seeing big improvements here. But AirBNB did fail and at root it was an Amazon outage that shouldn’t ever happen.

Read: Why Oracle won’t kill MySQL

Get more. Monthly insights about scalability, startups & innovation.. Our latest Are SQL Databases Dead?

Why managers & CTO’s underestimate operational costs

too much inventory

Join 19k others and follow Sean Hull on twitter @hullsean.

1. Technology choices & talent shortage

I worked at one firm evaluating their technology stack. When we got to the programming language, I paused in my tracks. “Haskell” I asked? “Oh you haven’t heard of it? It’s a really cool functional programming language, and we found it had some cool features that we really wanted to use”.

I had to fight the urge to roll my eyes. Yes I’d heard of the language, sitting in the club with scheme, lisp & prolog, you study them at university. They’re certainly an interesting bunch and to be sure, can do some things that imperative programming languages can’t. But did it belong in the stack of this run-of-the-mill internet startup?

In this case the developers had full reign to choose any technologies they liked, adding more & more to the mix almost daily. But what are some of the ramifications here?

Two years, three years, or five years down the line, this team will be long gone, and another team will be picking up the pieces. Will you as a manager be able to find a lot of Haskell experts? What’s more operationally will you be able to support those choices? Will updates be made often enough to have a secure stack for years to come?

Also: 5 things toxic to scalability

2. Scalability & server costs

Server costs are easier than ever to estimate. Build your application to serve your first 10,000 customers on Amazon with a couple webservers and a database server. Growing 100x to a million customers, just vertically scale your db, scale out your webservers and you’re good. Or are you?

What happens when you hit a wall? Did you build your application on ORM technology or take on technical debt? I’ve seen firm after firm struggle with technologies like hibernate, eating up precious resources, and being helpless to eliminate the problem. Tread carefully on these types of questions.

Related: Why you’re not hitting five nines uptime

3. Patching, fixing bugs & managing security

Another long term cost of an application will be minor repairs and bug fixes. Those might appear in a slow steady trickle over the years, but security may loom larger. Cross-site scripting, SQL injection and many other threats can be a real headache.

What’s more fixes may involve the libraries your application sits on top of. And when they are upgraded, your application will require tweaks too. It’s all basic stuff when you’re knee deep in development, but when your application has been deployed, the original team is long gone, and you’re supporting it years later, it can surely get messy.

Read: The four-letter-word dividing dev & ops

4. missing operational switches

When building a web application, all eyes are on features. Which ones to include, and which are a priority. Pressure is heavy to build functions that can be sold to customers. Pleasing customers is of obvious importance.

So it’s no surprise that backend switches are often missing. But they can be a real boon for operations team. Suppose you roll out a new feature to support star-ratings on certain pieces of content. An operational switch can be built to allow that feature to be disabled as necessary. If the site is loaded, or trouble is brewing, you may desperately want some switches to disable parts of the site, without the whole thing going down. I talk about this in AirBNB didn’t have to fail.

Another useful thing is a browse only mode. This allows your site to operate, even when writing to the database is not possible. If you’ve ever tried to update on a social network like twitter, facebook or instagram, perhaps late and nite and gotten a “please try again later” message, you’ll understand the value. Here users can’t make changes, but otherwise the site appears to be working, and browsing works normally.

Check this: Are SQL Databases Dead?

5. Consider bitcoin

Mt. Gox, the Japanese exchange handling bitcoin failed in a spectacular fashion. 500 million of the digital currency was stolen. And what’s more since it’s all frictionless currency, untraceable, there’s no marked bills to try and track down. Ooops!

How does this relate to operational costs? The failure was squarely with the operations department. Functionally the site worked fine. But security wasn’t handled well enough, intrusion detection wasn’t employed, and “unspecified weaknesses” were to blame.

Security is one of those things that can be ignored without pain. Until something goes wrong. What’s more if it is being handled well, it’s invisible, and unappreciated besides.

Read this: Why Oracle won’t kill MySQL

Get more. Grab our exclusive monthly Scalable Startups. We insights on scalability, startups & innovation. Our latest Why I don’t work with recruiters

Why I can’t raise the bar at every firm

Screen-shot-2012-08-02-at-1.28.35-PM

Join 17,000 others and follow Sean Hull on twitter @hullsean. Also check out: Who is Sean Hull?

It may seem counterintuitive. If I am not the best solution provider, why on earth would I highlight it?

I believe by pointing those cases out, I also underline the clients and problems that I’m particularly well suited too, and for which I can really provide value. Read on!

1. People Problems

Sometimes, you’re hired to solve a particular problem which is framed as a technical one. Some process needs to be reworked, recoded or retooled. It’s framed as a technical problem, yet as things unfold the client already has the expertise in-house to solve & write the code. What then?

It may be that the right people aren’t communicating, project managers aren’t seeing the issues, or part of the human systems are gummed up. We can’t raise the technical bar, but we can help getting those folks talking.

I wrote about this before in When You’re Hired to Solve a People Problem.

Also: Why are oil spills & financial instability related to datacenter outages?

2. ORM Usage & Technical Debt

If you’ve read my blog you know I am not very fond of Object Relational Modelers.

I would also argue as Ward Cunningham does so elloquently that technical debt can be a real and pressing problem.

Here we would help identify and frame the problem, though the work of raising the bar technically involves the longer process of retooling & refactoring your code base.

Related: Why database choices are tricky

3. Where Commodity & Offshore Works

Some firms are already making use of odesk or offshoring resources, where you might pay as little as $150/day. If you have a very technical manager or CTO, such a solution may work well for you.

At the other end of the spectrum are the high priced senior consultants from firms like Oracle, Percona or Pythian. Yes they may set you back as much as $3500/day.

In those cases a scalability & performance review may make sense. Here’s how.. Although specialists are necessary, remember to ask yourself Why generalists are better at scaling the web.

We sit in the sweet spot between the two options. With low overhead, our prices are more affordable. At the same time you’re getting a whole lot more than a commodity solution. We’ll communicate in plain language with folks at every technical level. And for many firms that in itself is a value add.

Check this: Does Oracle Aim to Kill MySQL?

4. Existing team did their homework

Believe it or not, I’ve gone into consulting engagements where the existing team has really really done their homework.

In those cases it becomes much harder to raise the bar technically. In those cases I can help when existing team missed something. But more importantly, I can validate a correct setup, or identify technical debt.

Having an outside perspective then, can provide reassurance. As I see ten to fifteen new environments per year, I’ve seen hundreds in the past decade & a half. That’s helpful perspective in itself.

Read: Does Oracle Aim to Kill MySQL?

5. Availability & Uptime Are Already High

I wrote in depth about high availability in the Myth of Five Nines

At the end of the day, availability can only approach perfection, not actually reach it. That’s a property of complex systems. If your uptime is already extremely high, again we can validate your environment, review and provide & summarize findings. But we may not be able to raise the bar.

If that’s you, it’s a good problem to have!

Also: Why AirBNB Didn’t Have to Fail

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Why Scalability Is Big Business

Russian_Dolls

Join 16,500 others and follow Sean Hull on twitter @hullsean.

1. Complexity Is Growing

Despite automation & the mass migration to the cloud, or perhaps because of it, complexity continues to grow. Back in the dot com era a typical infrastructure included a load balancer, a couple web servers, one oracle database, and that was pretty much it.

Now that has multiplied. Pile on top of that three to five more webservers, a search server, a page cache, an object cache, one or more slave databases and more. You may have a utility server with jenkins for continuous automation, monitoring applications like nagios and cacti, your source code repository and perhaps configuration management like Puppet or Chef.

That’s not only more moving parts, it’s a wider swath of skills and technologies to understand. That’s one reason Generalists Are Better At Scaling The Web.

Also: Are SQL Databases Dead?

2. Developer Mandate: Features

The pressure to build features that can directly be monetized is obvious. Startups especially have the pressure to grow fast and grow now. So security, technical debt, and scalability often take a back seat. What’s more in small scrappy and lean startups, ops sometimes falls on the shoulders of one competent but overworked developer.

Related: Why Oracle Won’t Kill MySQL

3. Startups Growing Pains

With hyper growth, startups can go from 100 customers to millions overnight. That kind of popularity is a good problem to have. But if your app hits a wall and suddenly falls over, everyone is scrambling. The pressure builds, as fear of losing that traction mounts, and heads are put on the chopping block.

Read: AirBNB Didn’t Have To Fail

4. Missing Browse-only Mode & Feature Flags

Ever been browsing for airline tickets, then go to order and get an error? Try again later? If so you’re familiar with a browse-only mode. This is a very powerful addition to any web application but is very often left out. Some mistakenly believe it won’t work for their application, as users will always be changing data.

Ever visited a website that has star ratings, only to find them missing? Or temporarily unable to edit your rating for a piece of content? This amounts to what’s called a feature flag. These powerful switches give operations teams the ability to disable heavy features, while the side is under tremendous load. They can take a huge burden off the shoulders of your servers when you hit that scalability cliff.

Check this: Why I Don’t Work With Recruiters

5. Operations as an afterthought

I outlined some of the top reasons Why Startups Desperately Need Techops. It is a repeating refrain. Priorities of a growing startup often involve taking on technical debt. But if that isn’t managed carefully you’ll run into some of the problems that Ward Cunningham Warns Us About.

Also: 5 Things Are Toxic To Scalability

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

When a client takes a swing at you

MUHAMMAD ALI ROCKS GEORGE FOREMAN ON THE JAW

Join 16,000 others and follow Sean Hull on twitter @hullsean.

1. A changing of the guard

Back in the dot-com era, circa 1999 I worked for a startup in some transition. Upon meeting the team, I met the new CTO Harvey, who joined just a month before. Also on the team was the IT director Bill, who had been with the firm for five years.

After spending time in initial meetings & discovery, I put together an outline and my plan to migrate them to Oracle. The project kicked off shortly thereafter.

Also: Why Oracle Won’t Kill MySQL

2. Team lead sucker punches you

I spent the first week onsite so I could work closely with the team, specifically at Bill’s request. We worked almost side-by-side for a few days, and as I worked through some of the challenges of their application, and how it might interact with Oracle. At that time I was still working on some test boxes, as the new Oracle server was not yet setup.

First thing Monday while working remote I email Bill and CC Harvey to ask how things are going setting up the new server to house Oracle. A fairly harmless email, after what seemed like a successful previous week.

The response from Bill the director of IT was sharp and quick. He emailed back:

“The server is already setup, and I’ve installed Oracle on it. I have much of the data moved over. I’m not sure what you’ve been working on or how you will be able to help us on this project. Please advise.”

This came as a big surprise, as we had been working so closely together. We had also exchanged various emails to get details & configuration steps as well. It also seemed strange that he’d go ahead and complete the work that he had asked me to work on.

Related: Are SQL Databases Dead?

3. Proceed with caution

I quickly reached out to him, discussed status over IM and next steps. I also suggested that I come into the office again, to help with communication.

The following day I returned to the office, and met with him privately. I gently asked about his concerns, and if he had reviewed my task list and consulting agreement. It seemed that some of the terms & details had been overlooked. What’s more he and the CTO weren’t seeing eye-to-eye.

I then explained in a nice way, and to express that I had no plans to step on any toes, but that

“I’m glad to work with you Bill, in any way you see best, and on whatever tasks you decide I can help with.”

This seemed to put him at ease, and we moved forward.

Read this: AirBNB Didn’t Have to Fail

4. Green Shoots

As the engagement progressed it came to light that Harvey had hired me against Bill’s wishes. So Bill’s move seemed more motivated by feeling threatened than anything else.

Over the years I’ve learned time and again not to jump to conclusions. Especially at the start of a consulting assignment, there are likely a complex mix of personalities, and human dynamics that come into play. Sometimes when someone lashes out, it isn’t even directed at you per se, but because of a difficult transition period.

Patience, understanding and renewed efforts to communicate often win the day.

Check this: Why Are Devops & DBAs in Short Supply?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Connect to MySQL in the Amazon Public Cloud

MySQL on Amazon Cloud AWS

Troubleshooting MySQL on Amazon can be a real test of patience. There are quite a few different things to watch out for in terms of connectivity & networking. Sometimes a checklist can help.

Join 16,000 others and follow Sean Hull on twitter @hullsean.

Here’s my exhaustive list of things that can block you.

1. Be sure to create users & grants

Chances are you did something like this to create your user:


mysql> CREATE USER ‘sean’@‘localhost’ IDENTIFIED BY ‘password’;
mysql> GRANT ALL PRIVILEGES ON sean_schema.* TO ‘sean’@‘localhost' WITH GRANT OPTION;

But that won’t help you when connecting from a remote Amazon box. So what to do? Here’s an example:


mysql> CREATE USER ‘sean’@’10.10.%’ IDENTIFIED BY ‘password’;
mysql> GRANT ALL PRIVILEGES ON sean_schema.* TO ‘sean’@‘%’ WITH GRANT OPTION;

You may need to make your source IP wildcard *more* aggressive. For example consider ’10.%’. You *may* even with with ‘%’ which allows *all* source IPs. This may sound dangerous, but if you use a tight security group (see item #3 below), you can still be safe.

Related: Why Oracle Won’t Kill MySQL

2. Make sure iptables is not a problem

IPTables is a Linux service that acts like a private firewall for each server. Some AMIs will have it enabled by default. If you’re having trouble like I did, this can definitely trip you up. That’s because your connection will fail silently without telling you, hey the OS won’t let me into that port!

If you are a networking pro you’ve probably already fiddled with iptables. Feel free to add specific rules, and keep it turned on. However I’d recommend just disabling it completely, and using your Amazon security groups to protect your ports.


$ /etc/init.d/iptables stop
$ chkconfig --list iptables
iptables 0:off 1:off 2:on 3:on 4:on 5:on 6:off
$ chkconfig --del iptables
$ chkconfig --list iptables
service iptables supports chkconfig, but is not referenced in any runlevel (run 'chkconfig --add iptables')

Also: Are SQL Databases Dying Out?

3. Test & verify amazon security group settings

Security groups in Amazon can be tricky. I recommend the following:

o create a security group webserver_group
- allow port 80 from 0.0.0.0/0
- allow port 443 from 0.0.0.0/0
- allow port 22 from

o create a security group db_group
- allow port 22 from
- allow 3306 from

What’s happening here? We can’t specify a fixed set of IP addresses because they can change in Amazon. So essentially what we’ve done is say *any* requests from servers in our Amazon package, which are in the webserver_group security group, can connect to port 3306. Pretty cool right?

This means we’re pretty locked down. No internet connections to 3306, so we can be a little looser (see item #1 above) about our grants and source IPs.

What about if you want to use your GUI tools to hit your Amazon hosted MySQL boxes? Say you like to use the Oracle Workbench, Navicat or Toad to connect to MySQL. One way you could do this is configure your db_group to allow 3306 from your office subnet. Then anyone VPN’d into your office will be able to use the tools they like.

Another option is to use Amazon VPC for your servers. You’ll setup an Amazon Virtual Private Gateway, which is a direct VPN connection between Amazon’s datacenter and your datacenter. This can be a messy process, and you’ll want to contact your network admin to help. Once it’s setup, amazon boxes appear to sit on your office or datacenter network. Cool stuff!


$ mysql -h xxx.xxx.xxx.xxx -u admin -p
Enter password:
ERROR 2003 (HY000): Can't connect to MySQL server on 'xxx.xxx.xxx.xxx'

Read this: Why are MySQL experts in such short supply?

4. MySQL network settings

If MySQL is bound to the wrong IP address you can have real problems. First be sure skip_networking is OFF. If it is ON change it in /etc/my.cnf & restart MySQL.


mysql> show variables like 'skip_net%';
+-----------------+-------+
| Variable_name | Value |
+-----------------+-------+
| skip_networking | OFF |
+-----------------+-------+
1 row in set (0.00 sec)

The other MySQL setting that can be problematic is bind-address. First check what it is set to:


$ cat /etc/my.cnf | grep bind
bind-address=127.0.0.1

This isn’t going to allow remote connections. In amazon however, your IP address may change upon reboot. So there is a special setting to allow binding to any IP:


bind-address=0.0.0.0

Related: Bulletproofing MySQL Replication with Checksums

5. installing mysql client & telnet for troubleshooting

You have two options for troubleshooting on the webserver side. If you’re simply trying to check by mysql command line, you may get blocked up if the network settings & security groups aren’t configured right. So use telnet first.


$ yum install -y telnet

$ telnet 10.10.10.1 3306
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
4
5.1.71??gu9Y6B'/y9Oay`QV Connection closed by foreign host.
$

If you don't get a responce, it's not an issue with users or grants, but rather that the port isn't opened. Check iptables, check bind-address and check security groups.

Check this: Top MySQL DBA Interview Questions

6. SE Linux related issues

SE Linux will do a lot of good, if managed properly. However if you're not aware of it's existence, it can be very very frustrating. Symptoms can be as abstract as allergies, a cold or flu. It can monitor files, and prevent MySQL from being able to write where it needs to,

Read this: Migrating MySQL to Oracle

7. RPM & later centos yum repo install conflicts

I had real problems doing a custom install for a customer. They didn't want to use a repository for various settings, but preferred downloading RPMs. There were a few other customizations which were tripping things up.

Based on all the connectivity issues I was having, I backed out of the RPM based install, and then ran through a stock yum install. After doing that, I started seeing these weird errors in the mysqld.log

120328 21:32:40 [ERROR] Can't start server: Bind on TCP/IP port: Address already in use
120328 21:32:40 [ERROR] Do you already have another mysqld server running on port: 3306 ?
120328 21:32:40 [ERROR] Aborting
If I run "netstat -nat | grep 3306" in my terminal, I get the following:
tcp4 0 0 *.3306 . LISTEN

I spent hours spinning my wheels and not able to figure out what was happening here. At first it seemed a leftover pid file was the culprit. In the end it appeared the *old* /etc/init.d/mysql script was still in place, and the new yum packages wouldn't work with that.

I ended up just scrapping the whole box, and starting from scratch. Sometimes you have to do that. After a clean build, all was fine.

Related: RDS or MySQL 10 Use Cases

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don't work with recruiters

5 Things Frans Johansson says about innovation

medici affect johansson

You may not have heard of Medici before, but you’ve probably heard of the renaissance. The medici family hosted the round tables, the meetups, the social gatherings & mixers. They brought diverse artisans engineers & thinkers together, and the world hasn’t been the same since!

b>Join 16,000 others and follow Sean Hull on twitter @hullsean.

In the Medici Effect, Frans dissects what this famous family did. His case studies include the likes of Richard Branson, Deepak Chopra, Charles Darwin, Thomas Edison, Orit Gadiesh, Marcus Samuelsson, George Soros & our own favorite Linus Torvalds,

What he discovered really surprised me.

1. Swim at the intersection

Hanging out with folks in your field is great. Whether you’re a physician, financial analyst, Ruby programmer, or artist. But it won’t expose us to enough new ideas. To get that, you need to hang out with those in other disciplines. Learn a language, take dance classes, try your hand at a new sport, or attend meetups of wedding planners or DJs. Whatever it takes to get out of your comfort zone is what will put you at the intersection.

Also: Why a killer title can make or break your content efforts

2. You need quantity to get quality

This was a very surprising finding of their research. One might think that greats like Albert Einstein were geniuses from the start. But it turns out one consistent factor between all these folks is the quantity of their attempts. They came up with many many ideas, and chased as many as they could. Of course they are only remembered for their successes, but this hides the underlying mathematics. It’s a numbers game in almost all of these cases.

Read: Are SQL Databases Dead?

3. Peel all the potatoes and cook them together

Peel one potato and cook it. Then peel another and cook it. Doesn’t sound like a recipe for efficiently preparing dinner does it? Turns out it’s also not great for innovating. Peel & prepare many ideas at once, and try to execute them in parallel if you can. That’s what these greats have done.

Related: Why generalists are better at scaling the web

4. Be ok with more failures

This is a tricky one. But Johansson puts in perspective with this key quote:

”Inaction is far worse than failure.”

Viewed that way, our caution about diving into a new idea seems more limiting. True it costs money, time & resources to pursue new ideas, ventures & startups. So be sure to reserve resources. That’s right spend that money & time carefully lest you run out before hitting on the big one.

He also says to be suspicious of low failure rates. In yourself or those you’re evaluating. This probably indicates you’re not risking enough, or trying new things constantly.

Read this: Why Oracle Won’t Kill MySQL

5. break out of your network

Your network is powerful to pursue your career, or following existing well traveled paths. But they can be an obstacle when forging new paths, which is what innovation is all about.

So break away from your networks. One way you can do this is by building a new one. But be sure to surround yourself with diverse cultures, upbringing, backgrounds & expertise.

Also: RDS or MySQL 10 Use Cases

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

How to deploy on Amazon EC2 with Vagrant

vagrant logo

Join 16,000 others and follow Sean Hull on twitter @hullsean.

Why do I want Vagrant?

Vagrant is a really powerful tool for managing virtual machines. If you’re a developer it can make it push-button simple to setup a dev box on your laptop. It manages the images, and uses configuration files to describe specifics of your machines.

In the amazon environment, you can deploy machines just as easily as on your desktop. That’s pretty exciting for those of us already familiar with Vagrant. With that I’ve provided a simple 7 step howto for doing just that!

Also: Are SQL Databases Dead?

1. Use the Mac OS X installer

Fetch your download file here:

Vagrant Installer Downloads

Run the installer. It should do the right thing!

Also: Why Oracle Won’t Kill MySQL

2. Install the vagrant-aws plugin


$ vagrant plugin install vagrant-aws

Also: Bulletproofing MySQL Replication with Checksums

3. Fetch a vagrant box image

Box images vary depending on your “provider” which is vagrant-speak for the environment you’re running in. For aws, they’re some simple json files that tell Vagrant how to work in that environment.

The creator of the plugin has provided a dummy box. Let’s fetch it:


$ vagrant box add dummy https://github.com/mitchellh/vagrant-aws/raw/master/dummy.box

This command is straight out of the readme. What does it do? Take a look:


$ cd /var/root/.vagrant.d/boxes/dummy/aws

$ cat metadata.json
{
"provider": "aws"
}

There’s also the info.json file which looks like this:


$ cat info.json
{"url":"https://github.com/mitchellh/vagrant-aws/raw/master/dummy.box","downloaded_at":"2014-01-14 17:42:33 UTC"}

There’s not a whole lot going on here. If you’re deploying VirtualBox VMs with Vagrant, you’d see a VMware4 disk image. But with Amazon, it stores it’s own AMIs on S3, so Vagrant simply fetches them and runs them for you.

Related: Intro to EC2 Cloud Deployments

4. Configure Vagrantfile

Create a directory to hold your vagrant metadata. This would be the name of your machine:


$ cd /var/root
$ mkdir testaws
$ cd testaws
$ vagrant init

Edit the file as follows:


Vagrant.configure("2") do |config|
# config.vm.box = "sean"

config.vm.provider :aws do |aws, override|
aws.access_key_id = "AAAAIIIIYYYY4444AAAA”
aws.secret_access_key = "c344441LooLLU322223526IabcdeQL12E34At3mm”
aws.keypair_name = "iheavy"

aws.ami = "ami-7747d01e"

override.ssh.username = "ubuntu"
override.ssh.private_key_path = "/var/root/iheavy_aws/pk-XHHHHHMMMAABPEDEFGHOAOJH1QBH5324.pem"
end
end

If you’re familiar with the Amazon command line tools, you’ve probably setup environment variables. Otherwise these may not be familiar to you, so lets go through them:

Your access_key_id and secret_access_key are two pieces of information Amazon uses to identify your instances and bill you. Those are unique to your environment so keep them close to the vest. Here’s how you create them or find them on your aws dashboard.

The keypair_name is your personal SSH key. You may have one on your laptop which you use to access other servers. If so you can upload to the amazon environment. If not you can also use the dashboard to create your own. Whenever you spinup a server, you can instruct amazon to drop that key on the box in the right place. Then you’ll have secure command line access to the box, without password. Great for automation!

Next is your AMI. This is an important choice, as it determines the OS of the machine you’ll spinup, and many other characteristics. You can go with a Amazon Linux AMI but I quite like the Alestic ones from Eric Hammond. Trusted & reliable.

Looking for an ubuntu AMI? Try this ami locator tool.

Check this: 8 Best Practices for Deplying MySQL on AWS

5. Startup the box

Starting an instance once you’ve configured your Vagrantfile is pretty straightforward.


$ vagrant up —-provider=aws

Related: How to autoscale MySQL on Amazon EC2

6. Verify in the Amazon dashboard

Jump over to your amazon dashboard with this link. If you’re logged in already, that will take you to your EC2 instances. You should see a new one, based on the parameters in your Vagrantfile.

Read: Why devops talent is in short supply

7. Login to your Amazon instance

Last but not least, you’ll want to login. Note I’m explicitly specifying my SSH key here. Your path may vary…


$ ssh -i ./iheavy.pem ubuntu@ec2-50-220-50-40.compute-1.amazonaws.com

Also: 5 more things deadly to scalability

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Why cloud computing is the spotify-cation of hosting

dvd collection

Join 16,000 others and follow Sean Hull on twitter @hullsean.

1. Music collections of old

Way way back in the 70’s I remember riding around in a VW beetle. Maybe I’d be driving
with my dad or my uncle. Everybody seemed to own a VW! What everybody also had was a huge collection of 8-track taps in a big box. You’d dig through the box and find what you wanted to play, then pop in the tape. It was exciting because before 8-tracks you only had records, and you couldn’t play those in the car!

But even record collections were new in the 60’s. Before that, most music was consumed live or on the radio.

Also: Why a killer title can make or break your content efforts

2. When books left the library

A similar trend followed for books and reading. Although newspapers have been sold by subscription for a lot longer, books were mostly consumed in libraries. But the consumer itch to build collections eventually built Barnes & Noble into a powerhouse brick and mortar store.

Internet disruption of that business model came too. Enter Amazon’s Kindle. Although you theoretically *buy* digital books, if you read the fine print you’ll see you actually rent them in perpetuity. In fact there have been cases where Amazon has reached into devices and removed previously purchased media.

Related: Why AirBNB didn’t have to fail

3. Managing collections (even stolen ones) is hard work

When you download music or movies, either from iTunes or god forbid grabbing it off of Bittorrent networks, you need to put it somewhere. You’ll store it on your laptop harddrive or if your collection is large enough, on some shared storage system at home. And you’ll also probably never back it up.

The thing is harddrives themselves have a life of about two to four years. As an operations guy I manage data everyday. Backups are a big part of that process, so when the media fails, you won’t lose the collection of movies & music you built lovingly over so many years.

Sadly most people learn the hard way. And when you learn this lesson you probably think, where did all that time go? What did I even *have* in my collection?

Also: Are SQL Databases Dead?

4. Why music & movie theft was just a blip on the historical radar

I’m also a bit of a Doctor Who fan. Since it’s a rather obscure British TV show (or was) I spent some time buying many of the old episodes on DVD. Or I *did* rather, until Netflix starting offering the whole classic collection on subscription. They did this with Star Trek too. Now I have no reason to fish through my shelves for a DVD. Why would I?

As users become more accustomed to the subscription model, they’re less likely to want to build a whole collection of media. This goes well for books, music & videos. Who would bother downloading off of Bittorrents, managing your home collection, and all that trouble when you can just subscribe. Easy. No mess!

Read: Why Oracle Won’t Kill MySQL

5. Subscriptions, subscriptions everywhere!

Whether you managed a datacenter of physical servers in-house, or bought servers managed by a hosting company before the subscription model you had to worry about moving parts. You had to worry about failing harddrives, memory & all the rest.

Then along comes Amazon Web Services and it’s EC2 servers bringing the subscription model to hosting too. This raises the bar on the biggest failing component harddrives, but putting all data on EBS, their virtual storage network. All of this raises the bar for a lot of organizations and reduces the drudgery.

What spotify is doing with music, Netflix is doing for movies & tv shows, and kindle is doing for books. That same trend has brought great disruption to the internet & server hosing. Startups and consumers win big in this game.

Can you think of any businesses where a subscription model might work? They may be ripe for disruption by a new startup.

Check out: Why your startup is failing at Devops

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Are SQL Databases Dead?

mesa verde city

I like the image of this city of Mesa Verde. It’s fascinating to see how ancient cities were built, especially as an inhabitant of one of the worlds largest cities today, New York.

I’m a long time relational database guy. I worked at scores of dot-coms in the 90′s as an old-guard Oracle DBA, and pivoted to MySQL into the new century. Would a guy like me who’s seen 20 years of relational database dominance really believe they could be dying?

There’s a lot to be excited about in this new realm of db, and some interesting bigger trends that are pushing things in a new way.

Join 15,100 others and follow Sean Hull on twitter @hullsean.

1. Growing use of ORMs

ORM probably sounds like some strange fossil archeologists just dug up in the ancient city of Mesa Verde. But they’re important. You may know them by their real-life names, Hibernate, Active Record, SQL Alchemy and Cake. There are many others. Object Relational Modelers provide a middleware between developers and the SQL of your chosen relational database. They abstract away the nitty gritty, and encapsulate it into a library.

In a way they’re like code generators. Mark Winand talks about them in SQL Performance Explained warning of the “eager fetching” problem. This is DBA speak for specifying all columns (SELECT *) or fetching all rows, when you don’t need them all. It’s inefficient in terms of asking the database to read & cache all that data, but also to send it across the network and then discard it on the webserver side. Like a lazy housekeeper the clutter & dust will grow to overwhelm you.

Martin Fowler is the author of the great book NoSQL Distilled. He tries to walk the fence in his post ORM Hate, trying to balance developers love of ORMs, and the obvious need for scalability. Ted Neward calls ORMs the Vietnam of Computer Science.

Mattias Geniar points out that BAD ORMs are infinitely worse than bad SQL and another on High Scalability by Drewsky The Case Against ORM Frameworks.

If you agree the ORM conversation is still a huge mess, you’ll be excited to know that NoSQL sidesteps it completely. They’re built out of the box to interface more like data structures, than reading rows and columns. So you eliminate the scalability problems they introduce when you go NoSQL. That makes developers happy, and pleases DBAs and techops too. Win!

Read: Why Oracle won’t kill MySQL

2. Widening field of options

NoSQL databases are not simply key value stores, though some like Memcache and Riak do fit that mold.

Mongodb offers configurable consistency & durability & the advantages of document storage, no need for an ORM here. You also have a mix of indexing options, that go a little deeper than other NoSQL solutions. A sort of middle ground solution that offers the best of both worlds.

Cassandra, a powerful db that is clustered out of the box. All nodes are writeable, and there are various ways to handle conflict resolution to suit your needs. Cassandra can grow big, and naturally takes advantage of cloud nodes. It also has a nice feature to naturally age out data, based on settings you control. No more monumental archiving jobs.

Hbase is the database part of Hadoop, based on Google’s seminal Bigtable paper.

Redis is another option with growing popularity. It’s a key-value store, but allowing more complex data in it’s buckets, such as hashes, lists, sets and sorted sets. Developers should be salivating at this one.

Also: 5 Great Things about Markus Winand’s Book SQL Performance Explained

3. Lowering bar

The old world of relational databases treat data as sacrosanct. DBAs are tasked with protecting it’s integrity & consistency. They manage backups to protect against disaster. In this world, every bit of data written is as sacred as any other, whether it’s your bank account balance, or a comment added to a facebook discussion.

But modern non-relational databases introduce the idea of eventually consistent. DBAs and architects would say we are relaxing our durability requirements. What they mean is data can get slightly out of sync and we’re ok with that. We’ll build our web applications to plan for that, or even in the case of Riak expose the levers of durability directly to the developers, allowing them to make some changes instant, while others more lax and lazy.

Check this: Why high availability is so very hard to deliver

4. Cloud demands

Virtualized environments like Amazon EC2, give easy access to legions of servers. Availability zones & regions only widen the deployment options. So deploying a single writeable master, the way traditional relational databases work best, is not natural.

Databases like Cassandra, Mongo & Redis are clustered right out of the box. They grew up in this virtual datacenter environment and feel comfortable there.

Related: Why I wrote the book on Oracle & Open Source

5. Only DBAs understand them

Devs may whine at this statement, and to be fair it’s a generalization. The popularity of ORMs speaks volumes here. Anything to eliminate the dreaded SQL writing. Meanwhile DBAs bemoan the use of ORMs for they represent everything they’re trying to fix.

SQL is hard enough, but the ugly truth is each database vendor has their own implementation, their own optimizations, their own optimal tweaks. Even between database versions, SQL code may not perform consistently.

Identifying slow SQL and tweaking it remains one of the primary tasks of performance tuning, for this reason. It hasn’t changed much in my two decades on the job.

Also: Why bemoaning AWS performance sounds like Linux detractors circa 1999

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters