Category Archives: All

Are startup CEO’s hiding their scalability problems?

Russian_Dolls

Join 27,000 others and follow Sean Hull on twitter @hullsean.

Your site is running fine right? You have 1000 customers, and it usually runs smoothly. Just this one lingering question, why does it take five high performance EC2 instances to run the database, all on flash drives? Goood question!

The truth is one of the highest trafficed sites I managed, pulled in 100 million uniques a month, and only used three backend databases. That site was one of these wildly popular celebrity gossip sites, the ultimate guilty pleasure when you’re at the office and can’t watch reality tv!

Snickers aside, this is huge traffic. And all of the above was built on Drupal, with no ORM in the mix. It could even run, albeit noticeably slower, while memcache was disabled.

1. Servers with solid state drives

I’m very excited to see Amazon introduce servers with SSD drives. They can bring you 100x improvement of disk I/O, and that my friends is the end all and be all for databases. So why complain?

If you deploy on these boxes right out of the gates, it may be like using a crutch. You become dependent on it, and ignore real performance tuning. Solid state drives still won’t obviate that ORM middleware you’re using.

Also: Do managers & CEO’s underestimate operational costs?

2. Memcache saving your bad queries

Memcache is also a powerful tool. It sits between the database and your webservers, reducing load on the database by as much as 10x. That’s a great way to get better response time, and reduce drag on your db tier. But it’s still worthwhile performance tuning without it.

Why? If you can get your site to run without caching, it will run blazingly fast *with* it. Don’t use it as a crutch, use it as rocket fuel for your well tuned site.

Read this: Do startups need techops?

3. A legion of read slaves

I’ve seen smaller sites, using a ton of read slaves. All of it deployed to cover up slow & redundant queries pouring out of an ORM middleware layer, in this case Cake PHP.

Again, read slaves are great, but tune & test with less hardware, and get the performance up the hard way. With elbow grease!

Related: Howto automate MySQL query analysis with Amazon RDS

4. Really really big memory

64G, 128G, 256G of main memory? If I wax on about the days when you’d get excited by 64k, I’ll sound like an old timer. But with those extreme limitations, you had to write tight code. Otherwise it just wouldn’t do anything.

Really really big memory of today’s servers allows us to get lazy. I hear developers say “Hey, the database is 10G of data, and we have 64G main memory, so the whole thing will fit in memory. Problem solved!”

Duhhh… No. Why not? Because you still have to slice and dice that data. You still have to scan through for bits & pieces that aren’t indexed, then sort, and organize that into temporary memory space. In DBA speak, you’re still doing a ton of logical IOs.

Picture it another way, imagine the days when you’re on horseback, riding across the west. You travel light cause frankly your horse can carry only so much. Then along come cars, and you start loading up the trunk. You add the kitchen sign, and the rear tires are hanging on the ground. All seems fine until you hit a steep mountain, and you’re car is almost stalling at 20mph. If you had only carried the same load as you did on horseback, you’d be speeding across the country at lightning pace.

Read: Is Amazon RDS hard to manage?

5. Deploying poor code

Deadlines are looming, and new features must be deployed. So performance testing can wait until later. The code works after all.

Been there, done that. Code gets deployed and all of a sudden there are spikes on server load in the evening. Ops & DBA teams are screaming, “Who wrote this code?”.

Load testing should be a part of everyday QA & test. It’s the only way to avoid growing scalability problems.

Check this: Are SQL databases dead?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Howto automate MySQL slow query analysis with amazon RDS

iRobot1

If you’ve used relational databases for more than ten minutes, I hope you’ve heard of slow queries. Those are those pesky little gremlins that are slowing down your startup, and preventing scalability you so desperately need.

Luckily there’s a solution. What I’ve found is if I send a report to developers every week, it keeps these issues front and center, for folks that are very busy indeed.

The script below is for RDS, but you can surely modify it if you have a physical server or roll-your-own MySQL box on Amazon. Take a look & enjoy!

Join 26,000 others and follow Sean Hull on twitter @hullsean.

1. install percona tools

Percona as many probably already know, are a wildly successful services firm that support MySQL and related technologies. They also have a very popular & scalable MySQL distribution by the same name.

Even if you’re not using Percona MySQL, you definitely want to get ahold of the percona toolkit. It provides all sorts of useful tools, including the one this article is based on, query-digest.

This tool takes your stock MySQL slow query logfile as input, and summarizes it into a very useful and readable report. Formerly mk-query-digest, it’s not called pt-query-digest. See below.

You can install the percona tools easily by grabbing the repository file and installing that with rpm. From there you can just use yum or apt-get depending on your distribution.

Related: Why a killer title can make or break your content efforts

2. install aws command line tool

Amazon has consolidated all it’s command line tools into a single one called just “aws”. The options can be a little arcane, and the error messages misleading besides. What’s good though is it is slightly easier to install & configure.

Do you already use Python? Install it this way:


$ pip install awscli

If not, you’ll need to dig into the aws cli installation instructions further.

Also: Do managers underestimate operational costs?

3. edit .aws/config

After you get the tool installed, you need to setup your environment. I edited a file named /home/shull/.aws/config as follows:


[default]
region = us-east-1
aws_access_key_id = BLIBJZMKLWIL5UTNRBMQ
aws_secret_access_key = MF5J/2z7HmN92lQUrV12ZO/FBXNjDVjL52TNRWsG

Those access_key_id and secret_access_key you can find on your amazon dashboard. Click upper right hand corner under your name, select the menu item “Security Credentials”.

Check out: Are SQL Databases Dead?

4. edit send_query_report.sh

I wrote the script below so you can fairly easily edit it.


#!/bin/bash
#

# get the rds db instanceID from command line (or crontab) entry
#
AWS_INSTANCE=$1

# here's where we'll store the latest slowquery.log
#
SLOWLOG=/tmp/rds_slow.log
#SLOWLOG=`/bin/ls -tr /home/shull/*.log | /usr/bin/tail -1`

# fetch slow query log from rds box
# here I always grab the latest one.
#
/usr/local/bin/aws rds download-db-log-file-portion --db-instance-identifier $AWS_INSTANCE --output text --log-file-name slowquery/mysql-slowquery.log > $SLOWLOG

# query report output
SLOWREPORT=/tmp/reportoutput.txt

# pt-query-digest location
MKQD=/usr/local/bin/pt-query-digest

# run the tool to get analysis report
$MKQD $SLOWLOG > $SLOWREPORT

# today's date in a variable
TODAY=`/bin/date +\%m/\%d/\%Y-\%H:\%S`
#YESTERDAY=`/bin/date -d "1 day ago" +\%m/\%d/\%Y-\%H:\%S`

# report subject
SUBJECT="Sean Query Report -- $TODAY "

# recipient
EMAIL="hullsean@gmail.com"

# send an email using /bin/mail
/usr/bin/mailx -s "$SUBJECT" "$EMAIL" < $SLOWREPORT

Note, if you don't have mailx installed, it should be available in your repository. Use apt-get or yum as necessary to get it installed.

Also: Is high availability overrated & near impossible to deliver?

5. Add to crontab

After you've tested the above script from command line, you will want to add it to a weekly cron job. Voila, automation! Don't forget to chmod +x to make it executable. :)


00 09 * * 5 /home/shull/send_query_report.sh seandb

Read: Are MySQL DBA's impossible to find?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don't work with recruiters

What happens when you combine devops & continuous delivery into a card game?

release devops game

Join 25,000 others and follow Sean Hull on twitter @hullsean.

Alex Papadimoulis & the guys at Inedo put together CodeMash The Game an interesting game for a new twist to conference going.

Now they’re at it again with a kickstarter to build Release! a game about devops & continuous delivery.

1. Bring your team together

Weekly standups are great, but what about throwing a quick card game in to mix things up? It’s an interesting twist and one that’s sure to help with team building.

Read: Why has no-one heard of Moskovitz but everyone knows Zuckerberg?

2. Learn more about cutting edge software development

Weak on your agile or want to raise your teams software quality. Release seems like a new and surprising way to do just that.

Related: Why I ask clients for a deposit

3. Learn about software development luminaries

Many of the important folks in the evolution of software development are featured in the game, such as Patrick Dubois, Jez Humble & Dan North.

Also: Is Amazon RDS hard to manage?

Why I ask clients for a deposit

Editor & writer in friendly dialog

Join 25,000 others and follow Sean Hull on twitter @hullsean.

1. It indicates both parties are serious

A common refrain when discussing terms of a project, and reviewing statement of work – “when shall we get started?”. The answer should be, “I’m ready to get started anytime you like. Would you like to use paypal or ACH for deposit?”.

The deposit signals to the vendor that it’s time to get working. This client has the budget and is serious about moving forward today.

Read: Why Fred Wilson is wrong about Apple

2. It protects against scope changes

Startups & seasoned businesses alike have changing needs. That’s why they may choose a situational resource to begin with.

If the winds change, and we don’t need you tomorrow, a deposit defrays the final invoice, and or discounts you may have applied.

Related: Is Dave Eggers right about risks of social media?

3. Insurance if business fizzles

Fizzling business, is a nice way to say the market has changed. Perhaps the startup has decided to pursue other opportunities. In close to twenty years of business I’ve only had this happen twice.

Once I was working with a rewards card business. They were already having trouble meeting payroll. Turns out businesses have a legal obligation to meet payroll. That’s another way of saying they’re at the top of the who-gets-paid list. And vendors may be closer to the bottom. They owner went back to being a lawyer, his profession before the startup.

All in all, a deposit provides some insurance in these cases.

Read this: 5 cloud ideas that aren’t actually true

4. Signals your maturity to client

This is a hard one for some freelancers and consultants to stomach. “I really want to get going with consulting, and don’t want to turn away this client.” The thinking goes. But consulting is a peer relationship, where vendor and client need to be on an equal footing.

Your need for a deposit, and willingness to walk away without one, says to the client you are professional and have been in business for some time.

Also: If you’re using MySQL in the Amazon cloud, you need to ask yourself this question

5. Protection from early termination

That sounds ominous but it doesn’t have to be. In the world of freelance and consulting, a client can decide they no longer need your services tomorrow.

Why? Perhaps they hired a fulltime resource? Perhaps their needs changed? Perhaps the storm of site outages have passed and the urgency has changed.

Whatever the reason, projects change. If you’ve offered a discount for three months of work, but only end up with one month of work, your full fees may apply. In that case, the deposit should be the discount amount.

Check this: Do managers underestimate operational costs?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Why Fred Wilson is wrong about Apple

apple_android

If you’ve followed the tech news recently, you may have heard Fred Wilson’s comments about Apple. In essence he believes Apple is too reliant and rooted in hardware, and that hardware isn’t viable in the long term. Mobile hardware, is becoming a commodity.

Join 25,000 others and follow Sean Hull on twitter @hullsean.

To be sure Androids have come a long way, and they may yet improve a lot by 2020 as Fred says. But the aftermarket value of iPhones really does speak volumes. See below.

1. iPhone has never had the best hardware

If you’ve ever watched a Samsung ad, or talked to someone with the phone you probably know this already. Bigger screens, faster processors, first phones with fingerprint readers, or untethered syncing. The list goes on and on.
Also: 5 Cloud ideas that aren’t actually true

Yes Apple is rooted in the hardware business, but not in a way that a commodity can disrupt it. They’re rooted in the hardware business only in as much as it helps them deliver polish. If it helps them deliver a seemless experience, and a device that Jean Luc Picard would appreciate, then they are in that business. . Just “make it so!”.

2. Users are seduced by simplicity

So how is it possible that an inferior piece of hardware could sell more?

Easy. Those users don’t think that way. They aren’t buying hardware. What do I mean?

I would argue many iPhone users buy for the experience, the simplicity, the ease of use. Designers call this User Experience or User Interface, but end users don’t know these terms. What they know is they don’t have a headache. They’re not frustrated trying to move an image from one app to another, or copy/pasting etc.

User interface is that invisible force that just makes everything on the device better. Call it polish, but it’s much more than a pretty face.

Related: Are SQL Databases Dead?

3. Most users don’t care about “open”

Another benefit touted on the Android side is it being “open”. The OS is open-source, and then extended by manufacturers. While this surely brings down costs to them, it may be all be irrelevant to end users and consumers.

Yes open standards are great for competition, great for markets, and ultimately great for users. But Microsoft is a great case study in why consumers often still choose a closed solution.

Read: Do managers underestimate operational costs?

4. Apple is Sexy

That may sound fanboyish, but seriously. Look at the accessories market for blinging your phones.

If that’s not enough, look at the aftermarket value. iPhones retain their value, Samsung’s don’t.

Read: Five things I learned from David Maisterabout trust and advising clients

5. Android is still broken

From where I’m standing, and a lot of experts agree, the Android ecosystem is broken.

For one the AppStore, being historically unregulated, is chock full of malware and dangerous downloads. Most users aren’t computer experts, not good at evaluating security risks, and pay the price.

What’s more many Android phones come stocked with bloatware, slowing down the device, and reducing reliability from day one.

Read: Why the Android ecosystem is broken

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Is Dave Eggers right about the risks of social media?

eggers the circle

I have to admit, though Egger’s is a pretty famous author, I wasn’t familiar with his work. I do however read AVC regularly, the writing of renowned VC & Union Square Ventures partner Fred Wilson. So when one of the commenters pointed to the book as a great read I grabbed a copy on my Kindle.

Flipping through to the back of the book, the further reading section is telling. Bradbury’s Fahrenheit 451, DeLillo’s White Noise, Huxley’s Brave New World & Orwell’s 1984 are just a few on the list. All books that I’d read & enjoyed not only for their story, but for their cautious warning of a dystopian future.

The Circle story takes place at a fictional Silicon Valley company “The Circle”, whose campus includes wings such as Old West, Renaissance, Enlightenment, Machine Age & Industrial Revolution. The main character Mae, has just been hired in customer experience. Employees at the circle are all but *required* to socialize together. There everything is ranked, from customer satisfaction, to employee participation, comments, likes, posts & shares.

Join 25,000 others and follow Sean Hull on twitter @hullsean.

I came away with five major themes from the book. As the characters march through the pages, watch them sacrifice their morality, free will & eventually human rights too.

1. social media is like snack food

What I loved most about the story, was how extreme the social media use had become. It was as though every moment had to be captured, every interaction “shared”. And with that, others then comment, favorite, and interact.

But as we found later, social media became something of lesser value. It was like eating snack food, a simulation of real food, missing in nutrients, but masquerading as the real thing. The metaphor holds together well, as we see people become fatigued with Facebook in the real world, and the constant sharing of everything.

Also: Is quality journalism dead? What I learned from Ryan Holiday

2. Egger’s fictional technologies are close at hand

At one point in the story, Mae does a search to find out about her family history. What turns up is more than she bargained for. It turns out that her parents had a rather odd affinity for yearly baccanalian partying, and the photos shock & embarrass Mae.

Turns out some neighbor had scanned a whole shoebox full of photos, and from there the internet crawlers took care of the rest, indexing the photos complete with facial recognition & identification. Once that was complete, a simple search revealed pictures even her parents didn’t know exist.

Facial recognition technologies in fact already exist, though are not widely used quite yet. Governments are obviously beginning to use them for law enforcement, but facebook & google are certainly getting into the act too. What’s more the SeeChange cameras described in the story, parallel Google Glass for example, which is maturing quickly.

Related: Do consultants need to balance conflicts of interest?

3. secrets are a real human need

After Mae begins wearing the SeeChange monocle, everything she does is streamed to an online audience. It begins as an exercise in transparency, but we quickly see the trouble it brings as Mae has no moments of privacy.

In this world, moments of intimacy become shorter & harder to find. And we see then how Mae begins to crave those moments, and they become more precious too.

Check this: Why Oracle won’t kill MySQL

4. monitoring changes our behavior

Much of the monitoring and transparency in the Circle story comes from a new technology called SeeChange, a camera monocle worn around the neck, perhaps paralleling Google Glass that we have all heard of.

Surveillance can surely help prevent crime, or provide evidence after the fact. But one other affect of the technology is in warping people’s natural behavior, as though we are all on a stage, all on camera all the time. In Mae’s case she begins to act for the camera, and those around her do too.

Read: Why devops talent is in short supply

5. how social media warps our sense of time & human scale

Another interesting scene occurs when Mae follows up with a friend via text. Her friend doesn’t respond back, so she sends along another text a few minutes later asking if “everything is ok”. By Mae’s fourth & fifth message, she’s sure she’s been kidnapped, and by the tenth message she’s just angry and declares their friendship is over! All this in the span of 25 minutes.

I think Eggers uses a sort of extreme example, but really to illustrate an important point. In the world of always on communication, these types of misunderstandings are more and more common. Our sense of time changes, and we may feel that others are in slow motion.

Also: Are SQL databases dead?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

If you use MySQL in the Amazon cloud, you need to ask yourself this question

Join 25,000 others and follow Sean Hull on twitter @hullsean.

Are you serious about backups?

If you’re just using Amazon EBS snapshots, that may not be sufficient. There’s a good chance it won’t protect you against your next data loss.

That’s why I like to have a few different types of backups

Also: 5 more things deadly to scalability

Protect against operator error

mysqldump is a tool every DBA is familiar with. Same as a hotbackup or snapshot you say? Just more labor? Not true.

A dump allows you to restore one table, or one schema. That’s why they’re also known as logical backups. What’s more you can edit the file, remove indexes, change object names, or datatypes. All these can be essential in the screwy and unpredictable event of a real world outage.

Expect the unexpected!

Read: Why devops talent is in short supply

Test those backups regularly

If you haven’t actually tried to restore, you really don’t know if you have everything. Did you backup stored procedures & database code? How about grants? Database events? How about cronjobs? What about the my.cnf file? And your replication configuration?

Yes there are a lot of little pieces, and testing your backups by rebuilding everything is an attempt to poke holes in your plan, and hit issues before d-day!

Related: MySQL interview guide for managers and candidates alike

Replication isn’t a backup

Replication is getting better and better in MySQL. It used to fail regularly. MyiSAM was very unpredictable. But even in the comfortable realm of Innodb, there can still be data drift. If you’re on MySQL 5.0 or 5.1, you should consider performing regular checksums. These test the integrity of data and compare what’s actually in master & slave. Bulletproofing MySQL replication with checksums.

Read: Why high availability is so very hard to deliver

Have you considered security around your backup files?

While you’re thinking about backups, make sure the files themselves are secure. Remember they contain your crown jewels. Hopefully individual data that’s sensitive is encrypted, but still you should secure their final resting place as well.

If you’re using S3, consider encrypting the file before shipping it up to the bucket.

Read this: Why a four letter word divides dev and ops

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

How to increase newsletter signup conversions with nifty iphone trick

If you’re like me & spending a lot of time on twitter, I hope you’re also seeing the traffic growth I’m seeing. I’m sharing a stream of posts using hootsuite, then actively engaging with journalists, VCs, startups & technology experts.

That’s all great, and I’m finding more and more it’s a good use of my time.

Recently I started using a cool iphone feature to let followers know about my newsletter. It’s called a shortcut.

Have you ever mistyped a word on iOS? It then offers up the correct spelling. Through this same mechanism, there is an awesome way to quickly type anything. Use a two or three character shortcut to type a paragraph.

Take a look, here’s what I mean.

1. Click through to Settings->General->Keyboard

Open your iphone settings, and navigate through General, and then Keyboard.

keyboard tab

Also: Why you should track your time on social media

2. Find the Shortcuts tab

Navigate until you find shortcuts. It should look like this:

shortcuts tab

Read: Do managers underestimate operational costs?

3. Create a shortcut

Add a new shortcut with the plus button.

create shortcut

Phrase: “u may also like my newsletter http://iheavy.com/signup-scalable-startups-newsletter”

Shortcut: mytest

edit shortcut

Related: When I had to take the fall

4. Use your new shortcut on twitter

Responding to a new follower, or in a dialog with a journalist? In a response somewhere along the way, type “dyo”. Just like a typo correction, you’ll see iOS offer you a completion, the full text you want to use. Click (space) to accept it.

use shortcut

Check this: Why a killer title make or break your content efforts

5. Post it periodically using trending hashtags

Open twitter & click timelines->discover

Click View more trending…

Scroll through for related topics. For me anything technology, startup, scalability, devops, venture, founder, database related, I’ll use that word, hashtag of phrase.

(BONUS) Create four or five shortcut variations

Nobody wants to see the same thing repeated over and over. So create a few variations. Mix it up a bit.

I’m seeing huge conversion rate on these. I haven’t measured yet (not sure how), but anecdotally I’d say in the 30-50% range. In other words if I mentioned my newsletter to 10 people during the day on twitter, I get about 3-5 new signups. This compared to one newsletter signup per day, passively through my blog.

By directly imploring people to signup, you bring it front and center to their already busy & distracted attention. It works!

Read: Is scaling automatic in the cloud?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

5 Things I learned about bitcoin from Chris Dixon, Balaji Srinivasan & a16z

I’ve avoided the bitcoin hype for long enough. I’ve watched a bit on the periphery, but recently been doing a bit more research. Then I bumped into the new Andreessen Horowitz podcast, and got a crash course on it!

Join 21k others and follow Sean Hull on twitter @hullsean.

http://blog.pmarca.com/2014/01/22/why-bitcoin-matters/

1. Goldman Sacks has taken notice

Want proof that Bitcoin isn’t just for geeks? Goldman has released a report and they have real interest.

Specifically Goldman identified the potential for 210 billion dollars in savings in payments that Bitcoin could bring. That’s billion with a “B” and serious opportunity for disruption!

Also: 5 cloud ideas that aren’t actually true

2. Solves online trust problem

There are many who feel Bitcoin doesn’t have potential as a currency. But even those folks feel it’s underlying technology could solve a big problem with online payments, the general ledger problem.

When you want to send digital things, whether a signature, contract, keys or currency, you need a way to establish trust between people. Bitcoin solves this with it’s technical sounding “block chain” which serves as a sort of internet notary public. Anyone can check on this common general ledger the status of a transaction, without fear of compromise, double entries or theft.

For more in-depth discussion, check out Bitcoin & the Byzantine Generals problem. It explains the general ledger aka the block chain in a lot more detail.

Related: Are SQL databases dying out?

3. Better digital wallets

Although currently bitcoin wallets are banned on the iphone AppStore, the potential there is huge. Currently there still isn’t a good digital wallet solution, and bitcoin sits nicely in that space.

Bitcoin is more a platform, and a set of protocols, a new digital infrastructure that solves a lot of big problems online. As new apps are built on top of it, they abstract away the technical complexity, providing day-to-day

Read this: 8 questions to ask a cloud expert

4. Store of value for Greek & Cyprus

Citizens of distressed countries can face the fear of their savings eroding away. That can happen rather quickly as we’ve seen in Greece & Cyprus. Savings in Bitcoin presents an alternate currency within which one could place some of their savings. Since it’s not controlled by any government or power, it provides a hedge against such fears.

Check this: Why Oracle won’t kill MySQL

5. Say goodbye to inflation

Fiat currency, as it’s known, is the currency we live with today. It’s the post gold standard currency, where the federal reserve controls the money supply. Quantitative easing, aka printing money, is the lever the fed uses to keep a small steady inflation on the money supply.

With the gold standard before it, and potentially through something like Bitcoin, you eliminate the government meddling, and inflation along with it. Some argue this would reduce or even eliminate the so-called moral hazard in the present system. With the gold standard, large & systemic firms cannot be bailed out, so they have a huge insensitive to behave prudently, or fail.

Read: Why AirBNB didn’t have to fail

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

5 cloud ideas that aren’t actually true

storm coming

Join 20,000 others and follow Sean Hull’s scalability, startup & innovation content on twitter @hullsean.

Cloud computing is heralding us into a wonderful era where computing can be bought in small increments, like a utility. This changes the whole way we plan, manage budgets, and accelerates startups making them more agile.

But it’s not all wine & roses up there. I’ve heard a few refrains from clients over the years, and thought I’d share some of the most common.

1. Scaling is automatic

Rather recently I was working with a client on building some sophisticated reports. They needed to slice & dice customer data, over various time series, and summarize with invoices & tracking data. Unfortunately their dataset was large, in the half terabyte range.


Client: Can we just load all this data into the cloud?
Me: Yes we can do that. Build a system in Amazon public cloud, can support large datasets.
Client: I want it to scale easily. So we won’t have these slow reports. And as we add data, it’ll just manage it easily for us.
Me: Well it’s a little bit more complicated than that, unfortunately.

Unfortunately this is a rather familiar conversation that I have quite often. A lot of the press around cloud scalability, centers around auto-scaling, Amazon’s renowned & superb virtualization feature. Yes it’s true you can roll out webservers to scale out this way, but that’s not the end of the story. Typically web applications have a lot of components, from caching servers, to search servers, and of course their backend datastore.

But can we scrap our relational database, such as MySQL and go with one that scales out of the box like Riak, Cassandra or Dynamodb?

Those NoSQL solutions are built to be distributed from the start, it’s true. And they lend themselves to that type of architecture. However, if you’ve built up a dataset in MySQL or Oracle, and more so an application around that, you’ll have to migrate data into the NoSQL solution. That process will take some time.

Like teaching a fish to fly, it make take some time. They do well in water, but evolution takes a bit longer.

Related: RDS or MySQL 10 use cases

2. Disaster recovery is free

In the traditional datacenter, when you want DR, you setup a parallel environment. Hopefully not in the same room, same city or same coast even. Preferrably you do so in a different region. What you can’t get around is dishing out cash for that second datacenter. You need the servers, just in case.

In the cloud, things are different. That’s why we’re here, right? In amazon you have regions already setup & available for plugin-n-play use. Setup your various components, servers, software & configure. Once you’ve verified you can failover to the parallel environment you can just turn off all those instances. Great, no big charges for all that iron that you’d pay for to keep the rooms warm in an old-school datacenter. Or do you?

As it turns out, since you don’t have this environment running all the time, you’ll want to test it more often, run fire drills to bring the servers back online. That’ll incur some costs in terms of manpower. You’ll also want to include in there some scripts to start those servers up, and/or some detailed documentation on how to do that. And don’t lose that documentation, either will you?

You may also want to build some infrastructure as code unit tests. Things change, code checkouts evolve, especially in the agile & continuous integration world. Devops beware!

Read this: Why a killer title can make or break your content efforts

3. Machines are fast

Fast, fast, fast. That’s what we expect, things keep getting faster, right? Hard to believe then that the world of computing took a big step backward when it jumped into the cloud. Something similar happened when we jumped to commodity Linux a decade ago.

In amazon, it’s a multi-tenant world. And just like apartment buildings, popular restaurants, or busy highways you must share. When things are quiet you may have the road to yourself, but it’ll never be as quiet as a dirt road in the country!

Amazon is making big strides though. They now offer memory optimized & storage optimized instances. And an even bigger development is the addition of the most important feature for performance & scalability. That said the network & EBS can still be a real bottleneck.

Also: What is a relational database & why is it important?

4. Backups aren’t necessary

I’ve experienced a few horror stories over the years. I wrote about one noteworthy one When fat fingers take down your business.

True EBS snapshots make backing up your whole server, well a snap! That said a few extra steps have to happen (flush the filesystem & lock tables) to make this work for a relational database like MySQL or Oracle. And suddenly you have a verification step that you also need to perform. You see no backups are valid until they’ve been restored, remember?

But even with these wonderful disk snapshots, you’ll still want to do database dumps, and perhaps table dumps. Operator error, deleting the wrong data, or dropping the wrong tables, will always be a risk. Ignore backups at your own peril!

Check this: Why CTOs underestimate operational costs

5. Outages won’t happen

In an ideal world, everything is redundant, and outages will be a thing of the past. We’ll finally reach five nines uptime and devops everywhere will be out of work. :)

It’s true that Amazon provides all the components to build redundancy into your architecture, and very cutting edge firms that have taken netflix’s approach with chaos monkey are seeing big improvements here. But AirBNB did fail and at root it was an Amazon outage that shouldn’t ever happen.

Read: Why Oracle won’t kill MySQL

Get more. Monthly insights about scalability, startups & innovation.. Our latest Are SQL Databases Dead?