Category Archives: Startups

Best of Startup Content on Scalable Startups

strawberries

Join 28,000 others and follow Sean Hull on twitter @hullsean.

Costs

Costs of techops can involve short-term architectural, decisions, but what about the longer term affects of choices? Do cto’s underestimate operational costs?

A stack of…

These days the full stack of a internet or mobile startup involves a lot of varied components, from Chef, Puppet & Ansible, to Nginx, haproxy, redis, solr and some database like MySQL or Postgres on the relational end of the spectrum, or Mongodb, Hbase or Cassandra on the NoSQL side. What type of challenges does this pose to a team? I’m curious,
Do startups assemble at their own risk?

Most used tech

Leo Polovets ran some stats over the Angellist data of startups. He wanted to know Which tech do startups use most? and I summarized the results.

Death of ops?

These days with all the talk of automation, I’ve heard heard developers & even CTO’s argue of a diminishing need for backend administrators. Do startups still need techops?

Speed as a feature

Is Fred Wilson right to say speed is a feature? What does this mean for those migrating or already running in the cloud? How does scalability come into play?

Avoiding outages

Are many outages avoidable? Did Airbnb have to fail?

Performance Review

Reviewing architecture & site speed is a type of engagement that a lot of startups can benefit from. Here’s my Anatomy of a performance review.

Let things fail

Does it sometimes make sense to let things break a little? A tale of managed failure.

Young founders

I worked at one startup with a CTO just out of college. Although they were flush with cash & had real problems scaling, communication problems ultimately soured the engagement. Are you too young to be a boss?

80 million fix

Sometimes fixing serious performance bottlenecks can get a site back up on it’s feet. In this success story they went on to get acquired weeks after the fix. In tongue in cheek fashion I askWhere’s my 80 million dollars?

CTO’s should never do

There are times to get into the trenches. But what if it sacrifices leadership?What should a CTO never do?

Startups too cool for school

Joining YC but have no ideas? No problem. Is my startup too cool for your business school?

Instant business, just add water

Can a business be built in just a weekend? Is there a problem with startup bootcamps?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Today’s startups: assemble at your own risk

devops divide

I was talking with Todd Hoff recently over at High Scalability about a trend I’ve seen of late.

ME: I really liked this post by Zoli Kahan from Clay.io.  AWS, cloudflare, docker, haproxy, mysql, mongo, memcache, ansible.  They use just about every technology being talked about these days.  

Todd: Yah, that’s why I asked to republish it. I thought it was a good updated sampler stack.

ME: That said I defy you to find a team that actually *KNOWS* all those technologies.  

Todd: Agreed. Systems are a lot of assembly these days, which doesn’t mean we know how to build the parts being assembled.

Join 28,000 others and follow Sean Hull on twitter @hullsean.

The article I was referring to was: How Clay.io Built their 10x Arch Using AWS, Docker, HAProx & Lots More

1. Dizzying array of technologies in use

I’ve been working with startups since the mid-nineties. In those days most application stacks consisted of a PHP application running on Apache, with Oracle on the backend. Both webserver & db ran on Sun Solaris. Hardware was reliable. Most attention was focused on fitting everything in memory, and monitoring the servers for swapping, and disk failure. Boy have those days changed.

I see dozens of startups each year, so I see a lot of very cutting edge environments. Here’s a peak at what I’m seeing these days:

Database: MySQL, Postgres & Oracle, to Mongodb, Cassandra & Couchbase

Caching: Memcache or Redis

Search: Solr

Webservers: Apache, Nginx, Lighttpd

Load balancers: haproxy, Zen

Languages: PHP, Python & Ruby

Publishing: Drupal, WordPress, Joomla

Continuous Integration: Jenkins

Metrics: Cacti, collectd, NewRelic

Monitoring: Nagios, Ganglia, Munin, OpenNMS

Automation: Ancible, Chef, Puppet, Docker & Vagrant

Logs: Logstash

DDOS & CDN: Cloudflare, Ultradns

Whew… That’s a long list!! And we’re not even considering the API’s that many applications are now building on.

Also: Are generalists better at scaling the web?

2. Shortcuts abound

Startups early on, don’t have enough working capital to hire a huge engineering team. So that means everyone is stretched. With a list of technologies that is ever growing, something’s gotta give.

These may cut corners by handing the web & technical operations work to a developer who has some skills. But I continue to ask… Does a four-letter word divide dev & ops?

Read: Which tech do startups use most?

3. More things to break & master

Ownership of a software stack, such as a database means mastery of…

o features in current versions
o bugs of current versions
o vulnerabilities of various versions
o troubleshooting
o best practices
o backup & reliability

For example a lot of shops where I dig into the database, I find low hanging fruit, such as misconfigured startup settings, table layout or index usage.

I see similar things when a networking expert pours over the haproxy configuration, or runs ping tests across the network. Most of these components are setup with fairly vanilla configurations, leaving loose ends and frayed threads.

Check out: Why I can’t raise the bar at every firm

4. Many startups carrying technical debt

I’ve seen a growing reliance on ORM’s which is worrying. Build your foundation on a crutch, and it gets very hard to eliminate down the line. Here are Ward Cunningham’s warnings on technical debt.

Related: Are SQL Databases Dead?

5. Long term support & viability

At one five year old firm, I was brought in to address scalability problems. I met with the team and was asked to provide a comprehensive review. The first thing I found was all the original engineers had long since left, so the code was new for everyone. As I dug my heels in, I found multiple versions of Apache along with Nginx on some other servers. Their stack was built on a patchwork of Python, Ruby & PHP. Then digging in further, we found a complicated web of dependencies for digital assets, mounted across servers & unmonitored.

Lack of standards is common in environments like these. Without an operational or architectural lead, developers are left to make decisions with what is directly in front of them. Though a decision of what language to use may appear simple at the outset, it carries long term consequences.

Will that language or technology be supported in five years? Will the community survive? Will your firm be able to hire people with that skill set? Will engineers still be excited about it?

See also: Is high availability overrated? Is five nines a myth?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

When prospects mislead

MUHAMMAD ALI ROCKS GEORGE FOREMAN ON THE JAW

While a story is fresh in ones mind, it’s a great time to tell it. And so I set out to putting pen to paper about a recent consulting war story.

Join 28,000 others and follow Sean Hull on twitter @hullsean.

A financial services firm reached out to me, asking about services. We discussed the project plan, and the day after the call I sent along a quote. I suggested three options, a weekly fee, a monthly one, or monthly with advance payment.

They decided to go with option C, and we arranged a kickoff meeting.

1. Level setting on trust

I’ve done this kind of work for so long, and worked with so many clients over the years, that it sometimes becomes second nature. I arrived, and we chatted amicably. I asked him about his wikipedia page, which he seemed excited to talk about.

I was surprised that there wasn’t a check ready, as we had decided on advanced payment in full, but didn’t make a mention right away. He then tried to dial in his partner, but that just went to voicemail. So we continued the meeting without him.

I don’t know how important the meeting was to both team members, but they were both on the invite & emails. His partner never called back through the meeting either.

Read this: When migrating from Oracle to MySQL Prepare to Bushwack

2. Negotiations is part art & dance

Interestingly I had met up with some colleagues the night before over italian food. I mentioned I was meeting a new prospect the next day, but had reservations about whether they had really decided to hire me, or were just still prospecting.

So during the meeting I was somewhat conscious of that question. Are we already in exploratory, discovery mode? Has the project even begun? That’s a question, and from what I sensed it was still an open one.

As the meeting wore on, questions about oracle licenses, versions, and EC2 configurations came up. Furious note taking continues.

Related: Which tech do startups use most?

3. Time & mismanagement

One thing that comes up for me in these situations is questions of time management. In order to work with a new client, I must clear my schedule, and make time available. That has a value to start with. When it turns out a project isn’t actually ready yet, it becomes an awkward stumble out of the gates.

Also: Is automation killing the sysadmin job role?

4. Can you research this one thing

As I raised various concerns about Oracle, the data loader portion, and unknowns around how that software worked, the prospect asked if I could do a little research for them.

This is where things started to crack. Rather than answer the question, I made a more aggressive nod to the question on my mind: Have we really started on this project yet? I explained that I was confused, and gathered from our email this this was a kickoff meeting. The tension in the air rose noticeably.

He then explained “Well we’re still waiting to hear back from a vendor about XYZ”. From there I began to gather up my things.

Check this: What can fashion week teach Chad Dickerson about Net Neutrality?

5. Watch out for those Rothkos

As I stand up I comment on the digs. “Is this shared office space, those look like Rothkos?” I ask. “Nope this is all ours, my wife is a collector & art dealer. We have some real Warhol’s too”. “Wow…”, I respond, “tough business to be in!”. With that he says “Well it is very volatile, we can be out of business in a month.”

My take away here isn’t to be wary of all new prospects. Each person or business has their own *style* of doing business. Rather, until you’ve established trust with a new client, consider that you may not yet be working on the project at all.

And with that the dance continues. While you may wish to demonstrate and illustrate your knowledge, and the solutions you’d recommend, beware of solving the problem before you’re even hired!

Read: Are SQL Databases Dead?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

What can NYFW teach Chad Dickerson about net neutrality?

net neutrality

Here we are again discussing Net Neutrality… Chad Dickerson CEO of well renowned Etsy.com, has come out strongly in favor, and wants everyone to take action.

Join 27,000 others and follow Sean Hull on twitter @hullsean.

Honestly when I read his wired piece Etsy CEO to businesses: If Net Neutrality Perishes, We Will Too, I was struck by one statement:

The FCC proposal will threaten *ANY* business that uses the internet to reach it’s customers.

Any business? Quite a sweeping statement. Strikes fear into me that’s for sure… And if you read through the comments, the debate is equally fierce. One side says net neutrality is socialism! The other side says anyone against net neutrality is a shill for Comcast or Verizon! Battle lines drawn!

1. Are all businesses at risk?

Isn’t the idea that ETSY will perish overstated? Are they a high bandwidth company? Are they trying to stream video?
Is the entire Etsy community alarmed? Isn’t that a rather broad statement?

To be sure ending net neutrality will impact some businesses. Perhaps one reason VC’s like Fred Wilson are so concerned about Net Neutrality isn’t for the freedom of millions of internet users, but the threat to disruptive businesses, the startups that VC’s directly invest in.

Read: Which tech do startups use most?

2. Will all internet users be impacted?

Here again some of this debate seems overstated. I remember using the internet on a dialup modem. 300 baud, was about the speed at which you can type. Then along came 14.4, 28k and upward speeds climbed. All the while the internet was usable. Could I do all the things I can today, nope.

Even if these horrible Comcast’s & Verizon’s reduce speeds by 100 times, they will still be plenty fast for most internet users. Sure streaming video would be impacted, and yes streaming music would be impacted. But for end users, I would argue most would not be impacted. It is rather the disruptive startups & businesses that would be most impacted.

Also: Is automation killing old-school operations?

3. Are there anti-EDU parallels

In the mid-nineties, before the dot-com bubble, there was a huge raging debate about even having commercial entities on the internet at all. Enlightened internet cognoscenti considered it an abomination.

But the real world pushed it’s nose in, and today we take as a given.

Check this: Is Hunter Walk right about operations & startups?

4. Is google right about millisecond delays?

“Research from Google & Microsoft shows that delays of milliseconds result in fewer page views and fewer sales in both the short & long term”. Yep, that’s a fact. The research shows this. But what do we take away from that?

As a performance and scalability consultant I see a *TON* of websites that have huge delays, well over tiny millisecond ones that Google frets over. Internet startups struggle with performance every day.

What’s the irony? Slowdowns that Comcast or Verizon might introduce to end users pale in comparison with these larger systemic problems.

Also: 5 Ways startups misstep on scalability

5. Any lessons from sites of New York Fashion Week?

I like the Pingdom speed test tool. I used it to track the speed of some of the websites & blogs that are big for NYFW. Here’s what I found:

nyfw speed test results

What do you see? Take a look at the SIZE column. Notice something strange? The LARGEST sites, in terms of images, css & assets aren’t necessarily the SLOWEST! That’s a funny result if you consider net neutrality. If you think the network speed is the same for all websites, shouldn’t the smallest pages load fastest?

Not true at all. It’s a very simplistic way of viewing things. Fashionista.com for example is doing a ton of tuning behind the scenes. As you can see it is making their site far and away the fastest! Network bandwidth and net neutrality be damned!

Related: Are SQL Databases Dead?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Which tech do startups use most?

MySQL on Amazon Cloud AWS

Leo Polovets of Susa Ventures publishes an excellent blog called Coding VC. There you can find some excellent posts, such as pitches by analogy, and an algorithm for seed round valuations and analyzing product hunt data.

He recently wrote a blog post about a topic near and dear to my heart, Which Technologies do Startups Use. It’s worth a look.

One thing to keep in mind looking over the data, is that these are AngelList startups. So that’s not a cross section of all startups, nor does it cover more mature companies either.

In my experience startups can get it right by starting fresh, evaluating the spectrum of new technologies out there, balancing sheer solution power with a bit of prudence and long term thinking.

I like to ask these questions:

o Which technologies are fast & high performance?
o Which technologies have a big, vibrant & robust community?
o Which technologies can I find plenty of engineers to support?
o Which technologies have low operational overhead?
o Which technologies have low development overhead?

1. Database: MySQL

MySQL holds a slight lead according to the AngelList data. In my experience its not overly complex to setup and there are some experienced DBAs out there. That said database expertise can still be hard to find .

We hear a lot about MongoDB these days, and it is surely growing in popularity. Although it doesn’t support joins and arbitrary slicing and dicing of data, it is a very powerful database engine. If your application needs more straightforward data access, it can bring you amazing speed improvements.

Postgres is a close third. It’s a very sophisticated database engine. Although it may have a smaller community than MySQL, overall it’s a more full featured database. I’d have no reservations recommending it.

Also: Top MySQL DBA Interview questions

2. Hosting: Amazon

Amazon Web Services is obviously the giant in the room. They’re big, they’re cheap, they’re nimble. You have a lot of options for server types, they’ve fixed many of the problems around disk I/O and so forth. Although you may still experience latency around multi-tenant related problems, you’ll benefit from a truly global reach, and huge cost savings from the volume of customers they support.

Heroku is included although they’re a different type of service. In some sense their offering is one part operations team & one part automation. Yes ultimately you are getting hosting & virtualization, but some things are tied down. Amazon RDS provides some parallels here. I wrote Is Amazon RDS hard to manage?. Long term you’re likely going to switch to an AWS, Joyent or Rackspace for real scale.

I was surprised to see Azure on the list at all here, as I rarely see startups build on microsoft technologies. It may work for the desktop & office, but it’s not the right choice for the datacenter.

Read: Are generalists better at scaling the web?

3. Languages: Javascript

Javascript & Node.js are clearly very popular. They are also highly scalable.

In my experience I see a lot of PHP & of course Ruby too. Java although there is a lot out there, can tend to be a bear as a web dev language, and provide some additional complication, weight and overhead.

Related: Is Hunter Walk right about operations & startups?

4. Search: Elastic Search

I like that they broke apart search technology as a separate category. It is a key component of most web applications, and I do see a lot of Elastic Search & Solr.

That said I think this may be a bit skewed. I think by far the number one solution would be NO SPECIFIC SEARCH technology. That’s right, many times devs choose a database centric approach, like FULLTEXT or others that perform painfully bad.

If this is you, consider these search solutions. They will bring you huge performance gains.

Check this: Are SQL Databases Dead?

5. Automation: Chef

As with search above, I’d argue there is a far more prevalent trend, that is #1 to use none of these automation technologies.

Although I do think chef, docker & puppet can bring you real benefits, it’s a matter of having them in the right hands. Do you have an operations team that is comfortable with using them? When they leave in a years time, will your new devops also know the technology you’re using? Can you find a good balance between automation & manual configuration, and document accordingly?

Read: Why are database & operations experts so hard to find?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Is Hunter Walk right about operations & startups?

The.Rohit - Flickr

The.Rohit – Flickr

Join 26,000 others and follow Sean Hull on twitter @hullsean.

Hunter Walk blogged recently about the importance of building great operations teams. And while he was speaking primarily about business operations, the startup technical operations teams are equally difficult to get right.

1. performance & scalability

As your grows like Birchbox, your customer growth curve may begin to look like a hockey stick. That’s a good problem to have. Will your web application be able to keep up with the onslaught of traffic those customers bring?

Getting performance and scalability just right, will mean fewer site crashes during those key moments when all eyes are on your site.

Also: Is top operations talent hard to find?

2. Operations is key to architecture

Developers will always have strong opinions on architecture. However they may be heavily influenced by their own mandate, features, deliverability & deadlines. So it’s no surprise that they may sometimes choose to build on ORM’s, the middleware brought to you by Hibernate, Cake PHP, Active Record & the like.

And while these technologies seem a necessity in todays modern architectures, they play havoc with your long term scalability. Strong technical operations teams mean a better vision in this area. Heading off your reliance on these technologies will mean managing technical debt before it takes down your country.

Read: Are generalists better at scaling the web?

3. Operations informs strategy

Did you build in those operational switches to turn off the heaviest code, when your site gets overloaded? Operations strategy can help you see these problems on the horizon before they overwhelm you.

Have you considered building a browse only mode for your site? If you’ve ever visited Facebook or Yelp after hours you may have been greeted with the message “We can’t save your comments. Please try again later”. A small innocuous message to end users doesn’t disrupt their enjoyment of the site terribly. But from an technical operations perspective it’s huge. It means teams can perform backups, upgrades and maintenance without interrupting day-to-day activity on the site.

Related: Is scalability a big business?

4. Operations means resilience

We only learn real disaster recovery lessons from storms like Sandy. That’s because resilience highlighted best when it is a real & urgent need.

In technical operations, getting backups right & testing your recovery plan all form key steps in your path to excellence. Get them right before you need them, and ensure repeatability.

Read: Is high availability a real possibility?

5. Operations means technical strength

At the end of the day, getting technical operations right, means you can move from strength to strength. It means building on a solid foundation the likes of Google, Facebook, Foursquare & Etsy. It means you can evolve & grow with your customers, and meet their needs confidently.

Check out: Do startup CEO’s underestimate operational costs?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Are startup CEO’s hiding their scalability problems?

Russian_Dolls

Join 27,000 others and follow Sean Hull on twitter @hullsean.

Your site is running fine right? You have 1000 customers, and it usually runs smoothly. Just this one lingering question, why does it take five high performance EC2 instances to run the database, all on flash drives? Goood question!

The truth is one of the highest trafficed sites I managed, pulled in 100 million uniques a month, and only used three backend databases. That site was one of these wildly popular celebrity gossip sites, the ultimate guilty pleasure when you’re at the office and can’t watch reality tv!

Snickers aside, this is huge traffic. And all of the above was built on Drupal, with no ORM in the mix. It could even run, albeit noticeably slower, while memcache was disabled.

1. Servers with solid state drives

I’m very excited to see Amazon introduce servers with SSD drives. They can bring you 100x improvement of disk I/O, and that my friends is the end all and be all for databases. So why complain?

If you deploy on these boxes right out of the gates, it may be like using a crutch. You become dependent on it, and ignore real performance tuning. Solid state drives still won’t obviate that ORM middleware you’re using.

Also: Do managers & CEO’s underestimate operational costs?

2. Memcache saving your bad queries

Memcache is also a powerful tool. It sits between the database and your webservers, reducing load on the database by as much as 10x. That’s a great way to get better response time, and reduce drag on your db tier. But it’s still worthwhile performance tuning without it.

Why? If you can get your site to run without caching, it will run blazingly fast *with* it. Don’t use it as a crutch, use it as rocket fuel for your well tuned site.

Read this: Do startups need techops?

3. A legion of read slaves

I’ve seen smaller sites, using a ton of read slaves. All of it deployed to cover up slow & redundant queries pouring out of an ORM middleware layer, in this case Cake PHP.

Again, read slaves are great, but tune & test with less hardware, and get the performance up the hard way. With elbow grease!

Related: Howto automate MySQL query analysis with Amazon RDS

4. Really really big memory

64G, 128G, 256G of main memory? If I wax on about the days when you’d get excited by 64k, I’ll sound like an old timer. But with those extreme limitations, you had to write tight code. Otherwise it just wouldn’t do anything.

Really really big memory of today’s servers allows us to get lazy. I hear developers say “Hey, the database is 10G of data, and we have 64G main memory, so the whole thing will fit in memory. Problem solved!”

Duhhh… No. Why not? Because you still have to slice and dice that data. You still have to scan through for bits & pieces that aren’t indexed, then sort, and organize that into temporary memory space. In DBA speak, you’re still doing a ton of logical IOs.

Picture it another way, imagine the days when you’re on horseback, riding across the west. You travel light cause frankly your horse can carry only so much. Then along come cars, and you start loading up the trunk. You add the kitchen sign, and the rear tires are hanging on the ground. All seems fine until you hit a steep mountain, and you’re car is almost stalling at 20mph. If you had only carried the same load as you did on horseback, you’d be speeding across the country at lightning pace.

Read: Is Amazon RDS hard to manage?

5. Deploying poor code

Deadlines are looming, and new features must be deployed. So performance testing can wait until later. The code works after all.

Been there, done that. Code gets deployed and all of a sudden there are spikes on server load in the evening. Ops & DBA teams are screaming, “Who wrote this code?”.

Load testing should be a part of everyday QA & test. It’s the only way to avoid growing scalability problems.

Check this: Are SQL databases dead?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

5 Things I learned about bitcoin from Chris Dixon, Balaji Srinivasan & a16z

I’ve avoided the bitcoin hype for long enough. I’ve watched a bit on the periphery, but recently been doing a bit more research. Then I bumped into the new Andreessen Horowitz podcast, and got a crash course on it!

Join 21k others and follow Sean Hull on twitter @hullsean.

http://blog.pmarca.com/2014/01/22/why-bitcoin-matters/

1. Goldman Sacks has taken notice

Want proof that Bitcoin isn’t just for geeks? Goldman has released a report and they have real interest.

Specifically Goldman identified the potential for 210 billion dollars in savings in payments that Bitcoin could bring. That’s billion with a “B” and serious opportunity for disruption!

Also: 5 cloud ideas that aren’t actually true

2. Solves online trust problem

There are many who feel Bitcoin doesn’t have potential as a currency. But even those folks feel it’s underlying technology could solve a big problem with online payments, the general ledger problem.

When you want to send digital things, whether a signature, contract, keys or currency, you need a way to establish trust between people. Bitcoin solves this with it’s technical sounding “block chain” which serves as a sort of internet notary public. Anyone can check on this common general ledger the status of a transaction, without fear of compromise, double entries or theft.

For more in-depth discussion, check out Bitcoin & the Byzantine Generals problem. It explains the general ledger aka the block chain in a lot more detail.

Related: Are SQL databases dying out?

3. Better digital wallets

Although currently bitcoin wallets are banned on the iphone AppStore, the potential there is huge. Currently there still isn’t a good digital wallet solution, and bitcoin sits nicely in that space.

Bitcoin is more a platform, and a set of protocols, a new digital infrastructure that solves a lot of big problems online. As new apps are built on top of it, they abstract away the technical complexity, providing day-to-day

Read this: 8 questions to ask a cloud expert

4. Store of value for Greek & Cyprus

Citizens of distressed countries can face the fear of their savings eroding away. That can happen rather quickly as we’ve seen in Greece & Cyprus. Savings in Bitcoin presents an alternate currency within which one could place some of their savings. Since it’s not controlled by any government or power, it provides a hedge against such fears.

Check this: Why Oracle won’t kill MySQL

5. Say goodbye to inflation

Fiat currency, as it’s known, is the currency we live with today. It’s the post gold standard currency, where the federal reserve controls the money supply. Quantitative easing, aka printing money, is the lever the fed uses to keep a small steady inflation on the money supply.

With the gold standard before it, and potentially through something like Bitcoin, you eliminate the government meddling, and inflation along with it. Some argue this would reduce or even eliminate the so-called moral hazard in the present system. With the gold standard, large & systemic firms cannot be bailed out, so they have a huge insensitive to behave prudently, or fail.

Read: Why AirBNB didn’t have to fail

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

5 cloud ideas that aren’t actually true

storm coming

Join 20,000 others and follow Sean Hull’s scalability, startup & innovation content on twitter @hullsean.

Cloud computing is heralding us into a wonderful era where computing can be bought in small increments, like a utility. This changes the whole way we plan, manage budgets, and accelerates startups making them more agile.

But it’s not all wine & roses up there. I’ve heard a few refrains from clients over the years, and thought I’d share some of the most common.

1. Scaling is automatic

Rather recently I was working with a client on building some sophisticated reports. They needed to slice & dice customer data, over various time series, and summarize with invoices & tracking data. Unfortunately their dataset was large, in the half terabyte range.


Client: Can we just load all this data into the cloud?
Me: Yes we can do that. Build a system in Amazon public cloud, can support large datasets.
Client: I want it to scale easily. So we won’t have these slow reports. And as we add data, it’ll just manage it easily for us.
Me: Well it’s a little bit more complicated than that, unfortunately.

Unfortunately this is a rather familiar conversation that I have quite often. A lot of the press around cloud scalability, centers around auto-scaling, Amazon’s renowned & superb virtualization feature. Yes it’s true you can roll out webservers to scale out this way, but that’s not the end of the story. Typically web applications have a lot of components, from caching servers, to search servers, and of course their backend datastore.

But can we scrap our relational database, such as MySQL and go with one that scales out of the box like Riak, Cassandra or Dynamodb?

Those NoSQL solutions are built to be distributed from the start, it’s true. And they lend themselves to that type of architecture. However, if you’ve built up a dataset in MySQL or Oracle, and more so an application around that, you’ll have to migrate data into the NoSQL solution. That process will take some time.

Like teaching a fish to fly, it make take some time. They do well in water, but evolution takes a bit longer.

Related: RDS or MySQL 10 use cases

2. Disaster recovery is free

In the traditional datacenter, when you want DR, you setup a parallel environment. Hopefully not in the same room, same city or same coast even. Preferrably you do so in a different region. What you can’t get around is dishing out cash for that second datacenter. You need the servers, just in case.

In the cloud, things are different. That’s why we’re here, right? In amazon you have regions already setup & available for plugin-n-play use. Setup your various components, servers, software & configure. Once you’ve verified you can failover to the parallel environment you can just turn off all those instances. Great, no big charges for all that iron that you’d pay for to keep the rooms warm in an old-school datacenter. Or do you?

As it turns out, since you don’t have this environment running all the time, you’ll want to test it more often, run fire drills to bring the servers back online. That’ll incur some costs in terms of manpower. You’ll also want to include in there some scripts to start those servers up, and/or some detailed documentation on how to do that. And don’t lose that documentation, either will you?

You may also want to build some infrastructure as code unit tests. Things change, code checkouts evolve, especially in the agile & continuous integration world. Devops beware!

Read this: Why a killer title can make or break your content efforts

3. Machines are fast

Fast, fast, fast. That’s what we expect, things keep getting faster, right? Hard to believe then that the world of computing took a big step backward when it jumped into the cloud. Something similar happened when we jumped to commodity Linux a decade ago.

In amazon, it’s a multi-tenant world. And just like apartment buildings, popular restaurants, or busy highways you must share. When things are quiet you may have the road to yourself, but it’ll never be as quiet as a dirt road in the country!

Amazon is making big strides though. They now offer memory optimized & storage optimized instances. And an even bigger development is the addition of the most important feature for performance & scalability. That said the network & EBS can still be a real bottleneck.

Also: What is a relational database & why is it important?

4. Backups aren’t necessary

I’ve experienced a few horror stories over the years. I wrote about one noteworthy one When fat fingers take down your business.

True EBS snapshots make backing up your whole server, well a snap! That said a few extra steps have to happen (flush the filesystem & lock tables) to make this work for a relational database like MySQL or Oracle. And suddenly you have a verification step that you also need to perform. You see no backups are valid until they’ve been restored, remember?

But even with these wonderful disk snapshots, you’ll still want to do database dumps, and perhaps table dumps. Operator error, deleting the wrong data, or dropping the wrong tables, will always be a risk. Ignore backups at your own peril!

Check this: Why CTOs underestimate operational costs

5. Outages won’t happen

In an ideal world, everything is redundant, and outages will be a thing of the past. We’ll finally reach five nines uptime and devops everywhere will be out of work. :)

It’s true that Amazon provides all the components to build redundancy into your architecture, and very cutting edge firms that have taken netflix’s approach with chaos monkey are seeing big improvements here. But AirBNB did fail and at root it was an Amazon outage that shouldn’t ever happen.

Read: Why Oracle won’t kill MySQL

Get more. Monthly insights about scalability, startups & innovation.. Our latest Are SQL Databases Dead?

Why managers & CTO’s underestimate operational costs

too much inventory

Join 19k others and follow Sean Hull on twitter @hullsean.

1. Technology choices & talent shortage

I worked at one firm evaluating their technology stack. When we got to the programming language, I paused in my tracks. “Haskell” I asked? “Oh you haven’t heard of it? It’s a really cool functional programming language, and we found it had some cool features that we really wanted to use”.

I had to fight the urge to roll my eyes. Yes I’d heard of the language, sitting in the club with scheme, lisp & prolog, you study them at university. They’re certainly an interesting bunch and to be sure, can do some things that imperative programming languages can’t. But did it belong in the stack of this run-of-the-mill internet startup?

In this case the developers had full reign to choose any technologies they liked, adding more & more to the mix almost daily. But what are some of the ramifications here?

Two years, three years, or five years down the line, this team will be long gone, and another team will be picking up the pieces. Will you as a manager be able to find a lot of Haskell experts? What’s more operationally will you be able to support those choices? Will updates be made often enough to have a secure stack for years to come?

Also: 5 things toxic to scalability

2. Scalability & server costs

Server costs are easier than ever to estimate. Build your application to serve your first 10,000 customers on Amazon with a couple webservers and a database server. Growing 100x to a million customers, just vertically scale your db, scale out your webservers and you’re good. Or are you?

What happens when you hit a wall? Did you build your application on ORM technology or take on technical debt? I’ve seen firm after firm struggle with technologies like hibernate, eating up precious resources, and being helpless to eliminate the problem. Tread carefully on these types of questions.

Related: Why you’re not hitting five nines uptime

3. Patching, fixing bugs & managing security

Another long term cost of an application will be minor repairs and bug fixes. Those might appear in a slow steady trickle over the years, but security may loom larger. Cross-site scripting, SQL injection and many other threats can be a real headache.

What’s more fixes may involve the libraries your application sits on top of. And when they are upgraded, your application will require tweaks too. It’s all basic stuff when you’re knee deep in development, but when your application has been deployed, the original team is long gone, and you’re supporting it years later, it can surely get messy.

Read: The four-letter-word dividing dev & ops

4. missing operational switches

When building a web application, all eyes are on features. Which ones to include, and which are a priority. Pressure is heavy to build functions that can be sold to customers. Pleasing customers is of obvious importance.

So it’s no surprise that backend switches are often missing. But they can be a real boon for operations team. Suppose you roll out a new feature to support star-ratings on certain pieces of content. An operational switch can be built to allow that feature to be disabled as necessary. If the site is loaded, or trouble is brewing, you may desperately want some switches to disable parts of the site, without the whole thing going down. I talk about this in AirBNB didn’t have to fail.

Another useful thing is a browse only mode. This allows your site to operate, even when writing to the database is not possible. If you’ve ever tried to update on a social network like twitter, facebook or instagram, perhaps late and nite and gotten a “please try again later” message, you’ll understand the value. Here users can’t make changes, but otherwise the site appears to be working, and browsing works normally.

Check this: Are SQL Databases Dead?

5. Consider bitcoin

Mt. Gox, the Japanese exchange handling bitcoin failed in a spectacular fashion. 500 million of the digital currency was stolen. And what’s more since it’s all frictionless currency, untraceable, there’s no marked bills to try and track down. Ooops!

How does this relate to operational costs? The failure was squarely with the operations department. Functionally the site worked fine. But security wasn’t handled well enough, intrusion detection wasn’t employed, and “unspecified weaknesses” were to blame.

Security is one of those things that can be ignored without pain. Until something goes wrong. What’s more if it is being handled well, it’s invisible, and unappreciated besides.

Read this: Why Oracle won’t kill MySQL

Get more. Grab our exclusive monthly Scalable Startups. We insights on scalability, startups & innovation. Our latest Why I don’t work with recruiters