Category Archives: Scalability

Some thoughts on 12 factor apps

12 factor app

I was talking with a colleague recently about an upcoming project.

In the summary of technologies, he listed 12 factor, microservices, containers, orchestration, CI and nodejs. All familiar to everyone out there, right?

Join 28,000 others and follow Sean Hull on twitter @hullsean.

Actually it was the first I had heard of 12 factor, so I did a bit of reading.

1. How to treat your data resources

12 factor recommends that backing services be treated like attached resources. Databases are loosely coupled to the applications, making it easier to replace a badly behaving database, or connect multiple ones.

Also: Is the difference between dev & ops a four-letter word?

2. Stay loosely coupled

In 12 Fractured Apps Kelsey Hightower adds that this loose coupling can be taken a step further. Applications shouldn’t even assume the database is available. Why not fall back to some useful state, even when the database isn’t online. Great idea!

Related: Is Amazon too big to fail?

3. Degrade gracefully

A read-only or browse-only mode is another example of this. Allow your application to have multiple decoupled database resources, some that are read-only. The application behaves intelligently based on what’s available. I’ve advocated those before in Why Dropbox didn’t have to fail.

Read: When hosting data on Amazon turns bloodsport


The twelve-factor app appears to be an excellent guideline on building cleaning applications that are easier to deploy and manage. I’ll be delving into it more in future posts, so check back!

Read: Are we fast approaching cloud-mageddon?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Should we be muddying the relational waters? Use cases for MySQL & Mongodb

muddy sewer tunnels

Many of you know I publish a newsletter monthly. One thing I love about it is that after almost a decade of writing it regularly, the list has grown considerably. And I’m always surprised at how many former colleagues are actually reading it.

So that is a really gratifying thing. Thanks to those who are, and if you’re not already on there, signup here.

Join 28,000 others and follow Sean Hull on twitter @hullsean.

Recently a CTO & former customer of mine reached out. He asked:

“I’m interested to hear your thoughts on the pros and cons of using a json column to embed data (almost like a poor-man’s Mongo) vs having a separate table for the bill of materials.”

Interesting question. Here are my thoughts.

1. Be clean or muddy?

In my view, these type of design decisions are always about tradeoffs.  

The old advice was normalize everything to start off with.  Then as you’re performance tuning, denormalize in special cases where it’ll eliminate messy joins.  The special cases would then also need to be handled at insert & update time, as you’d have duplication of data.

NoSQL & mongo introduce all sorts of new choices.  So too Postgres with json column data.  

We know that putting everything in one table will be blazingly fast, as you don’t need to join.  So reads will be cached cleanly, and hopefully based on single ID or a small set of ID lookups.  

Also: Is the difference between dev & ops a four-letter word?

2. Go relational

For example you might choose MySQL or Postgres as your datastore, use it for what it’s good at.  Keep your data in rows & columns, so you can later query it in arbitrary ways.  That’s the discipline up front, and the benefit & beauty down the line.

I would shy away from the NoSQL add-ons that some relational vendors have added, to compete with their newer database cousins. This starts to feel like a fashion contest after a while.

Related: Is automation killing old-school operations?

3. Go distributed

If you’d like to go the NoSQL route, for example you could choose Mongodb. You’ll gain advantages like distributed-out-of-the-box, eventually consistent, and easy coding & integration with applications.

Downside being you’ll have to rearrange and/or pipeline to a relational or warehouse (redshift?) if & when you need arbitrary reports from that data.  For example there may be new reports & ways of slicing & dicing the data that you can’t forsee right now.

Read: Do managers underestimate operational cost?

4. Hazards of muddy models

Given those two options, I’m erring against the model of muddying the waters.  My feeling is that features like JSON blobs in Postgres, and the memcache plugin in MySQL are features that the db designers are adding to compete in the fashion show with the NoSQL offerings, and still keep you in their ecosystem.  But those interfaces within the relational (legacy?) databases are often cumbersome and clunky compared to their NoSQL cousins like Mongo.

Also: Is the difference between dev & ops a four-letter word?

5. Tradeoffs of isolation

Daniel Abadi and Jose Faleiro published an interesting article on a very related topic Why MongoDB, Cassandra, HBase, DynamoDB, and Riak will only let you perform transactions on a single data item.

The upshot is that in databases you can choose *TWO* of these three characteristics. Fairness, Isolation & Throughput.

Relational databases sacrifice throughput for fairness & isolation. Distributed databases sacrifice isolation to bring you gains in throughput & horizontal scalability of writes.

That’s a lot of big words to say one simple thing.

You can’t have it both ways.

Also: Is the difference between dev & ops a four-letter word?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Are frameworks a shortcut to slow bloated code?

elephant traffic jam

I was reading one of my favorite blogs again, Todd Hoff’s High Scalability. He has an interesting weekly post format called “Quotable Quotes”. I like them because they’re often choice quotes that highlight some larger truth.

Join 32,000 others and follow Sean Hull on twitter @hullsean.

1. Chasing bloody frameworks

One that caught my eye was this one by Sandro Mancuso:

Frameworks are bad? Aren’t these the foundation of reusable code? Can we build for the web without Ruby’s Rails, PHP’s Laravel or Java’s Spring?

Also: Are we fast approaching cloud-mageddon??

2. Architecture purity or bloat?

Luckily the conversation remained civil. The other side of the isle piped up with practicality. I think Jeff is saying “Hey we gotta get our jobs done.”

Related: Did MySQL & Mongo have a beautiful baby called Aurora?

3. Look Ma No Frameworks

Then Pablo Chacin jumps in with his slideshare deck, rejecting Frameworks as hugely wastful.

Read: When hosting data on Amazon turns bloodsport

4. Beware the ORM

Anyone who’s a regular reader of this blog knows that I’ve railed on against ORM’s for a long time. These are object relational mappers, they are a library on to of a relational database. They allow developers to build software, without mucking around in SQL.

Hibernate is a famous one. I remember back in the late 90’s I was brought on to fix some terrible performance problems. At root the problem was the SQL code. It wasn’t being written by their team & tuned carefully. It was written en mass & horribly by this Hibernate library.

Also: Is the difference between dev & ops a four-letter word?

5. Where’s the oversight?

Perhaps the biggest risk to many startups is that the decision isn’t being made at all. What do I mean by this?

I think this tweet gets to the heart of it. Often the decision to use a framework or not is simply hidden in plain sight, an assumption albeit a large one, that this is how you build software.

But these decisions are so fundamental because once your scaffolding is built, it becomes very hard to disentangle. Rip & replace becomes terribly expensive, and scalability becomes a painful unattainable dream.

Also: 5 things toxic to scalability

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Why you should build a portable cloud

486 linux server

I was recently browsing Ycombinator News. It is always an endless trove of the curious & interesting.

I stumbled onto Karanbir Singh’s post The Portable Cloud. I was curious, what is that?

Join 28,000 others and follow Sean Hull on twitter @hullsean.

This story gave me warm fuzzies… I was excited in a similar way when Linux was first released. This was many years back through the mists of time in 1992. I had recently graduated computer science, and one of my favorite classes was Operating Systems. We worked to build an OS following Andrew Tennenbaum’s book Minix.

After graduation, I heard about the Linux project & got excited. I was hearing whispering online that Linux could really completely replace windows. So I bought all the parts to build a 486 tower, graphics card, motherboard, memory cards & IDE drives. This ran into the thousands of dollars. Hardware wasn’t cheap then! Keyboard & monitor. I even ordered an optical mouse because it felt like you were sitting at a sun workstation, at home!

I remember putting all this together, and loading the first floppy disk into the thing. Did I image the disks properly? Will it really load something?

Up comes the bios and sure enough it’s booting off of the floppy drive. I thought “Wow, mother of god! This is amazing!”

From there I had init running, and soon the very seat of the soul, the Unix OS itself. That felt so darn cool.

After that I’d spend weeks configuring x-windows, but to have a GUI seemed like the mission impossible. And then you’d go about tweaking and rebuilding your kernel for this or that.

Thank you to Karanbir for rekindling this memory. It’s a great one!

For those starting out now as a developer, operations, or cloud site reliability engineer, I would totally recommend following Karanbir’s instructions. Here’s why!

1. Learning by building

My favorite thing about building a server myself, is that there’s something physical going on. You’re plugging in a cable for the disk bus. Bus is no longer just a concept, but a thing you can hold. You’re plugging in memory, you can look at it & say oh this is a chip, it’s different than a disk drive. You can hold the drive and say, oh there’s a miniature little record player in there, with magnetic arm. Cool!

Also: Is Amazon too big to fail?

2. Linux early beginnings

Another thing I remember about those days, was feeling like I was part of something big. I knew operating systems were crucial. And I knew that Windows wasn’t working. I knew it wouldn’t scale to the datacenter anyway.

I realized I wasn’t the only one to think this way. There were many others as excited as me, who were contributing code, and debug reports.

Related: Did Airbnb have to fail?

3. Debugging & problem solving

Building your own server involves a ton of debugging. In those days you had to compile all those support programs, using the GNU C compiler. You’d run make and get a whole slew of errors, and fire up your editor and get to work.

Configuring your windowing system meant figuring out where the right driver was, and also buying a graphics card that was *supported*. You would then tweak the refresh frequencies, resolution, and so forth. There was no auto detection. You could actually fry your monitor, if you set those numbers wrong!

Read: When hosting data on Amazon turns bloodsport

4. Ownership of the stack

These days you hear a lot about “fullstack engineers”. There is no doubt in my mind, that this is the way to become one. Basic systems administration requires you to compile other peoples software & troubleshoot it. All those developer skills that will come in very handy.

They force you to see all the hardware, and how it fits together into a greater performant whole. It also gives you an appreciation for speed. Use one bus such as IDE or another such as SCSI and experience a different performance profile. Because all that software that Unix is paging in and out of memory, it’s doing by reading & writing to disk!

Also: Are SQL Databases Dead?

5. Learn to be a generalist

I’ve argued before that being a generalist allows you to better scale the web. I still think this, and I believe it is this formative experience building servers from individual parts, that taught me whole the larger whole fit together.

This allows me to look at problems today, and jump to causes of performance problems quickly.

Also: Does Volkswagen highlight the bloodsport in benchmarks?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Are we fast approaching cloud-mageddon ?

storm coming

One look at StackShare’s trending technologies, and you’ll discover the exploding growth of languages, webservers, load balancers, databases, caching servers, automation & monitoring tools, continuous integration suites & a broad spectrum of Software as a service solutions.

The choices today boggles the mind. Choice is good, but too much choice can mean trouble too.

Join 30,000 others and follow Sean Hull on twitter @hullsean.

1. What am I actually using?

Erich Schubert wrote a superb piece about the sad state of the sysadmin in the age of containers. Here’s what caught my eye:

Stack is the new term for “I have no idea what I’m actually using”.

That definitely rings true for me. The customers I’m seeing these days have such complicated stacks, that nobody really knows what’s installed. That’s dangerous!

Also: Do today’s startups assemble at their own risk?

2. Embrace failure more broadly

Recently I wrote a blog post asking Is AWS the patient that needs constant medication?. It got a lot of traction, and here’s why I think that happened.

AWS uses very commodity, cheapo components. The assumption is, with an infinitely redundant datacenter, component failure is ok. It’s ordinary & everyday.

Unfortunately most startups, even ones that employ some Ansible & devops, still don’t have Netflix grade automation.

Those regularly everyday failures are still getting detected by old-school manual monitoring. And that’s a recipe for trouble

Also: 5 Things toxic to scalability

3. What are complex systems?

In this excellent deck, James Urquhart talks about emergent behavior in complex systems. It’s worth a quick read.


Read: How I find entrepreneurial focus

4. What to do? Do you like boring?

Dan McKinley formerly principal engineer at Etsy & now with Stripe wrote a brilliant essay arguing for boring technology.

This comes as a shock to many in the startup world. It sort of smacks in the face of open source, or does it?

I worked in the enterprise space as an Oracle DBA for a decade starting in the mid-nineties. Among DBAs there was always a chuckle when a new version of Oracle came out. No one wanted to touch it with a ten foot pole. Sure we’d install it on test boxes, start learning the new features and so forth. But deploy it? No way, wait a good 2 or even 3 years before upgrading.

Meanwhile management was eager for the latest software. Don’t we want the newest? The Oracle sales guys would be selling the virtues of all sorts of features that nobody needed right away anyway.

Choosing boring components takes discipline to fight sexy new technologies & bleeding edge versions. But staid & stodgy wins you everyday in operations uptime.

Related: Is automation killing old-school operations?

5. Use tried & tested components

Do you find your application or stack contains java, ruby, python & PHP? Choose one.

One webserver like nginx, one caching server like memcache or redis, one search server like solr or elasticsearch, one database like MySQL or postgres. Standardize all your components on one image, so you can use that for all your servers, regardless of which you use.

Fewer components will mean fewer interdependencies, less maintenance, & less chaos.

Also: What’s the luckiest thing that’s happened in your career?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

What happens when entrepreneurs treat data as a product?

Sneachta Pix
Sneachta Pix

I’ve been reading DJ Patil’s thoughts on building data products. As the chief data scientist of the united states, he knows a thing or two.

Join 30,000 others and follow Sean Hull on twitter @hullsean.

I also attended a recent Look & Tell event, where Lincoln Ritter talked about Data Democracy at Animoto. He expressed many of Patil’s lessons.

I took away a few key lessons from these that seem to be repeating refrains…

1. UX of data

UX design involves looking at how customers actually use a product in the real world. What parts of the product work for them, how they flow through that product and so on.

That same design sense can be applied to data. At high level that means exposing data in a measured, meaningful & authoritative way. Not all the tables & all the data points but rather key ones that help the business make decisions. Then layering on top discovery tools like Looker to allow the biz-ops to make more informed decisions.

Also: 5 Reasons to move data to Amazon Redshift

2. Be iterative

Clean data, presented to business operations in a meaningful way, allows them to explore the data, and find useful trends. What’s more with good discovery tools, biz-ops is empowered to do their own reporting.

All this reduces the need to go to engineering for each report. It reduces friction and facilitates faster iteration. That’s agile!

Related: Is automation killing old-school operations?

3. Be authoritative

Handing the keys to the data kingdom over to business means more eyes on the prize. That may well surface data inconsistencies. Each such case can reduce trust on your data.

Being authoritative means building checks into your data feeds, and identifying where data is amiss. Then fixing it at the source.

Read: Are SQL Databases dead?

4. Spot checks & balances

Spot checks on data are like unit tests on code. They keep you honest. Those rules for how your business works, and what your data should look like, can be captured in code, then applied as tests against source data.

Also: Is Apple betting against big data?

5. Monitoring for data outages

As data is treated as a product, it should be monitored just like other production systems. A data inconsistency or failed spot check then becomes an “outage”. By taking these very seriously, and fire fighting just as you do other production systems, you can build trust in that data, as those fires become less frequent.

Also: Why Airbnb didn’t have to fail

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Is AWS the patient that needs constant medication?

storm coming

I was just reading High Scalability about Why Swiftype moved off Amazon EC2 to Softlayer and saw great wins!

We’ve all heard by now how awesome the cloud is. Spinup infrastructure instantly. Just add water! No up front costs! Autoscale to meet seasonal application demands!

But less well known or even understood by most engineering teams are the seasonal weather patterns of the cloud environment itself!

Join 28,000 others and follow Sean Hull on twitter @hullsean.

Sure there are firms like Netflix, who have turned the fickle cloud into one of virtues & reliability. But most of the firms I work with everyday, have moved to Amazon as though it’s regular bare-metal. And encountered some real problems in the process.

1. Everyday hardware outages

Many of the firms I’ve seen hosted on AWS don’t realize the servers fail so often. Amazon actually choosing cheap commodity components as a cost-savings measure. The assumption is, resilience should be built into your infrastructure using devops practices & automation tools like Chef & Puppet.

The sad reality is most firms provision the usual way, through the dashboard, with no safety net.

Also: Is your cloud speeding for a scalability cliff

2. Ongoing network problems

Network latency is a big problem on Amazon. And it will affect you more. One reason is you’re most likely sitting on EBS as your storage. EBS? That’s elastic block storage, it’s Amazon’s NAS solution. Your little cheapo instance has to cross the network to get to storage. That *WILL* affect your performance.

If you’re not already doing so, please start using their most important & easily missed performance feature – provisioned IOPS.

Related: The chaos theory of cloud scalability

3. Hard to be as resilient as netflix

We’ve by now heard of firms such as Netflix building their Chaos Monkey to actively knock out servers, in effort to test their ability to self-healing infrastructure.

From what I’m seeing at startups, most have a bit of devops in place, a bit of automation, such as autoscaling around the webservers. But little in terms of cross-region deployments. What’s more their database tier is protected only by multi-az or just a read-replica or two. These are fine for what they are, but will require real intervention when (not if) the server fails.

I recommend building a browse-only mode for your application, to eliminate downtime in these cases.

Read: 8 questions to ask an aws expert

4. Provisioning isn’t your only problem

But the cloud gives me instant infrastructure. I can spinup servers & configure components through an API! Yes this is a major benefit of the cloud, compared to 1-2 hours in traditional environments like Softlayer or Rackspace. But you can also compare that with an outage every couple of years! Amazon’s hardware may fail a couple times a hear, more if you’re unlucky.

Meanwhile you’re going to deal with season weather problems *INSIDE* your datacenter. Think of these as swarms of customers invading your servers, like a DDOS attack, but self-inflicted.

Amazon is like a weak immune system attacking itself all the time, requiring constant medication to keep the host alive!

Also: 5 Things toxic to scalability

5. RDS is going to bite you

Besides all these other problems, I’m seeing more customers build their applications on the managed database solution MySQL RDS. I’ve found RDS terribly hard to manage. It introduces downtime at every turn, where standard MySQL would incur none.

In my experience Upgrading RDS is like a shit-storm that will not end!

Also: Does open source enable the cloud?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Can entrepreneurs learn from how science beat Cholera?

cholera ghost map johnson

When I picked up Johnson’s book, I knew nothing about Cholera. Sure I’d heard the name, but I didn’t know what a plague it was, during the 19th century.

Johnson’s book is at once a thriller, of the deadly progress of the disease. But in that story, we learn of the squalor inhabitants of victorian england endured, before public works & sanitation. We learn of architecture & city planning, statistics & how epidemiology was born. The evidence is weaved together with map making, the study of pandemics, information design, environmentalism & modern crisis management.

Join 28,000 others and follow Sean Hull on twitter @hullsean.

“It is a great testimony to the connectedness of life on earth that the fates of the largest and the tiniest life should be so closely dependent on each other. In a city like Victorian London, unchallenged by military threats and bursting with new forms of capital and energy, microbes were the primary force reigning in the city’s otherwise runaway growth, precisely because London had offered Vibrio cholerae (not to mention countless other species of bacterium) precisely what it had offered stock-brokers and coffee-house proprietors and sewer-hunters: a whole new way of making a living.”

1. Scientific pollination

John Snow was the investigator who solved the riddle. He didn’t believe that putrid smells carried disease, the miasma theory prevailing of the day.

“Part of what made Snow’s map groundbreaking was the fact that it wedded state-of-the-art information design to a scientifically valid theory of cholera’s transmission. “

Also: Did Airbnb have to fail?

2. Public health by another name

“The first defining act of a modern, centralized public health authority was to poison an entire urban population”

Although they didn’t know it at the time, the dumping of waste water directly into the Thames river was in fact poisoning people & wildlife in the surrounding areas.

In large part the establishment was blinded by it’s belief in miasma, the theory that disease was originated from bad smells & thus traveled through the air.

Related: 5 reasons to move data to amazon redshift

3. A Generalist saves the day

The interesting thing about John Snow was how much of a generalist he really was. Because of this he was able to see thing across disciplines that others of the time were not able to see.

“Snow himself was a kind of one-man coffeehouse: one of the primary reasons he was able to cut through the fog of miasma was his multi-disciplinary approach, as a practicing physician, mapmaker, inventor, chemist, demographer & medical detective”

Read: Are generalists better at scaling the web?

4. Enabling the growth of modern cities

The discovery of the cause of Cholera prompted the city of London to a huge public works project, to build sewers that would flush waste water out to sea. Truly, it was this discovery and it’s solution that has enabled much of the population densities we see in the 21st century. Without modern sanitation, 20 million plus cities would certainly not be possible.

Also: Is automation killing old-school operations?

5. Bird Flu & modern crisis management

In recent years we have seen public health officials around the world push for Thai poultry workers to get their flu shots. Why? Because although avian flu only spreads from animal to humans now, were it to encounter a person who had our run-of-the-mill flu, it could quickly learn to move human to human.

All these disciplines of science, epidemiology and so forth direct decisions we make today. And much of that is informed by what Snow pioneered with the study of cholera.

Also: 5 Things toxic to scalability

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Is Agile right for fixing performance issues?

storm coming

I was sifting through the CTO school email list recently, and the discussion of performance tuning came up. One manager had posted asking how to organize sprints, and break down stories for the process.

Join 29,000 others and follow Sean Hull on twitter @hullsean.

Another CTO chimed in with a response…

“Agile is not right for fixing performance issues.”

I agree with him & here’s why.

1. Agile roadblocks

At a very high level, agile seeks to organize work around sprints of a few weeks, and sets of stories within those sprints. The assumption here is that you have a set of identified issues. With software development, you have features you’re building. With performance tuning, it’s all about investigation.

How long will it take to solve the crime? Very good question!

Also: 5 things toxic to scalability

2. Reproduce problem

Are you seeing general site slowness? Is there a particular feature that loads extremely slowly? Or is there a report that runs forever? Whatever it is, you must first be able to reproduce it. If it’s general site slowness, identify when it is happening.

Related: Are SQL Databases dead?

3. Search for bottlenecks

Once you’ve reproduced your problem, next you want to start digging. Looking at logfiles can help you find errors, such as timeouts. The database has a slow query log, which you’ll definitely want to review. Slow queries can be surfaced by new code deploys, or middleware in front of your database, such as an ORM.

If you find your logfiles aren’t enabled, it’s a good first step to turn them on. Also look at how you’re caching. The browser should be directed to cache, assets should be on CDN, a page cache should protect your application server, and an object cache in front of your database.

Read: Is five nines a myth that just won’t die?

4. Find the root cause

As you dig deeper into your problem, you’ll likely uncover the root of your scalability problem. Likely causes include synchronous, serial or locking processes & requests, object relational modelers, lack of caching or new code that has not been tuned well.

Also: Did Airbnb, reddit , heroku & flipboard have to fail?

5. Optimize

This is what I think of as the fun part. You’ve measured the issues, found the problem. Now it’s time to fix it. This is an exciting moment, to bring real benefit to the business. Eliminating a performance problem can feel like springtime at the end of a long cold winter!

Also: Is zero downtime even possible on RDS?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

The chaos theory of cloud scalability

The.Rohit - Flickr
The.Rohit – Flickr

Reading Benedict Evans weekly newsletter, you’re bound to bump into something new & useful. His newsletter covers Mobile, but that also means it touches on a lot of other areas of tech, innovation & startups.

This week he pointed me to A Weissman’s The Chaos Theory of Startups. He argues a VC’s job is to help a startup identify the right framework. It’s about finding the signal in the noise.

Join 29,000 others and follow Sean Hull on twitter @hullsean.

I think you can carry this idea over to technical operations today. There are a few key maxims I follow to keep you on the scalable track.

1. Degrade gracefully

You’ve heard it before, but have you done anything about it?

Build a read-only or browse only mode into your application. Do it now. You will thank me. When your database goes down unexpectedly (with RDS this might happen sooner than you think), you want to be able to use your lovely read-only slave database. Browse only mode, forces developers to add read-only support in most application functions, keeping the site up and running, without a full visible and ugly outage.

Which brings me to point two, be sure to have copies of your production database. Real live, only read-only copies. In Amazon speak, this is a read-replica, in MySQL this is a slave database. Most startups I see these days have this, but if you’re one of the ones dragging your feet, do this now.

Also: Is the difference between dev & ops a four-letter word?

2. Monitor & measure

Amazon’s cloudwatch is fine for what it is, and so is New Relic. But employing a dedicated tool just for monitoring, such a Nagios & cacti can give you much more granular intelligence about what’s happening with your infrastructure. Nagios gives you the monitoring & alerting, Cacti gives you the history. It’s like a BI reporting tool for infrastructure.

Related: Is automation killing old-school operations?

3. Keep components simple

Keep it simple stupid? Don’t adopt new technologies, languages, or versions of software, without first vetting them. Ask questions:

o Is there an existing piece of software or service that can overlap this new one, killing two birds with one stone?
o Does everybody know this new technology?
o Does this choice of technology solve any other broad problems we have?
o Is there a large community around the project?
o Are there a lot of engineers with experience in this chosen technology?

Tellingly, many startups don’t have an operations person to start with. In those, the danger is developers choose new solutions, with no push back.

I asked… Does a four letter word divide dev and ops?

Read: Do managers underestimate operational cost?

4. Don’t force database abstraction

Object Relational Modelers, aka database middleware, are great in theory. We want a library that takes database & SQL drudgery away from developers. Why reinvent the wheel?

The trouble is database independent code doesn’t work, and never has. ORMs are painfully inefficient, selecting all columns, or repeatedly reading rows from tables. This causes serious traffic jams inside your database.

They come in various guises, Cake PHP, Active Record for Ruby, Hibernate for Java, SQL Alchemy for Python.

Also: Is the difference between dev & ops a four-letter word?

5. Be asynchronous

This means don’t make your application code wait. Make asynchronous calls to APIs & check back later, use software queues so traffic backups don’t clog your components & communication.

Avoid any type of two-phase or multi-phase commit. These are common in clustered databases, forcing a serialization point so nodes can agree on what data looks like. However they’re toxic to scalability. Look for technologies that use eventually consistent algorithms.

Also: Is the difference between dev & ops a four-letter word?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters