Why generalists are better at scaling the web

Join 12,100 others and follow Sean Hull on twitter @hullsean.

Recently at Surge 2011, the annual  conference on scalability  and performance, Google’s CIO Ben Fried gave an illuminating keynote address. His main insight was that generalists are the people that will lead engineering teams in successfully scaling the web.

Read: Why devops talent is in short supply

In a world where the badge of Specialist or Expert is prized, this was refreshing perspective from an industry bigwig. As tech professionals, or any professional for that matter, we don’t welcome the label of generalist. The word suggests a jack-of-all-trades and master of none. But the generalist is no less an expert than the specialist. Generalists can get their hands greasy with the tools to fix bugs in the machine but they are especially good at mobilizing the machine itself; with their talents of broad vision, and perspective they can direct an entire team to accomplish tasks efficiently. This ability to see big-picture can not be underestimated especially during times of crisis or pressure to meet targets. For a team to scale the web effectively, you’re going to need a good mix of both types of personalities.

Also: Why a killer title can make or break your content efforts

Picking out the potential generalist

Startups wanting to achieve scalability  are face with huge pressure to do more with limited budgets.  In bringing on new engineers, they must hire people who have the programming skills to realise their big idea. Ideally these programmers should also have some architectural vision, a knowledge of web operations, and performance as that application becomes popular.  And what of maintaining that large infrastructure as it grows?

So the question for a startup is how do you spot or hire generalists?  In the book, REWORK by  Jason Fried and David Heinemeier Hansson, the authors emphasize good writers and good teachers.  Their point is that in order to teach an idea or concept you have to understand it thoroughly and be able to step into someone elses shoes in order to explain it from their vantage point.

Read: Why high availability is so very hard to deliver

This is in large part the skill that Ben Fried was speaking about at Surge. To borrow his method of using “Disaster Porn” as a way to illustrate a point, we have a story of our own.

Our own disaster porn

About five years ago we worked for a firm who was faced with ongoing challenges of growth.  Their user base was growing by 25%-50% per quarter but they were suffering from outages because of that growth.  What’s more one of their top engineers was leaving to join another company.  They took the opportunity to bring us on board to assess the entire infrastructure.

We looked over the architecture and were surprised at every turn.  Although they had a lot of engineers on staff, they were all tasked with building features, and responding to ongoing business requirements.  None were given any operations responsibilities. There was a very obvious lack of leadership. so you can imagine how this turned out to be a recipe for a fine mess. One day we’d see new servers being added at random, another day we’d witness haphazard decisions with what technologies to use or what what versions of frameworks to adopt. In effect, each engineer was making decisions without considering the consequences on the whole.

Read this: Why a four letter word divides dev and ops

The infrastructure wound up being built on two different webserver platforms, three – count ‘em – three different programming languages and frameworks, and three MySQL databases scattered about on different machines. After a few hours discussing the architecture with the team, we put together a plan that framed the architecture around three simpler tiers.  Two included the standard load balanced webserver tier, and backend database tier, and then a third to manage batch jobs and building static assets and media files.

A generalist solution

Our push then was to standardize on one type of webserver, one version of each language stack, and consolidate all the databases into one instance.  This huge simplification meant that they could add replication to the database tier, eliminating single points of failure, providing redundancy for all business services.  This in itself was a major achievement. We left them with some major problems solved while offering a new direction and a better handle on the remaining challenges. What the company had lacked was not engineering know-how, but rather a generalist’s perspective.  The engineers had focused too much on immediate tasks, locked on detail, but lost sight of the big picture.

Related: MySQL interview guide for managers and candidates alike

As more companies move their applications to the cloud, some carefully and some not, we anticipate many more disaster scenarios such as these.  This speaks strongly to the rising cult of DevOps and its effort towards broader skills and collaboration among both developers and operations teams. The good thing to come out of it is that cleaning up messes such as these will force us to hone our strategic thinking and organizational skills, possibly making generalists out of many more of us.