Category Archives: CTO/CIO

Business Agility at AWS re:Invent

Also find Sean Hull’s ramblings on twitter @hullsean.

Although I couldn’t be in Vegas to attend re:Invent, there is so much online it’s almost better than being at the conference. From an ongoing live stream of keynotes and sessions, to an archived collection on Youtube.

The big wins

You may have heard of all the great things that Amazon or cloud computing can do, but I thought Andy Jassy summarized these nicely in these six points.

1. Replace capex with opex
2. lower total costs of ownership
3. no guessing about capacity
4. encourage agility & innovation
5. differentiation
6. global from the start

Redshift

By far the biggest announcement at the show is Amazon’s new Redshift product. It is a fully managed datawarehouse solution that scales to petabytes in it’s cloud. Currently there are two business intelligence tools that are supported namely Jaspersoft and Microstrategy.

[quote]
In 2003 Amazon was a 5 billion dollar company. Today AWS adds the same infrastructure capacity everyday to it’s availability zones!
[/quote]

Reduced prices by 25% for S3

As a lot of folks know, Amazon has always been about cheaper prices. That model has been disruptive in the book selling industry, and in a huge way in the infrastructure and datacenter industry. As more customers signup, economies of scale mean they can offer the same hardware & services for lower prices.

With that they’re announcing lower prices for S3 by a whopping 25%. To me this speaks to their continuing push to dominate the market by driving prices downward.

Amazon’s Channel on Youtube

If you weren’t able to attend the conference, or want to recap some highlights you might have missed, they have put up a great AWS Channel on Youtube.

Some of the speakers include Sharon Chiarella VP Mechanical Turk, Glenn Hazard, CEO, Xceedium, Todd Barr CMO of Alfresco talks, Bright Fulton, Operations for Swipely, Colin Percival, FreeBSD Developer, Ted Dunning, Chief Application Architect of MapR Technologies, James Broberg, CTO & Founder of MetaCDN, Mitchell Garnaat, Sr. Engineer, David Etue, Vice President, SafeNet, and Mike Culver, Sr. Consultant to name just a few.

Read this far? Grab our Scalable Startups for more tips and special content.

Hacking Job Search – Three Meaty Ideas

Also find the author on twitter @hullsean.

Demand for talented engineers has never been higher. It is in fact the dirty little secret of the startup industry, that there are simply not enough qualified folks to fill the positions.

What this means for you is that you have a lot of options. What it means for a hiring manager is that you will have to work even harder to find the right candidate. Just going to a recruiter isn’t enough. Use your network, go to meetups, follow Gary’s Guide daily.

Also check out our Mythical MySQL DBA piece where we talk about the shortage of DBAs and operations folks.

Further if you’ve dabbled in freelance or independent consulting, I wrote an interesting an in depth look at Why do people leave consulting. Understanding this can help avoid it in your own career, or avoid your resources leaving for better shores.

Find us on twitter @hullsean and linkedin where we share content and ideas everyday!

1. Build your reputation

As they say, your reputation precedes you. So start building it now. Fulltime or freelance, you want to be known.

Speaking, yes you can do it. Start with some small meetups, volunteer to speak on a topic. A ten person room is easier than 30, 50 or 100. Once you have a couple under your belt, fill out a CFP for Velocity, OSCon or some software developers conference. There are many.

Blog – if you’re not already doing so you should. Start with once a week. Comment on industry topics, controversial ideas, or engineering know-how. Prospects can look at this and learn a lot more than from a business card.

Write a book, yes you can. It may sound impossible, but the truth is that publishers are always looking for technical writers. Pick a topic near and dear to you. It’ll also give you endless material for your blog.

Go to meetups, you really need to be getting out there and networking. Get some Moo Business Cards and start working on your elevator pitch!

Social media – being active here helps your blog, and helps people find you. Twitter is a great place to do this. Interact with colleagues and startup founders, VCs and more. If you’re a hiring manager or CTO, you may find great programmers and devops this way.

We also wrote a more in-depth article Consulting and Freelance 101. It’s a three part guide with a lot of useful nugets.

Also take a look at our MySQL DBA Interview Guide which is as helpful to devops and DBAs as it is to managers hiring them!

[quote]
Above all else, build your network & your reputation. It will put you in front of more people as a person, not a commodity or a resume in a pile of hundreds.
[/quote]

2. Qualify prospects

You definitely don’t want to take the first offer you get, and managers don’t want to hire the first candidate that comes along. You want two or three to choose from. Best way to do this is to have options.

If you’re a candidate, network or work through your colleagues. When you do get a lead, be sure you’re speaking to an economic buyer. If you’re not you’ll need to try to find that person who actually signs the checks. They are the ones who ultimately make the decision, so you want to sell yourself to them.

Get a Deposit – I know I know, if it’s your first freelance job, you don’t want to scare them off. Or maybe you do? The only prospects that would be scared off by this are ones who may not pay down the line. Dragging their feet with a deposit can also mean bureaucratic red tap, so be patient too.

Sara Horowitz has an excellent book Freelancers bible, we recommend you grab a copy right now!

Commodity You Are Not so don’t sell yourself as one. What do I mean? You are not an interchangeable part. You have special skills, you have personality, you have things that you’re particularly good at. These traits are what you need to focus on. The dime-a-dozen skills should sit more in the background.

You’ll also need to price and package your services. We talked about this in-depth in Consulting Essentials – Getting the Business.

We also think there is a reason Why Generalists are better at scaling the web.

3. Play the numbers game

For hiring managers this doesn’t mean working through recruiters that might be bringing subpar talent, it means networking through industry events, meetups, startup pitch and venture capital events. There are a few every single day in NYC and there’s no reason not to go to some of them.

For candidates, be eyeing a few different companies, and following up on more than one prospect. You should really think of this process as an integral and enjoyable part of your career, not a temporary in between stage. Networking doesn’t happen overnight, but from a regular process of meeting and engaging with colleagues over years and years in an industry.

At the end of the day hiring is a numbers game so you should play it as such. Keep searching, and always be watching the horizon.

Read this far? Grab our Scalable Startups for more tips and special content.

No iPhones Were Harmed in the Creation of this Outage

Apple’s recent iMessage outage had some users confused. What do you mean I can’t text my favorite cat photos?? How can Apple do this to me!?!?

What happened?

Apple provides services to everyone who uses it’s platform. iCloud for example stores your contacts, calendar, photos, apps and documents in the cloud. No more syncing to itunes to make sure all your stuff is backed up. It’s automatic in the cloud. Yes or course unless iCloud is down.

Same goes for iMessage. Apple has quietly introduced this, as a more feature rich version of text messaging. It’s great until the service isn’t available. What gives?

All these services are backed magically or not so magically by computer servers. These computers sit in datacenters, managed by operations teams, and to some degree with automation. All the things that brought down AWS & AirBNB & Reddit with it could also take out Apple. A serious storm like Sandy also presents real risks.

[quote]
iMessage is a text and SMS replacement service for iPhones & iPads. It is more feature rich, offering device synchronization, group texting & return receipt. But in a very big way it is also an attempt for Apple to muscle into the market and further extend it’s platform reach.
[/quote]

100% uptime ain’t easy

Even for firms that promise insanely good uptime, five nines remains very very hard to achieve in practice.

For starters all the components behind your service, need to be redundant. Multiple load balancers, webservers, caching servers, and of course databases that hold all your business assets.

But as the repeated AWS outages attest, even redundancy here isn’t enough. You also need to use multiple cloud providers. Here you can mirror across clouds so even an outage in one won’t bring down your business.

What about in the world of messaging? Well you can bet your customers don’t likely know or care about high availability, uptime, or any of these other web operations buzzwords. But they sure understand when they can’t use their service. It may give companies like Apple pause as they try to stretch themselves into areas outside their core business of iphones, ipads, and the IOS platform itself.

iMessage – messaging standards power play

When I first upgraded to an iPhone 4S, the first thing I noticed was the light blue bubbles when texting certain people. Why was that, I wondered? I quickly found out about iMessage, which was conveniently configured, to replace my old and trusty text messaging.

Texts or SMS work across all phones, smartphone or not, and apple or not. But open standards don’t lend themselves well to market muscle and dominance. So it makes sense that Apple would be pushing into this space. I met more than one blackberry owner who loved using bbm to keep in touch with colleagues. It’s like your own private club. And that muscle further strengthens Apple’s platform overall. Just take a look at how the Android Ecosystem is broken if you need an example of what not to do.

The flip side is it means you have more to manage. More servers, more services, more dimensions to your business. More frequent outages that can tarnish your reputation.

[quote]
A lot complaining and publicity like the iMessage outage received, may just be an indication that you’re big enough for people to care.
[/quote]

Alternatives abound…

There is huge competition in the messaging space. The outage and it’s publicity further underline this fact.

For example on the iPhone for messaging there is ChatOn, Whatsapp, LINE, SKYPE & wechat just to name a few.

Interestingly, while researching this article, I downloaded WhatsApp to give it a try. Only 99 cents, why not. Turns out that they had not one, but two outages, just a week ago. Seems Apple isn’t the only one experiencing growing pains.

A lot of complaining and publicity could be a sign that you’re big enough for people to care!

Read this far? Grab our Scalable Startups monthly.

Crisis Management in the Crosshairs – Sandy

Crisis Management During Sandy

The news this past week has brought endless images of devastation. All metropolitan region, the damage is apparent.

More than once in conversation I’ve commented “That’s similar to what I do.” The response is often one of confusion. So I go on to clarify. Web operations is every bit about disaster recovery and crisis management in the datacenter. If you saw Con Edison down in the trenches you might not know how that power gets to your building, or what all those pipes down there do, but you know when it’s out! You know when something is out of order.

That’s why datacenter operations can learn so much about crisis management from the handling of Hurricane Sandy.

This is a followup to our popular article last week Real Disaster Recovery Lessons from Sandy.

1. Run Fire Drills

Nothing can substitute for real world testing. Run your application through it’s paces, pull the plugs, pull the power. You need to know what’s going to go wrong before it happens. Put your application on life support, and see how it handles. Failover to backup servers, restore the entire application stack and components from backups.

2. Let the Pros Handle Cleanup

This week Fred Wilson blogged about a small data room his family managed, for their personal photos, videos, music and so forth. He ruminated on what would have happened to that home datacenter, were he living there today when Sandy struck.

It’s a story many of us can related to, and points to obvious advantages of moving to the cloud. Handing things over to the pros means basic best practices will be followed. EBS storage, for example is redundant, so a single harddrive failure won’t take you out. What’s more S3 offers geographically distributed redundant copies of your data.

After last week’s AWS outage I wrote that AirBNB & Reddit didn’t have to fail. What’s more in the cloud, disaster recovery is also left to the professionals.

[quote]
Web Operations teams do what Con Edison does, but for the interwebs. We drill down into the bowels of our digital city, find the wires that are crossed, and repair them. Crisis management rules the day. I can admire how quickly they’ve brought NYC back up and running after the wrath of storm Sandy.
[/quote]

3. Have a few different backup plans

Watching New Yorkers find alternate means of transportation into the city has been nothing short of inspirational. Trains not running? A bus services takes it’s place. L trains not crossing the river? A huge stream of bikes takes to the williamsburg bridge to get workers to where they need to go.

Deploying on Amazon can be a great cloud option, but consider using multiple cloud providers to give you even more redundancy. Don’t put all your eggs in one basket.

Some very important things to remember about MySQL backups.

4. Keep Open Lines of Communication

While recovery continued apace, city dwellers below 34th street looked to text messages, and old school radios to get news and updates. When would power be restored? Does my building use gas or steam to heat? Why are certain streets coming back online, while others remain dark?

During an emergency like this one, it becomes obvious how important lines of communication are. So to in datacenter crisis management, key people from business units, operations teams, and dev all must coordinate. Orchestrating that is and art all by itself. A great CTO knows how to do this.

Read this far? Grab our monthly scalable startups.

Cloud Operations Interview

What does a cloud computing expert need to know? How do you hire a cloud computing expert? Competition for operations & DBAs is fierce, so you’ll want to know how to find the best.

If you’re a systems administrator or ops guy, you may want to prepare for an interview for such a position. Meanwhile, if you’re a director of it or operations, a recruiter or manager in HR, you’ll want to have some idea how to find the right candidate.

Here’s my guide to do just that. You may also jump to part two Cloud Deployment Interview or the last part three Cloud DBA, Architecture and Management Interview.

1. Solid unix systems administrator

At the top of the list, a cloud operations expert needs to understand Unix and more importantly Linux. Here are some sample questions to get the conversation moving:

o What is web operations and what have you done day-to-day?

Prepare some stories.

o What’s your favorite feature of the linux kernel?

This is an open ended question, but a systems administrator should have some knowledge here. The kernel is the most basic piece of software that runs when a computer boots up, whether it is a desktop or a server. This piece of software coordinates everything, manages resources, and directs traffic.

o Name some distributions of linux. What is a distro?

Linux is built by a collaborative team of thousands on the internet. That’s what makes it open source. The distributions, include the operating system, along with a collection of software to go along with it. All the supporting utilities, libraries and servers must be compiled and held in a repository. That’s what makes up a distribution. Debian, Redhat and Ubuntu are a few popular ones.

[quote]
A cloud operations expert needs to have a wide ranging skillset, from unix administration, architecture, scalability, database & webserver administration, troubleshooting & performance, load & stress testing. You’ll also want someone who has learned hard lessons from some failures, has some war stories to tell and has a hard nose for stability.
[/quote]

o What’s the difference between apache and nginx?

These two pieces of software are both webservers, that is they respond to the HTTP protocol, and can serve HTML pages. They also have a myriad of plugins to support different languages and features. The difference? Nginx (pronounced engine-X) is a newer incarnation. It’s been rearchitected from the ground up, building on all the things learned from Apache over the years. Its tighter, more efficient code, and easier to configure.

You might also enjoy our Intro to EC2 Cloud Deployments Guide.

o What is a key value store? examples?

There are lots of examples of these types of databases. They are a very simple memory cache that can interface with most applications. Memcache is a popular example of a key value store. Redis, CouchDB and Voldemort can also do this.

o What is a page cache? Reverse proxy cache? examples?

These are all the same thing. They are basically a very minimal webserver without all the plugins or bells and whistles. You put one of these in front of your webserver to handle all the easy stuff, and speed up overall throughput. Varnish is a popular example.

o What filesystem do you prefer?

This is a bit arcane, but one should have some opinions here. xfs is a popular filesystem, though ext3 and ext4 are also common. Emphasize the journaling aspect here. Journaling means that if you pull the cord or your server crashes, the filesystem can recover upon reboot. It does this by journaling changes, much how a database keeps a redolog cache of recent changes to database tables.

o Command line tools

There are lots of commands in the day-to-day toolbox of a web ops expert. Here are some examples:
rsync (pronounced our-sync) – sync files between servers & do checksums to allow easy restarts
scp (pronounced s-c-p) – secure copy, similar to rsync but no checksums, so less reliable
curl (pronounced kurl) – diagnose & test urls and HTTP from the command line
cron (pronounced cron) – run commands at scheduled times
ssh (pronounced s-s-h) – secure shell, the most basic tool to reach a cloud server
ifconfig (pronounced if-config) – check the network interfaces on the server
vi/emacs (pronounced v-i and e-macks) – terminal editors, to modify config files
uptime (pronounced up-time) – display the current load average of the server
top (pronounced top) – interactive display of system metrics like memory, load, swap & processes
ps (pronounced p-s) – shows running processes on the server
/var/log/messages – essential system logfile

o What are application servers? How are they different from webservers?

Tomcat & Glassfish are two examples of application servers. These handle heavier weight languages & applications like Java. Application server on some level is just a more heavyduty webserver and these days Apache can be thought of as an application server also.


2. Cloud concepts

o What is virtualization? What is a hypervisor?

Virtualization allows you to run one or more computers within a computer. You can do virtualization on a desktop, sharing network, memory, cpu and disk resources among a number of virtual servers. But more importantly in cloud computing or IaaS offerings you can do virtualization at the datacenter level. The hypervisor layer is a datacenter virtualization technology that provisions server resources, and balances shared network and disk resources.

o What is an image?

In Amazon the world, the AMI or amazon machine image is a snapshot of a server state at one moment in time. This image is take at the block level, and includes the master block record, the first block on disk that a server boots from. All that is the state of a server, when it is shutdown, is what is stored on disk or in this image. All config files, logfiles, and anything else writing to disk.

o What is multi-tenant?

This means that there are multiple servers sharing resources. The tenants are the customers who each want to get the server, cpu, memory, network and disk that they paid for.

o What is the downside to shared resources?

Contention for resources is always the challenge. If your fellow tenants are not very thirsty, this can work to your advantage. But if they’re also heavy users, the hypervisor layer has manage the balancing act. You may get a spike of disk I/O at one point, but later get a dearth. This can cause a relational database like MySQL or Oracle to suddenly look stalled.

o What is instance-store? What is ebs?

Instance store servers were Amazon’s original offering, where servers had their own local (and slow) storage. This storage was ephemeral, so all machine state was lost on reboot. These servers also boot slowly. EBS also known as elastic block storage is a virtualized storage option, similar to NAS or NFS. You can create arbitrary chunks of storage, and attach them to servers, all from command line APIs. Cool!

o What is virtual private cloud?

With the VPC offering, Amazon drops a router into your existing datacenter. You can then provision virtual servers to your hearts content, and they all appear to be servers in your existing datacenter. Elastically scale, within the network and security model you’re already using.

o What is a hybrid approach to cloud adoption?

Keeping your investments in hardware and datacenter is obviously an appealing option for firms that have large existing environment. A hybrid approach with a VPC allows you to get your feet wet, but still keep essential applications on physical servers.

o What is Amazon EC2?

Elastic Compute Cloud refers to the virtual servers you spinup in Amazon Web Services.

o What is Amazon RDS, Oracle RDS, Mysql RDS?

Amazon has various relational and non-relational database offerings. RDS stands for relational database service.

RDS or roll your own – which is better? Here are some use cases to help you decide.

o What is multi-az?

Amazon’s infrastructure offering isn’t just a single datacenter with servers. The beauty of what they’ve built is that they offer a number of datacenters (called availability zones) in each of many regions such as Northern Virginia, Oregon and Singapore.

Incidentally multi-az is a key feature to how businesses can protect themselves from failure. Amazon recently had an outage, but AirBNB, Reddit & Foursquare didn’t have to fail.

o What does a CDN do? How does it work? examples?

A CDN is a content delivery network. Remember all those files that make up a webpage? Images, video, css files? Turns out serving these components from servers *closer* to your customer, make their webpages load much faster. CDNs are networks of servers that hold the content of your pages, and serve them faster.

It works by replacing content paths with a special one from your provider. A simple change in your code will allow content to dynamically load from across the web. Cool!

CloudFront is Amazon’s offering coupled with S3 for file storage. Akamai is another big provider.

We’re not done yet. In part two on deployments and http://www.iheavy.com/2012/11/01/cloud-deployment-interview/”>part three of this series, we’ll hit on other important skills a cloud ops expert should have including scripting, database administration (Our MySQL Interview Guide), scalability, performance, configuration management, metrics, monitoring, and some all important war stories!

Here are some questions to pique your interest:

o Why does the API battle between Amazon & Eucalyptus (FOSS) matter?
o Do you use command line tools? why?
o What can go wrong with backups? how do we test them?
o Should we encrypt filesystems in the cloud? what are the risks?
o Should we use offsite backups?
o What is DRBD?
o Why is auditing important? access control?
o What is load balancing? why is it difficult with databases?
o How do you perform a benchmark? perform load testing?
o Why use a package manager? can we install from source?

Our Deploying MySQL on Amazon EC2 Guide is also related to this interview process.

You may also jump to part two Cloud Deployment Interview or the last part three Cloud DBA, Architecture and Management Interview.

Read this far? Grab our newsletter – startup scalability.

AirBNB didn't have to fail

Today part of Amazon Web Services failed, taking down with it a slew of startups that all run on Amazon’s Cloud infrastructure. AirBNB was one of the biggest, but also Heroku, Reddit, Minecraft, Flipboard & Coursera down with it. Its not the first time. What the heck happened, and why should we care?

1. Root Cause

The AWS service allows companies like AirBNB to build web applications, and host them on servers owned and managed by Amazon. The so-called raw iron of this army of compute power sits in datacenters. Each datacenter is a zone, and there are many in each of their service regions including US East (Northern Virginia), US West (Oregon), US West (Northern California), EU (Ireland), Asia Pacific (Singapore), Asia Pacific (Tokyo), South America (Sao Paulo), and AWS GovCloud.

Today one of those datacenters in the Northern Virginia region had a failure. What does this mean? Essentially firms like AirBNB that hosted their applications ONLY in Northern Virginia experienced outages.

As it turns out, Amazon has a service level agreement of 99.95% availability. We’ve long since said goodbye to the five nines. HA is overrated.

2. Use Redundancy

Although there are lots of pieces and components to a web infrastructure, two big ones are webservers and database servers. Turns out AirBNB could make both of these tiers redundant. How do we do it?

On the database side, you can use Amazon’s multi-az or alternately read-replicas. Each have different service characteristics so you’ll have to evaluate your application to figure out what will work for you.

Then there is the option to host mysql or Percona directly on Amazon servers yourself and use replication.

[quote]Using redundant components like placing webservers and databases in multiple regions, AirBNB could avoid an Amazon outage like Monday’s that affected only Northern Virginia.[/quote]
When do I want RDS versus mysql? Here are some use cases for RDS versus roll your own MySQL.

Now that you’re using multiple zones and regions for your database the hard work is completed. Webservers can be hosted in different regions easily, and don’t require complicated replication to do it.

3. Have a browsing only mode

Another step AirBNB can take to be resilient is to build a browsing only mode into their application. Often we hear about this option for performing maintenance without downtime. But it’s even more valuable during a situation like this. In a real outage you don’t have control over how long it lasts or WHEN it happens. So a browsing only mode can provide real insurance.

For a site like AirBNB this would mean the entire website was up and operating. Customers could browse and view listings, only when they went to book a room would the encounter an error. This would be a very small segment of their customers, and a much less painful PR problem.

Facebook has experience intermittent outages of it’s service. People hardly notice because they’ll often only see a message when they are trying to comment on someone’s wall post, send a message or upload a photo. The site is still operating, but not allowing changes. That’s what a browsing only mode affords you.

[quote]A browsing only mode can make a big difference, keeping most of the site up even when transactions or publish are blocked.
[/quote]

Drupal, an open source CMS system that powers sites like Adweek.com, TheHollywoodReporter.com, and Economist.com uses this technology. It supports a browsing only mode out of the box. An amazon outage like this one would only stop editors from publishing new stories temporarily. A huge win to sites that get 50 to 100 million with-an-m pageviews per month.

4. Web Applications need Feature Flags

Feature flags give you an on/off switch. Build them into heavy duty parts of your site, and you can disable those in an emergency. Host components multiple availability zones for extra peace of mind.

One of our all time most popular posts 5 Things Toxic to Scalability included some indepth discussion of feature flags.

5. Consider Netflix’s Simian Army

Netflix takes a very progressive approach to availability. They bake redundancy and automation right into all of their infrastructure. Then they run an app called the Chaos Monkey which essentially causes outages, randomly. If resilience from constantly falling and getting back up can’t make you stronger, I don’t know what can!

Take a look at the Netflix blog for details on intentional load & stress testing.

6. Use multiple cloud providers

If all of the above isn’t enough for you, taking it further you’d do as George Reese of enstratus recommends and use multiple cloud providers. Not being beholden to one company could help in more situations than just these type of service disruptions too.

Basic EC2 Best Practices mean building redundancy into your infrastructure. Multiple cloud providers simply take that one step further.

Read this far? Grab our newsletter on scalability and startups!

Why do people leave consulting?

Join 12,100 others and follow Sean Hull on twitter @hullsean.

As a long time freelancer, it’s a question that’s intrigued me for some time. I do have some theories…

First, definitions… I’m not talking about working for a large consulting firm. Although this role may be called “consultant”, my meaning is consultant as sole proprietor, entrepreneur, gun for hire or lone wolf.

1. Make more money in a fulltime role

I’ve met a lot of people who fall into this trap. They take a fulltime role simply because it pays better. That raises a lot of questions…

o Are you pricing right?

You could be pricing to high to get *enough* work. You may also be pricing too low to cover benefits, health insurance and so forth. Or perhaps you can’t sell to your rate. You can be smart skills-wise, but do you feel your clients pain? Are you good at being a businessman? Consistent?

o Can you sell, and put together an appealing proposal?

o Can you execute to the clients satisfaction?

o Can you followup consistently while accounts payable gets tied up in knots?

o Can you followup if your client executes past their spend?

Running a business is complicated, and a lot of expenses can be hard to juggle. You will find times when a client may have spent a little faster than their revenue, and have trouble finding money when the invoice arrives. Followup, patience and persistence is key.

Read: Why high availability is so very hard to deliver

Want more? We wrote an in depth 3 part guide to consulting.

2. Make a consistent paycheck in a fulltime position

o Are you networking enough?

If you take a longterm gig and get comfortable, your pipeline can dry up. And your pipeline is the key to your longterm strength, and regular business. You must get out there, and let people know about you, your services, and your availability.

If you don’t network regularly, post across the web, engage on social media channels, blog regularly and so forth, you’ll likely just land a series of 6-12 month fulltimeish gigs through recruiters or headshops.

Related: 5 ways to evaluate independent consultants

[quote]Being a freelancer or entrepreneur involves wearing many hats. Finding business involves networking & marketing. Delivering to their needs involves emotional intelligence. And actually getting paid on time is a whole artform in itself. Leave a good taste in their mouth and your reputation will spread quickly by word of mouth.[/quote]

o Do you really *LIKE* being an entrepreneur?

Are you consistent? Consulting is like running a marathon, if you burn out you may give up!

Have a large web property or application which is experiencing some growing pains? Take a look at how we do performance reviews. It may be just what you’re looking for.

Related: MySQL interview guide for managers and candidates alike

3. Do you like the lifestyle of larger corporate environments?

o Fulltime roles allow for much more jedi sword play. Maneuvering up the ranks involves relationship building as much as consulting, but with a more well defined ladder to climb.

o Sometimes you’ll find pass the buck and pointing fingers quite common.

o There are roles involving managing people and processes. These less often lend themselves to short term or situational consulting arrangements. If you lean towards those roles

Trying to hire top tech talent? Here’s our MySQL DBA hiring guide & interview questions

[quote]Working as a sole proprietor for a couple of decades has taught me to be very entrepreneurial. It is every bit about building a real-world startup[/quote]

4. Want to do more cutting edge & at the keyboard work

Consulting can and often does allow you to bump into the latest technologies, and get your feet wet with what cutting edge firms are doing. However in a fulltime role you can more completely immerse yourself in the technology, and those long term solutions.

Also: Why devops talent is in short supply

o You can take part in R&D – Google’s 20% projects, for example

o You can build hypothetical projects

o You can work in more idealistic environments, operations and even lectures & training

Though you can certainly do all of this as a freelancer, you have to build enough capital, and so forth to make it work.

Juggling job roles as a consultant isn’t easy. What a CTO must never do.

5. Don’t like running a small business

Consulting as a sole proprietor and staying in business for almost twenty years, I’ve learned that it is every bit about running a small business or startup.

A. Acquiring customers, networking, marketing
B. Understanding their needs and delivering to improve their position
C. Pricing in a your customers understand
D. Offering value to your customers, at a competitive price
E. Managing relationships so your brand or reputation precedes you
F. Making sure payments and invoicing isn’t a hurdle, followup
G. Pacing yourself like a marathon runner – keep doing what you’re doing right

Read this far? Get our scalable startups monthly newsletter. We cover these topics in detail, year in and year out.

Anatomy of a Performance Review

A lot of firms come to us with a specific scalability problem. “Our user base is growing rapidly and the website is falling over!” Or they’re selling more widgets, “Our shopping cart is slowing down and we’re seeing users abandon their purchases”. These are real startup growing pains, so what to do?

We like to take a measured approach with these types of challenges, so we thought it would be helpful to run through a hypothetical scenario and see how we work.

Related: Why website speed is crucial to business

Having trouble with scalability? Check out our 5 things toxic to scalability piece.

1. Contract outline

First we talk on the phone, or meet face to face and discuss what’s happening. Do you have one page that’s problematic? Is the website slow during certain hours? Or are you seeing erratic behavior and can’t point to a single source?

From there we outline a course of action, based on:

o talking with team, devs & architects
o reviewing systems first hand
o identifying bottlenecks and trouble spots

This with this outline we’ll include an estimate of the number of work days it’ll take to complete. We’ll then send that back to you for review, exchange a deposit and set a start date.

2. Meet team & discuss architecture

Next we’ll meet the team and review the problems in more technical detail. If you’re in NYC we’ll probably make a stop into your offices and have a warm meet & greet. If you’re located further afield we can either meet over a skype call, or arrange for us to travel to your location for the start of the engagement.

3. Measure current throughput

In order to get a sense of the current state of the systems we’ll measure some system metrics. This could be load average or queries per second or other MySQL internal metrics. We’ll also look at some business metrics such as speed of an ecommerce checkout, or a speed test on a particularly slow page.

These metrics are designed to create a baseline of where things are before any changes are made.

[quote]Measuring both business and system metrics before and after changes, allow a rough ROI measurement to be done. This goes a long way towards justifying the expense of a performance review, current and future.[/quote]

4. Review systems, configurations & setups

Next we’ll jump on the various systems and review configurations. This includes webservers, caching servers and the database servers as necessary. We’ll review memory settings, important configurations, all the dials and switches.

Along with this we’ll also review development and architecture. Are you using Java with Hibernate a popular ORM? Or perhaps CakePHP? Are you writing custom SQL code? Are developers up to speed with EXPLAIN and query profiling? For that matter is code in version control?

Just looking for a DBA? Check out our MySQL Hiring Guide.

5. Report on actionable advice & findings

Perhaps the most essential and useful part of an initial engagement is our overall findings and review report. We’ve found these are very valuable to firms as they speak to a lot of folks up and down the business hierarchy. They speak to management about high level architectural problems and structural or process related challenges. And they can speak well to developers and operations teams as they provide a third party birds eye view of day-to-day activities.

Take a look at a sample report we’ve prepared for Acme StartUp, Inc.

6. Discuss which steps to move on

From here we’ll meet again. In particular we’ll review the actionable advice. Some changes will be low cost, requiring no downtime, while others might require a downtime window. Further medium term changes might require refactoring some code and deploying. Typically the larger longer term architecture changes will also be outlined.

Based on time & costs, we’ll decide together which changes are a priority. Obviously we’ll want to move on low hanging fruit first, and move forward from there.

Want to learn more about us? Check out our testimonials and our about page.

7. Take action on agreed changes

Once we’ve decided which changes we’ll make, we’ll schedule downtime windows as needed and make the changes to systems. From there we’ll carefully observe everything for stability, and no adverse affects.

8. Measure throughput again

Based on the throughput measurements in #3 above, we’ll perform those same benchmarks again. We’ll check low level system metrics, along with higher level business & user based throughput. Both of these are important as they can provide different perspectives on changes made.

For example if the system metrics improve markedly, but the business or user metrics do not, we know are change had some affect on overall performance, but likely we did not identify the one which directly is causing the business slowdown.

9. Summarize findings & performance gain

In the most likely case they both improve markedly, and we can measure the improvements from our entire process of performance review.

This can be helpful and measuring overall return on investment for the engagement. ROI is obviously an important exercise as we want to know that the money is well spent.

10. Document solutions & recommendations

The last step is to document what we did and what we learned. This allows us to carry forward that knowledge and keep applying it to the development and operations process. This allows the business to continue adding value from the engagement even after it’s completed.

Read this far? Grab our newsletter.

Why you should attend Percona Live 2012

What I loved about Percona Live 2011

Last year I was excited to go to Percona Live for the first time in NYC. I arrived just in time to hear Harrison Fisk from Facebook speak about some of the awesome tweaks they’re running with MySQL there. It’s not everyday that you get to hear from top MySQL engineers how they’re using the technology and what their biggest challenges are. If they can make MySQL hum, so can the rest of us!

Afterward, outside in the foyer, I ran into all sorts of luminaries in the MySQL space. Percona folks like Peter Zaitsev & Vadim Tkachenko, plus other big names like Baron Schwartz, Harrison, and Ronald Bradford. I ran into people from firms like Yahoo, Google, Daniweb, Pythian, SkySQL & Palomino.

You might also like our Setup MySQL Replication with Hotbackups as well as How to deploy MySQL on Amazon EC2 servers articles.

What to expect at Percona Live NYC 2012

This years event next month features rockstar engineers from an incredible lineup of firms including Etsy, New Relic, Youtube, Paypal, Tumblr, SugarCRM, Square, and of course a few from Percona themselves. I promise you this, these talks won’t be salesy or in any way a waste of your time and money. They will be thoroughly technical talks, with cutting edge insights and advice from those in the trenches using the technology everyday.

If I wasn’t heading to Oracle Open World for the publishers seminar & MySQL Connect, I would most certainly be there. In fact I had originally been slated to talk about point-in-time recovery in MySQL. Oh well, I’m sure I’ll catch you at the Percona Live in April 2013.

If you do decide to attend please enjoy a 15% discount with code “SeanHull” !

Looking to hire top MySQL talent? Check out our MySQL DBA Hiring Guide with advice for managers, recruiters, and candidates too! We also have an enduringly popular article about the mythical MySQL DBA and why they’re hard to find.

Also if you’ve read this far, please grab my newsletter scalable startups.

Beware the sales wolf in sheep suits

Recently a colleague called me up to get my opinion.

[quote]We’re in the process of standardizing our systems on Red Hat Linux, but management and higher ups are convinced we should deploy Oracle on Oracle’s own Linux distribution. Which is better?[/quote]

Therein lies the eternal drama in organizations, the push & pull between dollars and technology best practices.

We had a similar experience with a MySQL deployment, and solution framed by Oracle sales.

Battle lines are drawn

Clearly the battle lines are drawn now. Between director of operations & team versus management & business stakeholders, between high level and the trenches, or between the systems that support your business and day-to-day running of them.

Business units & management are tasked with budgets, cost management, and long term thinking about trajectory and what’s best for the business. Operations teams are tasked with the day-to-day stability, the command line perspective.

What is the sales team’s position?

Sales guys at Oracle have a job to sell licenses. This isn’t good or bad, it’s their driver. Understanding all the drivers will help us align them.

Sales guys sell to management, so they will likely frame all their stories to management concerns. Also Oracle’s history here is fairly clear. Get customers locked into Oracle up and down the stack, and they become more and more beholden to you as their primary provider. As customers become more dependent, they will begin to squeeze more and more out of them.

Nothing personal, this is how money is made. But understand the goal.

How do OS choices affect the business bottom line?

Standardizing across the enterprise reduces costs & reduces operational complexity. This can reduce risks to operator error & other downtime that increase with more heterogeneous environment.

On the Oracle distribution side, you likely have tweaks to make Oracle run better. However don’t forget the profit motive. Some tweaks may be conveniently “overlooked” in favor of profit. For example for many years the Oracle installer would not complete without error on many Linux systems. Imagine all the professional services that are sold around running through a complex install. Streamlining such an install would *reduce* profits. Don’t laugh.

What happens on the front lines?

On the front lines of course are the ops teams & DBAs, actually installing and supporting enterprise software. Let’s not forget these guys are at the command line. They know inordinately more about what’s really happening down in the trenches. You may find them repeatedly rolling their eyes at salesmen claims.

However they are not the colorful storytellers or communicators that salesmen are, so they may

Want to hire a DBA? Here’s our MySQL interview hiring guide. We also wrote a similar one for Oracle DBA Interview questions.

Align each division’s interests

Despite cultural differences, business management & operations teams should work hard to connect, and align with one another.

Operations should make an effort to better understand the business bottom line. Money doesn’t grow on trees as they say, and choices have to be based on budget, and real-world needs. We’d all like to sit in a university and program or build things just to create something new, but in a business there are market pressures. All teams should reflect on those.

Management should also make an effort to understand ops teams needs. Why are my ops teams telling me a different story than they Oracle sales guys? Fight the urge to bond with the sales folks, despite their smooth delivery, great suits and peer positioning.

Weigh short and long term tradeoffs

List out advantages & tradeoffs on all sides. These should be technical and business bullet points. Brainstorming a full list like this, and having the whole team discuss the list openly will help the team together come up with a more realistic outcome. Some questions to ask…

1. What are the advantages & disadvantages of having multiple providers for your technology stack?
2. Which solutions are open and which are proprietary? What are the tradeoffs there?
3. What does your team have subject matter expertise in?
4. Are there real technical advantages to one solution or the other?
5. Are there real cost advantages to one solution or the other?
6. Are there expertise advantages & training savings to go one direction?
7. Is the technology widely used in your industry? Will additional or replacement operations experts be easy or hard to find?

Read this far? Grab our scalable startups newsletter.