Category Archives: All

Seasonal Traffic Variations – What is it and why is it important?

We applications and websites get measurable traffic, recorded in metrics such as pageviews, unique visitors, and visits.  All of this activity translates to hits to a webserver, and work for a database to retrieve information for those pages.

During one month your application might get 150,000 visits, then during one week where a large ad campaign hits, or some marketing feature goes viral, you suddenly get 500,000 visits in one week!  This is a “good problem to have” on the business side, but poses great challenges to an infrastructure as it represents a 7x increase.  What’s more if you do your capacity planning around that peak, you’ll have in 600% of your computing power and servers sitting around idle most of the year (assuming that’s just a blip).

Therein lies the challenge of seasonal traffic variations.  Capacity planning attempts to watch for trends in traffic, and growth over time of your user base.  But large spikes like the one described above can often be difficult to predict.  The whim of the masses.

Sean Hull asks on Quora – What are seasonal traffic variations and why are they important?

Decoupling – What is it and why is it important?

Processes are said to be coupled when they are tightly wound together, and dependent on one another.

A loose analogy might be replacing a traffic light by a traffic circle.  You keep the traffic moving, reducing the overall wait time for any car entering the intersection.

Decoupling web applications might involve replacing a makeshift queue your application currently implements in a table, with a message queuing service such as RabbitMQ or Amazon’s SQS.

Ultimately decoupling promotes scalability, as you can scale the pieces of your infrastructure that your capacity planning identifies to be bottlenecks.  What’s more you can make those pieces redundant, increasing high availability at the same time.

Sean Hull discusses on Quora: What is decoupling and why is it important?

Database Migration – What is it and why is it important?

Migration in the context of enterprise and web-based applications means moving from one platform to another.  Database Migrations are particularly complicated as you have all the challenges of changing your software platform, where some old features are missing, or behave differently and some new features are available and you’d like to take advantage of those.

In the world of databases, some developers try to build database independent applications, especially using ORMs (object relational mappers).  On the surface this seems like a great option, build your application to use only standard components and features, and then you can easily move to a different platform when requirements dictate.  Unfortunately things are not quite that simple.

Database independent applications necessarily shoot for the lowest common denominator of all of your database platforms, thus lowering the bar on what high-performance features you might take advantage of on the platform you are currently using.

Here are some scenarios:

  1. Building an application which needs to support multiple database backends for customer sites
  2. Building an application in dev and test for proof of concept.  May port to an alternate database in the future.
  3. Don’t want to be locked into one vendor, but have plans for only one platform currently.

These are all good reasons to think about features, and database platforms from the outset.

For situation #1, you need to be most serious about cross-platform compatibility from the start.  Build modules for each database platform, with platform specific code isolated in that module.  If the particular feature you want to use is available only on one of the two platforms, the alternate platform will have to include its implementation of that feature in the database specific module.  Also by isolating all database specific interactions to one module, you have also put boundaries around that code.  If you choose to support another database platform in the future, you merely need to rewrite that database interaction module.

For situation #2, you would use a similar tactic, but won’t necessarily have to implement all the routines in that module for the alternate platform.  Just keep those features, and differences in mind during the development phase.  Where possible document those differences, and comment code liberally.  This will go a long way towards preparing you if you do decide to go for a different database backend.

In situation #3, this may be more of a philosophical concern at this stage.  Don’t get overly dragged down by this, as it’s hypothetical at this stage.  Sometimes developers labor under this concern from previous bad experiences migrating to a new database platform.  But to some degree this is the nature of the beast.  Database platforms include a myriad of different features, datatypes, storage methods, and coding languages.  In many ways this is where their power lies.

Sean Hull discusses on Quora: What is a database migration and why is it important?

Silicon Alley is blooming again

A very nice article just appeared in Crains New York Business this past week covering interesting people to watch in Gotham’s tech scene.

With such a long list of tech startups, Silicon Alley is nearly bursting at the seams!

  • Huffington Post
  • Foursquare
  • Rent the Runway
  • BankSimple
  • Knewton Inc.
  • GroupMe
  • Kickstarter
  • Tumblr
  • Tremor Media Inc.
  • MyCityWay
  • Bitly
  • Yodle
  • Boxee Inc.
  • Tabula Digita
  • AppNexus
  • Yext
  • Lot18
  • Tapad
  • Etsy
  • SecondMarket
  • GawkerMedia

Backups – What are they and why are they important?

Backups are obviously a crucial component in any enterprise application.  Modern internet components are prone to failure, and backups keep your bases covered.  Here’s what you should consider:

  1. Is your database backed up, including object structures, data, stored procedures, grants, and logins?
  2. Is your webserver doc-root backed up?
  3. Is your application source code in version control and backed up?
  4. Are your server configurations backed up?  Relevant config files might include those for apache, mysql, memcache, php, email (postfix or qmail), tomcat, Java solr or any other software your application requires.
  5. Are your cron or supporting scripts and jobs backed up?
  6. Have you tested all of these components and your overall documentation with a fire drill?  This is the proof that you’ve really covered all the angles.

If you do your backups right, you should be able to restore without a problem.

Sean Hull asks on Quora – What are backups and why are they important?

High Availability – What is it and why is it important?

Highly available systems build redundancy into the application and the architecture layers to mitigate against disasters.  Since computing systems are made from commodity hardware and components which are prone to failure, having redundancy at every layer is key.

Redundancy of switches, network interfaces, and load balanced webservers are fairly straightforward and run-of-the-mill.  But clustering your database tier is another trick entirely.  With MySQL, master-master active-passive can work quite well, running circular replication to send all changes to both nodes.  Both nodes are able to handle production traffic, and you pick the one that is active simply by configuring your application to point to that.  Use a technology like MMM or Pacemaker to front your database cluster with a virtual IP (vip), so no application or webserver changes are required to switch which node takes on the master role.

Redundant components are important in a single datacenter, but what if that datacenter goes out or gets hit by a natural disaster?  Is your whole business out?  That’s where geographic redundancy comes in.  Geographic redundancy and geo load balanced DNS comes in.  Having redundant copies of your whole site on both the east and west coast with geo-dns provides the next level of high availability.

Sean Hull discusses on Quora – What is High Availability and why is it important?

Degrade Gracefully – What is it and why is it imporant?

Websites and web applications have traffic patterns that are often unpredictable.  After all growth in traffic is really what we’re after.  However, even with the best stress testing, it’s sometimes difficult to predict what areas of the site will get innundated, or how the site will scale.

Degrade gracefully describes an architecture built specially to unwind in a smooth manner without any real site-wide outage.  What do we mean by that?  We mean build in operational switches to turn off components in the site.  Have a star rating on pages?  Build an on/off switch for your operations team to disable it if necessary.  Have site-wide comments, or robust search?  Allow those features to be disabled.  If possible, architect in a read-only mode for your site that you can turn on in a real difficult situation.  By operationalizing these components, you give more flexibility to the operations team, and reduce the likelihood of having a complete outage.

Sean Hull asks on Quora: What does degrade gracefully mean, and why is it important?

Auto-scaling – What is it and why is it important?

With cloud-based hosting solutions, new servers can be provisioned and “spun up” with a few options on the command line.  This opens a whole new dimension for infrastructure, allowing software scripts to bring new computing power into your web infrastructure.

Internet based applications often exhibit seasonal traffic patterns where traffic stays steady or grows slowly over a period, but then experiences a sharp spike in demand requiring much higher computing resources to meet customer demand.

Enter auto-scaling, an even more powerful feature of cloud-based offerings.  Define roles for your webservers and database servers, set capacity rules that control how much traffic will trigger new servers to be rolled out, and watch your infrastructure scale automatically to meet the needs of your internet application.

Mobile Scalability – What is it and why is it important?

We talked about Scalability previously.  So what is mobile scalability?  Mobile devices and smartphones run applications just like your laptop or home computer.  However these applications have some special requirements such as location-based search.  They also are typically not as weighty as their desktop counterparts, as memory and computing cycles are limited on a mobile device.  What’s more they should have a reduced network requirement and make fewer roundtrips between the device and the server.

Just like for other web applications, mobile scalability involves a few key areas:

  • load balancing the webserver tier
  • load balancing the database tier

It also introduces these new requirements:

  • fast location based search and lookups
  • minimal network usage

For the operations team tasked with optimizing a mobile application, pay particular attention to the measured amount of  data that moves back and forth on each page.  Optimize images and/or remove them if possible.  Adjust layout for the most popular devices, and spend extra time testing for those.

To address the location based search requirements, look at what’s happening on your database tier.  If you’re running MySQL, enable the slow query log, and watch for heavy queries doing location-based lookups.  If you’re not already using a radius indexes, definitely consider those as a way to speedup such lookups.

Also consider creative ways of looking at the business requirements to reduce actual computational power.  What do we mean by this?  Chances are the business team discusses what the application needs to do and the developers then go about figuring out how to do it.  But what happens as in the case of a location search, when the how is very expensive and slow?  It may be that customers think about nearby me in a much looser way than the technical folks do.  They may not care that you offer them businesses that are exactly within a one-mile radius at all.  They may be happy enough with a search that puts them uniquely inside one zipcode for example, or alternatively by breaking a city area for example into adjacent boxes instead of circles.  A square or rectangular area will be at least an order of magnitude (10x) faster but possibly as much as 100x faster because you can index perfectly on latitude and longitude lines, then compare the lat and long location of the businesses nearby.

This creative solution may fit business requirements and will bring huge speedups to your database tier, and thus your overall mobile application.  Scalability indeed!

Sean Hull asks on Quora: What is mobile scalability and why is it important?

What is disaster recovery and why is it important?

Disaster recovery involves the anticipation of major business outage, and the contingency planning to avoid business loss in revenue, customers or sales.

All of the technology components that make up your enterprise applications should be carefully considered against loss.  What happens if this database server disappears?  Do we have all the data backed up somewhere?  Have we tested that backup to restore it?  How long does it take to restore?  Can we reconnect the application to said database?  What if the network goes down?  How about if the whole datacenter goes out?

Planning for disaster recovery is important whether you’re hosted in-house or with a hosting provider.  Consider Amazon’s EC2 outage in April.  Various availability zones went out.  Were affected customers to have their database backed up properly – with offsite & tested copies, and further if they had other components such as webserver document roots, software configurations, etc they would be able to rebuild their entire infrastructure in an alternate availability zone or region.  Remember it was only a small component of Amazon Web Services which was out.

Sean Hull asks on Quora: Disaster Recovery – What is it and why is it important?