MySQL Consulting & Migrations, EC2 Deployments, Scalability & Performance

Heavyweight Internet Group +1-212-533-6828MySQL Expert, Linux, EC2 & Scalability Consulting NYC

31Jan/110

iHeavy Insights 76 – Scale By Design

In a recent trip to Southeast Asia, I visited the Malaysian city of Kuala Lumpur.  It is a sprawling urban area, more like Los Angeles than New York.  With all the congestion and constant traffic jams the question of city planning struck me.  On a more abstract level this is the same challenge that faces web application and internet website designers.  Architect and bake the quality into the framework, or hack it from start to finish?

Urban Un-Planning

Looking at cities like Los Angeles you can't help but think that no one imagined there would ever be this many cars.  You think the same thought when you are in Kuala Lumpur.  The traffic reaches absurd levels at times.  A local friend told me that when the delegates travel through the city, they have a cavalcade of cars, and a complement of traffic cops to literally move the traffic out of the way.  It's that bad!

Of course predicting how traffic will grow is no science.  But still cities can be planned.  Take a mega-city like New York for example.  The grid helps with traffic.  A system of one way streets, a few main arteries, and travelers and taxis a like can make better informed decisions about which way to travel.  What's more the city core is confined to an island, so new space is built upward rather than outward.  Suddenly the economics of closeness wins out.  Many buildings in midtown you can walk between, or at most take a quick taxi ride.  Suddenly a car becomes a burden.  What's more the train system, a spider web of subways and regional transit branches North to upstate New York, Northeast to Connecticut, East to Long Island, and West to New Jersey.

If you've lived in the New York metropolitan region and bought a home, or work in real estate you know that proximity to a major train station affects the prices of homes.  This is the density of urban development working for us.  It is tough to add this sauce to a city that has already sprawled.   And so it is with architecting websites and applications.

Architecting for the Web

Traffic to a website can be as unpredictable as traffic within the confines of an urban landscape.   And the spending that goes into such infrastructure as delicate.  Spend too much and you risk building for people who will never arrive.  What's more while the site traffic remains moderate, it is difficult to predict patterns of larger volumes of users.  What areas of the site will the be most interested in?  Have we done sufficient capacity planning around those functions?  Do those particular functions cause bottlenecks around the basic functioning of the site, such as user logins, and tracking?

Baking in the sauce for scalability will never be an exact science of course.  In urban planning you try to learn from the mistakes of cities that did things wrong, and try to replicate some of the things that you see in cities doing it right.  Much the same can be said for websites and scalability.

For instance it may be difficult to do bullet proof stress testing and functional testing to cover every single possible combination.  But there are best practices for architecting an application that will scale.  Basics such as using version control - of course but I have seen clients who don't.  There are a few options to choose from, but they all provide versioning, and self-document your development process.  Next build redundancy into the mix.  Load balance your application servers of course, and build various levels of caching - reverse proxy caching such as varnish, and a key-value caching system like memcache.  Build redundancy into the database layer, even if you aren't adding all those servers just yet.  Your application should be multi-database aware.  Either use an abstraction layer, or organize your code around write queries, and read-only queries.  If possible build in checks for stale data.

Also consider various cloud providers to host your application, such as Amazon's Elastic Compute Cloud.  These environments allow you to script your infrastructure, and build further redundancy into the mix.  Not only can you take advantage of features like auto-scaling to support dynamic growth in traffic, but you can scale servers in place, moving your server images from medium to large, to x-large servers with minimal outage.  In fact with MySQL multi-master active/passive replication on the database tier, you could quite easily switch to larger instances or from larger to smaller instances dynamically, without *any* downtime to your application.

Conclusion

Just as no urban planner would claim they can predict the growth of a city, a devops engineer won't claim they can predict how traffic to your website will grow.  What we can do is mitigate that growth, build quality by building scaffolding so it can grow organically, and then monitor, collect metrics and do basic capacity planning.  A small amount of design up front will payoff over and over again.

Book Review: How To Disappear by Frank M Ahearn

With such an intimidating title you might think at first glance that this is a book only for the paranoid or criminally minded.  Now granted Mr Ahearn is a Skip Tracer, and if you were one already you certainly wouldn't need this book.  Still Skip Tracers have a talent for finding people, just as an investigator or a detective has of catching the bad guys.  And what a person like this can teach us about how they find people is definitely worth knowing.

If you've had your concerns about privacy, what companies have your personal information and how they use it, this is a very interesting real-world introduction to the topic.  Of particular interest might be the chapter on identity thieves and another on social media.  All-in-all a quick read and certainly one-of-a-kind advice!

View on Amazon - How To Disappear

30Dec/10Off

How To Build Highly Scalable Web Applications For The Cloud

Scalability in the cloud depends a lot on application design.  Keep these important points in mind when you are designing your web application and you will scale much more naturally and easily in the cloud.

** Original article -- Intro to EC2 Cloud Deployments **

1. Think twice before sharding

  • It increases your infrastructure and application complexity
  • it reduces availability - more servers mean more outages
  • have to worry about globally unique primary keys

2. Bake read/write database access into the application

  • allows you to check for stale data, fallback to write master
  • creates higher availability for read-only data
  • gracefully degrade to read-only website functionality if master goes down
  • horizontal scalability melds nicely with cloud infrastructure and IAAS

3. Save application state in the database

  • avoid in-memory locking structures that won't scale with multiple web application servers
  • consider a database field for managing application locks
  • consider stored procedures for isolating and insulating developers from db particulars
  • a last updated timestamp field can be your friend

4. Consider Dynamic or Auto-scaling

  • great feature of cloud, spinup new servers to handle load on-demand
  • lean towards being proactive rather than reactive and measure growth and trends
  • watch the procurement process closely lest it come back to bite you

5. Setup Monitoring and Metrics

  • see trends over time
  • spot application trouble and bottlenecks
  • determine if your tuning efforts are paying off
  • review a traffic spike after the fact

The cloud is not a silver bullet that can automatically scale any web application.  Software design is still a crucial factor.  Baking in these features with the right flexibility and foresight, and you'll manage your websites growth patterns with ease.

Have questions or need help with scalability?  Call us:  +1-212-533-6828

27Dec/100

Metrics Bridge Gap Between IT & Business Units

On the business side we've all seen requests for hardware purchases that seem astronomical, or somehow out of proportion to the project at hand.  And on the IT side we've been faced with the challenge of selling capital expenditures on technology, as demands grow.

Collecting statistics on real usage of server systems, and then connecting the dots to business metrics is an excellent way to bridge the gap.  This allows IT to draw concrete connection between technology investment, and reaching business goals.

Metrics and drawing the dotted line in this way also educates folks on both sides of the tracks.  It educates technologists on exactly how technology purchases can be justified, by their direct return to the business.  And it educates finance and business executives on how those hardware purchases directly contribute to business growth.

Filed under: Business No Comments
25Dec/10Off

Review: Cloud Application Architectures

George Reese's book doesn't have the catchiest title, but the book is superb.  One thing to keep in mind, it is not a nuts and bolts or howto type of book.  Although there is a quick intro to EC2 APIs etc, you're better off looking at the AWS docs, or Jeff Barr's book on the subject.  Reese's book is really all about answering difficult questions involving cloud deployments.

24Dec/100

iHeavy Insights 75 – Recognizing Quality

Finding good vendors who provide professional services may have a lot in common with finding good restaurants.  There may be an abundance of them, while the best ones remain difficult to find.

A long line does not mean quality food

Some restaurants have a long line because they have slow service.  If that's because you're getting quality personalized service, great.  But if it's because of incompetence and general disorganization or because they can't keep quality help, that's another story.

Hype and marketing can bring a lot of customers to a new restaurant.  Sometimes it's a celebrity chef or architect.  If that's what you're after then you may be at the right place.  If you're looking for the best home cooked meal, you may have to keep looking.

Convenience and location can also bring long lines.  Finding a restaurant on the main street or square is usually not the one with the best food.

A better way to find quality

Take a look at how long the restaurant has been around.  A service provider who has been in business for a long time has obviously been successful at acquiring customers, solving their problems, and charging a fee that matches both their needs and those of their customers.

Check the testimonials of your provider.  If their website doesn't list some, ask for one or two customers that they've worked with recently.

Pay attention to service.  If you are a small fish for your vendor, it's likely that service will be affected.  If you on the other hand are one of your vendors bigger clients, they'll likely give much more attention to you.  Notice how regular customers at a restaurant or lounge tend to get the best service.

Book Review:  The Power of Pull by Hagel, Brown & Davison

A lot of really influential people like this book.  Joichi Ito, Richard Florida and Eric Schmidt to name a few.  Enterprises are faced with a bewildering array of challenges from finding good people, to retaining them, and putting them to work in the most creative ways.  This book brings another new and welcome perspective on the future of building and growing successful organizations.

18Dec/100

Review: Host Your Web Site In The Cloud, Amazon Web Services Made Easy

Jeff Barr's book on AWS is a very readable howto and a quick way to get started with EC2, S3, CloudFront, CloudWatch and SimpleDB.  It is short on theory, but long on all the details of really getting your hands dirty.  Learn how to:

  • get started using the APIs to spinup servers
  • create a load balancer
  • add and remove application servers
  • build custom AMIs
  • create EBS volumes, attach them to your instances & format them
  • snapshot EBS volumes
  • use RAID with EBS
  • setup CloudWatch to monitor your instances
  • setup triggers with CloudWatch to enable AutoScaling

I would have liked to see examples in Chef rather than PHP, but hey you can't have everything!

Review: Host Your Web Site In The Cloud by Jeff Barr

Filed under: Book Review No Comments
14Dec/100

Introduction to EC2 Cloud Deployments

Cloud Computing holds a lot of promise, but there are also a lot of speed bumps in the road along the way.

In this six part series we're going to cover a lot of ground.  We don't intend this series to be an overly technical nuts and bolts howto.  Rather we will discuss high level issues and answer questions that come up for CTOs, business managers, and startup CEOs.

Some of the tantalizing issues we'll address include:

  • How do I make sure my application is built for the cloud with scalability baked into the architecture?
  • I know disk performance is crucial for my database tier.  How do I get the best disk performance with Amazon Web Services & EC2?
  • How do I keep my AWS passwords, keys & certificates secure?
  • Should I be doing offsite backups as well, or are snapshots enough?
  • Cloud providers such as Amazon seem to have poor SLAs (service level agreements).  How do I mitigate this using availability zones & regions?
  • Cloud hosting environments like Amazons provide no perimeter security.  How do I use security groups to ensure my setup is robust and bulletproof?
  • Cloud deployments change the entire procurement process, handing a lot of control over to the web operations team.  How do I ensure that finance and ops are working together, and a ceiling budget is set and implemented?
  • Reliability of Amazon EC2 servers is much lower than traditional hosted servers.  Failure is inevitable.  How do we use this fact to our advantage, forcing discipline in the deployment and disaster recovery processes?  How do I make sure my processes are scripted & firedrill tested?
  • Snapshot backups and other data stored in S3 are somewhat less secure than I'd like.  Should I use encryption to protect this data?  When and where should I use encrypted filesystems to protect my more sensitive data?
  • How can I best use availability zones and regions to geographically disperse my data and increase availability?

As we publish each of the individual articles in this series we'll link them to the titles below.  So check back soon!

  1. Building Highly Scalable Web Applications for the Cloud
  2. Managing Security in Amazon Web Services
  3. MySQL Databases in the Cloud - Best Practices
  4. Backup and Recovery in the Cloud - A Checklist
  5. Cloud Deployments - Disciplined Infrastructure
  6. Cloud Computing Use Cases
29Nov/10Off

Newsletter 74 – Design For Failure

It may sound like a pessimistic view of computing systems, but the fact is all of the components that make up the modern internet stack have a certain failure rate.    So looking at that realistically, in essence planning to stumble so you can manage it better, is essential.
Traditional Datacenter

In your own datacenter, or that of your managed hosting provider sit racks and racks of servers.  Typically an proactive system administrator will keep a lot of spare parts around, harddrives, switches, additional servers etc.  Although you don't need them now, you don't want to be in a position to have to order new equipment when it fails.  That would increase your recovery time dramatically.
Besides keeping extra components lying around, you also typically want to avoid the so-called single point of failure.  Dual power systems, switches, database servers, webservers etc.  We also see RAID as sort of standard now in all modern servers as a loss of commodity sata drive is so common.  Yet this redundancy makes it a non-event.  We are expecting it and so design for it.
And while we are prudent enough to perform backups regularly and document the layout of systems, rarely is the environment in a traditional datacenter completely scripted.  Although attempts to test backups, and restore the database may be common, a full fire drill to rebuild everything is rather rarer.
Cloud Hosting
In the last decade we saw Linux on commodity take over as the internet platform of choice because of the huge cost differential as compared to traditional hardware such as Sun or HP.   The hardware was more likely to fail, but being 1/10th the price meant you could build redundancy in to cover yourself and still save money.
The latest wave of cloud providers are bringing the same types of costs savings again.  But cloud hosted servers, for instance in Amazon EC2 are much less reliable than typical rack mounted servers you might have in your datacenter.
Planning for disaster recovery we agree is a really good idea, but sometimes it gets pushed aside by other priorities.  In the cloud it moves to front and center as an absolute necessity.  This forces a new more robust approach to rebuilding your environment with scripts documenting and formalizing your processes.
This is all a good thing as hardware failure then becomes more of a managed expected occurance.  Design For Failure indeed!
Book Review:
Cloud Application Architectures by George Reese
Originally picked up this book expecting a very hands on guide to cloud deployments, especially on EC2.  That is not what this book is though.  It's actually a very good CTO targeted book, covering difficult questions like cost comparisons between cloud and traditional datacenter hosting, security implications, disaster recovery, performance and service levels.   The book is very readable, and not overly technical.
1Nov/10Off

iHeavy Insights 73 – It’s Easy

In the business of technology consulting, there are times when I've heard this statement.  It's Easy!  Perhaps the single biggest thing I've learned through a decade and a half of consulting is, people use this phrase when they are feeling overly confident.

What do I mean by that?  Well it turns out in psychology there are all sorts things we communicate with our spoken language & body language.  Some of those things we aren't even conscious of.  In the case of the statement "It's easy" your first thought may be about all the intricate details that have yet to be ironed out, all the hiccups that may happen along the way.  Or you may just simply be thinking of Murphy's Law that always seems to rear it's ugly head at the worst time.

Truth is when you hear this statement, you may also be inclined to think of it as a statement of fact.  The person saying that they have reviewed all the facts and ascertained that task x is in fact a trivial one.

Of course one doesn't want to be the naysayer either, but you can raise concerns while still acknowledging both sides.  My tack is first to repeat what the person said in more detailed language.  By reiterating all of the details, it can sometimes illustrate right there some hidden complexity and weaken the sense of triviality to the task at hand.

A Software Developer

A few years back I had subcontractor developer working on a project.  We went over some details about what changes needed to happen.  A web-based analytical tool needed some additional search functionality.  We went over how that search would index documents in the site.  The developer explained to me "That's easy.  No problem".  I was suspicious.

As development unfolded we hit a bump in the road.  Besides the database indexing, additional xml documents needed to be indexed in order for the search function to work properly.  That added quite a bit of additional complexity because the search solution developer had envisioned couldn't deal with that xml data.

A Business Prospect

I was recently reviewing a contract with a prospect, and going over items and deliverables.   They explained that for the database portion we'll just use Amazon RDS, instead of installing MySQL and configuring the server manually.  "This piece will be easy".  Unfortunately using Amazon's solution is still not  push-button  in any event.   These types of oversimplifications are fine if you're working on a time and materials basis because the complexity of the project will unfold organically, and the process will educate everyone involved.  But if you are trying to do a fixed fee project, these can be a harbinger of trouble later on.

Conclusion

When you hear people say "That's Easy" understand that they are only expressing their confidence, despite their assurances.  If you are not equally confident,  you'll both need to discuss details until you reach a middle point.  If it scares you to hear someone say something is EASY, think of it as a warning flag that you are not both on the same page.  Then remedy the situation with ample communication.

BOOK REVIEW:  Satyajit Das - Traders, Guns & Money

Financial Times has very high standards, and with an endorsement by Gillian Tett on the back, you know you are on the right track to some excellent material.  Das' expose explores the inside world of derivatives, the so-called WMD of the financial world.   Along the way you'll enjoy wacky stories of rogue traders, $70,000 meals, LIBOR numbers, delusional thinking, and even more about financial risk.  It illustrates exactly why Warren Buffet said "You only find out who is swimming naked when the tide goes out."

30Sep/100

5 Steps to Cloud Computing

Believe it or not you can actually start playing around with virtual servers that are as real and powerful as the physical servers you're already used to deploying.  And you can do it for literally pennies per month.

  1. Signup for an Amazon account or use the one you buy books with.
  2. Browse over to http://aws.amazon.com & click Sign Up Now
  3. Navigate to AWS Management Console, follow the Amazon EC2 link, and click Launch Instance
  4. Download Elastic Fox or the API tools & configure your credentials for easy browser or command line control of your virtual infrastructure and deployments.
  5. Terminate instances & delete volumes & snapshots so you'll have no recurring charges.

At a mere 8 and 1/2 cents per hour, you can play around with the technology with no real ongoing costs.  And you can do it with your existing Amazon account and credit card info.

Good stuff!