Tag Archives: iaas

Review – Test Driven Infrastructure with Chef – Stephen Nelson-Smith

In search of a good book on Chef itself, I picked up this new title on O’Reilly.  It’s one of their new format books, small in size, only 75 pages.

There was some very good material in this book.  Mr. Nelson-Smith’s writing style is good, readable, and informative.  The discussion of risks of infrastructure as code was instructive.  With the advent of APIs to build out virtual data centers, the idea of automating every aspect of systems administration, and building infrastructure itself as code is a new one.  So an honest discussion of the risks of such an approach is bold and much needed.  I also liked the introduction to Chef itself, and the discussion of installation.

Chef isn’t really the main focus of this book, unfortunately.  The book spends a lot of time introducing us to Agile Development, and specifically test driven development.  While these are lofty goals, and the first time I’ve seen treatment of the topic in relation to provisioning cloud infrastructure, I did feel too much time was spent on that.  Continue reading Review – Test Driven Infrastructure with Chef – Stephen Nelson-Smith

Amazon Web Services – What is it and why is it important?

Amazon Web Services is a division of Amazon the bookseller, but this part of the business is devoted solely to infrastructure and internet servers.  These are the building blocks of data centers, the workhorses of the internet.  AWS’s offering of Cloud Computing solutions allows a business to setup or “spinup” in the jargon of cloud computing, new compute resources at will.  Need a small single cpu 32bit ubuntu server with two 20G disks attached?  One command and 30 seconds away, and you can have that!

As we discussed previously, Infrastructure Provisioning has evolved dramatically over the past fifteen years from something took time and cost a lot, to a fast automatic process that it is today with cloud computing.  This has also brought with it a dramatic culture shift in the way that systems administration is being done, from a fairly manual process of physical machines, and software configuration, one that took weeks to setup new services, to a scriptable and automateable process that can then take seconds.

This new realm of cloud computing infrastructure and provisioning is called Infrastructure as a Service or IaaS, and Amazon Web Services is one of the largest providers of such compute resources.  They’re not the only ones of course.  Others include:

  • Rackspace Cloud
  • Joyent
  • GoGrid
  • Terremark
  • 3Tera
  • IBM
  • Microsoft
  • Enomaly
  • AT&T

Cloud Computing is still in it’s infancy, but is growing quickly.   Amazon themselves had a major data center outage in April that we discussed in detail. It sent some hot internet startups into a tailspin!

More discussion of Amazon Web Services on Quora – Sean Hull

Devops – What is it and why is it important?

Devops is one of those fancy contractions that tech folks just love.  One part development or developer, and another part operations.  It imagines a blissful marriage where the team that develops software and builds features that fit the business, works closely and in concert with an operations and datacenter team that thinks more like developers themselves.

In the long tradition of technology companies, two separate cultures comprise these two roles.  Developers, focused on development languages, libraries, and functionality that match the business requirements keep their gaze firmly in that direction.  The servers, network and resources those components of software are consuming are left for the ops teams to think about.

So too, ops teams are squarely focused on uptime, resource consumption, performance, availability, and always-on.  They will be the ones worken up at 4am if something goes down, and are thus sensitive to version changes, unplanned or unmanaged deployments, and resource heavy or resource wasteful code and technologies.

Lastly there are the QA teams tasked with quality assurance, testing, and making sure the ongoing dearth of features don’t break anything previously working or introduce new show stoppers.

Devops is a new and I think growing area where the three teams work more closely together.  But devops also speaks to the emerging area of cloud deployments, where servers can be provisioned with command line api calls, and completely scripted.  In this new world, infrastructure components all become components in software, and thus infrastructure itself, long the domain of manual processes, and labor intensive tasks becomes repeatable, and amenable to the techniques of good software development.  Suddenly version control, configuration management, and agile development methodologies can be applied to operations, bringing a whole new level of professionalism to deployments.

Sean Hull asks on Quora – What is devops and why is it important?

Auto-scaling – What is it and why is it important?

With cloud-based hosting solutions, new servers can be provisioned and “spun up” with a few options on the command line.  This opens a whole new dimension for infrastructure, allowing software scripts to bring new computing power into your web infrastructure.

Internet based applications often exhibit seasonal traffic patterns where traffic stays steady or grows slowly over a period, but then experiences a sharp spike in demand requiring much higher computing resources to meet customer demand.

Enter auto-scaling, an even more powerful feature of cloud-based offerings.  Define roles for your webservers and database servers, set capacity rules that control how much traffic will trigger new servers to be rolled out, and watch your infrastructure scale automatically to meet the needs of your internet application.

Cloud Computing – What is it and why is it important?

Cloud Computing has a few varied meanings from API services such as twitter to web-based (read cloud-based) email services such as gmail and yahoo.

An even bigger tectonic shift is happening though, in the area of infrastructure and hosting, to cloud based solutions.  No longer is provisioning a slow ordering process, followed by a multi-year contract and commitment with an associated high price tag.  Now computing resources can be provisioned and “spin-up” in seconds, even allowing for auto-scaling, bringing new computing resources online dynamically as seasonal traffic patterns demand.

  • uniquely suited to applications with seasonal traffic requirements
  • supports disaster recovery effectively for free
  • allows temporary provisioning of test environments
  • facilitates auto-scaling of bare metal servers
  • no huge budgetary outlay, pay for only what you use
  • bring up resources in seconds – supports true agile development

What’s more since cloud resources are all provisioned in software through an API, it encourages the treatment of infrastructure as a whole as software.  Now the scripts to completely rebuild all of your systems, from spin-up, to package configuration to application configuration can all be done in software, and managed in version control.

Sean Hull asks the question on Quora: What is Cloud Computing?

How To Build Highly Scalable Web Applications For The Cloud

Scalability in the cloud depends a lot on application design.  Keep these important points in mind when you are designing your web application and you will scale much more naturally and easily in the cloud.

** Original article — Intro to EC2 Cloud Deployments **

1. Think twice before sharding

  • It increases your infrastructure and application complexity
  • it reduces availability – more servers mean more outages
  • have to worry about globally unique primary keys

2. Bake read/write database access into the application

  • allows you to check for stale data, fallback to write master
  • creates higher availability for read-only data
  • gracefully degrade to read-only website functionality if master goes down
  • horizontal scalability melds nicely with cloud infrastructure and IAAS

3. Save application state in the database

  • avoid in-memory locking structures that won’t scale with multiple web application servers
  • consider a database field for managing application locks
  • consider stored procedures for isolating and insulating developers from db particulars
  • a last updated timestamp field can be your friend

4. Consider Dynamic or Auto-scaling

  • great feature of cloud, spinup new servers to handle load on-demand
  • lean towards being proactive rather than reactive and measure growth and trends
  • watch the procurement process closely lest it come back to bite you

5. Setup Monitoring and Metrics

  • see trends over time
  • spot application trouble and bottlenecks
  • determine if your tuning efforts are paying off
  • review a traffic spike after the fact

The cloud is not a silver bullet that can automatically scale any web application.  Software design is still a crucial factor.  Baking in these features with the right flexibility and foresight, and you’ll manage your websites growth patterns with ease.

Have questions or need help with scalability?  Call us:  +1-212-533-6828

Introduction to EC2 Cloud Deployments

Cloud Computing holds a lot of promise, but there are also a lot of speed bumps in the road along the way.

In this six part series we’re going to cover a lot of ground.  We don’t intend this series to be an overly technical nuts and bolts howto.  Rather we will discuss high level issues and answer questions that come up for CTOs, business managers, and startup CEOs.

Some of the tantalizing issues we’ll address include:

  • How do I make sure my application is built for the cloud with scalability baked into the architecture?
  • I know disk performance is crucial for my database tier.  How do I get the best disk performance with Amazon Web Services & EC2?
  • How do I keep my AWS passwords, keys & certificates secure?
  • Should I be doing offsite backups as well, or are snapshots enough?
  • Cloud providers such as Amazon seem to have poor SLAs (service level agreements).  How do I mitigate this using availability zones & regions?
  • Cloud hosting environments like Amazons provide no perimeter security.  How do I use security groups to ensure my setup is robust and bulletproof?
  • Cloud deployments change the entire procurement process, handing a lot of control over to the web operations team.  How do I ensure that finance and ops are working together, and a ceiling budget is set and implemented?
  • Reliability of Amazon EC2 servers is much lower than traditional hosted servers.  Failure is inevitable.  How do we use this fact to our advantage, forcing discipline in the deployment and disaster recovery processes?  How do I make sure my processes are scripted & firedrill tested?
  • Snapshot backups and other data stored in S3 are somewhat less secure than I’d like.  Should I use encryption to protect this data?  When and where should I use encrypted filesystems to protect my more sensitive data?
  • How can I best use availability zones and regions to geographically disperse my data and increase availability?

As we publish each of the individual articles in this series we’ll link them to the titles below.  So check back soon!

  • Building Highly Scalable Web Applications for the Cloud
  • Managing Security in Amazon Web Services
  • MySQL Databases in the Cloud – Best Practices
  • Backup and Recovery in the Cloud – A Checklist
  • Cloud Deployments – Disciplined Infrastructure
  • Cloud Computing Use Cases
  • Newsletter 74 – Design For Failure

    It may sound like a pessimistic view of computing systems, but the fact is all of the components that make up the modern Internet stack have a certain failure rate. So looking at that realistically, planning for a break-down so you can manage it better, is essential.

    Failures in traditional datacenters

    In your own datacenter, or that of your managed hosting provider sit racks and racks of servers. Typically an proactive system administrator will keep a lot of spare parts around, hard drives, switches, additional servers etc. Although you don’t need them now, you don’t want to be in a position to have to order new equipment when it fails.  That would increase your recovery time dramatically.

    Besides keeping extra components lying around, you also typically want to avoid the so-called single point of failure. Dual power systems, switches, database servers, webservers etc. We also see RAID as sort of standard now in all modern servers as a loss of commodity sata drive is so common. Yet this redundancy makes it a non-event. We are expecting it and so design for it.

    And while we are prudent enough to perform backups regularly and document the layout of systems, rarely is the environment in a traditional datacenter completely scripted. Although attempts to test backups, and restore the database may be common, a full fire drill to rebuild everything is rarer.

    Failure in the Cloud

    In the last decade we saw Linux on commodity take over as the internet platform of choice because of the huge cost differential as compared to traditional hardware such as Sun or HP.   The hardware was more likely to fail, but being 1/10th the price meant you could build redundancy in to cover yourself and still save money.

    The latest wave of cloud providers are bringing the same types of costs savings. But cloud hosted servers, for instance in Amazon EC2 are much less reliable than typical rack mounted servers you might have in your datacenter.

    Planning for disaster recovery we agree is a really good idea, but sometimes it gets pushed aside by other priorities. In the cloud it moves to front and center as an absolute necessity. This forces a new, more robust approach to rebuilding your environment with scripts documenting and formalizing your processes.

    This is all a good thing as hardware failure then becomes an expected occurrence. Failures are a given, it’s how quickly you recover that makes the difference.

    Book Review:

    Cloud Application Architectures by George Reese
    Originally picked up this book expecting a very hands on guide to cloud deployments, especially on EC2. That is not what this book is though. It’s actually a very good CTO targeted book, covering difficult questions like cost comparisons between cloud and traditional datacenter hosting, security implications, disaster recovery, performance and service levels. The book is very readable, and not overly technical.