Software development has always made use of libraries, off-the-shelf components that are shared between different projects. These allow you to stand on the shoulders of others and build bigger things. Frameworks do the same thing, they provide a context from which to build on. Ruby on Rails for example provides a great starting framework from which to build web applications, managing sessions in an elegant way.
Unfortunately the same process when applied to interacting with and managing database access does not work well. In these cases you build a framework for example that might treat tables as objects. As you interact with and build code to use those objects, the ORM dynamically constructs the SQL to handle getting stuff into and out of the database. Secretly no one likes writing SQL, and any layer of software that will do it for me has go to be a good thing!
Unfortunately for scalability it is decidedly *not* a good thing. From Hibernate to ActiveRecord, these type of abstraction layers build SQL that is innefficient, and worse tough to get at when you want to optimize it. That’s what frameworks do well, hide that complexity. Beware though in regards to future scalability.
Instead isolate your queries in one place, and use a read database handle, and a write handle.
2. Serialization & Synchronous Processes
Synchronous processes usually block until a change makes it way to more than one place. This is dangerous as your traffic grows higher – it becomes a real bottleneck. Imagine a business where the boss has not learned to delegate tasks to underlings. Everything must be ok’d by the big boss. As the business grows larger, that one point – in software a serialization point – becomes the point of contention. Can get ahold of the big boss, business slows to a standstill.
Better to delegate those tasks to direct reports – managers that you have given the authority to make certain decisions or sign checks for example. In our infrastructure example we have components that are “eventually consistent” and can provide answers to questions about the data, or provide services that give answers that are sufficient.
3. Complex Infrastructures
Look at your network diagram. Can you grok it quickly? Can you explain it to new hires without struggling? Or are their complex relationships, dependencies or assumptions built into it? Simplify the stack and keep functional components on their own servers. Simplicity is a discipline, like traveling cross-country without packing huge bags. It requires you to constantly trim the fat, and eliminate unnecessary pieces or parts.
4. Single Database Assumption
Even if you’re not at the point where you need to scale your database tier in the beginning, keep it in the back of your mind as you architect your application. Don’t assume you will talk to one backend datastore. Build for a read handle and a write handle. Then you will have already done much of the work to later scale since most applications do 80% reads and 20% writes. You can easily load balance over multiple read-only slaves if necessary as well.
5. Throwing Hardware at a Problem
If you try to sidestep the hard stoic work of tuning your application, optimizing the work it is doing, disk I/O it is performing, or network roundtrips you are making, it will come back to bite you. Thinking you can just buy bigger iron, and scale vertically will only get you so far. One simple cannot compete with the other. Buying hardware can get you 2x, 5x, maybe 10x. Tuning SQL can get you 10,000x speedup.