Join 27,000 others and follow Sean Hull on twitter @hullsean.
Your site is running fine right? You have 1000 customers, and it usually runs smoothly. Just this one lingering question, why does it take five high performance EC2 instances to run the database, all on flash drives? Goood question!
The truth is one of the highest trafficed sites I managed, pulled in 100 million uniques a month, and only used three backend databases. That site was one of these wildly popular celebrity gossip sites, the ultimate guilty pleasure when you’re at the office and can’t watch reality tv!
Snickers aside, this is huge traffic. And all of the above was built on Drupal, with no ORM in the mix. It could even run, albeit noticeably slower, while memcache was disabled.
1. Servers with solid state drives
I’m very excited to see Amazon introduce servers with SSD drives. They can bring you 100x improvement of disk I/O, and that my friends is the end all and be all for databases. So why complain?
If you deploy on these boxes right out of the gates, it may be like using a crutch. You become dependent on it, and ignore real performance tuning. Solid state drives still won’t obviate that ORM middleware you’re using.
2. Memcache saving your bad queries
Memcache is also a powerful tool. It sits between the database and your webservers, reducing load on the database by as much as 10x. That’s a great way to get better response time, and reduce drag on your db tier. But it’s still worthwhile performance tuning without it.
Why? If you can get your site to run without caching, it will run blazingly fast *with* it. Don’t use it as a crutch, use it as rocket fuel for your well tuned site.
Read this: Do startups need techops?
3. A legion of read slaves
I’ve seen smaller sites, using a ton of read slaves. All of it deployed to cover up slow & redundant queries pouring out of an ORM middleware layer, in this case Cake PHP.
Again, read slaves are great, but tune & test with less hardware, and get the performance up the hard way. With elbow grease!
4. Really really big memory
64G, 128G, 256G of main memory? If I wax on about the days when you’d get excited by 64k, I’ll sound like an old timer. But with those extreme limitations, you had to write tight code. Otherwise it just wouldn’t do anything.
Really really big memory of today’s servers allows us to get lazy. I hear developers say “Hey, the database is 10G of data, and we have 64G main memory, so the whole thing will fit in memory. Problem solved!”
Duhhh… No. Why not? Because you still have to slice and dice that data. You still have to scan through for bits & pieces that aren’t indexed, then sort, and organize that into temporary memory space. In DBA speak, you’re still doing a ton of logical IOs.
Picture it another way, imagine the days when you’re on horseback, riding across the west. You travel light cause frankly your horse can carry only so much. Then along come cars, and you start loading up the trunk. You add the kitchen sign, and the rear tires are hanging on the ground. All seems fine until you hit a steep mountain, and you’re car is almost stalling at 20mph. If you had only carried the same load as you did on horseback, you’d be speeding across the country at lightning pace.
5. Deploying poor code
Deadlines are looming, and new features must be deployed. So performance testing can wait until later. The code works after all.
Been there, done that. Code gets deployed and all of a sudden there are spikes on server load in the evening. Ops & DBA teams are screaming, “Who wrote this code?”.
Load testing should be a part of everyday QA & test. It’s the only way to avoid growing scalability problems.
Check this: Are SQL databases dead?