3 ways your MySQL migration project can shake you up
Once a development or operations team gets over the hurdle of open-source, and start to feel comfortable with the way software works outside of the enterprise world, they will likely start to settle in and feel comfortable. Best not to get too cushy though for there are more surprises hiding around the corner. Here are a few of the biggest ones.
ORMs are popular among developers but not among performance experts. Why is that? Primarily these two engineers experience a web application from entirely different perspectives. One is building functionality, delivering features, and results are measured on fitting business requirements. Performance and scalability are often low priorities at this stage. ORMs allow developers to be much more productive, abstracting away the SQL difficulties of interacting with the backend datastore, and allowing them to concentrate on building the features and functionality.
Scalability is about application, architecture and infrastructure design, and careful management of server components.
On the performance side the picture is a bit different. By leaving SQL query writing to an ORM, you are faced with complex queries that the database cannot optimize well. What’s more ORMs don’t allow easy tweaking of queries, slowing down the tuning process further.
2. Synchronous, Serial, Coupled or Locking Processes
Locking in a web application operates something like traffic lights in the real world. Replacing a traffic light with a traffic circle often speeds up traffic dramatically. That’s because when you’re out somewhere in the country where there’s very little traffic, no one is waiting idly at a traffic light for no reason. What’s more even when there’s a lot of traffic, a traffic circle keeps things flowing. If you need locking, better to use InnoDB tables as they offer granular row level locking than table level locking like MyISAM tables.
Avoid things like semi-synchronous replication that will wait for a message from another node before allowing the code to continue. Such waits can add up in a highly transactional web application with many thousands of concurrent sessions.
Avoid any type of two-phase commit mechanism that we see in clustered databases quite often. Multi-phase commit provides a serialization point so that multiple nodes can agree on what data looks like, but they are toxic to scalability. Better to use technologies that employ an eventually consistent algorithm.
Without replication, you rely on only one copy of your database. In this configuration, you limit all of your webservers to using a single backend datastore, which becomes a funnel or bottleneck. It’s like a highway that is under construction, forcing all the cars to squeeze into one lane. It’s sure to slow things down. Better to build parallel roads to start with, and allow the application aka the drivers to choose alternate routes as their schedule and itinerary dictate.
Having no metrics in place is toxic to scalability because you can’t visualize what is happening on your systems. Without this visual cue, it is hard to get business units, developers and operations teams all on the same bandwagon about scalability issues. If teams are having trouble groking this, realize that these tools simple provide analytics for infrastructure.
There are tons of solutions too, that use SNMP and are non-invasive. Consider Cacti, Munin, OpenNMS, Ganglia and Zabbix to name a few. Metrics collections can involve business metrics like user registrations, accounts or widgets sold. And of course they should also include low level system cpu, memory, disk & network usage as well as database level activity like buffer pool, transaction log, locking sorting, temp table and queries per second activity.
Applications built without feature flags make it much more difficult to degrade gracefully. If your site gets bombarded by a spike in web traffic and you aren’t magically able to scale and expand capacity, having inbuilt feature flags gives the operations team a way to dial down the load on the servers without the site going down. This can buy you time while you scale your webservers and/or database tier or even retrofit your application to allow multiple read and write databases.
Without these switches in place, you limit scalability and availability.
Deploying new code that includes changes to your database schema doesn’t have to be a process fraught with stress and burned fingers. Follow these five tips and enjoy a good nights sleep.
1. Deploy with Roll Forward & Rollback Scripts
When developers check-in code that requires schema changes, that release should also require two scripts to perform database changes. One script will apply those changes, alter tables to add columns, change data types, seed data, clean data, create new tables, views, stored procedures, functions, triggers and so forth. A release should also include a rollback script, which would return tables to their previous state.
Software development has always made use of libraries, off-the-shelf components that are shared between different projects. These allow you to stand on the shoulders of others and build bigger things. Frameworks do the same thing, they provide a context from which to build on. Ruby on Rails for example provides a great starting framework from which to build web applications, managing sessions in an elegant way.
1. This page or area of the website is very slow, why?
There are a lot of components that make up modern internet websites, and a lot of places to get stuck in the mud. Website performance starts with the browser, what caching it is doing, their bandwidth to your server, what the webserver is doing (caching or not and how), if the webserver has sufficient memory, and then what the application code is doing and lastly how it is interacting with the backend database.
With the fast growth of virtualized data centers, and companies like Google, Amazon and Facebook, it’s easy to forget how much is built on open-source components, aka commodity software. In a very real way open-source has enabled the huge explosion of commodity hardware, the fast growth of the internet itself, and now the further acceleration through cloud services, cloud infrastructure, and virtualization of data centers.
Your typical internet stack and application now stands on the shoulders of tens of thousands of open source developers and projects. Let’s look at a few of them.
One very strong case for cloud computing is that it can satisfy applications with seasonal traffic patterns. One way to test the advantages of the cloud is through a hybrid approach.
Cloud infrastructure can be built completely through scripts. You can spinup specific AMIs or machine images, automatically install and update packages, install your credentials, startup services, and you’re running.
All of these steps can be performed in advance of your need at little cost. Simply build and test. When you’re finished, shutdown those instances. What you walk away with is scripts. What do we mean?
The power here is that you carry zero costs for that burst capacity until you need it. You’ve already build the automation scripts, and have them in place. When your capacity planning warrants it, spinup additional compute power, and watch your internet application scale horizontally. Once your busy season is over, scale back and disable your usage until you need it again.
Shoe leather cost is similar to opportunity cost. It refers to the cost of counteracting inflation by keeping less of your assets in cash. Your strategy would require more trips to the bank and more walking, and incur a cost in the wearing out of the leather in your shoes.
All joking aside, it’s an interesting idea. It highlights how there are all sorts of hidden costs to different strategies. There are hidden costs to using coupons, loyalty cards, frequent flyer miles, managing assets & investments, hiring resources and in general running a business. Let’s look at a few.