Tag Archives: horizontal scaling

Semi-Synchronous Replication – What is it and why is it important?

Replication in MySQL allows you to copy and replay changes from your primary database to an alternate backup or slave database.  This facility in MySQL is an asynchronous process, which means it does not happen at the time it occurs on the primary.  It could happen a second later, or minutes later.  In fact sometimes the secondary database can get bogged down by heavy load because transactions are applied serially, while they execute in parallel sessions on production. You can find out how far behind the master you are with SHOW SLAVE STATUS, and look at:

Seconds_Behind_Master: 8

If you are sending SELECT or the READ traffic from your website to the slave database, you may experience phantom reads.  For instance if you comment on a blog posting, and refresh the page within 8 seconds on the server above, it would not display the comment just posted!

As it turns out the Maatkit toolkit has a tool called mk-slave-prefetch which can help with slow performance of the slave.  Since most of the work of doing inserts, updates and deletes involves fetching the right rows, running a similar SELECT query in advance of running the actual transaction will warm up the caches, and speed things up dramatically and may be enough for your needs.  Test it first and find out.

Semi-Synchronous Replication comes to the rescue if you really need this type of guarantee, but it comes at a cost.  You enable it on the master, then on the slave and restart the slave.  Whenever the master commits a transaction, it will block until one of two things happen.  It must either get an acknowledgement from at least one slave that the transaction has been applied downstream or it must reach the timeout threshold.

This type of arrangement may sound fine in theory as such blocking would often be less than a second.  However in the microscopic world of high speed, high transaction, high traffic websites, this may be an eternity, and one which can slow the database down substantially.  So test first before assuming it’s a solution that will help you.

Quora discussion of Semi-synchronous Replication

Scalability – What is it and why is it important?

Scaling comes in a few different flavors.  Vertical scaling involves growing the computing power of a single server, adding memory, faster or more CPUs and/or faster disk I/O.

Horizontal scaling involves adding additional computing resources or servers in parallel and then load balacing across them.

Scalability refers to applications which facilitate scaling well.  With web applications, the middle tier aka the webservers are fairly easy to scale horizontally and most enterprise class applications already do this with commercial load balancers – with either hardware or software.

Doing the same with the database tier, however can be trickier.  Enter MySQL replication to facilitate a fairly painless horizontal scalability.  Build your application architecture with read-only transactions, and write/update transactions segmented apart, and you can send the latter to one master database, and the former to a handful of replicated slaves.  With a typical web application that is less than 10% writes, and 90% reads, there is the potential to add as many as 5-10 servers horizontally to increase application throughput by as much as 500-1000%.

Sean Hull asks on Quora: What is scalability and why is it important?