Tag Archives: database migration

Migration to MySQL – What is it and why is it important?

MySQL is a relational database that backs many internet websites and enterprise applications.  Like all enterprise software, it has a whole complement of features which are well documented, such as data types, storage engines, transactional behaviors and so forth.  It also has a set of processes, many of which involve how software operates on Linux servers, such as how it gets installed, where binaries and libraries will get placed, where to find logfiles, and how to move directories and set permissions.  Thirdly it is important to understand the culture, in this case Unix-based, forum discussions and community contributions as an open-source project.

MySQL can do much of the workhorse kind of stuff you see in databases like Oracle or SQL Server, but sometimes it achieves those goals in very different ways.  For instance there are many open-source projects that support and surround the database, such as mysqltuner an analysis script, innotop a unix top-like utility for monitoring on-going activity in the database, and maatkit a whole suite of tools that build on and expand the features already present in the MySQL database.

Some Limitations in MySQL

  1. Complex queries and subqueries specifically can be problematic in MySQL.  If you’re used to writing huge queries in Oracle, and having the CBO figure everything out for you, you’ll be in for a surprise with MySQL.  Keep your queries simple, proper columns indexed and avoid complex joins where possible.  The EXPLAIN facility is available to you and at your disposal.  Use it!
  2. Vertical Scalability problems – primarily addressed in 5.5, the latest version of MySQL, previously the database did not scale well on greater than four processor boxes.  SMP or Symmetric Multiprocessing servers were less common 10-15 years ago when MySQL was in it’s infancy, and development is slowly catching up with the big iron of today.
  3. There is no flashback table, tablespace or database that you might find in other databases such as Oracle.  You can achieve the same thing with point-in-time recovery, so keep regular backups of your database, and also backup the transaction logs.
  4. MySQL can do JOINs, but only with the nested loops algorithm.  It can’t do sort merge join or hash join.
  5. MyISAM is the default table type and storage engine.  It is not crash safe and not transactional.  On new installations it’s recommended that you change this to InnoDB and use InnoDB for most if not all of your tables.  It’s very reliable and very fast!
  6. There is a query cache, but it caches result sets not query plans!  It also has some performance issues and shows some erratic behavior on larger SMP boxes.  Query plans are cached on a session basis, but when a session is closed and reopened, MySQL must reparse and reexecute that query.
  7. MySQL does not have a facility like Oracle’s Real Application Clusters.  It does have NDB Cluster which is an all-in-memory clustering solution.  Despite it’s promise, it tends to have very serious performance problems with any type of join, and is mainly good for single table index-based lookups.  If managed well it can increase availability but will probably reduce performance.
  8. MySQL’s default replication solution is statement based.  Although it is easy to setup, it breaks almost as easily, sometimes with resolvable errors, and sometimes silently.  Consider row-based replication, and definitely make use of Maatkit’s mk-table-checksum and mk-table-sync tools.  Also be sure to do thorough and regular monitoring of your replication setup.
  9. There are no in-built materialized views or snapshots in MySQL.  There is an open-source project called Flexviews by Justin Swanhart that provides this facility to the MySQL community.
  10. MySQL provides stored procedures, triggers and functions as a regular feature to the database.  However I would use them with caution.  They are very difficult to edit, troubleshoot and diagnose when they are causing troubles.  Also as with the query plan caching, stored procedures are cached at the session level, so they can be expensive to execute over and over again in different areas of your application.  They can cause real performance problems.
  11. There is no in-built mechanism for auditing that you find in relational databases such as Oracle 11g.
  12. Only b-tree indexes are supported, no bitmap indexes, index-organized tables, clustered indexes or other more exotic index types.
  13. ALTER TABLE is generally a locking and blocking operation.  For example if you add a new column or change a columns data type, the entire table will be locked for the duration of the operation.  This will be a surprise coming from the Oracle world where these type of operations can routinely be done online.

MySQL’s Strengths Are Numerous

  1. Install with an RPM using Yum or Aptget.  Fast & simple!
  2. Works great in the cloud, using MySQL Community distro, Percona distro, or Amazon’s own RDS solution.
  3. Comes out-of-the-box with an excellent command line shell providing all sorts of features and power that are constant frustrations on the Oracle side.  Command history, standard input/output redirection support, a full compliment of features and options, and easy autologin with a user level my.cnf file which fits in nicely with the global settings as well.
  4. A simpler mechanism to serve unique id columns with the auto-increment data type.  Although Oracle’s sequence method is extremely scalable, for many many developers it is troublesome and confusing.
  5. Good support of the LIMIT clause allowing an easier method for developers to fetch a subset of data.
  6. A huge community of users, forums, and support in third party applications such as monitoring (Nagios etc…) as well as metrics collection (Munin, Cacti, OpenNMS, Ganglia etc.)
  7. Great visibility of system variables with SHOW VARIABLES.  Many can be changed dynamically as well, just like Oracle.
  8. Great visibility of internal system state with SHOW PROCESSLIST.
  9. System counters for all sorts of internal instrumentation data using SHOW STATUS and SHOW INNODB STATUS.  Ultimately it is not as comprehensive as Oracle’s own data dictionary and millions of instrumentation counts.  However Oracle could take a huge page out of the MySQL book in terms of usability.  The obfuscation of Oracle’s internal kernel state makes it all but unusable by most.
  10. innotop, the utility much like the unix TOP facility that all Unix & Linux folks love, it provides instant visibility into what queries are running, what work is being done, and what is blocking.  Oracle could really take a page from this playbook, as this tool is so invaluable.
  11. The incredible Maatkit, a veritable goldmine of great community contributed powertools.  Query analyzers, profilers, log tools, replication tools, data archiver, a find facility, and a whole lot more!

Sean Hull discusses further on Quora – What considerations are important when migrating to MySQL?

Database Migration – What is it and why is it important?

Migration in the context of enterprise and web-based applications means moving from one platform to another.  Database Migrations are particularly complicated as you have all the challenges of changing your software platform, where some old features are missing, or behave differently and some new features are available and you’d like to take advantage of those.

In the world of databases, some developers try to build database independent applications, especially using ORMs (object relational mappers).  On the surface this seems like a great option, build your application to use only standard components and features, and then you can easily move to a different platform when requirements dictate.  Unfortunately things are not quite that simple.

Database independent applications necessarily shoot for the lowest common denominator of all of your database platforms, thus lowering the bar on what high-performance features you might take advantage of on the platform you are currently using.

Here are some scenarios:

  1. Building an application which needs to support multiple database backends for customer sites
  2. Building an application in dev and test for proof of concept.  May port to an alternate database in the future.
  3. Don’t want to be locked into one vendor, but have plans for only one platform currently.

These are all good reasons to think about features, and database platforms from the outset.

For situation #1, you need to be most serious about cross-platform compatibility from the start.  Build modules for each database platform, with platform specific code isolated in that module.  If the particular feature you want to use is available only on one of the two platforms, the alternate platform will have to include its implementation of that feature in the database specific module.  Also by isolating all database specific interactions to one module, you have also put boundaries around that code.  If you choose to support another database platform in the future, you merely need to rewrite that database interaction module.

For situation #2, you would use a similar tactic, but won’t necessarily have to implement all the routines in that module for the alternate platform.  Just keep those features, and differences in mind during the development phase.  Where possible document those differences, and comment code liberally.  This will go a long way towards preparing you if you do decide to go for a different database backend.

In situation #3, this may be more of a philosophical concern at this stage.  Don’t get overly dragged down by this, as it’s hypothetical at this stage.  Sometimes developers labor under this concern from previous bad experiences migrating to a new database platform.  But to some degree this is the nature of the beast.  Database platforms include a myriad of different features, datatypes, storage methods, and coding languages.  In many ways this is where their power lies.

Sean Hull discusses on Quora: What is a database migration and why is it important?