Category Archives: All

Migrating MySQL to Oracle Guide

Also find Sean Hull’s ramblings on twitter @hullsean.

Migrating from MySQL to Oracle can be as complex as picking up your life and moving from the country to the city.  Things in the MySQL world are often just done differently than they are in the Oracle world.  Our guide will give you a birds eye view of the differences to help you determine what is the right path for you.

** See also: Oracle to MySQL Migration Considerations **

MySQL comes from a more open-source or DIY background.  One of Unix and Linux administrators and even developers carrying the responsibility of a DBA.

  1. Installation & Administration Considerations
  2. Query and Optimizer Differences
  3. Security Strengths and Weaknesses
  4. Replication & High Availability
  5. Table Types & Storage Engines
  6. Applications, Connection Pooling, Stored Procedures and More
  7. Backups & Disaster Recovery
  8. Community – MySQL & Oracle Differences
  9. TCO, Licensing, and Cloud Considerations
  10. Advanced Oracle Features – Missing in MySQL

Check back soon as we update each of these sections.

Oracle to MySQL Migration Considerations

There are a lot of forms of transportation, from walking to bike riding, motorcycles and cars to busses, trains and airplanes.  Each mode of transport will get you from point a to point b, but one may be faster, or more comfortable and another more cost effective.  It’s important to keep in mind when comparing databases like Oracle and MySQL that there are indeed a lot of feature differences, a lot of cultural differences, and a lot of cost differences.  There are also a lot of impassioned people on both sides arguing at the tomfoolery of the other.  Hopefully we can dispel some of the myths and discuss the topic fairly.

** See also: Migrating MySQL to Oracle Guide **

As a long time Oracle DBA turned MySQL expert, I’ve spent time with clients running both database engines and many migrating from one to the other.  I can speak to many of the differences between the two environments.  I’ll cover the following:

  1. Query & Optimizer Limitations
  2. Security Differences
  3. Replication & HA Are Done Differently
  4. Installation & Administration Simplicity
  5. Watch Out – Triggers, Stored Procedures, Materialized Views & Snapshots
  6. Huge Community Support – Open-source Add-ons
  7. Enter The Cloud With MySQL
  8. Backup and Recovery
  9. Miscellaneous Considerations

Check back again as we edit and publish the various sections above.

iHeavy Insights 77 – What Consultants Do

 

What Do Consultants Do?

Consultants bring a whole host of tools to experiences to bear on solving your business problems.  They can fill a need quickly, look in the right places, reframe the problem, communicate and get teams working together, and bring to light problems on the horizon. And they tell stories of challenges they faced at other businesses, and how they solved them.

Frame or Reframe The Problem

Oftentimes businesses see the symptoms of a larger problem, but not the cause.  Perhaps their website is sluggish at key times, causing them to lose customers.  Or perhaps it is locking up inexplicably.  Framing the problem may involve identifying the bottleneck and pointing to a particular misconfigured option in the database or webserver.  Or it may mean looking at the technical problem you’ve chosen to solve and asking if it meets or exceeds what the business needs.

Tell Business Stories

Clients often have a collection of technologies and components in place to meet their business needs.  But day-to-day running of a business is ultimately about bringing a product or service to your customer.  Telling stories of challenges and solutions of past customers, helps illustrate, educate, and communicate problems you’re facing today.

Fill A Need Quickly

If you have an urgent problem, and your current staff is over extended, bringing in a consultant to solve a specific problem can be a net gain for everyone.  They get up to speed quickly, bring fresh perspectives, and review your current processes and operations.  What’s more they can be used in a surgical way, to augment your team for a short stint.

Get Teams Communicating

I’ve worked at quite a number of firms over the years and tasked with solving a specific technical problem only to find the problem was a people problem to begin with.  In some cases the firm already has the knowledge and expertise to solve a problem, but some members are blocking.  This can be because some folks feel threatened by a new solution which will take away responsibilities they formerly held.  Or it can be because they feel some solution will create new problems which they will then be responsible to cleanup.  In either case bridging the gap between business needs and operations teams to solve those needs can mean communicating to each team in ways that make sense to them.  A technical detail oriented focus makes most sense when working with the engineering teams, business and bottom-line focused when communicating with the management team.

Highlight Or Bring To Light Problems On Horizon

Is our infrastructure a ticking timebomb?  Perhaps our backups haven’t been tested and are missing some crucial component?  Or we’ve missed some security consideration, left some password unset, left the proverbial gate open to the castle.  When you deal with your operations on a day-to-day basis, little details can be easy to miss.  A fresh perspective can bring needed insight.

BOOK REVIEW – Jaron Lanier – You Are Not a Gadget

Lanier is a programmer, musician, the father of VR way back in the 90′s, and wide-ranging thinker on topics in computing and the internet.

His new book is a great, if at times meandering read on technology, programming, schizophrenia, inflexible design decisions, marxism, finance transformed by cloud, obscurity & security, logical positivism, strange loops and more.

He opposes the thinking-du-jour among computer scientists, leaning in a more humanist direction summed up here:  “I believe humans are the result of billions of years of implicit, evolutionary study in the school of hard knocks.”    The book is worth a look.

Disaster Recovery Services

Call Us Daily 11-11 EST:  +1-212-533-6828

Planning and implementing a bullet proof disaster recovery strategy forges a a large piece in your business continuity plans.  We can:

  • Review your entire web operations & cloud environment
  • Perform a fire drills to test  backups, scripts, and processes
  • Examine security of operations
  • Provide feedback of current environment
  • Work closely with your team

Emergency Services

  • Performance related outages
  • MySQL database crash
  • Upgrade related problems
  • Server & Hardware outages

Backup and Recovery in EC2 – 5 Point Checklist

backup and recovery checklistBest practices for backups and disaster recovery aren’t tremendously different in the cloud than from a managed hosting environment.  But they are more crucial since cloud servers are less reliable than physical servers.  Also the security aspect may play a heightened role in the cloud.  Here are some points to keep in mind.

Read the original article -
Intro to EC2 Cloud Deployments.

1. Perform multiple types of backups
2. Keep non-proprietary backups offsite
3. Test your backups – perform firedrills
4. Encrypt backups in S3
5. Perform Replication Integrity Checks Continue reading

Deploying MySQL on Amazon EC2 – 8 Best Practices

Also find Sean Hull’s ramblings on twitter @hullsean.

There are a lot of considerations for deploying MySQL in the Cloud.  Some concepts and details won’t be obvious to DBAs used to deploying on traditional servers.  Here are eight best practices which will certainly set you off on the right foot.

This article is part of a multi-part series Intro to EC2 Cloud Deployments.

1. Replication

Master-Slave replication is easy to setup, and provides a hot online copy of your data.  One or more slaves can also be used for scaling your database tier horizontally.

Master-Master active/passive replication can also be used to bring higher uptime, and allow some operations such as ALTER statements and database upgrades to be done online with no downtime.  The secondary master can be used for offloading read queries, and additional slaves can also be added as in the master-slave configuration.

Caution: MySQL’s replication can drift silently out of sync with the master. If you’re using statement based replication with MySQL, be sure to perform integrity checking to make your setup run smoothly. Here’s our guide to bulletproofing MySQL replication.

2. Security

You’ll want to create an AWS security group for databases which opens port 3306, but don’t allow access to the internet at large.  Only to your AWS defined webserver security group.  You may also decide to use a single box and security group which allows port 22 (ssh) from the internet at large.  All ssh connections will then come in through that box, and internal security groups (database & webserver groups) should only allow port 22 connections from that security group.

When you setup replication, you’ll be creating users and granting privileges.  You’ll need to grant to the wildcard ‘%’ hostname designation as your internal and external IPs will change each time a server dies. This is safe since you expose your database server port 3306 only to other AWS security groups, and no internet hosts.

You may also decide to use an encrypted filesystem for your database mount point, your database backups, and/or your entire filesystem.  Be particularly careful of your most sensitive data.  If compliance requirements dictate, choose to store very sensitive data outside of the cloud and secure network connections to incorporate it into application pages.

Be particularly careful of your AWS logins.  The password recovery mechanism in Amazon Web Services is all that prevents an attacker from controlling your entire infrastructure, after all.

3. Backups

There are a few ways to backup a MySQL database.  By far the easiest way in EC2 is using the AWS snapshot mechanism for EBS volumes.  Keep in mind you’ll want to encrypt these snapshots as S3 may not be as secure as you might like.   Although you’ll need to lock your MySQL tables during the snapshot, it will typically only take a few seconds before you can release the database locks.

Now snapshots are great, but they can only be used within the AWS environment, so it also behooves you to be performing additional backups, and moving them offsite either to another cloud provider or to your own internal servers.  For this your choices are logical backups or hotbackups.

mysqldump can perform logical backups for you.  These backups perform SELECT * on every table in your database, so they can take quite some time, and really destroy the warm blocks in your InnoDB buffer cache.   What’s more rebuilding a database from a dump can take quite some time.  All these factors should be considered before deciding a dump is the best option for you.

xtrabackup is a great open source tool available from Percona.  It can perform hotbackups of all MySQL tables including MyISAM, InnoDB and XtraDB if you use them.  This means the database will be online, not locking tables, with smarter less destructive hits to your buffer cache and database server as a whole.  The hotbackup will build a complete copy of your datadir, so bringing up the server from a backup involves setting the datadir in your my.cnf file and starting.

We wrote a handy guide to using hotbackups to setup replication.

4. Disk I/O

Obviously Disk I/O is of paramount performance for any database server including MySQL.  In AWS you do not want to use instance store storage at all.  Be sure your AMI is built on EBS, and further, use a separate EBS mount point for the database datadir.

An even better configuration than the above, but slightly more complex to configure is a software raid stripe of a number of EBS volumes.  Linux’s software raid will create an md0 device file which you will then create a filesystem on top of – use xfs.  Keep in mind that this arrangement will require some care during snapshotting, but can still work well.  The performance gains are well worth it!

5. Network & IPs

When configuring Master & Slave replication, be sure to use the internal or private IPs and internal domain names so as not to incur additional network charges.  The same goes for your webservers which will point to your master database, and one or more slaves for read queries.

6. Availability Zones

Amazon Web Services provides a tremendous leap in options for high availability.  Take advantage of availability zones by putting one or more of your slaves in a separate zone where possible.  Interestingly if you ensure the use of internal or private IP addresses and names, you will not incur additional network charges to servers in other availability zones.

7. Disaster Recovery

EC2 servers are out of the gates *NOT* as reliable as traditional servers.  This should send shivers down your spine if you’re trying to treat AWS like a traditional hosted environment.  You shouldn’t.  It should force you to get serious about disaster recovery.  Build bulletproof scripts to spinup your servers from custom built AMIs and test them.  Finally you’re taking disaster recovery as seriously as you always wanted to.   Take advantage of Availability Zones as well, and various different scenarios.

8. Vertical and Horizontal Scaling

Interestingly vertical scaling can be done quite easily in EC2.  If you start with a 64bit AMI, you can stop such a server, without losing the root EBS mount.  From there you can then start a new larger instance in EC2 and use that existing EBS root volume and voila you’ve VERTICALLY scaled your server in place.  This is quite a powerful feature at the system administrators disposal.  Devops has never been smarter!  You can do the same to scale *DOWN* if you are no longer using all the power you thought you’d need.  Combine this phenomenal AWS feature with MySQL master-master active/passive configuration, and you can scale vertically with ZERO downtime.  Powerful indeed.

We wrote an EC2 Autoscaling Guide for MySQL that you should review.

Along with vertical scaling, you’ll also want the ability to scale out, that is add more servers to the mix as required, and scale back when your needs reduce.  Build in smarts in your application so you can point SELECT queries to read-only slaves.  Many web applications exhibit the bulk of there work in SELECTs so being able to scale those horizontally is very powerful and compelling.  By baking this logic into the application you also allow the application to check for slave lag.  If your slave is lagging slightly behind the master you can see stale data, or missing data.  In those cases your application can choose to go to the master to get the freshest data.

What about RDS?

Wondering whether RDS is right for you? It may be. We wrote a comprehensive guide to evaluating RDS over MySQL.

If you read this far, you should grab our newsletter!

Managing Security in Amazon Web Services

Security is on everyone’s mind when talking about the cloud.  What are some important considerations?

For the web operations team:

  1. AWS has no perimeter security, should this be an overriding concern?
  2. How do I manage authentication keys?
  3. How do I harden my machine images?

** Original article — Intro to EC2 Cloud Deployments **

Amazon’s security groups can provide strong security if used properly.  Create security groups with specific minimum privileges, and do not expose your sensitive data – ie database to the internet directly, but only to other security groups.  On the positive side, AWS security groups mean there is no single point to mount an attack against as with a traditional enterprises network security.  What’s more there is no opportunity to accidentally erase network rules since they are defined in groups in AWS.

Authentication keys can be managed in a couple of different ways.  One way is to build them into the AMI.  From there any server spinup based on that AMI will be accessible by the owner of those credentials.  Alternatively a more flexible approach would be to pass in the credentials when you spinup the server, allowing you to dynamically control who has access to that server.

Hardening your AMIs in EC2 is much like hardening any Unix or Linux server.  Disable user accounts, ssh password authentication, and unnecessary services.  Consider a tool like AppArmor to fence applications in and keep them out of areas they don’t belong.  This can be an ongoing process that is repeated if the unfortunate happens and you are compromised.

You should also consider:

  • AWS password recovery mechanism is not as secure as a traditional managed hosting provider.  Use a very strong password to lock down your AWS account and monitor it’s usage.
  • Consider encrypted filesystems for your database mount point.  Pass in decryption key at server spinup time.
  • Consider storing particularly sensitive data outside of the cloud and expose through SSL API call.
  • Consider encrypting your backups.  S3 security is not proven.

For CTOs and Operations Managers:

  1. Where is my data physically located?
  2. Should I rely entirely on one provider?
  3. What if my cloud provider does not sufficiently protect the network?

Although you do not know where your data is physically located in S3 and EC2, you have the choice of whether or not to encrypt your data and/or the entire filesystem.  You also control access to the server.  So from a technical standpoint it may not matter whether you control where the server is physically.  Of course laws, standards and compliance rules may dictate otherwise.

You also don’t want to put all your eggs in one basket.  There are all sorts of things that can happen to a provider, from going out of business, to lawsuits that directly or indirectly affect you to even political pressure as in the wikileaks case.  A cloud provider may well choose the easier road and pull the plug rather than deal with any complicated legal entanglements.  For all these reasons you should be keeping regular backups of your data either on in-house servers, or alternatively at a second provider.

As a further insurance option, consider host intrusion detection software.  This will give you additional peace of mind against the potential of your cloud provider not sufficiently protecting their own network.

Additionally consider that:

  • A simple password recovery mechanism in AWS is all that sits between you and a hacker to your infrastructure.  Choose a very secure password, and monitor it’s usage.
  • EC2 servers are not nearly as reliable as traditional physical servers.  Test your deployment scripts, and your disaster recovery scenarios again and again.
  • Responding to a compromise will be much easier in the cloud.  Spinup the replacement server, and keep the EBS volume around for later analysis.

As with any new paradigm there is an element of the unknown and unproven which we are understandably concerned about.  Cloud hosted servers and computing can be just as secure if not more secure than traditional managed servers, or servers you can physically touch in-house.

iHeavy Insights 76 – Scale By Design

In a recent trip to Southeast Asia, I visited the Malaysian city of Kuala Lumpur.  It is a sprawling urban area, more like Los Angeles than New York.  With all the congestion and constant traffic jams the question of city planning struck me.  On a more abstract level this is the same challenge that faces web application and internet website designers.  Architect and bake the quality into the framework, or hack it from start to finish?

Urban Un-Planning

Looking at cities like Los Angeles you can’t help but think that no one imagined there would ever be this many cars.  You think the same thought when you are in Kuala Lumpur.  The traffic reaches absurd levels at times.  A local friend told me that when the delegates travel through the city, they have a cavalcade of cars, and a complement of traffic cops to literally move the traffic out of the way.  It’s that bad!

Of course predicting how traffic will grow is no science.  But still cities can be planned.  Take a mega-city like New York for example.  The grid helps with traffic.  A system of one way streets, a few main arteries, and travelers and taxis a like can make better informed decisions about which way to travel.  What’s more the city core is confined to an island, so new space is built upward rather than outward.  Suddenly the economics of closeness wins out.  Many buildings in midtown you can walk between, or at most take a quick taxi ride.  Suddenly a car becomes a burden.  What’s more the train system, a spider web of subways and regional transit branches North to upstate New York, Northeast to Connecticut, East to Long Island, and West to New Jersey.

If you’ve lived in the New York metropolitan region and bought a home, or work in real estate you know that proximity to a major train station affects the prices of homes.  This is the density of urban development working for us.  It is tough to add this sauce to a city that has already sprawled.   And so it is with architecting websites and applications.

Architecting for the Web

Traffic to a website can be as unpredictable as traffic within the confines of an urban landscape.   And the spending that goes into such infrastructure as delicate.  Spend too much and you risk building for people who will never arrive.  What’s more while the site traffic remains moderate, it is difficult to predict patterns of larger volumes of users.  What areas of the site will the be most interested in?  Have we done sufficient capacity planning around those functions?  Do those particular functions cause bottlenecks around the basic functioning of the site, such as user logins, and tracking?

Baking in the sauce for scalability will never be an exact science of course.  In urban planning you try to learn from the mistakes of cities that did things wrong, and try to replicate some of the things that you see in cities doing it right.  Much the same can be said for websites and scalability.

For instance it may be difficult to do bullet proof stress testing and functional testing to cover every single possible combination.  But there are best practices for architecting an application that will scale.  Basics such as using version control – of course but I have seen clients who don’t.  There are a few options to choose from, but they all provide versioning, and self-document your development process.  Next build redundancy into the mix.  Load balance your application servers of course, and build various levels of caching – reverse proxy caching such as varnish, and a key-value caching system like memcache.  Build redundancy into the database layer, even if you aren’t adding all those servers just yet.  Your application should be multi-database aware.  Either use an abstraction layer, or organize your code around write queries, and read-only queries.  If possible build in checks for stale data.

Also consider various cloud providers to host your application, such as Amazon’s Elastic Compute Cloud.  These environments allow you to script your infrastructure, and build further redundancy into the mix.  Not only can you take advantage of features like auto-scaling to support dynamic growth in traffic, but you can scale servers in place, moving your server images from medium to large, to x-large servers with minimal outage.  In fact with MySQL multi-master active/passive replication on the database tier, you could quite easily switch to larger instances or from larger to smaller instances dynamically, without *any* downtime to your application.

Conclusion

Just as no urban planner would claim they can predict the growth of a city, a devops engineer won’t claim they can predict how traffic to your website will grow.  What we can do is mitigate that growth, build quality by building scaffolding so it can grow organically, and then monitor, collect metrics and do basic capacity planning.  A small amount of design up front will payoff over and over again.

Book Review: How To Disappear by Frank M Ahearn

With such an intimidating title you might think at first glance that this is a book only for the paranoid or criminally minded.  Now granted Mr Ahearn is a Skip Tracer, and if you were one already you certainly wouldn’t need this book.  Still Skip Tracers have a talent for finding people, just as an investigator or a detective has of catching the bad guys.  And what a person like this can teach us about how they find people is definitely worth knowing.

If you’ve had your concerns about privacy, what companies have your personal information and how they use it, this is a very interesting real-world introduction to the topic.  Of particular interest might be the chapter on identity thieves and another on social media.  All-in-all a quick read and certainly one-of-a-kind advice!

View on Amazon – How To Disappear

How To Build Highly Scalable Web Applications For The Cloud

Scalability in the cloud depends a lot on application design.  Keep these important points in mind when you are designing your web application and you will scale much more naturally and easily in the cloud.

** Original article — Intro to EC2 Cloud Deployments **

1. Think twice before sharding

  • It increases your infrastructure and application complexity
  • it reduces availability – more servers mean more outages
  • have to worry about globally unique primary keys

2. Bake read/write database access into the application

  • allows you to check for stale data, fallback to write master
  • creates higher availability for read-only data
  • gracefully degrade to read-only website functionality if master goes down
  • horizontal scalability melds nicely with cloud infrastructure and IAAS

3. Save application state in the database

  • avoid in-memory locking structures that won’t scale with multiple web application servers
  • consider a database field for managing application locks
  • consider stored procedures for isolating and insulating developers from db particulars
  • a last updated timestamp field can be your friend

4. Consider Dynamic or Auto-scaling

  • great feature of cloud, spinup new servers to handle load on-demand
  • lean towards being proactive rather than reactive and measure growth and trends
  • watch the procurement process closely lest it come back to bite you

5. Setup Monitoring and Metrics

  • see trends over time
  • spot application trouble and bottlenecks
  • determine if your tuning efforts are paying off
  • review a traffic spike after the fact

The cloud is not a silver bullet that can automatically scale any web application.  Software design is still a crucial factor.  Baking in these features with the right flexibility and foresight, and you’ll manage your websites growth patterns with ease.

Have questions or need help with scalability?  Call us:  +1-212-533-6828

Metrics Bridge Gap Between IT & Business Units

On the business side we’ve all seen requests for hardware purchases that seem astronomical, or somehow out of proportion to the project at hand.  And on the IT side we’ve been faced with the challenge of selling capital expenditures on technology, as demands grow.

Collecting statistics on real usage of server systems, and then connecting the dots to business metrics is an excellent way to bridge the gap.  This allows IT to draw concrete connection between technology investment, and reaching business goals.

Metrics and drawing the dotted line in this way also educates folks on both sides of the tracks.  It educates technologists on exactly how technology purchases can be justified, by their direct return to the business.  And it educates finance and business executives on how those hardware purchases directly contribute to business growth.