Tag Archives: oracle

Why Oracle won’t kill MySQL

oracle mysql database

Join 15,000 others and follow Sean Hull on twitter @hullsean.

1. MySQL does not compete with Oracle

It’s a myth that MySQL somehow poses a threat to Oracle. Oracle’s customers tend to be large enterprises running apps like e-business suite. These are certified to run on Oracle, and further they sit close to finance.

MySQL tends to be a choice of scrappy but nimble startups for their web-facing applications. They want to deploy in the cloud, and don’t want to deal with licenses. Plus they have the techops chops to handle the bushwacking of open source.

Related: Why I wrote the book on Oracle & Open Source

2. Oracle bought Sun for the hardware business

Remember when Oracle acquired Sun? A lot of folks assumed Larry was after MySQL. Grab it & slowly smother it. But actually it was more frosting on the cake. Larry had for years expressed interest in cubes and clusters, and building an Oracle appliance. Whether this ever came to profitable fruition in the form of Exadata remains to be seen. But buying Sun for a song helped him do this.

Also: Why bemoaning AWS performance sounds like Linux detractors circa 1999

3. Larry blows with the wind on open source

He’s money minded, so you’ll see in his decisions that comes first.

In the late 90’s when a customer might spend $100k on Sun and $100k on Oracle licenses, Larry realized porting to Linux and pushing commodity hardware would be a win. So he pushed Linux, and customers could now spend $20k on commodity hardware and $180k on Oracle licenses for them. Imagine the 10million dollar budget if you’re having trouble with the math here.

He also eventually moved the middle tier to Apache for similar reasons. I would argue Oracle corp overall pays lip service to contributing to open source, but they do that to some degree.

Read: Why MySQL dbas are so hard to find

4. MySQL support business is real

What’s more, just as adopting Linux, and then offering their “unbreakable Linux” distro, and pricey support along with it, they’re doing similar things with MySQL. For enterprise customers, and those already comfortable with making the call to Redwood Shores, sales folks will happily direct them support contracts and enterprise add-ons. Naturally.

Read: Why your startup needs real techops

5. There are real viable alternatives to keep balance

And let’s not forget folks, there are already a bunch of forks. There’s the popular and every growing Mariadb which Google has put their muscle behind.

Of course let’s not forget the very popular, very capable, and very bulletproof Percona distribution, along with the Percona toolkit and xtrabackup for real hotbackups.

And for those looking to experiment, there’s Drizzle a work in progress, complete rewrite, and one that’s unfortunately not a drop-in replacement.

Read this: What’s the four letter word dividing dev and ops?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Why I Wrote the Book – Oracle and Open Source

Back in the late 90’s New York City was deep in the dot-com boom. Silicon Alley was being born, and a thousand internet startups were sprouting. Everyone was hiring, it was an exciting time to work in technology!

Join 11,500 others and follow Sean Hull on twitter @hullsean.

Trend Spotting Circa 2000

As an independent consultant, I had the opportunity to work at quite a few startups. The technology stack was identical at almost all of them. Sun Microsystems hardware, Apache webservers, and Oracle on the backend. The database was always the sticking point, and developers struggled to get their queries right.

It was an interesting role to hold. Most career DBAs worked at large fortune 500 firms, the old stodgy kind where nothing ever changes. Few of the Oracle old guard, the kind you’d meet at User Groups or conferences, had much exposure to Linux, and they certainly didn’t trust it.

Also: Here’s how to do a scalability performance review

Meanwhile in the startup scene in NYC I was seeing the cutting edge uses of the technology, with more and more shops switching to Linux and commodity hardware. There was even talk of *gasp* Oracle porting to Linux. There was a real rumor mill around all of this.

Oracle and Open Source Published – 2001

Seeing this shift towards commodity hardware, and the tremendous demand for Oracle married with open source technologies, I pitched O’Reilly and Associates with a book idea. Let’s talk about what’s happening in the trenches. How and when does Oracle – the most commercial of relational databases, work with Open Source technologies? What is in the mix? What are real firms using it for? What tools and technologies can help firms grow faster?

Related: Oracle DBA Interview questions for managers, candidates & recruiters alike

These were the questions my co-author and I sought to answer, and to judge from the response I think we did a very good job. As that push continued, Oracle eventually ported it’s enterprise database to Linux. This was a seismic shift that meant existing Oracle customers would spend a lot less on hardware, and thus have more to spend on Oracle licenses. Win-win except for Sun. The trend continued with Oracle pushing Apache into the mix as well.

Fast Forward a Decade

Now a decade later, Oracle has bought it’s former partner Sun, and in so doing owns MySQL too.

Read this: Top MySQL Interview questions for Devops, managers & recruiters

What new trends are happening? We hear an incessant drum of hype around cloud computing. In many ways the trend parallels what happened a decade ago. See our related piece a history lesson for cloud detractors. How so?

[quote]Commoditization: push towards new platforms, driven by cost. [/quote]

But this is slowed by an equally large stumbling block.

[quote]Performance: new cloud servers can’t compete with their big iron cousins. Not yet at least.[/quote]

Interested in Amazon EC2? We wrote an Intro to EC2 Cloud Deployments article which digs in deeper.

What’s Next for Datacenters

Commiditization will continue, driving costs downward. This will provide more gravity to cloud migrations for firms big and small.

Performance will improve. Cloud services like Amazon EC2 will get bigger & better, as will the all important network & disk subsystems.

Also: 5 things toxic to scalability

Big enterprises are already dipping their feet in the water with VPC technology, tying their existing datacenter to a cloud. They can grow elastically while still having feet firmly planted on the ground.

As large enterprises begin to get experience behind the wheel, it’ll chip away at the stranglehold of Oracle and the huge taxation type licensing that firms struggle with today. Where salesforce.com had a huge impact, workday.com will be even bigger.

[quote]The cloud will finally disrupt the last old guard industry – enterprise software.[/quote]

Read this far? Get us monthly in your inbox. Grab our scalable startups newsletter!

Best of Guide – Highlights of Our Popular Content

We cherry pick the top 5 most popular posts of various topics we’ve covered in recent months.

Oracle to MySQL – prepare to bushwhack through the open source jungle

oracle to mysql

I was recently approached by a healthcare company for advice on suitable database solutions capable of executing its new initiative. The company was primarily an Oracle shop so naturally, they began by shopping for possible Oracle solutions.

The CTO relayed his conversation with the Oracle sales rep, who at first recommended an Oracle solution that, expensive as it may have been, ultimately aligned with the company’s existing technology and experience. Unfortunately this didn’t match their budget and so predictably, the Oracle sales rep whipped out a MySQL-based solution as an alternative.

Having worked as an Oracle DBA throughout the dot-com years, I know the technology well. I also know the cultural differences between enterprises that choose Oracle solutions and those that choose open-source ones.

This encounter with the healthcare firm struck me as a classic conundrum for today’s companies who are under pressure to meet business targets under a tight budget, and in a very short time.

Can an open-source solution like MySQL be the answer to such huge demands?

The Oracle sales rep will likely nod excitedly and say no sweat. But as a consultant I could only manage an equivocal yes.

As the healthcare CTO rattled off the list of products he wanted to use, specific RTOs and RPOs (recovery time objective + recovery point objective – all I could think was to react with concern.

In my experience with startup after startup I’ve seen plenty of different MySQL installations but I’d never heard of one with the technology stack he described. What’s more I’d never heard of these solutions described with the Oracle Corp titles.

On one hand I wanted to discuss the merits of the solution he was keen to implement, while on the other, I was expressing concern over possible directions and paths we might take.

An Oracle cluster is not a MySQL cluster

The solution Oracle suggested was a MySQL Cluster. The term cluster unfortunately means different things to different people. Such loose usage of the word dilutes its meaning. In particular a lot of Oracle technologists expect that this solution might be similar to Oracle’s Real Application Cluster technology. It’s not. There are a lot of limitations, and frankly it’s really just a different beast.

The list also included various management dashboards which Oracle likes to push, but which I rarely see in my consulting assignments. What’s more I heard nothing about replication integrity considering that replication problems are an ongoing concern for real-world MySQL installations due to the particular technology used under the hood. There are reliable solutions to this problem but none yet available from Oracle. In fact, this is a big problem but one that may be completely off the sales guys’ radar.

Don’t let sales frame your architecture

Honestly, I don’t have a particularly large axe to grind with the sales guys. They have a job to do, and providing solutions which bring revenue to their firm and commissions for themselves is what puts food on their tables. Each party is motivated in different ways. But as a company shopping for solutions, this should be kept clearly in mind when starting down that road.

Beware prescribed architectural frameworks that appear too easy because they almost always don’t do what they say on the tin. Unfortunately sales folks don’t have experiencing designing architectures in the real world, so they can’t really know how the technologies work beyond the data sheet with feature bullet points.

As we all know in the technology space, all software come with bugs and real-world experience does not match the feature lists in the brochures. In law they have de jure and de facto. The former describes what is written and the latter, what’s practiced. For technology solutions, its never just adding water for something to work.

Do your homework

Before you embark on a new trip through the open source technology jungle, do some due diligence. Read up on real-world solutions, and how other large firms are using the technology. What configurations are they having success with? Which are causing trouble for a lot of people.

One of the great advantages of open-source are the very vibrant communities, forums and discussion groups where people are glad to share their experiences and offer advice.

Allow sufficient time to test and
bring your team up to speed

This is very important one. Shifting from an enterprise that relies primarily on Oracle for it’s relational database solution over to one that relies on open source technologies is a very big step indeed. Open-source technologies tend to be much more do-it-yourself and roll your own. Oracle solutions tend much more toward predefined paths and solutions and prescriptions for customers.

There are merits to each of these paths, with attendant pros and cons. But they are decidedly different. It’s likely that your team will also require time to get up to speed, not just with the particular software components, but with the new process by which things happen in the open-source space. Allow sufficient time for this shift to take place, lest you create more problems than solutions.

Extract Transform & Load – What is it and why is it important?

So-called ETL relates to moving data from external sources into and out of relational databases or data warehouses.


Source systems may store data in an infinite variety of formats.  Extracting involves getting that data into common files for moving to the destination system.  CSV file also known as comma separated values is named because each of the records is stored as one line in the file, and fields are separated by commas, and often surrounded by quotes as well.  In MySQL INTO OUTFILE syntax can perform this function.  If you have a lot of tables to work with, you can script the process using the data dictionary as a lookup for table names, and create a .mysql script to then run with the mysql shell.  In Oracle you would use the spool command in SQL*Plus the command line shell.  Spool sends subsequent output from the screen also to a file.


This step involves modifying the extracted data in preparation for moving it into the target database server.  It may involve sweeping out blank records, or rearranging columns, or breaking files into smaller subsets of data.  You might also map values differently for instance if one column in the source database was gender with values M/F you might transform those to the strings “Male” and “Female” if that is more useful for your target database server.  Or you might transform those to numerical values, for instance Male & Female might be 0/1 in your target database.

Although I myriad of high level GUI tools exist to perform these functions, the Unix operating system includes a plethora of very powerful tools that every experience System Administrator is familiar with.  Those include grep & sed which operate on regular expressions and can perform data transformation at lightening speed.  Then there is sort which can sort data and send the results to stdout or the file of your choosing.  Other tools include wc – word count, cut which can remove columns and so forth.

This final step involves moving the data into the database server, and it’s final target tables.  For instance in MySQL this might be done with the LOAD DATA INFILE syntax, while in Oracle you might use SQL*Loader, which is a very fast flat file dataloader.

Quora discussion by Sean Hull – What is ETL?

Point-in-time Recovery – What is it and why is it important?

Web-facing database servers receive a barrage of activity 24 hours a day.  Sessions are managed for users logging in, ratings are clicked and comments are added.  Even more complex are web-based ecommerce applications.  All of this activity is organized into small chunks called transactions.  They are discrete sets of changes.  If you’re editing a word processing document, it might autosave every five minutes.  If you’re doing something in excel it may provide a similar feature.  There is also an in-built mechanism for undo and redo of recent edits you have made.  These are all analogous to transactions in a database.

These are important because all of these transactions are written to logfiles.  They make replication possible, by replaying those changes on another database server downstream.

If you have lost your database server because of hardware failure or instance failure in EC2, you’ll be faced with the challenge of restoring your database server.  How is this accomplished?  Well the first step would be to restore from the last full backup you have, perhaps a full database dump that you perform everyday late at night.  Great, now you’ve restored to 2am.  How do I get the rest of my data?

That is where point-in-time recovery comes in.  Since those transactions were being written to your transaction logs, all the changes made to your database since the last full backup must be reapplied.  In MySQL this transaction log is called the binlog, and there is a mysqlbinlog utility that reads the transaction log files, and replays those statements.  You’ll tell it the start time – in this case 2am when the backup happened.  And you’ll tell it the end time, which is the point-in-time you want to recover to.  That time will likely be the time you lost your database server hardware.

Point-in-time recovery is crucial to high availability, so be sure to backup your binlogs right alongside your full database backups that you keep every night.  If you lose the server or disk that the database is hosted on, you’ll want an alternate copy of those binlogs available for recovery!

Quora discussion on Point-in-time Recovery by Sean Hull

Data warehousing – What is it and why is it important?

A data warehouse is a special type of database.  It is used to store large amounts of data, such as analytics, historical, or customer data, and then build large reports and data mining against it.  It is markedly different from a web-facing or high-transaction database, which typically has many many small transactions or pieces of data that are constantly changing, through many 100’s or 1000’s or small user sessions.  These typically execute in speeds on the order of 1/100th of a second, while in data warehouse you have fewer large queries which can take minutes to execute.

Data warehouses are tuned for updates happening in bulk via batch jobs, and for large queries which need big chunks of memory to sort and cross-tabulate data from different tables.  Often full table scans are required because of the specialized one-off nature of these reports.  The same queries are not executed over and over.

It’s important not to mix data warehousing databases with transactional databases in the same instance, whether you are dealing with MySQL or Oracle.  That’s because they are tuned totally differently.  It would be like trying to use the same engine for commuting to work, and a container ship traveling around the world.  Different jobs require different databases or databases that with their dials set for different uses.

Quora discussion of data warehousing – Sean Hull

Relational Database – What is it and why is it important?

A relational database is the warehouse of your data.  Your crown jewels.  It’s your excel spreadsheet or filing cabinet writ large.  You use them everyday and may not know it.  Your smartphone stores it’s contact database in a relational database, most likely sqlite – the ever present but ever invisible embedded database platform.  Your online bank at Citibank or Chase stores all your financial history, statements, contact info, personal data and so forth, all in a relational database.

  • organized around records
  • data points are columns in a table
  • relationships are enforced with constraints
  • indexing data brings hi-speed access
  • SQL is used to get data in and out of the db
  • triggers, views, stored procs & materialized views may also be supported

Like excel, relational databases are organized around records.  A record is like a 3×5 card with a number of different data points on it.  Say you have 3×5 cards for your addressbook.  Each card holds one address, phone number, email, picture, notes and so forth.  By organizing things nicely on cards, and for each card predictable fields such as first name, last name, birthday etc, you can then search on those data points.  Want all the people in your addressbook with birthday of July 5th, no problem.

While relational databases have great advantages, they require a lot of work to get all of your information into neatly organized files.  What’s more the method for getting things into and out of them – SQL is a quirky and not very friendly language.  What’s more relational databases have trouble clustering, and scaling horizontally.  NOSql database have made some headway in these departments, but at costs to consistency and reliability of data.

As servers continue to get larger, it becomes rarer that a single web-facing database really needs more than one single server.  If it’s tuned right, that is.  Going forward and looking to the future, the landscape will probably continue to be populated by a mix of traditional relational databases, new nosql type databases, key-value stores, and other new technologies yet to be dreamed up.

Sean Hull asks on Quora – What is an rdbms and why are they important?

Query Profiling – What is it and why is it important?

Queries are so-named because of the lovely language we call SQL – simplified query language.  That’s a bit of sarcasm on my part, I’ve never found it particularly simple or elegant.  Profiling them involves finding how they are spending their time, and what they are asking the server to do.   In this way you can make them faster, and improve performance of the whole server, and thus your website.

At any rate queries ask the database for information.  If they are on the simpler side, something like give me all the users whose name starts with “hu” for example, and last name is indexed, that will run very fast.  The database will lookup in the index the last name field, and find the subset of ones starting with those letters, then go lookup the records by id and return them to you.  Cache the index blocks, cache the data blocks.  Great!  However, say those are customers, and you want their calling cellphone calling history.  Ok, now you have to join on another table, matching by some key, hold all those records in memory, shuffle them around, and so on.

So queries are effectively little jobs or bits of work you ask your database server to perform on your behalf.  With websites you typically have hundreds of concurrently running sessions all sending their own little SQL jobs to the server to get processed, and records returned.  And blip in the radar slows everyone down, so you want them to all run quickly.

That’s where profiling comes in.  MySQL, Oracle, and SQL Server alike all have EXPLAIN type facilities for showing the plan with which the database will use to fetch your data.  It shows indexes, sorting, joins, order, and so forth.  All of this contributes to the overall execution time and resources used on the database server.

Quora discussion by Sean Hull – What is query profiling and why is it important?

Oracle to MySQL Migration Considerations

There are a lot of forms of transportation, from walking to bike riding, motorcycles and cars to busses, trains and airplanes.  Each mode of transport will get you from point a to point b, but one may be faster, or more comfortable and another more cost effective.  It’s important to keep in mind when comparing databases like Oracle and MySQL that there are indeed a lot of feature differences, a lot of cultural differences, and a lot of cost differences.  There are also a lot of impassioned people on both sides arguing at the tomfoolery of the other.  Hopefully we can dispel some of the myths and discuss the topic fairly.

** See also: Migrating MySQL to Oracle Guide **

As a long time Oracle DBA turned MySQL expert, I’ve spent time with clients running both database engines and many migrating from one to the other.  I can speak to many of the differences between the two environments.  I’ll cover the following:

  1. Query & Optimizer Limitations
  2. Security Differences
  3. Replication & HA Are Done Differently
  4. Installation & Administration Simplicity
  5. Watch Out – Triggers, Stored Procedures, Materialized Views & Snapshots
  6. Huge Community Support – Open-source Add-ons
  7. Enter The Cloud With MySQL
  8. Backup and Recovery
  9. Miscellaneous Considerations

Check back again as we edit and publish the various sections above.