I like the image of the city of Mesa Verde. It’s fascinating to see how ancient cities were built, especially as an inhabitant of one of the world’s largest cities today, New York.
I’m a long-time relational database guy. I worked at scores of dot-coms in the 90’s as an old-guard Oracle DBA, and pivoted to MySQL into the new century. Would a guy like me who’s seen 20 years of relational database dominance really believe they could be dying?
There’s a lot to be excited about in this new realm of db, and some interesting bigger trends that are pushing things in a new way.
Why SQL Database Is Still Alive?
Here are the reasons why SQL database isn’t dead yet:
1. Growing Use of ORMs
ORM probably sounds like some strange fossil archeologists just dug up in the ancient city of Mesa Verde. But they’re important. You may know them by their real-life names, Hibernate, Active Record, SQL Alchemy, and Cake. There are many others. Object Relational Modelers provide middleware between developers and the SQL of your chosen relational database. They abstract away the nitty gritty and encapsulate it into a library.
In a way, they’re like code generators. Mark Winand talks about them in SQL Performance Explained warning of the “eager fetching” problem. This is DBA speak for specifying all columns (SELECT *) or fetching all rows when you don’t need them all. It’s inefficient in terms of asking the database to read & cache all that data, but also to send it across the network and then discard it on the webserver side. Like a lazy housekeeper, the clutter & dust will grow to overwhelm you.
Martin Fowler is the author of the great book NoSQL Distilled. He tries to walk the fence in his post ORM Hate, trying to balance developers love of ORMs, and the obvious need for scalability. Ted Neward calls ORMs the Vietnam of Computer Science.
Mattias Geniar points out that BAD ORMs are infinitely worse than bad SQL and another on High Scalability by Drewsky The Case Against ORM Frameworks.
If you agree that the ORM conversation is still a huge mess, you’ll be excited to know that NoSQL sidesteps it completely. They’re built out of the box to interface more like data structures, than reading rows and columns. So, you eliminate the scalability problems they introduce when you go to NoSQL. That makes developers happy and pleases DBAs and techops too. Win!
Read: Why Oracle won’t kill MySQL
2. Widening the Field of Options
NoSQL databases are not simply key-value stores, though some like Memcache and Riak do fit that mold.
MongoDB offers configurable consistency & durability & the advantages of document storage, no need for an ORM here. You also have a mix of indexing options, that go a little deeper than other NoSQL solutions. A sort of middle-ground solution that offers the best of both worlds.
Cassandra, a powerful db that is clustered out of the box. All nodes are writeable, and there are various ways to handle conflict resolution to suit your needs. Cassandra can grow big, and naturally takes advantage of cloud nodes. It also has a nice feature to naturally age out data, based on settings you control. No more monumental archiving jobs.
HBase is the database part of Hadoop, based on Google’s seminal Bigtable paper.
Redis is another option with growing popularity. It’s a key-value store but allows more complex data in its buckets, such as hashes, lists, sets, and sorted sets. Developers should be salivating at this one.
3. Lowering the Bar
The old world of relational databases treats data as sacrosanct. DBAs are tasked with protecting their integrity & consistency. They manage backups to protect against disaster. In this world, every bit of data written is as sacred as any other, whether it’s your bank account balance or a comment added to a Facebook discussion.
However modern non-relational databases introduce the idea of eventually consistency. DBAs and architects would say we are relaxing our durability requirements. What they mean is data can get slightly out of sync and we’re ok with that. We’ll build our web applications to plan for that, or even in the case of Riak expose the levers of durability directly to the developers, allowing them to make some changes instant, while others are laxer and lazier.
Check this: Why high availability is so very hard to deliver
4. Cloud Demands
Virtualized environments like Amazon EC2, give easy access to legions of servers. Availability zones & regions only widen the deployment options. So, deploying a single writeable master, the way traditional relational databases work best, is not natural.
Databases like Cassandra, Mongo & Redis are clustered right out of the box. They grew up in this virtual data center environment and feel comfortable there.
Related: Why I wrote the book on Oracle & Open Source
5. Only DBAs Understand Them
Devs may whine at this statement, and to be fair it’s a generalization. The popularity of ORMs speaks volumes here. Anything to eliminate the dreaded SQL writing. Meanwhile, DBAs bemoan the use of ORMs for they represent everything they’re trying to fix.
SQL is hard enough, but the ugly truth is each database vendor has their own implementation, their own optimizations, their own optimal tweaks. Even between database versions, SQL code may not perform consistently.
Identifying slow SQL and tweaking it remains one of the primary tasks of performance tuning, for this reason. It hasn’t changed much in my two decades on the job. That’s all for today. Thanks for reading!