Category Archives: All

MySQL for Devs, DBAs and Debutantes

Join 6500 others and follow Sean Hull on twitter @hullsean.

I just received my copy of the 5th Edition of Paul DuBois’ MySQL tomb. Weighing in at 1153 pages, it’s a solid text, with a very thorough introduction to the topic of administering MySQL databases.

Buy the book here: MySQL 5th Edition by Paul Dubois

A book for a broad audience

When I say debutantes, it’s a nod to beginners, for this book forges a very solid and complete introduction to the topic of MySQL. Start with installing the software & setting up your environment, and then move on to really understanding the SQL language, from commands to create objects, to ones for adding & modifying data, and then writing code around it.

See also: 5 more things deadly to Scalability

There’s a thorough discussion of datatypes, stored procedures, functions and views.

[quote]
Paul Dubois’ definitive reference makes a excellent compliment to High Performance MySQL. They should sit alongside eachother on your database bookshelf.
[/quote]

For developers there are chapters on writing applications in C, another for Perl and a third for PHP.

For DBAs there are chapters on security, backups, replication, understanding the data directory and general server administration. There is also good coverage of both 5.5 and the newly released 5.6 of MySQL.

What I like about this book

You can think of this book as a definitive reference to MySQL. It includes much of the online documentation that you would find at Oracle’s site, such as command & variable reference, and detailed explanation of how to use the client tools.

Dubois also goes beyond the online documentation though, giving you a bit of a background around concepts, a broader more complete discussion.

Read this: Two Part DBA Interview Guide for Managers & Candidates alike

He also lays out the material in a very logical stepwise way, so for someone new to the MySQL world and the time on their hands, the 1153 pages could be read straight through.

Why No Mention of Percona Toolkit?

I have to admit I was a bit surprised there was no mention of Percona Toolkit. Perhaps it was buried in some dark corner of the text I missed, but it made no mention in the index at all.

Percona Toolkit of course is a tool that every DBA should be familiar with. It is really an essential toolkit and fills the gaps that the prepackaged tools can’t help you with.

Want to checksum your tables to compare data on master & slave? pt-table-checksum does the trick.

Check this: AirBNB didn’t have to fail during the Amazon AWS Outage

Want to find out how far your slaves *really* are behind? pt-heartbeat is your friend.

Want to analyze your slow query log to produce a useful summary report? pt-query-digest to the rescue.

I also see no mention of innotop, which I would also say is an essential tool. These aren’t really advanced topics, so It’s unclear why they are missing. In the real world you need these tools to do your job.

Other Criticisms

My more general criticism is where the book lacks real-world advice from a seasoned DBA. At times the writing feels a bit more of the official line on how things work. But in day-to-day devops and operations, things can be quite different.

Also: Bulletproofing MySQL Replication with Checksums

For example, stored procedures. In MySQL they are there, however using them brings real performance challenges. They’re not always compatible with replication. Given all of that, why include a whole chapter with endless discussion of them without strong reservations. It would lead a novice user or developer to incorporate them into an application only to be shocked and surprised at the problems they bring.

Another example, looking through the system variables reference, I see the sync_binlog option. There is a short caution “…lower values provide greater safety in the event of a crash, but also affect performance more adversely”. Now reading this as a novice DBA I might think great, crash protection. But having tried this parameter in production, I found a huge impact on performance and had to disable it. What’s the advice here? It’s a bit confusing.

Conclusions

This is a really great book as an introduction to MySQL, and delving into intermediate topics. I would sit it on your bookshelf along side High Performance MySQL. What this book lacks in advice, you can turn to the latter book, and what High Performance MySQL lacks in terms of introductory material this book covers in spades. They make a great compliment to each other.

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Here’s a sample

How to Optimize MySQL UNION For High Speed

obama innauguration big data sets

Join 6100 others and follow Sean Hull on twitter @hullsean.

There are two ways to speedup UNIONs in a MySQL database. First use UNION ALL if at all possible, and second try to push down your conditions.

[mytweetlinks]

1. UNION ALL is much faster than UNION

How does a UNION work? Imagine you have two tables for shirts. The short_sleeve table looks like this:

[code]
blue
green
gray
black
[/code]

And long_sleeve another that looks like this:

[code]
red
green
yellow
blue
[/code]

Related: Why Generalists are Better at Scaling the Web

If you UNION those two tables, first MySQL will sort the combined set into a temp table like this:

[code]
black
blue
blue
gray
green
green
red
yellow
[/code]

Once it’s done this sort, it can easily remove the duplicate blue & duplicate green for this resulting set:

[code]
black
blue
gray
green
red
yellow
[/code]

See also: Mythical MySQL DBA – the talent drought.

Why does it do this? UNION is defined that way in SQL. Duplicates must be removed and this is an efficient way for the MySQL engine to remove them. Combine results, sort, remove duplicates and return the set.

[quote]
Queries with UNION can be accelerated in two ways. Switch to UNION ALL or try to push ORDER BY, LIMIT and WHERE conditions inside each subquery. You’ll be glad you did!
[/quote]

What if we did UNION ALL? The result would look like this:

[code]
blue
green
gray
black
red
green
yellow
blue
[/code]

Read this: MySQL DBA Interview & Hiring Guide.

It doesn’t have to sort, and doesn’t have to remove duplicates. If you imagine combining two 10 million row tables, and don’t have to sort, this speedup can be HUGE.

2. Use Push-down Conditions to speedup UNION in MySQL

Imagine with our example above the shirts have a design date, the year they were released. Yes we’re keeping this example very simple to illustrate the concept.

Here is the short_sleeve table:
[code]
blue 2013
green 2013
green 2012
gray 2011
black 2009
black 2011
[/code]

And long_sleeve table looks like this:

[code]
red 2012
red 2013
green 2011
yellow 2010
blue 2011
[/code]

For 2013 designs could combine them like this:

[code]
(SELECT type, release FROM short_sleeve)
UNION
(SELECT type, release FROM long_sleeve);
WHERE release >=2013;
[/code]

See also: 5 More Things Deadly to Scalability and the original 5 Things Toxic to Scalability..

Here the WHERE clause works on this 11 record temp table:

[code]
black 2009
black 2011
blue 2011
blue 2013
gray 2011
green 2013
green 2012
green 2011
red 2012
red 2013
yellow 2010
[/code]

But it would be much faster to move the WHERE inside each subquery like this:

[code]
(SELECT type, release FROM short_sleeve WHERE release >=2013)
UNION
(SELECT type, release FROM long_sleeve WHERE release >=2013);
[/code]

That would be operating on a combined 3 record table. Faster to sort & remove duplicates. Smaller result sets cache better too, providing a pay forward dividend. That’s what performance optimization is all about!

Read this: RDS or MySQL – 10 Use Cases.

Remember multi-million row sets in each part of this query will quickly illustrate the optimization. We’re using very small results to make visualizing easier.

You can also use this optimization for ORDER BY and for LIMIT conditions. By reducing the number of records returned by EACH PART of the UNION, you reduce the work that happens at the stage where they are all combined.

If you’re seeing some UNION queries in your slow query log, I suggest you try this optimization out and see if you can tweak

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Here’s a sample

The Most Important AWS Feature for Performance and Scalability

Join 6100 others and follow Sean Hull on twitter @hullsean.

The Foundation of Speed

All servers use disk to store files. Operating system libraries, webserver & application code, and most importantly databases all use disk constantly.

So disk speed is crucial to server speed.

[mytweetlinks]

[quote]
Disk speed is crucial for MySQL databases. It has been a real challenge in multi-tenant environments like Amazon’s EBS. The provisioned IOPS feature addresses this head on, allowing customers to lock in great MySQL database performance!
[/quote]

Also check out: Five more things Deadly to Scalability.

Disk Performance on Multi-tenant EBS

Amazon’s EBS or elastic block storage, is a virtualized network storage solution. You can think of it as RAIDed disks but accessed & provisioned over a high speed network.

Related: Why Generalists are Better at Scaling the Web

Since Amazon is a multi-tenant environment, other customers are using that same network, and hitting those same disks. So if your neighbors are seeing a lot of traffic to disk, your web application can slow down. Not good!

What is Provisioned IOPS

We’ll agree that it’s one of the worst branded features ever, but you should know about it and use it, especially for your MySQL databases.

Provisioned means that you’ll lock in performance in advance, and IOPS stands for I/O operations. Think of it as google juice for your cloud database servers!

Also: How I increased my blog pagerank to 5

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Here’s a sample

Five More Things Deadly to Scalability

The.Rohit - Flickr

The.Rohit – Flickr

Join 19k others and follow Sean Hull on twitter @hullsean.

1. Slow Disk I/O – RAID 5 – Multi-tenant EBS

Disk is the grounding of all your servers, and the base of their performance. True with larger and larger main memory, much is available in cache, a server still needs to constantly read from disk and flush things from memory. So it’s a very very important component to performance and scalability.

What’s wrong with Raid 5?

Raid 5 was designed to give you more space, using fewer disks. It’s often used in a server with few slots or because ops misunderstand how bad it will impact performance. On a database server it can be particularly bad.

All writes see a performance hit. What’s worse is if you lose a disk, the RAID though technically still on line, will perform SO SLOWLY as to be offline. And a rebuild takes many hours. Worse still is the risk to lose another drive during that rebuild. What if you have order a drive and it takes a couple of days?

RAID 10 is the solution. Mirror each set of two drives, then stripe over those. Even with only four slots available, it’s worth it. Good read performance, good write performance, and protection.

What the heck is multi-tentant?

In the cloud, you share servers, network & disk just like you do apartments in a building. Hence the name. Amazon’s EBS or elastic block storage, extends this metaphor, offering you the welcome flexibility of a storage network. But your bottleneck can be fighting with other tenants on that same network.

Default servers do have this problem, however AWS has addressed this serious problem with a little known but VERY VERY useful feature called Provisioned IOPS. It’s a technical name, but means you can lock in reliable disk I/O. Just what the scalability doctor ordered.

Check out our original post: 5 Things Toxic to Scalability

2. Using the database for Queuing

MySQL is good at a lot of things, but it’s not ideal for managing application queues. Do you have a table like JOBS in your database, with a status column including values like “queued”, “working”, and “completed”? If so you’re probably using the database to queue work in your application.

It’s not a great use of MySQL because of locking problems that come up, as well as the search and scan to find the next task.

Luckily their are great solutions for developers. RabbitMQ is a great queuing solution, as is Amazon’s SQS solution. What’s more as external services they’re easier to scale.

[quote]
Scalability becomes key to your business, as you customer base grows. But it doesn’t have to be impossible. Disk I/O, caching, queuing and searching are all key areas where you can make a big dent, in a manageable way. Juggle your technical debt too, and you’re golden!
[/quote]

Also take a look at: Why Generalists are Better at Scaling the Web

3. Using Database for full-text searching

Oracle has full text search support, why shouldn’t we assume the same in MySQL? Well MySQL *does* have this, but in many versions only with the old MyISAM storage engine. It has it’s set of corruption problems, and isn’t really very performant.

Better to use a proven search solution like Apache Solr. It is built specifically for search, includes excellent library support for developers of most modern web languages and best of all is easy to SCALE! Just add more servers on your network, or distributed globally.

For folks interested in the bleeding edge, Fulltext is coming to Innodb crash safe & transactional storage engine in 5.6. That said you’re still probably better off going with an external solution like Solr or Sphinx and the MySQL Sphinx SE plugin.

[mytweetlinks]

How to find A Mythical MySQL DBA

4. Insufficient Caching at all layers

Cache, cache, and cache some more. Your webservers should use a solid memcache or other object cache between them & the database. All those little result sets will sit in resident memory, waiting for future web pages that need them.

Next use a page cache such as varnish. This sits in front of the webserver, think of it as a mini-webserver that handles very simple pages, but in a very high speed way. Like a pack of motorbikes riding down an otherwise packed freeway, they speed up your webserver to do more complex work.

Browser caching is also important. But you can’t get at your customers browsers, or can you? Well not directly, but you can instruct them what things to cache. Do that with proper expires headers. Have your system administrator configure apache to support this.

Also: Tweaking Disqus to Find Experts & Drive Traffic

5. Too much technical debt

Technical debt can bite. What is it? As you’re developing an new idea, you’ll build prototypes. As those get deployed to customers, change gets harder, and past things you glossed over because problems. One team leaves, and another inherits the application, and the problems multiple. Overtime you’re building your technical debt as your team spends more time supporting old code and fixing bugs, and less time building new features. At some point a rewrite of problem code becomes necessary.

Also: How I increased my blog pagerank to 5

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Here’s a sample

The Needle in Big Data Noise

Join 5500 others and follow Sean Hull on twitter @hullsean.

[mytweetlinks]

Also take a look at: I hacked Disqus Digests to discover new blogs

Who the heck is Bayes

Thomas Bayes was a scientist & thinker, Fellow of the Royal Society, and back in 1763 author of “An Essay toward Solving a Problem in the Doctrine of Chances”. His method advocated learning by approximation, to get closer and closer to the truth by gathering more information, and factoring that into probabilities & predictions.

[quote]
What isn’t acceptable under Bayes’s theorem is to pretend that you don’t have any prior beliefs. You should work to reduce your biases, but to say you have none is a sign that you have many.
[/quote]

Why should you care?

From hurricane & earthquake warnings, to financial storms or terrorism, prediction is more important than ever. Epidemiologists can make use of Bayesian techniques to protect populations, gamblers can use it in sport, and investors for markets.

See also Amp up your blog traffic by improving your pagerank

Why Nate Silver is different

Nate is famous for predicting the 2012 presidential election with uncanny accuracy. So the book is an in depth look at how he thinks, and how he works with data. He talks of Hedgehogs – those who believe in big ideas and work from large principals, versus Foxes who see the world as messy, often inconsistent and unpredictable, but who nevertheless tend to present better though less definitive predictions. The philosophy is less of modeling, and more of testing, and adjusting along the way to get closer to the truth.

See also Sales sucks, but a bear market offers hard lessons

For engineers & startups

Nate interviewed John Sanders of a scout for the LA Dodgers. He identified five abilities and characteristics that predict success in baseball. Looking at them together, I think they can well predict success in Startup land too.

1. Preparedness & work ethic
2. Concentration & focus
3. Competitiveness & self-confidence
4. Stress management & humility
5. Adaptiveness & learning ability

The book is a bit technical and sometimes long winded. But it is choc full of real insight, and wisdom that we can all put to use in our careers and businesses.

Also: AirBNB didn’t have to fail – AWS outages be damned!

Get a whole lot more! Scalable Startups. Exclusive. Monthly.. We share special content. Here’s a sample

My DIY Disqus hack for blog discovery

I discovered disqus about a year ago while enjoying one of my favorite blogs, Fred Wilson’s AVC.

Believe it or not for a while I had it installed on my wordpress blog and thought it was pronouced DISK-OUS.

Join 5100 others and follow Sean Hull on twitter @hullsean.

[mytweetlinks]

What disqus does beautifully

Disqus does a lot of things great. The first thing you realize is they remove a huge hurdle for users across the web. Managing multiple logins on blogs here and there, when you just want to comment. This is your first of many wins.

Bloggers can count on an increase in discussion, commenting & overall engagement. What’s more it reduces spam. Great!


Bloggers want traffic, thats one reason they spend their valuable time sharing their knowledge. Jumping in to Disqus is one great way to do that. More robust discovery can push this much further. Driving traffic traffic for all of us will drive adoption of disqus across the web.

Disqus provides a one-stop dashboard for all of this, and it’s wonderful for bloggers.

What i wanted more of…

I found myself using disqus, but wondering…

o bloggers – who are the big shots?
o how do I find opinion minded people?
o how do I find intelligent discourse?
o can I encourage more discussions on my blog?
o I want web audiences discovering Sean Hull’s Scalable Startups
o How do I search – for this article, a comment that I posted?

I found myself keeping a list of disqus blogs. I would follow these blogs around the web, and thought – Why am I doing this? Why isn’t this part of the software? What am I intuitively searching for?

Why is database administration talent in short supply? They are the Mythical MySQL DBAs

A call for @disqushelp on twitter

I posted a request for info on twitter. @disqushelp was quick to point me to their Disqus Gravity Project. As I commented on the designing disqus gravity blog post it is a wonderful tool and proof of concept. It sure illustrates where disqus is taking things and the important visualization possible. But unfortunately it wasn’t helping me. :-(

Also take a look at: Why Generalists are Better at Scaling the Web

How I hacked disqus digest emails

I was receiving the disqus digest emails. I think when you signup you automatically get those. I was mostly just deleting them, as they didn’t have much of interest in them. Then I started clicking through, and realized – hey wait, Disqus is kind of doing what I want already. They just need a little help.

I decided to go to some of my favorite blogs. I visited AVC, RWW, Wired, HBR, businessweek, computerworld, chrisbrogan.com and scrolled down to disqus comments. I then clicked “community” tab. Along the right side you’ll see the most active commenters. I then clicked through to their disqus profiles, and “followed” them just like you might do on Twitter.

Also: How I increased my blog pagerank to 5

After doing this for the top 5 commenters on ten to fifteen blogs, my disqus digests emails started bringing me new blogs! This is super cool. I’ve discovered some Venture, some technical and some iPhone blogs I never new about.

What was missing – discovery

Discovery is tech vernacular for what I was doing. Scouring the web for subject matter experts was exactly what I was doing. Picking the ones that used disqus allowed me to share my thoughts and weigh in across the spectrum of topics I knew well.

Disqus digests came up short for some people. But after I started using the follow feature, suddenly blogs and authors were popping up on my radar. Exactly what I wanted.

Keep up the good work guys. Would love to see the iPhone app if in fact it’s under development!

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Here’s a sample

NYC Tech Firms Are Hiring – Map

Made In NY - Startups Hiring

If you haven’t noticed how much the NYC tech scene has grown recently, I’m afraid you’ve been hiding under a rock. It’s simply incredible.

Take a look at Mapped In NY a google maps mashup of the growing list popularized by the NY Tech Meetup called Made In New York.

Join 5000 others and follow Sean Hull on twitter @hullsean.

[mytweetlinks]

Having been around during the first dot-com boom back in the late 1990′s this is even more exciting to see. Despite the recession, New York’s economy is truly thriving!

[quote]
New York’s Startup scene is truly thriving with a whopping 1263 firms, many of which are hiring.
[/quote]

Why is database administration talent in short supply? They are the Mythical MySQL DBAs

Also take a look at: Why Generalists are Better at Scaling the Web

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Here’s a sample

A Pagerank of 5 Is Possible – Here's How

Join 4500 others and follow Sean Hull on twitter @hullsean.

A highly trafficked website is a valuable asset indeed. For a services business it helps you build reputation and reach prospects.

Here’s how to get there.

1. Longevity

We’ve been around for a while, as you can see from a quick whois search below. I’ve owned the web property (aka domain name) iheavy.com and used it for the same purpose, since July 1999! Google notices this and ranks accordingly.

Until 2011 I wasn’t blogging much. I had a pagerank of 3 though. That’s attributable to two factors:

o 12 years owning the domain at that time

o Writing a book for O’Reilly which got a strong backlink

[code]
$ whois iheavy.com

Registered through: GoDaddy.com, LLC (http://www.godaddy.com)
Domain Name: IHEAVY.COM
Created on: 14-Jul-99
Expires on: 14-Jul-15
Last Updated on: 18-Feb-13

Registrant:
iHeavy, Inc.
Box 5352
New York, New York 10185
United States

Administrative Contact:
Hull, Sean hullsean@gmail.com
iHeavy, Inc.
Box 5352
New York, New York 10185
United States
+1.2125336828
[/code]

2. Authored a Technical Book (pagerank 3)

I authored a book for O’Reilly in 2001 called Oracle and Open Source. This bumped up our ranking from a flat 1 because we got backlinks from O’Reilly’s author blog, a strong authoritative signal to Google.

Why is database administration talent in short supply? They are the Mythical MySQL DBA

Here’s Why I Wrote the Book on Oracle & Open Source.

[quote]
Consistent ownership and use of a domain name, along with backlinks from other authorities in your area of expertise weigh strongly in your favor.
[/quote]

3. Started blogging weekly

In Spring of 2011 I started blogging regularly. This was an effort to build out my services business, solidify my voice, and bring prospects and customers to my site.

[mytweetlinks]

4. Installed Google Analytics & Feedburner

It might seem crazy but to that point I didn’t track much. Without metrics you don’t know which pages users are visiting, how long they’re staying, or where they’re converting.

Also take a look at: Why Generalists are Better at Scaling the Web

A conversion – for those out of the analytics loop – is when a user does something you want them to do. For an e-commerce site, they buy or start the process of buying. For a services website it could be visiting your about page, downloading a pdf or e-book, or signing up for a newsletter.

5. WordPress SEO plugin

WordPress is a great publishing platform. Among the many plugins to choose from, Yoast SEO is a very important one to include. It exposes all the hidden SEO fields and functions in a powerful way. Edit your short description, keywords & categories, and a lot more.

Check out: A CTO Must Never Do This

It also helps you frame and think about how your content is seen both by search engines, and searchers alike.

6. Keyword research

A little keyword research goes a long way. You might be a subject matter expert in a given field, but if you don’t know how your customers search, you can’t help them find you. Remember they don’t know what you do, so likely don’t know jargony terms or the vernacular your expertise uses within.

SEO Moz has some great tools to help you, along with Wordtracker and Google has a keyword research tool for adwords.

[quote]
Strong titles should make you click to open the post. A dash of keyword research and regularly watching your analytics should be revealing. Give your readers what they want!
[/quote]

See also: My Blog Traffic is Growing Using these 5 Killer Tactics

7. Watch your analytics (pagerank 4)

After about six months of regular blogging, and a few viral hits, our pagerank went up to 4. What was I doing? All of the above, plus watching analytics closely. I asked myself questions about visitors:

o Which pages do they like and why?
o What causes them to stick around?
o What causes bounce rate to go down?
o What causes them to convert?

I found that adding links to relevant content right in the text helped reduce bounce rate right away. This was a real discovery that I could apply everyday.

Hiring a Cloud Engineer? Get our 8 Questions to Ask an AWS Expert for Recruiters, Managers & candidates alike

I also noticed that good content helped, but directly imploring readers to signup to the newsletter got regular conversions daily. Huh, that was a surprise since all along I had the signup form along the right column. Go figure.

8. Guest posting

Guest posting is great. It allows you to work with real publications who have paid editors. These folks with provide you with a more professional view, and that is great for your own writing and understanding your audience. The hardest thing to learn is how to write to a broad audience.

You’ll also of course get a backlink which is a major authority signal to the search engines. You might get paid a bit too, but your mileage may very.

I managed to do some regular writing for INFOWORLD and Database Journal. I wrote one piece for ChangeThis.com called Get Out of the Technology Hex.

From there I signed a syndication deal with Developer Zone. Since I have embedded links to content, that brings me regular traffic, even besides my profile, and the authoritative backlink.

Lately I’m working on some stuff for Gigaom and ACM’s Queue. Steady as she goes!

9. Get on the aggregators

Most likely your industry has some sort of aggregator site which will carry your RSS syndication feed. Get on those. That will drive regular traffic and RSS feedburner subscribers. We’re on Planet MySQL and it’s been great!

10. Patience, rinse and repeat

Easier said than done, I know. If you want this to happen overnight, you had better get onto the real world celebrity track. Otherwise work on your content, work on your voice, write clicky titles and keep your audience interested with solid content. And watch your traffic grow!

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Here’s a sample

Make MySQL clustering work for you

We’ve told you all about MySQL mult-master replication’s limitations. If you write to two masters it is bound to fail for myriad reasons.

Now what? Do what the pros do that’s what.

A. Don’t write to both masters

Using multi-master replication works great as long as you do so in active-passive mode. Never write to two masters at the same time. When you promote a new side to being a master, do it carefully:

o Put application into read-only mode temporarily (disable editing)
o Set current master to read-only mode
o Change all webservers to point to new master
o Change new master to read-write mode
o Turn application edit back on

B. Use Statement based replication

Statement based replication has been around forever, it’s proven and it’s limitations are well known. It’s been tested for many many edge cases. You know what you’re getting.

o supports online schema changes

Perform alter tables, add or drop columns or modify indexes on the inactive master. Once those changes are complete, promote the inactive side to being primary master and perform the changes again on that master. All with zero downtime to your application. Statement based replication makes this easy as differences to columns, column order and so forth won’t break things.

o facilitates point-in-time recovery

With the SQL of all your queries being written directly to binlogs, the forensic process of reconstructing things during point-in-time recovery becomes much easier.

o perform regular checksums against current master

Use the pt-table-checksum tool to verify data. Integrity checking will help you avoid any data drift and keep everything tightly in sync.

C. Degrade gracefully – build for a read-only mode

- facilitates failover
- facilitates maintenance
- insurance plan
- disaster recovery
- levers & dials for the operations team

D. Put Memcache between application and database

- reduces load on database
- reduces latency for remote write master
- key value stores are easier to scale
- continue to get fast application response

E. Misc recommenations

o use provisioned IOPS for the database servers
o use percona server 5.6
o use multi-threaded slaves
o use semi-syncronous replication
o using percona toolkit checksum tool to provide data integrity checks
o using percona toolkit heartbeat to check slave lag
o use percona xtrabackup to do hotbackups
o perform firedrills to restore backups
o perform firedrills to do point-in-time recovery

Like our content? We publish an exclusive monthly Scalable Startups where we share more tips and special content. Here’s a sample

Sales sucks, but then I learned

Are you a developer or startup entrepreneur? Have you ever been frustrated with some of the claims made by the sales team or lacked the patience or ability to communicate across departments?

Join 4000 others and follow Sean Hull on twitter @hullsean.

Just out of college

Just out of college I got a job as a Macintosh Software Developer for a small firm outside of University at Buffalo. It was a ten person company, and half of us were on the technology side of the house. I was doing C++ & Graphical Interface design & coding.

Why is it so hard to find operations & devops talent? Enter the Mythical MySQL DBA!.

Sales is “ahead” of engineering

Besides coding, I also fielded support calls from customers which brought me perspective on both what they wanted, and where they struggled with the software. Our app helped consumers and nutritionists track diet & exercise.

[quote]
The sales team made promises of technology the company wasn’t capable of delivering. Meanwhile the engineering team was sent scrambling to answer to those promises.
[/quote]

Soon I was fielding questions from customers asking when the new heart rate monitoring would be available. I followed up by talking with the team lead & chief architect. He had no plans of building such a feature, nor did we even know how it would be possible!

[mytweetlinks]

Searching for a database expert? MySQL DBA Interview Guide.

We checked in at our weekly meetings, and the CEO explained that the sales team was simply “ahead” of engineering. Years ahead apparently even of the technology that was possible at the time!

Fast forward 5 years to professional services

A half decade later I’m doing independent consultanting for dot-coms. Much of my business came from word of mouth. Helping a firm out of a pinch, speeding their site so they can handle 10x customers on the same servers and suddenly everyone is your friend!

Too many customers is a good problem to have right? For hyper growth companies there are 5 Things Toxic To Scalability .

But all is not smooth sailing in the freelance consulting world. The dot-com crash comes along and budgets are squeezed tighter. Business spend is reduced and every dollar is scrutinized. I learned to speak to prospects about savings and personalized service, advantages of lower overhead, and real return I could provide. At the end of the day if they’re not buying, your services aren’t worth their cost!

[quote]
The sales process should inform the business about what customers really want. In a successful startup there is communication back and forth with engineering and business units so all are working in harmony.
[/quote]

Full Circle

Now coming full circle I have a wide perspective on business. I understand the engineering fundamentals, and the limitations of technology. I also have a grasp of product, and how business units must manage the bottom line, and deliver to customers or else perish in the marketplace.

Looking for a top flight cloud engineer? Grab our Amazon EC2 Interview Guide.

For the two to achieve a happy marriage, you must bring a balance of execution & technical debt, with satisfying a real customer need in the marketplace. And therein lies the innovation & startup sweet spot!

You might also like our piece Why generalists are better at scaling the web

Like our content? We publish an exclusive monthly Scalable Startups where we share more tips and special content. Here’s a sample