Tag Archives: hiring

Locking down cloud systems from disgruntled engineers

medieval gate fortified aws

I worked at a customer last year, on a short term assignment. A brilliant engineer had built their infrastructure, automated deployments, and managed all the systems. Sadly despite all the sleepless nights, and dedication, they hadn’t managed to build up good report with management.

Join 32,000 others and follow Sean Hull on twitter @hullsean.

I’ve seen this happen so many times, and I do find it a bit sad. Here’s an engineer who’s working his butt off, really wants the company to succeed. Really cares about the systems. But doesn’t connect well with people, often is dismissive, disrespectful or talks down to people like they’re stupid. All burns bridges, and there’s a lot of bad feelings between all parties.

How to manage the exit process. Here’s a battery of recommendations for changing credentials & logins so that systems can’t be accessed anymore.

1. Lock out API access

You can do this by removing the administrator role or any other role their IAM user might have. That way you keep the account around *just in case*. This will also prevent them from doing anything on the console, but you can see if they attempt any logins.

Also: Is AWS too complex for small dev teams?

2. Lock out of servers

They may have the private keys for various serves in your environment. So to lock them out, scan through all the security groups, and make sure their whitelisted IPs are gone.

Are you using a bastion box for access? That’s ideal because then you only have one accesspoint. Eliminate their login and audit access there. Then you’ve covered your bases.

Related: Does Amazon eat it’s own dogfood?

3. Update deployment keys

At one of my customers the outgoing op had setup many moving parts & automated & orchestrated all the deployment processes beautifully. However he also used his personal github key inside jenkins. So when it went to deploy, it used those credentials to get the code from github. Oops.

We ended up creating a company github account, then updating jenkins with those credentials. There were of course other places in the capistrano bits that also needed to be reviewed.

Read: Is aws a patient that needs constant medication?

4. Dashboard logins

Monitoring with NewRelic or Nagios? Perhaps you have a centralized dashboard for your internal apps? Or you’re using Slack?

Also: Is Amazon too big to fail?

5. Non-key based logins

Have some servers outside of AWS in a traditional datacenter? Or even servers in AWS that are using usernames & passwords? Be sure to audit the full list of systems, and change passwords or disable accounts for the outgoing sysop.

Also: When hosting data on Amazon turns bloodsport?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

NYC Tech Firms Are Hiring – Map

Made In NY - Startups Hiring

If you haven’t noticed how much the NYC tech scene has grown recently, I’m afraid you’ve been hiding under a rock. It’s simply incredible.

Take a look at Mapped In NY a google maps mashup of the growing list popularized by the NY Tech Meetup called Made In New York.

Join 5000 others and follow Sean Hull on twitter @hullsean.

[mytweetlinks]

Having been around during the first dot-com boom back in the late 1990’s this is even more exciting to see. Despite the recession, New York’s economy is truly thriving!

[quote]
New York’s Startup scene is truly thriving with a whopping 1263 firms, many of which are hiring.
[/quote]

Why is database administration talent in short supply? They are the Mythical MySQL DBAs

Also take a look at: Why Generalists are Better at Scaling the Web

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Here’s a sample

Cloud DBA and Management Interview

What does a cloud computing expert need to know? This is the last of a three part guide to interviewing for a cloud operations position. You can find them here – part one Operations Interview and part two Deployment Interview.

Here’s my guide to do just that.

1. Database administration experience

Although in some shops the DBA role is a completely separate one, there are many others where the Linux and Operations teams manage these services as well. We do have a some other material Oracle DBA Interview questions and MySQL DBA Interview Guide. Here’s a taste of what to expect.

o What is RAID? Which type is best?

RAID is a way to share a whole bunch of disks on one server. Databases like Oracle or MySQL do a lot of writing and reading from disk. If there are more disks sharing this work, it’s like you have more waiters in your restaurant. Faster serivce.

Although some folks still hang onto RAID 5 as an option, it’s generally a very bad one. It has a serious write penalty because of parity checking it must perform. Most databases do a lot of writing, even when user transactions are not doing INSERT or UPDATE. What’s more if a disk fails, RAID 5 although technically online, will be so slow as to be effectively unusable while the long slow rebuild happens.

What’s the answer then? RAID 10! It mirrors each volume, and then stripes across those mirrored sets. Fast I/O, fast recovery. Done & done.

o What are the tradeoffs with more indexes versus fewer?

In all relational databases, you build indexes on data. Indexes are just like the ones you think of in the yellow pages, phonebooks of yore. An index on first name means you can look up Obama by Barack as well. Index on street addresses means you can lookup on the White House. So the more indexes you have, the more different ways you can search for & fetch what you want.

On the other hand the penalty here, is that whenever you add new data & records to this database, all those indexes must be updated. That’s overhead, which slows down writes.

So the tradeoff is more indexes – faster fetching, slower writing. Fewer indexes slower fetching, faster writing.

o What do NoSQL databases eliminate? How do they achieve great speed?

There are quite a few different types of NoSQL databases. So I’m generalizing quite a lot here. One thing NoSQL databases eliminate is the ability to JOIN data across different columns. By removing this great feature of relational databases, they dramatically simplify the underlying implementation. No free lunch!

What else? Many of these databases cut corners on what’s called durability. What is durability? Imagine you are in a lecture hall and bring your notebook or are waiting tables, and taking orders. It might be quicker to do so without writing things down. You keep it all in your head. Great, but what if you forget something? You have to go ask for the order again! It may be faster, but more prone to error. Losing data is not something to be taken lightly. NoSQL databases don’t always flush data to permanent storage.

[quote]
Whether or not an web operations candidate uses command line may seem like a small issue. But it speaks to what their DNA is, and the strength of their foundation. Strength and comfort on the command line is key.
[/quote]

o What is Amazon RDS? When should I use it?

Amazon has a managed relational database solution called RDS. It’s basically MySQL, Oracle or SQL Server, but modified so you can’t shoot yourself in the foot. Administrative tasks are simplified, but so are your configuration options.

I wrote an in-depth Amazon RDS use cases article. It mostly covers MySQL, but the general rules apply to Oracle & SQL Server. At the end of the data RDS is a lot less configurable and flexible. But if you don’t have a regular DBA on staff, it will probably simplify your administration of these servers.

o What are read-replicas? What about Multi-az?

Read-replicas are read-only copies of your data. Using MySQL these are fairly stock master-slave configurations. Note since they’re the standard technology, they’re still asyncronous. So yes the read-replica can lag behind.

Multi-az is a proprietary technology, and Amazon doesn’t disclose what’s under the hood. However it’s likely running on top of something like DRBD which is a distributed filesystem. This allows the underlying disk I/O to be mirrored across the internet, and to another availability zone. You’ll enjoy syncronous copies of your data, and no data consistency problems. Keep in mind those that the alternate server is offline or cold and can take time to come online.

o What is the primary bottleneck of hosting databases in the cloud? How has Amazon recently addressed this?

As I explained above disk I/O remains the largest bottleneck for relational databases, even if the entire dataset fits in memory. Why? Because sorting, joining, and rearranging data can take orders of magnitude more memory to magically do in memory. And that’s not even talking about durability guarentees.

The cloud has traditionally lagged quite a lot behind physical servers in terms of disk I/O so some internet firms have shyed away from moving to the cloud. EBS volumes were typically limited to a few hundred IOPs.

Amazon’s recently announced Provisioned IOPs. It’s a mouthful of a name for a very big development. It means you can provision how fast you want those virtual disks to be. For individual volumes the limit seems to be 2000 IOPs but you can also software raid across many of those virtual disks. For Amazon RDS the limit is reportedly 10,000 IOPs. This new feature will make a huge difference for hosting large high I/O databases in Amazon’s cloud.

2. Architecture & Management Questions

o Why does the API battle between Amazon & Eucalyptus (FOSS) matter?

As large applications are architected to build hardware components, and resources in the cloud, the API they work through becomes key. Sticking to an open standard for this API means you can change cloud vendors and/or build on multiple ones. We talked about this multi-cloud solution as a key way to avoid outages like AirBNB and Reddit experienced when AWS had an outage.

Following on the heels of that article, we were quoted about multi-cloud by Brandon Butler in his Network World piece .

o Do you use command line tools? Why?

A good web operations candidate should be very comfortable with command line tools. Everything in Linux is command line. It’s like broadway acting to movie acting, or literature to books. It’s the original source, much more powerful, what’s more it indicates and requires much stronger theoretical knowledge of the underlying systems being managed.

o What can go wrong with backups? How do we test them?

Everything can go wrong with them. They can fail to complete. Be backups of the wrong service or resource. Even the backup software itself can have bugs. The only way to sleep well at night is if you run firedrills and restore your application and data top to bottom.

o Should we encrypt filesystems in the cloud? What are the risks?

This depends on your environment and how sensitive your data is. If you’re collecting credit card data for instance, it may be key. However some surprising blips may push other applications to encrypt as well. Bugs in the hypervisor could potentially make your data vulnerable. What’s more if the cloud provider gets subpeonaed, it may well capture your server and data into the net. Better safe than sorry. Remember you don’t know where your data actually resides, but you do control who has access if you’re encrypted.

We wrote a very in-depth piece on Deploying on Amazon EC2 where we discuss questions such as encryption in more depth.
o Should we use offsite backups?

It’s definitely worth doing this. One more layer of insurance.

o What is load balancing? Why is it difficult with databases?


Load balancing puts a digital traffic circle into your infrastructure, giving you two roads or paths to resources. However those resources have to be exactly the same. With databases you are constantly writing to tables, and updating records. When you scale those horizontally, it becomes impossible to keep track of changes.

[quote]
Relational databases are inherently difficult to scale. Most environments scale a single authoritative master vertically, and add multiple read-only slaves horizontally to allow the appplication to serve more customers.
[/quote]


o Why use a package manager? Can we install from source?

Package managers simplify the installation of software components. A team such as Redhat, Ubuntu or Debian builds a distribution, and compiles all components storing them in a repository. Installing packages this way allows your setup to be standard across servers. This allows more automation, and is simpler for another admin to figure out what you have, down the line when it passes to someone elses shoulders.

Installing from source is generally a bad idea. Although it allows you to tweak and configure each piece of software the way you want, tightly and efficiently, it also means everything is custom. No commoditization advantages.

o What is horizontal scalability?

This involves adding more hardware, more individual servers to service the same application and users.

o What is vertical scalability?

This means scaling up or growing your existing single server, so it is larger, has more memory, cpu or faster disk.

o What can go wrong with automatic failover?

Just about everything. Applications and services can stall, disks can fail, servers can hang. What’s more networks can exhibit latency. Automatic failover is ultimately a piece of software or algorithm trying to diagnose and handle situations. And it does so based on a very small list of rules or heuristics. The real world is messy, so this can often lead to false failure detection, and potentially loss of data.

o How do cloud vendors implement vertical scalability?

This may vary dramatically between cloud providers. Ultimately, however since virtualization allows you to boot a disk image onto any hardware, you can snapshot your current root volume or disk and then boot it on another server, one that is larger, smaller and so forth. About the only thing you need to watch out for is 32 versus 64 bit questions.

If you haven’t already, don’t forget to checkout the rest of this series – part one Operations Interview and part two Deployment Interview.

Read this far? Grab our newsletter – startup scalability.

Why do people leave consulting?

Join 12,100 others and follow Sean Hull on twitter @hullsean.

As a long time freelancer, it’s a question that’s intrigued me for some time. I do have some theories…

First, definitions… I’m not talking about working for a large consulting firm. Although this role may be called “consultant”, my meaning is consultant as sole proprietor, entrepreneur, gun for hire or lone wolf.

1. Make more money in a fulltime role

I’ve met a lot of people who fall into this trap. They take a fulltime role simply because it pays better. That raises a lot of questions…

o Are you pricing right?

You could be pricing to high to get *enough* work. You may also be pricing too low to cover benefits, health insurance and so forth. Or perhaps you can’t sell to your rate. You can be smart skills-wise, but do you feel your clients pain? Are you good at being a businessman? Consistent?

o Can you sell, and put together an appealing proposal?

o Can you execute to the clients satisfaction?

o Can you followup consistently while accounts payable gets tied up in knots?

o Can you followup if your client executes past their spend?

Running a business is complicated, and a lot of expenses can be hard to juggle. You will find times when a client may have spent a little faster than their revenue, and have trouble finding money when the invoice arrives. Followup, patience and persistence is key.

Read: Why high availability is so very hard to deliver

Want more? We wrote an in depth 3 part guide to consulting.

2. Make a consistent paycheck in a fulltime position

o Are you networking enough?

If you take a longterm gig and get comfortable, your pipeline can dry up. And your pipeline is the key to your longterm strength, and regular business. You must get out there, and let people know about you, your services, and your availability.

If you don’t network regularly, post across the web, engage on social media channels, blog regularly and so forth, you’ll likely just land a series of 6-12 month fulltimeish gigs through recruiters or headshops.

Related: 5 ways to evaluate independent consultants

[quote]Being a freelancer or entrepreneur involves wearing many hats. Finding business involves networking & marketing. Delivering to their needs involves emotional intelligence. And actually getting paid on time is a whole artform in itself. Leave a good taste in their mouth and your reputation will spread quickly by word of mouth.[/quote]

o Do you really *LIKE* being an entrepreneur?

Are you consistent? Consulting is like running a marathon, if you burn out you may give up!

Have a large web property or application which is experiencing some growing pains? Take a look at how we do performance reviews. It may be just what you’re looking for.

Related: MySQL interview guide for managers and candidates alike

3. Do you like the lifestyle of larger corporate environments?

o Fulltime roles allow for much more jedi sword play. Maneuvering up the ranks involves relationship building as much as consulting, but with a more well defined ladder to climb.

o Sometimes you’ll find pass the buck and pointing fingers quite common.

o There are roles involving managing people and processes. These less often lend themselves to short term or situational consulting arrangements. If you lean towards those roles

Trying to hire top tech talent? Here’s our MySQL DBA hiring guide & interview questions

[quote]Working as a sole proprietor for a couple of decades has taught me to be very entrepreneurial. It is every bit about building a real-world startup[/quote]

4. Want to do more cutting edge & at the keyboard work

Consulting can and often does allow you to bump into the latest technologies, and get your feet wet with what cutting edge firms are doing. However in a fulltime role you can more completely immerse yourself in the technology, and those long term solutions.

Also: Why devops talent is in short supply

o You can take part in R&D – Google’s 20% projects, for example

o You can build hypothetical projects

o You can work in more idealistic environments, operations and even lectures & training

Though you can certainly do all of this as a freelancer, you have to build enough capital, and so forth to make it work.

Juggling job roles as a consultant isn’t easy. What a CTO must never do.

5. Don’t like running a small business

Consulting as a sole proprietor and staying in business for almost twenty years, I’ve learned that it is every bit about running a small business or startup.

A. Acquiring customers, networking, marketing
B. Understanding their needs and delivering to improve their position
C. Pricing in a your customers understand
D. Offering value to your customers, at a competitive price
E. Managing relationships so your brand or reputation precedes you
F. Making sure payments and invoicing isn’t a hurdle, followup
G. Pacing yourself like a marathon runner – keep doing what you’re doing right

Read this far? Get our scalable startups monthly newsletter. We cover these topics in detail, year in and year out.

Hiring is a numbers game


On a recent twitter chat (#hfchat) I posted some comments about hiring. Some folks were complaining that they had applied to various jobs, and not heard back.

I commented…

[quote]Apply for a job and don’t hear back, it’s nothing personal[/quote]

In today’s market, there are hundreds of job applicants for every position. Sad to say, but that means things become a blur after a while. There’s less chance to sift each candidate and find out who they really are or what they really know. It’s more about keywords, and buzzwords if you must, to get your foot in the door.

But there is a flip side to this coin, which I think many job seekers forget sometimes.

[quote]job seekers: apply to enough positions so that you forget to followup sometimes.[/quote]

Imagine that, you’ve applied to so many positions and heard back from a bunch that you take it for granted it a bit that you’ll surely here back from others.

We might also argue that to some degree, especially early on when you are building your reputation, this numbers game is at play in consulting too. The more people you get in front of the more you’ll practice honing your message. At the same time more people will find out about you, and talk about you. We have a consulting 101 guide we know you’ll enjoy.

If I were to offer a few other nuggets of advice I’d suggest:

o Hone your resume for keywords and search
o Test your linkedin profile – search those keywords
o Edit your cover letter to be short & punchy!
o Throw in some buzzwords – a little rockstar this and agile that!

Looking for specific advice for tech jobs? We wrote a hiring guide for a MySQL DBA. These are equally helpful to job candidates, and those who are interviewing them. Anyone know why are operations & MySQL DBAs so hard to find these days?.

Read this far? Grab our newsletter scalable startups.

Why you should attend Percona Live 2012

What I loved about Percona Live 2011

Last year I was excited to go to Percona Live for the first time in NYC. I arrived just in time to hear Harrison Fisk from Facebook speak about some of the awesome tweaks they’re running with MySQL there. It’s not everyday that you get to hear from top MySQL engineers how they’re using the technology and what their biggest challenges are. If they can make MySQL hum, so can the rest of us!

Afterward, outside in the foyer, I ran into all sorts of luminaries in the MySQL space. Percona folks like Peter Zaitsev & Vadim Tkachenko, plus other big names like Baron Schwartz, Harrison, and Ronald Bradford. I ran into people from firms like Yahoo, Google, Daniweb, Pythian, SkySQL & Palomino.

You might also like our Setup MySQL Replication with Hotbackups as well as How to deploy MySQL on Amazon EC2 servers articles.

What to expect at Percona Live NYC 2012

This years event next month features rockstar engineers from an incredible lineup of firms including Etsy, New Relic, Youtube, Paypal, Tumblr, SugarCRM, Square, and of course a few from Percona themselves. I promise you this, these talks won’t be salesy or in any way a waste of your time and money. They will be thoroughly technical talks, with cutting edge insights and advice from those in the trenches using the technology everyday.

If I wasn’t heading to Oracle Open World for the publishers seminar & MySQL Connect, I would most certainly be there. In fact I had originally been slated to talk about point-in-time recovery in MySQL. Oh well, I’m sure I’ll catch you at the Percona Live in April 2013.

If you do decide to attend please enjoy a 15% discount with code “SeanHull” !

Looking to hire top MySQL talent? Check out our MySQL DBA Hiring Guide with advice for managers, recruiters, and candidates too! We also have an enduringly popular article about the mythical MySQL DBA and why they’re hard to find.

Also if you’ve read this far, please grab my newsletter scalable startups.

Road War Story – Hacking Inflight Solutions

 

The 2am phone call

Last summer I got my call from the president at 2am.  Actually it was my former boss at Hollywood Reporter.  I had worked there three months previous, and they had since hired an outsourced DBA solution.  Big outsource, big chops.  And big fail.

 

 

12 hours to liftoff

I was scrambling to pack my luggage to go on summer vacation.  I was bound for SF at the moment and my flight was leaving in the morning.  I was trying to wrap up loose ends and my former boss was entreating me – “Can you help us?  Our replication setup has just melted down.  We need you to cleanup the mess.”

The so-called pain point

After a few more early am Skype calls and chats, the team retired for the night and I finished packing my bags.   I snuck in an hour of sleep then headed straight for the airport.  Once through airport security, I bust out my laptop and start logging into the servers.

Although the exact cause of the replication failure remained opaque, I was asked to scan both databases and determine differences.  Out of my toolbox comes the perfect tool for the job, pt-table-checksum, and I run scans on both databases.  (For the curious, here is how) I find countless records different between the two databases.

Now my flight is boarding, so I pack up the laptop and find my seat.  As soon as the seat belt lights flash off, I’m flipping open my macbook at getting inflight wifi working.  Through the flight I’m on SKYPE with the team, with command line terminals open to the servers.  Discuss, debug, troubleshoot – rinse, repeat.

From there I write up a report and explain to the team & CTO the problem.  Syncing that many different records is too risky.  We’d have to review all the statements one-by-one.  I’d rather rebuild replication from scratch.

From there the CTO gives the go ahead, and with the help of Percona’s xtrabackup to do online hotbackups, we are able to fix replication without downtime. Amen to that!

Now with our primary MySQL database and secondary read-only one back online, things calm down a lot.  Traffic returns to a smooth predictable 2 million pageviews per day.  That’s smooth and predictable on a site that gets 50 million a month!   The database loads are calm and steady, as our all of our nerves.   In the coming days we continue to monitor the situation, and write up lengthly root cause analysis of the situation.

Freelancers & Consultants take note

To my recent Consulting 101 article I would add the following bullets:

  • Responsiveness is crucial
  • Be there when a client needs you, and your value goes up.  Be reliable, and loyal to those you’ve worked with.

  • Be an integral part of your team
  • Everyone knows eachother virtual or in real life, and are comfortable with the parts they play.  A team that can work together is crucial, whether it’s all fulltime folks, some consultants, some outsourced or wherever they may be.  Each has a role to play, and communication and team work brings it all together.

  • Have laptop will travel
  • I never turn down a job.  There will be plenty of time for vacations and rest when the dust settles.

  • Don’t break things
  • If there is any doubt in your mind, test, and test again.  Always err on the side of caution.  Check thrice and cut once!  If you haven’t done an operation ten, twenty or fifty times before, experiment a few more times with options to be sure.  And most importantly, if you don’t login to the systems you’re working on regularly, you better make damn sure you’re on the right box, flipping the right switch, and moving the right dials.  With modern internet infrastructure, there are a hundred ways to push the wrong red button!

    CTOs and Directors of Operations take note

  • Small & Nimble wins the day
  • I’ve used this value proposition before when speaking to prospects.  You can hire a big firm, and be a small fish to them.  Small fish means you’re gonna get less attention.  OR you can hire a small firm or contractor.  Then you’ll be a big fish to him or her.  Guess what?  If you’re their big fish, they’re gonna pay extra attention to every move they make, and ensure things don’t break.  They can’t afford mistakes, not to their reputation or their bottom line.  Not like the big boys can.

  • Choose passionate, yet conservative & risk averse operations folks
  • In developers you’re building technology, features, and forging ahead into new solutions.  The role is more to create waves, and break barriers.  How can we enable new business processes and so forth?

    In hiring operations personnel you want stability.  Look for individuals who are more risk averse.  This conservative streak is a countering force.  Ops teams are tasked with that job of bringing a steady state to your business services.  They don’t want to wake up at 2am in the morning.

Best of Guide – Highlights of Our Popular Content

We cherry pick the top 5 most popular posts of various topics we’ve covered in recent months.

Top MySQL DBA interview questions (Part 1)

MySQL DBA interview questions

Also find Sean Hull’s ramblings on twitter @hullsean.

MySQL DBAs are in greater demand now than they’ve ever been. While some firms are losing the fight for talent, promising startups with a progressive bent are getting first dibs with the best applicants. Whatever the case, interviewing for a MySQL DBA is a skill in itself so I thought I’d share a guide of top MySQL DBA interview questions to help with your screening process.
It’s long and detailed with some background to give context so I will be publishing this in two parts.

The history of the DBA as a career

In the Oracle world of enterprise applications, the DBA has long been a strong career path. Companies building their sales staff required Peoplesoft or SAP, and those deploying the financial applications or e-business suite needed operations teams to manage those systems. At the heart of that operations team were database administrators or DBAs, a catchall title that included the responsibility of guarding your businesses crown jewels. Security of those data assets, backups, management and performance were all entrusted to the DBA.

In the world of web applications, things have evolved a bit differently. Many a startup are driven only by developers. In those smaller shops, Operations tasks are designated to one developer who takes on the additional responsibility of managing systems. In that scenario, Operations or DBA duties becomes a sort of secondary role to the primary one of building the application. Even in cases where the startup creates a specific operations role with one person managing systems administration, chances are they don’t also have DBA experience. Instead, these startups are more likely to manage the database as a typical Linux application.

When I grow up I (don’t) want to be a MySQL DBA

Where do they come from, and why don’t a lot of computer science folks gravitate towards operations, and DBA? This may be in part due to and the romance of certain job roles which we discussed in a past article, The Mythical MySQL DBA. This pattern appeared a lot in the Oracle world as well. Many folks who were career DBAs actually moved to that role from the business side. In fact you’d find that many didn’t have a computer science or engineering background in the first place. In my experience I saw many Linux and Unix administrators with a stronger foundation who would fit into the DBA role but were simply not interested in it. The same can be said of the MySQL side of the house. Computer science grads don’t get out of school aiming for a career in ops or as a DBA because it has never been regarded as the pinnacle. It’s typically the PROGRAMMERS who become the rockstars in a cool startup.

But as the Internet grows into a richer and more complex medium, things are changing. People talk about scalability, high availability, zero downtime and performance tuning. When brief outages cost millions in losses expectations are very high and that requires skilled, experienced DBAs.

We’ve made a list of comprised of skill questions, general questions and ‘good-to-know’ questions. Have fun grilling your candidate with them, although bear in mind that with interviews it’s not about knowing it all, rather how the person demonstrates critical thinking skills.

Skills Questions

  1. Why are SQL queries so fundamental to database performance?
  2. This is the one question which a DBA should have an answer to. If they can’t answer this question, they’re unlikely to be a good candidate.

    After a MySQL server is setup and running, with many of the switches and dials set to use memory, and play well with other services on the Linux server, queries remain an everyday challenge. Why is this?

    SQL queries are like little programs in and of themselves. They ask the database server to collect selections of records, cross tabulate them with other collections of records, then sort them, and slice and dice them. All of this requires MySQL to build temporary tables, perform resource intensive sorts and then organize the output in nice bite size chunks.

    Unfortunately there are many ways to get the syntax and the results right, yet not do so efficiently. This might sound like a moot point, but with modern websites you may have 5000 concurrent users on your site, each hitting pages that have multiple queries inside them.

    What makes this an ongoing challenge is that websites are typically a moving target, with business requirements pushing new code changes all the time. New code means new queries, which pose ongoing risks to application stability.

  3. Indexes – too many, too few; what’s the difference?
  4. Indexes are very important to the smooth functioning of a relational database. Imagine your telephone book of yore. I can look up all the people with last name of “Hull” in Manhattan because I have the proper index. But most yellow pages don’t include an index for *first* names even though they might occaisionally come in handy, for example with the names “Star” or “Persephone”.

    You can imagine that, if you had a phone book which you maintain and update, everytime you add or remove a name you also have to update the index. That’s right, and the same goes for your relational database.

    So therein lies the trade off, and it’s an important one. When you are *modifying* your data, adding, updating or removing records, you must do work to keep the index up to date. More indexes mean more work. However when you’re looking up data or *querying* in SQL speak, more indexes mean more ways of looking up data fast. One more trade off is that indexes take up more pages in your phonebook, and so too they take up more space on disk.

  5. Backup & Recovery – explain various types & scenarios for restore
  6. Backups come in a few different flavors that the DBA should be familiar with.

    Cold backups involve shutdown down the database server (mysqld) and then backing up all the data files by making a copy of them to another directory. To be really thorough, the entire datadir including binlogs, log files, /etc/my.cnf config file should also be backed up. The cold backup is a database in itself, and can be copied to an alternate server and mounted as-is.

    Logical backups involve using the mysqldump tool. This locks tables while it runs to maintain consistency of changing data, and can cause downtime. The resulting dump file contains CREATE DATABASE, CREATE TABLE & CREATE INDEX statements to rebuild the database. Note the file itself is not a database, but rather a set of instructions which can tell a MySQL server *HOW* to reconstruct the database. Important distinction here.

    Hot backups are a great addition to the mix as they allow the physical database data files to be backed up *WHILE* the server is up and running. In MySQL this can be achieved with the xtrabackup tool, available from Percona. Despite the name, it works very well with MyISAM and InnoDB tables too, so don’t worry if you’re not using xtradb tables.

    There are a few different restore scenarios, and the candidate should be able to describe how these various backups can be restored, and what the steps to do so would be. In addition they should understand what point-in-time recovery is, and how to perform that as well. After restoring one of the above three backup types, the DBA would use the mysqlbinlog utility to apply any subsequent transactions from the binary logs. So if the backup was made at 2am last night, and you restore that backup, the mysqlbinlog tool would be used to dig up transactions since 2am, and apply them to that restored database.

  7. Troubleshooting Performance
  8. Since this is an ongoing challenge with relational databases, a good grasp of it is crucial. One way to challenge the candidate would be to describe a recent performance problem you experienced with your infrastructure, and ask them how they would go about resolving it.

    If they struggle with the particulars of what you ran into, ask them to describe a big performance challenge they solved, what the cause was, and how they performed analysis.

    Typically, first steps involve mitigating the immediate problem by finding out what changed in the environment either operationally or code changes. If there is a bug that was hit, or other strange performance anomaly, the first stop is usually looking at log files. MySQL server error logs, and the slow query log are key files. From there, analyzing those files during the timeframe where problems occurred should yield some clues.

    You might also hope to hear some comment about metrics collection in this discussion. Tools such as cacti, munin, opennms, or ganglia are invaluable tools for drilling down on a past event or outage, and sifting through server stats to find trouble.

  9. Joins – describe a few kinds and how the server performs them
  10. A basic understanding of INNER JOIN and OUTER JOIN would be a great start. A simple example might be employees and departments. If you have four employees and two departments, an INNER JOIN of these tables together will give you the departments employees belong to. Add another employee without assigning her to a department, and the inner join won’t display her. Further adding a new department which doesn’t yet contain employees won’t display either. However performing an OUTER JOIN will give you those matches with null in the department field, and null in the employee field respectively.

    Thought of with another example, take a credit card company. One tables contains cardholders identity, their number, address, and other personal information. A second table contains their account activity. When they first join, they don’t have any monthly statements, so an INNER JOIN of cardholders with statements will yield no rows. However an OUTER JOIN on those two tables will yield a record, with a null for the statements columns.

Feeling like a MySQL expert yet? In Part 2 of Top MySQL DBA Interview Questions we’ll walkthrough four more questions plus a bonus.

How to hire a developer that doesn't suck

xkcd_goodcode
Strip by Randall Munroe; xkcd.com

First things first. This is not meant to be a beef against developers. But let’s not ignore the elephant in the living room that is the divide between brilliant code writers and the risk averse operations team.

By the way we also have a MySQL DBA Interview Questions article which is quite popular.

Also take a look at our AWS & EC2 Interview questions piece.

Lastly we have a great Oracle DBA Hiring Guide.

It is almost by default that developers are disruptive with their creative coding while the guys in operations, those who deploy the code, constantly cross their fingers in the hope that application changes won’t tilt the machine. And when you’re woken up at 4am to deal with an outage or your sluggish site is costing millions in losses, the blame game and finger-pointing starts.

If you manage a startup you may be faced with this problem all the time. You know your business, you know what you’re trying to build but how do you find people who can help you build and execute your ideas with minimal risk?

Ideally, you want people who can bridge the mentality divide between the programmers eager to see feature changes, the business units pushing for them, and the operations team resistant to changes for the sake of stability. Continue reading How to hire a developer that doesn't suck