Tag Archives: linux

Top questions to ask a devops expert when hiring or preparing for job & interview

Strip by Randall Munroe; xkcd.com

Whether your a hiring manager, head of HR or recruiter, you are probably looking for a devops expert. These days good ones are not easy to find. The spectrum of tools & technologies is broad. To manage today’s cloud you need a generalist.

Join 33,000 others and follow Sean Hull on twitter @hullsean.

If you’re a devops expert and looking for a job, these are also some essential questions you should have in your pocket. Be able to elaborate on these high level concepts as they’re crucial in todays agile startups.

Check out: 8 questions to ask an aws ec2 expert

Also new: Top questions to ask on a devops expert interview

And: How to hire a developer that doesn’t suck

1. How do you automate deployments?

A. Get your code in version control (git)

Believe it or not there are small 1 person teams that haven’t done this. But even with those, there’s real benefit. Get on it!

B. Evolve to one script push-button deploy (script)

If deploying new code involves a lot of manual steps, move file here, set config there, set variable, setup S3 bucket, etc, then start scripting. That midnight deploy process should be one master script which includes all the logic.

It’s a process to get there, but keep the goal in sight.

C. Build confidence over many iterations (team process & agile)

As you continue to deploy manually with a master script, you’ll iron out more details, contingencies, and problems. Over time You’ll gain confidence that the script does the job.

D. Employ continuous integration Tools to formalize process (CircleCI, Jenkins)

Now that you’ve formalized your deploy in code, putting these CI tools to use becomes easier. Because they’re custom built for you at this stage!

E. 10 deploys per day (long term goal)

Your longer term goal is 10 deploys a day. After you’ve automated tests, team confidence will grow around developers being able to deploy to production. On smaller teams of 1-5 people this may still be only 10 deploys per week, but still a useful benchmark.

Also: Top serverless interview questions for hiring aws lambda experts

2. What is microservices?

Microservices is about two-pizza teams. Small enough that there’s little beaurocracy. Able to be agile, focus on one business function. Iterate quickly without logjams with other business teams & functions.

Microservices interact with each other through APIs, deploy their own components, and use their own isolated data stores.

Function as a service, Amazon Lambda, or serverless computing enables microservices in a huge way.

Related: Which engineering roles are in greatest demand?

3. What is serverless computing?

Serverless computing is a model where servers & infrastructure do not need to be formalized. Only the code is deployed, and the platform, AWS Lambda for example, takes care of instant provisioning of containers & VMs when the code gets called.

Events within the cloud environment, such a file added to S3 bucket, trigger the serverless functions. API Gateway endpoints can also trigger the functions to run.

Authentication services are used for user login & identity management such as Auth0 or Amazon Cognito. The backend data store could be Dynamodb or Google’s Firebase for example.

Read: Can on-demand consulting save startups time & money?

4. What is containerization?

Containers are like faster deploying VMs. They have all the advantages of an image or snapshot of a server. Why is this useful? Because you can containerize your microservices, so each one does one thing. One has a webserver, with specific version of xyz.

Containers can also help with legacy applications, as you isolate older versions & dependencies that those applications still rely on.

Containers enable developers to setup environments quickly, and be more agile.

Also: 30 questions to ask a serverless fanboy

5. What is CloudFormation?

CloudFormation, formalizes all of your cloud infrastructure into json files. Want to add an IAM user, S3 bucket, rds database, or EC2 server? Want to configure a VPC, subnet or access control list? All these things can be formalized into cloudformation files.

Once you’ve started down this road, you can checkin your infrastructure definitions into version control, and manage them just like you manage all your other code. Want to do unit tests? Have at it. Now you can test & deploy with more confidence.

Terraform is an extension of CloudFormation with even more power built in.

Also: What can startups learn from the DYN DNS outage?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

What engineering roles are most in demand at startups?


I was just reading over StackOverflow’s 2017 Developer survey. As it turns out there were some surprising findings.

Join 33,000 others and follow Sean Hull on twitter @hullsean.

One that stood out was databases. In the media, one hears more and more about NoSQL databases like Cassandra, Dynamo & Firebase. Despite all that MySQL seems to remain the most popular database by a large margin. Legacy indeed!

1. Databases

MySQL is still the most popular db by a large margin 56%. Followed by SQL Server 39%, SQLite 27% and Postgres 27%.

Related: Is Amazon too big to fail?

2. Most popular language

Javascript sits at number one for Web developers, sysadmins & Data Scientists alike. Followed by SQL.

Read: Are SQL Databases dead?

3. Most popular framework

Node.js at 47%. It’s followed by AngularJS at 44%.

Also: 5 ways to move data to Amazon Redshift

4. Most loved database

Redis sits at number one here at 65%, followed by Postgres & Mongo.

Also: Myth of five nines – why HA is overrated

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Does Linux tell the Gilgamesh story of hacker culture?

stephenson command line

Is the command line still essential?
Was Stephenson right about his Linux

It’s been a while since I read Stephenson’s essay on Linux. It’s one of those pieces that’s so well written, we need to go back to it now & then.

Join 28,000 others and follow Sean Hull on twitter @hullsean.

This quote caught my eye right away.

“…as living in a commune, where much lip service was paid to ideals of peace, love and harmony, had deprived them of normal, socially approved outlets for their control freakdom, it tended to come out in other invariably more sinister ways. Applying this to the case of Apple Computer will be left as an exercise for the reader, and not a very difficult exercise.”

Anyone who has read about Steve Jobs will chuckle at this one.

1. The Hole Hawg of the internet

When Stephenson wrote this it was 1999. Linux adoption was growing at internet startups, where cost was everything, and risks could be taken. Remember this was before the two biggest data center companies even existed, namely Google & Amazon. Without Linux, neither would be here today!

hole hawg power

Linux was and is today more like a Hole Hawg for the internet, powerful, but dangerous in the wrong hands. 🙂

“The Hole Hawg is like the genie of the ancient fairy tales, who carries out his masters instructions literally and precisely and with unlimited power, often with disasterous unforseen consequences.”

Also: Why I like Etsy’s site performance report

2. Unix as oral history, our Gilgamesh

gilgamesh unix

“Unix, by contrast is not so much a product as it is a painstakingly compiled oral history of the hacker subculture. It is our Gilgamesh. What made old epics like Gilgamesh so powerful and so long-lived was that they were living bodies of narrative that many people knew by heart, and told over and over again — making their own personal embellishments whenever it struck their fancy.”

Also: Are SQL Databases dead?

3. The bizarre Trinity Torvalds, Stallman & Gates

“In trying to understand the Linux phenomenon, then, we have to look not to a single innovator but to a sort of bizarre Trinity, Linus Torvalds, Richard Stallman and Bill Gates. Take away any of these three & Linux would not exist.”

And indeed we must thank all three of these characters for where the internet stands today. The cloud is possible because of Linux & cheap intel hardware. And the GNU free software to go along with it.

Related: Did MySQL & Mongo have a beautiful baby called Aurora?

4. On the meaning of “Open Source”

“Source files are useless to your computer, and of little interest to most users, but they are of gigantic cultural & political significance, because Microsoft & Apple keep them secret, while Linux makes them public. They are the family Jewels. They are the sort of thing that in Hollywood thrillers is used as a McGuffin: the plutonium bomb core, the top-secret blueprints, the suitcase of bearer bonds, the reel of microfilm.

Read: When hosting data on Amazon turns bloodsport

5. What about Apple today?

“The ideal OS for me would be one that had a well-designed GUI that was easy to set up and use, but that included terminal windows where I could revert to the command line interface and run GNU software when it made sense.”

Stephenson wrote this before Apple has rebuilt their OS to sit on top of Unix. And that’s where we are today with Mac OS X!

Also: Are we fast approaching cloud-mageddon??

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Is automation killing old-school operations?

puppet logo

Join 27,000 others and follow Sean Hull on twitter @hullsean.

I was shocked to find this article on ReadWrite: The Truth About DevOps: IT Isn’t Dead; It’s not even Dying. Wait a second, do people really think this?

Truth is I have heard whispers of this before. I was at a meetup recently where the speaker claimed “With more automation you can eliminate ops. You can then spend more on devs”. To an audience of mostly developers & startup founders, I can imagine the appeal.

1. Does less ops mean more devs?

If you’re listening to a platform service sales person or a developer who needs more resources to get his or her job done, no one would be surprised to hear this. If we can automate away managing the stack, we’ll be able to clear the way for the real work that needs to be done!

This is a very seductive perspective. But it may be akin to taking on technical debt, ignoring the complexity of operations and the perspective that can inform a longer view.

chef logo

Puppet Labs’ Luke Kanies says “Become uniquely valuable. Become great at something the market finds useful.”. I couldn’t agree more.

Read: Are SQL Databases Dead?

2. What happens when developers leave?

I would argue that ops have a longer view of product lifecycle. I for one have been brought in to many projects after the first round of developers have left, and teams are trying to support that software five years after the first version was built.

That sort of long term view, of how to refresh performance, and revitalize code is a unique one. It isn’t the “building the future” mindset, the sexy products, and disruptive first mover “we’re changing the world” mentality.

It’s a more stodgy & conservative one. The mindset is of reliability, simplicity, and long term support.

Also: How to hire a developer that doesn’t suck

3. What’s your mandate?

From what I’ve seen, devs & ops are divided by a four letter word.

That word I believe is “risk”. Devs have a mandate from the business to build features & directly answer to customer requests today. Ops have a mandate to reliability, working against change and thinking in terms of making all that change manageable.

Different mandates mean different perspectives.

Related: What is Devops & why is it important?

4. Can infrastructure live as code?

Puppet along with infrastructure automation & configuration management tools like Chef offer the promise of fully automated infrastructure. But the truth is much much more complex. As typical technology stacks expand from load balancer, webserver & database, to multiple databases, caching server, search server, puppet masters, package repositories, monitoring & metrics collection & jump boxes we’re all reaching a saturation point.

Yes automation helps with that saturation, but ultimately you need people with those wide ranging skills, to manage the complex web of dependencies when things fail.

And fail they will.

Check out: Why are MySQL DBA’s and ops so hard to find?

5. ORM’s and architecture

If you aren’t familiar, ORM’s are a rather dry sounding name for a component that is regularly overlooked. It’s a middleware sitting between application & database, and they drastically simplify developers lives. It helps them write better code and get on with the work of delivering to the business. It’s no wonder they are popular.

But as Ward Cunningham elloquently explains, they are surely technical debt that eventually must get paid. Indeed.

There is broad agreement among professional DBA’s. Each query should be written, each one tuned, and each one deployed. Just like any other bit of code. Handing that process to a library is doomed to failure. Yet ORM’s are still evolving, and the dream still lives on.

And all that because devs & ops have a completely different perspective. We need both of them to run modern internet applications. Lets not forget folks. 🙂

Read this: Do managers and CTO’s underestimate operational costs?

Want more? Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Scalability Happiness – A Quiet Query Log

Peter Van Allen - Pin Drop

Join 7500 others and follow Sean Hull on twitter @hullsean.

There’s a lot of talk on the web about scalability. Making web applications scale is not easy. The modern web architecture has so many moving parts. How can we grapple with the underlying problem?

Also: Why Are MySQL DBAs So Hard to Find?

The LAMP stack scales well

The truth that is half right. True there are a lot of moving parts, and a lot to setup. The internet stack made up of Linux, Apache, MySQL & PHP. LAMP as it’s called, was built to be resilient, dynamic, and scalable. It’s essentially why Amazon works. Why what they’re doing is possible. Windows & .NET for example don’t scale well. Strange to see Oracle mating with them, but I digress…

Linux and LAMP that is built on top of it, are highly scalable and dynamic to begin with.

Also: AirBNB Didn’t Have to Fail During an AWS Outage

Ok, so what’s this got to do with MySQL? Well a LOT.

The webserver tier, the caching layers like memcache & varnish, as well as the search tier solr. These all scale fairly easily because their assets are fixed. Or almost so.

The database tier is different. So what affects performance of a database server? Server size? Main memory? Disk speed? The truth is all of those. But

Also check out: The Sexiest New Feature of AWS Speeds Up EBS

After you setup the server – set memory settings and so forth, it’s a fairly fixed object. True there are parameters to tweak but on the whole there isn’t a ton of day-to-day tuning to do.

Well if that’s true, why does performance take a hit?! As applications grow, the db server slows down, don’t we need to tweak server settings? Do we need new hardware?

Read this: A CTO Must Never Do This

The answer is possibly, but 9 times out of 10 what really needs to happen is queries must be tuned.

In 17 years of consulting that is the single largest cause of scalability problems. Fix those queries and your problems are over.

The Elephant in the Room – Query Tuning

I was talking with a colleague today at AppNexus. He said, so should we do some of that work inside the application, instead of doing a huge UNION or a large JOIN? I said yes you can move work onto the application, but it makes the application more complex. On the flip side the webserver tier is easier to scale. So there are tradeoffs.

I said this:

By and large, if scalability is our goal, we should work to quiet the activity in the slow query log. This is an active project for developers & DBAs. Keep it quiet and your server will run well.

Also: Top MySQL DBA Interview Questions for Candidates, Hiring Managers & Recruiters

Yet I still talk to teams where this is mysterious. It’s unclear. There’s no conviction there. And that’s where I think DBAs are failing. Because this is our subject matter expertise, and if we haven’t convinced developer teams of this, we’re not working together enough. API teams aren’t separate from DBA and operations. Siloing technology departments is a killer…


As you roll out new code, if some queries show up, then those need attention. Tweak the code until the queries drop out. This is the primary project of scalability.

When should I think about upgrading hardware?

If your code is stable, but you’re seeing a steady line rising on load average of the server, *THEN* go up in hardware. Load average means cpu & disk are being taxed. The server can’t keep up.

Related: Should I use RDS or build a MySQL server on AWS?

Devops means work together!

I close with a final point. Devops means bring dev & ops together! Don’t silo them off in different wings. Communicate. DBAs it’s your job to educate Developers about scalability and help with query tuning. Devs, profile new SQL code, test with large datasets & for god sakes don’t use an ORM – it’s one of 5 things toxic to scalability. Run explain and be sure to index all the right columns.

Together we can tackle this scalability thing!

Get some in your inbox: Exclusive monthly Scalable Startups. We share tips and special content. Here’s a sample

10 ways I avoid trouble in database operations

1. Avoid destructive commands

From time to time I’m working with new recruits and bringing them up to speed in operations. The first thing I emphasize is care with destructive commands.

What do I mean here? Well there are all sorts of them. SQL commands such as DROP table & DROP database. But also TRUNCATE and DELETE are all destructive. They’re easy to execute but harder to undo. Think of all the steps it would take to restore from your backup.

If you are logged in as root there are many many ways to shoot your own foot. I hope you know this right? rm has lots of options that can be very difficult to step back from like -r (recursive) and -f (force). Better to not use the command at all and just move the file or directory you’re working on by renaming it. You can always delete later.

2. Set your command prompts

When working on the command line, your prompt is crucial. You check it over and over to make sure you’re working on the right box. At the OS, your prompt can tell you if you’re root or not, what directory you’re sitting in, and what’s the hostname of the box. With a few different terminals open, it’s very easy to execute a heavy loading command or destructive command on the wrong box. Check thrice, cut once!

You can also set your mysql prompt too. This can provide similar insurance. It can tell you the database schema you’re set at default, and the user you’re logged in as. Hostname or localhost too. It is one more piece in the risk aversion puzzle.

3. Perform backups & test them

I know I know, we’re all doing backups already. Well I sure hope so. But if you’re getting on a system for the first time, it should be your very initial impulse to check and find out what types of backups are being done. If they’re not, you should set them up. I don’t care how big the database is. If it’s an obstacle, you need to sell or educate management on what might happen if. Paint some ugly scenarios. It’s not always easy to see urgency in these things without a good war story or two.

We wrote a guide to using xtrabackup for hotbackups. These can be done online even while your production database is serving customers without table locking or other downtime.

4. Stay off production machines

This may sound funny to some of you, but I live by it. If it ain’t broke, don’t go and try to fix it! You don’t need to be on all these boxes all the time. That goes for other folks too. Don’t give devs access to every production box. Too many hands in the pie so to speak. Also limit root users. But again if those systems are running well, you don’t have to login to them and poke around every five minutes. This just brings more chances for operator error.

5. Avoid change as much as possible

This one might sound controversial but it’s saved me more than once.

I worked at one firm a few years back managing the MySQL servers. The Oracle DBA was going on vacation for a few weeks so I was picking up the reigns for a bit. I met with the DBA for some brain dump sessions, and he outlined the main things that can and do go wrong. He also asked that I avoid any table alterations.

Sure enough ten days into his vacation, a problem arose in the application. One page on the site was failing silently. There was a missing field which needed to be added. I resisted. A fight ensued. Suddenly a lot of money was at stake if this change wasn’t pushed through. I continued to resist. I explained that if such a change were not done correctly, it very likely would break replication, pushing a domino of other things to break and causing an unpredictable mess.

I also knew I only had to hold on for a few more days. The resident dba would be returning and he could juggle the change. You see Oracle was setup to use multi-master replication those changes needed to go through a rather complex process to be applied. Done incorrectly the damage would have taken days to cleanup and caused much more financial damage.

The DBA was very thankful at my resistance and management somewhat magically found a solution to the application & edit problem.

Push back is very important sometimes.

Many of these ten tips are great characteristics to select for in the DBA hiring process. If you’re a candidate, emphasize your caution and track record with uptime. If you’re a manager, ask candidates about how they handle these situations. We wrote a MySQL DBA hiring guide too.


6. Monitor important things

You should monitor your OS syslog and MySQL error log for starters. But also your slow query log for new activity, analyze them and send the reports along to devs. Provide analysis. Monitor your partitions. You don’t ever want disks to fill up. Monitor load average, and have a check that the database login or some other simple transaction can succeed. You can even monitor your backups to make sure they complete without error. Use your judgement to decide what checks satisfy these requirements.

7. Use one or more slaves & checksum

MySQL slave databases are a great way to provide insurance. You can use a lagging slave to provide insurance against operator error, or one of those destructive commands we mentioned above. Have it lag a few hours behind so you’ll have that much insurance. At night this slave may be fresh enough to use for backups.

Also since mysql uses statement based replication, data can get out of sync over time. Those problems may or may not flag errors. So use a tool to compare your master and slave for data consistency. We wrote a howto on using checksums to do just that.

8. Be very careful of automatic failover

Automation is wonderful when it works. We dream of a data center that works like clockwork, with robots that never sleep. We can work towards this ideal, and in some cases get close. But it’s important to also understand that failure is by nature *not* what we predicted. The myriad ways that complex systems can fail boggles the mind, and surprises even seasoned veterans of operations. So maintain a heathy suspicion of this type of automation. Understand that if you automate things to happen in this crucial time, you can potentially put yourself in an even *more* compromised position than simply failing.

Sometimes monitoring, alerting, and manual intervention are the more prudent path. Your mileage may vary of course.

9. Be paranoid

It takes many years of doing ops to realize you can never be paranoid enough. Already checked that you’re on the right host, and about to execute some command? Quit the shell prompt and check again. Go back and ask the team if that table really needs to be dropped. Try to rephrase what you’re about to do in different words. Email out again to the team and wait some time before you pull the trigger. Check one more time that you have a fresh backup.

Delay that destructive command as long as you possibly can.

10. Keep it simple

I know I know, we all want to use that new command or tool, or jump on the latest hardware and take it for a spin. We want to build beautiful architectures that perform great feats of magic. But the fewer moving parts, the less things that can go wrong. And in ops, your job is stability and availability. Can you avoid using multi-master replication and go with just basic master-slave replication in MySQL? That’s simpler. Can you have fewer schemas or fewer filter rules? Can you skip the complicated HA layer, and use monitoring and manual failover?

Made it this far? Grab our newsletter.

Macrowikinomics book review by Tapscott & Williams

Macrowikinomics follows on the success of the best selling Wikinomics.  It hits on a lot of phenomenal success stories, such as the Linux project, which has over a roughly twenty year history, produced 2.7 million lines of code per year and would have cost an estimated 10.8 billion that billion with a b, dollars to create by conventional means.  What’s more it’s estimated the Linux economy is roughly 50 billion.  With huge companies like Google, and Amazon Web Services built on datacenters driven principally by Linux it’s no wonder.

They also draw on the successes of companies like Local Motors who use collaboration and the internet in new and innovative ways.

In total this book speaks to the disruptive power of the internet and new technologies, and offers a lot of hopeful stories and optimism about where they are taking us.  Food for thought.

Configuration Management – What is it and why is it important?

Every software service or component on a server requires configurations. In your desktop applications you set preferences for what your default page will be, how you’d like your margins set, or whether to save and restore cookies each time you restart.

Enterprise applications also require complex configuration settings.  Want to monitor a webserver and a database with Nagios, that’s set in the config file.  What to start MySQL with 8G of memory for InnoDB, that’s also set in a config file.  What’s more config files contain server specific settings, based on IP address, or the servers role, webserver or database for example.   The webserver may also have memcache and outbound email services running.

With more traditional deployments, the systems administrator will setup each physical box, and configure those services based on the business needs.  As you bring online 10’s or 100’s of servers, however, you can quickly see how labor intensive this process would be, and also how much redundancy there is.

Enter configuration management into the picture.  Previously I blogged about tools like Puppet that can bring great new best practices to the table. There is also cfengine, and the newer Chef which incorporates cloud deployments as well into the mix.  Configuration management allows you to remotely administer servers, install packages, manage dependencies, install configurations based on a central copy, and even define roles and templates for new servers.  This brings a whole new level of professionalism to deployments, and also newfound power and flexibility.

We’ll be writing more about configuration management, especially in the context of cloud deployments such as Amazon EC2 so please stay tuned.

Sean Hull asks on Quora – What is configuration management and why is it important?

Open Source – What is it and why is it important?

Open Source, a term understood well by the technology set, but not enough by everyone.

Open Source for the software industry is like generic drugs for the pharmaceutical industry.  It enables more players to come to the table, it is a huge driving force behind internet infrastructures, which are built on Linux, Apache and many other technologies.  It is the backbone of companies like google, and facilitates cloud services from the likes of Amazon EC2, Joyent, Rackspace and many others.

It is the rising tide that lifts all boats, if you will.

Sean Hull’s writing on Quora.

Dummy's Guide to Linux firewalls

Security experts will probably tell you it’s not a good idea to be a dummy and also in charge of your own firewall. They’re probably right, but it’s a catchy title. In this article, I’ll quickly go over some common firewall rules for iptables under linux.
First things first. If you don’t have the right kernel, you’re not going to get anywhere. A quick way to find out of all the right pieces are in place is to try to load the iptables kernel module.

$ modprobe iptable_nat

If you get errors you may need to compile various support into your kernel, and of course you may need to compile the iptable_nat module itself. The easiest way is to download the source RPM for your installed distribution, and do ‘make menuconfig’ with it’s default configuration, that way all the things that are currently working with your kernel won’t break when you forget to select them. For details see the Linux Firewall using IPTables HOWTO.
Once the module is loaded, start the service:

$ /etc/rc.d/init.d/iptables start

You will also have to have your interfaces up. I did this as follows:

# startup dhcp

/usr/sbin/dhcpd eth0

# bring up twc cable connection to internet

ifup eth1

You’ll need to set some rules. Be sure to get your internet interface, and local network interface right on these commands. First to setup masquerade which allows multiple machines behind your firewall to all share your single dynamically assigned IP address from your internet provider:

$ iptables -t nat -A POSTROUTING -o eth1 -j MASQUERADE

On my firewall, eth1 is the device which talks to the ISP, and gets the IP address we’ll use on the internet. The other interface, eth0 is for my local internal network.

Next be sure to enable VPN traffic through the firewall if you have a VPN connection to your office:

iptables -A INPUT -s -p 50 -j ACCEPT

iptables -A INPUT -s -p 51 -j ACCEPT

iptables -A INPUT -s -p udp --dport 500 -j ACCEPT

Lastly enable ip forwarding:

echo 1 > /proc/sys/net/ipv4/ip_forward

Of course you don’t really want to be a dummy forever, so you should read up Linux Firewall HOWTO and other linux docs.