How to build an operational datastore on AWS with S3 & Redshift

via GIPHY

You’re building your data warehouse, and getting data into Redshift. You’ve got your ETL pipeline running, and presentation layer talking to the warehouse. Great.

But how to get access to that source data? Wouldn’t it be nice if that was close by too?

Join 35,000 others and follow Sean Hull on twitter @hullsean.

It may be you have 10-zillion rows of source data and don’t want or need to get all of that into Redshift and keep it there. But it would be nice to have access to it when you do.

Enter EXTERNAL tables, aka Spectrum. Now you can keep all your raw data in S3, an in place operational datastore of data before it’s been reworked and transformed. Use SQL to access it right where it sits.

Get all the advantages of lifecycle management in S3, and don’t pay all the redshift costs for data you don’t need all the time. Cool!

Let’s see how it works.

What is an EXTERNAL table?

Spectrum is Amazon’s rebranding of an old database technology called EXTERNAL TABLES. Back in the 90’s Oracle pioneered this work, allowing you to essentially map a CSV file, that sits outside the database proper. This means you can query all that juicy data sitting in flat files. Cool!

Athena allows you to query this stuff as a service, native to AWS. Spectrum allows you to create those external tables inside of Redshift.

Also: Top serverless interview questions for hiring aws lambda experts

Give Redshift permissions

Go into IAM and create a new role called “SeanSpectrumRole”. Assign the policy AmazonS3ReadOnlyPolicy. It looks like this:


{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:Get*",
"s3:List*"
],
"Resource": "*"
}
]
}

If you’re using the dashboard you just pick the policy from the named list. However if you’re using CloudFormation, you’ll use the code above.

Now navigate your aws console to the Redshift dashboard, click clusters, and click the checkbox for your cluster. Probably there’s only one.

Now click the “Manage IAM Roles” button, and a dialog should popup.. Select the role you created earlier, SeanSpectrumRole. Then click “Apply Changes”.

The beauty of the AWS world is that servers themselves can have API permissions. In this case we gave the redshift cluster or server itself, access to S3 for our use below!

Related: Which engineering roles are in greatest demand?

Create your spectrum schema

First you must create a spectrum schema. Here’s the syntax:


create external schema spectrum
from data catalog
database 'sean'
region 'us-east-1'
iam_role 'arn:aws:iam::9999999999999:role/SeanSpectrumRole';

Read: Can on-demand consulting save startups time & money?

Upload your data to S3 bucket

Here we create an s3 bucket called sean_spectrum, then upload one csv file named sean_numbers.txt.


$ aws s3api create-bucket --bucket sean_spectrum --region us-east-1
{
"Location": "/sean_spectrum"
}
$ cd spectrum/
$ cat sean_numbers.txt
21,Dr.,Who,44-22-55-77-88
35,Bat,Man,317-222-4777
15,Wonder,Woman,999-324-7878
99,Storm,Cloud,367-399-6767
75,Marvel,Girl,222-333-9595
32,Quick,Silver,22-33-77-99
12,Scarlet,Witch,23-35-47-555
$ aws s3 cp sean_numbers.txt s3://sean_spectrum/
upload: ./sean_numbers.txt to s3://sean_spectrum/sean_numbers.txt
$ aws s3 ls s3://sean_spectrum/
2017-05-18 20:28:41 193 sean_numbers.txt
$

Note the names. The table name won’t turn out to be sean_numbers. It will be called sean_spectrum, and all files inside that directory will be queried. So make sure they have consistent formats!

Also: 30 questions to ask a serverless fanboy

Create & query your external table

Here’s how you create your external table. Note this is just a map to data. The data is still stored in S3. it is not brought into Redshift except to slice, dice & present.


mydb=# create external table spectrum_schema.sean_numbers(
id int,
fname string,
lname string,
phone string)
row format delimited
fields terminated by ','
stored as textfile
location 's3://sean_spectrum/';

Here’s how you query it:


mydb=# select * from spectrum_schema.sean_numbers order by id;
id | fname | lname | phone
----------------+---------------
12 | Scarlet | Witch | 23-35-47-555
15 | Wonder | Woman | 999-324-7878
21 | Dr. | Who | 44-22-55-77-88
32 | Quick | Silver | 22-33-77-99
35 | Bat | Man | 317-222-4777
75 | Marvel | Girl | 222-333-9595
99 | Storm | Cloud | 367-399-6767

Cool. We reordered data read from an S3 file!!!

Although you can’t create a view over a redshift table *AND* an S3 external table, you can query them together.

So for example if I have a table in redshift with addresses, I can join them together:

mydb=# select a.id, a.fname, a.lname, b.address from spectrum_schema.sean_numbers a, sean_addresses b
where a.id = b.id order by id;

id | fname | lname | phone | address
----------------+----------------------------
12 | Scarlet | Witch | 23-35-47-555 | 10 main st
15 | Wonder | Woman | 999-324-7878 | 25 center st
21 | Dr. | Who | 44-22-55-77-88 | 32 broadway
32 | Quick | Silver | 22-33-77-99 | 1 first st
35 | Bat | Man | 317-222-4777 | 99 west st
75 | Marvel | Girl | 222-333-9595 | 66 East Ave
99 | Storm | Cloud | 367-399-6767 | 50 North st

Also: What can startups learn from the DYN DNS outage?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

A roughneck walk down database alley

via GIPHY

I was just responding to some Disqus comments on a recent blog post. Admittedly it had a provocative title Will SQL databases just die already. What do you think?

Join 34,000 others and follow Sean Hull on twitter @hullsean.

A reader pointed out that some No-SQL databases do support joins. Huh? My face contorts in total confusion. How? Why?

For years SQL was misunderstood & unloved

Relational databases have been around for decades. 43 years according to the StackExchange article. That’s a lot of years. I’ve spent a few years as a dba, aka database administrator. The role can be distilled down as a herder of sorts. Keep all the data bits in the right boxes with the right labels. A digital librarian, that makes sure the books don’t get lost.

Of course patrons don’t always put books back where they should, and strict rules get put in place to avoid losing that one volume of shakespear in miles of shelves.

In the fast moving world of web + mobile, product is king, and agile rules the day. And anything that can make us more agile also wins. SQL, much maligned & misunderstood, was not one of those things.

Also: Top serverless interview questions for hiring aws lambda experts

NoSQL burst on the scene with much fanfare

With all that pressure, it was no wonder engineers thought, there must be a better way. Then along comes the No SQL database. I mean just the name speaks volumes about the design goal.

We’ll sacrifice anything, just please don’t make me write SQL!

The promises…

1. Never have to deal with pesky SQL that we don’t understand!
2. Interact with the database like any other data structure in our code!
3. Be schemaless! Crotchety Database Administrators be damned!
4. Be distributed. Be everywhere consistent! Be indistinguishable from magic!
5. Always be fast.

In that rush into the abyss, we lost track of durability. And down the rabbithole we went!

Related: Which engineering roles are in greatest demand?

Relational databases tried to be key-value

Then I started hearing about crazy things, like MySQL providing a memcache plugin, so you could use it albeit lightening fast, as a key-value store. You could sidestep that pesky SQL engine, and get right down to the bare metal. But why? Memcache & Redis were already doing that & purpose built. Why indeed?

I started to argue maybe we shouldn’t be muddying the waters. I mean stick with what you know!

Read: Can on-demand consulting save startups time & money?

War was won, success declared

Around this I think was when Mongodb was declaring the war won. We had finally left SQL databases in the dustheap of history. It may or may not have inspired this popular youtube skit…

Also: 30 questions to ask a serverless fanboy

Meanwhile hadoop is losing ground. Bigquery & Redshift both speak SQL

But then something funny started to happen. It seemed there was a backlash against Mongodb. A lot of customers were losing data. (Yep that’s what durability means guys…) And the hype started reversing. Even the mighty hadoop has been losing popularity of late. How long does it take to write an EMR job versus an SQL query. Let’s be honest?

I asked myself, Is Hadoop losing ground to SQL warehouses like Redshift & Bigquery?. I wonder.

Also: What can startups learn from the DYN DNS outage?

NoSQL databases are looking for JOINs?

Recently I bumped into some interesting blog comments & discussions about how Orientdb was trying to add joins to their product.

As certain relational databases try to become No SQL databases, other No SQL databases are trying to add more complex SQL, because well somehow their product is missing something.

Also: What can startups learn from the DYN DNS outage?

Engineering truth versus fashion

43 years is a lot of years. And when we drop all the fashion trends in tech, and the new database du jour, what do we find?

There is room for No SQL databases. Yep. And the do certain things, and solve certain types of problems well. But their not general workhorses, nor can they slice and dice your data however you like. And when you get to that point in your project, you’re going to want to ask interesting questions of your data.

And surely that’s where SQL excels. It ain’t going anywhere, folks!

Also: What can startups learn from the DYN DNS outage?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Will SQL just die already?

With tons of new No-SQL database offerings everyday, developers & architects have a lot of options. Cassandra, Mongodb, Couchdb, Dynamodb & Firebase to name a few.

Join 33,000 others and follow Sean Hull on twitter @hullsean.

What’s more in the data warehouse space, you have Hadoop, which can churn through terabytes of data and get you results back before lunchtime!

So when I stumbled on this article SQL is 43 years old, I was intrigued.

Answer the questions you haven’t thought of

No-SQL databases are great if you know how you want to access the data. Users come from the users table, and that’s that!

But if later on you want to ask questions like, which users watched this video, which users are active, which users spent $100 in January? These questions may not be possible because NoSQL can’t join those other tables.

Relational databases shine when you need to aggregate your data, reorganize it, or ask unanticipated questions. And aren’t those most of the interesting questions?

Also: Top serverless interview questions for hiring aws lambda experts

Big Query, Redshift & even Hive speak SQL

I wrote that despite recent popularity in Hadoop, Redshift seems to be eating their lunch. And what would you know, surprise surprise, Amazon’s newish data warehousing solution, speaks SQL! What’s more there’s Apache Hive, which allows you to query Hadoop with, drumroll please… SQL!

Bigquery is the other major bigdata offering from none other than Google. And it too uses SQL!

Related: Which engineering roles are in greatest demand?

Still dominant

If you look at Stackoverflow’s developer survey, you’ll see that SQL is the second most popular language. Why might that be? For one thing it’s simple to learn. Enough that even business users can write simple requests, join & aggregate data.

Read: Can on-demand consulting save startups time & money?

Rugged, Proven & Open

SQL having been around so long is a fairly open standard. Sure there are extensions of it, but most of the basic stuff is there in all the products. That means you learn it once, and can interact with databases across the spectrum. That’s a win for everybody.

Also: 30 questions to ask a serverless fanboy

Business users can write it

Another under appreciated feature though is that basic queries are easy to write. They don’t require complex syntax like a hadoop job, or your favorite imperative programming language. The queries are readable, almost english-like sentences.

Given all that, it seems SQL is likely to be around for a long time to come!

Also: What can startups learn from the DYN DNS outage?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Is Amazon about to disrupt your data warehouse?

via GIPHY

Amazon is about to launch a product called glue. As you can see below, this is the last piece in the data warehousing puzzle. With that in place, Amazon will own you! Or at least have push button products to meet all of enterprises varying needs.

Even if you’re a small startup, you can do big-shot big enterprise data warehousing. That means everyone can use cutting edge data driven techniques for product & business decisions.

Join 33,000 others and follow Sean Hull on twitter @hullsean.

What is Redshift

Redshift is like the OLAP databases of years past, the Oracle’s of the world purpose built for warehousing data. Obviously without the crazy licensing model Oracle was famous for. With Amazon you can get enterprise class data warehouse for modest hourly prices.

If my recent conversations with recruiters about Redshift demand are any indication, there’s been a sudden uptick in startups looking for redshift expertise.

Also: Top serverless interview questions for hiring aws lambda experts

What is Spectrum?

Spectrum is a very new extension of Redshift allowing you to access & query S3 file data directly. This means you can have petabytes of data that you can access pre-load time. So you will ETL and load portions of it, but with Spectrum you can still access the offline data too.

In the old Oracle days this was called an EXTERNAL TABLE. I mention this only to say that Amazon isn’t doing anything that hasn’t been done before. Rather they’re bringing these advanced features within reach of everyday startups. That’s cool.

Related: Which engineering roles are in greatest demand?

What is glue?

Glue is still in beta, but if the RE:Invent talk above is any indication, it’s set to disrupt an entire industry. Wow!

Glue first catalogs your data sources. What does this mean, it scans them & models their schemas.

It then generates sample python ETL code. Modify it, or write your own. Share your code on Git. Or borrow other open source pieces, that already address your specific ETL use case!

Lastly it includes a job scheduler which handles dependencies. Job A must be completed before B can run and so forth. Error handling & logging are also all included.

Since these are native Amazon services, of course they’re going to integrate with their dangerously fast Redshift warehouse.

Read: Can on-demand consulting save startups time & money?

What is serverless?

I’ve written about how to throw fastballs at a serverless fanboy and even how to hire a serverless expert. But really what is it?

Serverless means deploying functions directly into the cloud. No servers, no configuration. All the systems administration & automation is hidden. No more devops to argue with! Amazon’s own offering is called Lambda.

Also: 30 questions to ask a serverless fanboy

What is Quicksight?

Amazon’s even jumped into the fray at the presentation layer. Quicksight is a BI tool along the lines of mode, domo, looker or Tableau.

Now it’s possible to stay completely within the cozy Amazon ecosystem even for business insight and analytics.

Also: What can startups learn from the DYN DNS outage?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Top questions to ask a devops expert when hiring or preparing for job & interview

xkcd_goodcode
Strip by Randall Munroe; xkcd.com

Whether your a hiring manager, head of HR or recruiter, you are probably looking for a devops expert. These days good ones are not easy to find. The spectrum of tools & technologies is broad. To manage today’s cloud you need a generalist.

Join 33,000 others and follow Sean Hull on twitter @hullsean.

If you’re a devops expert and looking for a job, these are also some essential questions you should have in your pocket. Be able to elaborate on these high level concepts as they’re crucial in todays agile startups.

Check out: 8 questions to ask an aws ec2 expert

Also new: Top questions to ask on a devops expert interview

And: How to hire a developer that doesn’t suck

1. How do you automate deployments?

A. Get your code in version control (git)

Believe it or not there are small 1 person teams that haven’t done this. But even with those, there’s real benefit. Get on it!

B. Evolve to one script push-button deploy (script)

If deploying new code involves a lot of manual steps, move file here, set config there, set variable, setup S3 bucket, etc, then start scripting. That midnight deploy process should be one master script which includes all the logic.

It’s a process to get there, but keep the goal in sight.

C. Build confidence over many iterations (team process & agile)

As you continue to deploy manually with a master script, you’ll iron out more details, contingencies, and problems. Over time You’ll gain confidence that the script does the job.

D. Employ continuous integration Tools to formalize process (CircleCI, Jenkins)

Now that you’ve formalized your deploy in code, putting these CI tools to use becomes easier. Because they’re custom built for you at this stage!

E. 10 deploys per day (long term goal)

Your longer term goal is 10 deploys a day. After you’ve automated tests, team confidence will grow around developers being able to deploy to production. On smaller teams of 1-5 people this may still be only 10 deploys per week, but still a useful benchmark.

Also: Top serverless interview questions for hiring aws lambda experts

2. What is microservices?

Microservices is about two-pizza teams. Small enough that there’s little beaurocracy. Able to be agile, focus on one business function. Iterate quickly without logjams with other business teams & functions.

Microservices interact with each other through APIs, deploy their own components, and use their own isolated data stores.

Function as a service, Amazon Lambda, or serverless computing enables microservices in a huge way.

Related: Which engineering roles are in greatest demand?

3. What is serverless computing?

Serverless computing is a model where servers & infrastructure do not need to be formalized. Only the code is deployed, and the platform, AWS Lambda for example, takes care of instant provisioning of containers & VMs when the code gets called.

Events within the cloud environment, such a file added to S3 bucket, trigger the serverless functions. API Gateway endpoints can also trigger the functions to run.

Authentication services are used for user login & identity management such as Auth0 or Amazon Cognito. The backend data store could be Dynamodb or Google’s Firebase for example.

Read: Can on-demand consulting save startups time & money?

4. What is containerization?

Containers are like faster deploying VMs. They have all the advantages of an image or snapshot of a server. Why is this useful? Because you can containerize your microservices, so each one does one thing. One has a webserver, with specific version of xyz.

Containers can also help with legacy applications, as you isolate older versions & dependencies that those applications still rely on.

Containers enable developers to setup environments quickly, and be more agile.

Also: 30 questions to ask a serverless fanboy

5. What is CloudFormation?

CloudFormation, formalizes all of your cloud infrastructure into json files. Want to add an IAM user, S3 bucket, rds database, or EC2 server? Want to configure a VPC, subnet or access control list? All these things can be formalized into cloudformation files.

Once you’ve started down this road, you can checkin your infrastructure definitions into version control, and manage them just like you manage all your other code. Want to do unit tests? Have at it. Now you can test & deploy with more confidence.

Terraform is an extension of CloudFormation with even more power built in.

Also: What can startups learn from the DYN DNS outage?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Top Amazon Lambda questions for hiring a serverless expert

via GIPHY

If you’re looking to fill a job roll that says microservices or find an expert that knows all about serverless computing, you’ll want to have a battery of questions to ask them.

Join 33,000 others and follow Sean Hull on twitter @hullsean.

For technical interviews, I like to focus on concepts & the big picture. Which rules out coding exercises or other puzzles which I think are distracting from the process. I really like what what the guys at 37 Signals say

“Hire for attitude. Train for skill.”

So let’s get started.

1. How do you automate deployment?

Programming lambda functions is much like programming in other areas, with some particular challenges. When you first dive in, you’ll use the Amazon dashboard to upload a zipfile with your code. But as you become more proficient, you’ll want to create a deployment pipeline.

o What features in Amazon facilitate automatic deployments?

AWS Lambda supports environment variables. Use these for credentials & other data you don’t want in your deployment package.

Amazon’s serverless offering, also supports aliases. You can have a dev, stage & production alias. That way you can deploy functions for testing, without interrupting production code. What’s more when you are ready to push to production, the endpoint doesn’t change.

o What frameworks are available for serverless?

Serverless Framework is the most full featured option. It fully supports Amazon Lambda & as of 1.0 provides support for other platforms such as IBM Openwhisk, Google Cloud Functions & Azure functions. There is also something called SAM or Serverless Application Model which extends CloudFormation. With this, you can script changes to API Gateway, Dynamo DB & Cognos authentication stuff.

If you’re using Auth0 instead of Cognito or Firebase instead of Dynamodb, you’ll have to come up with your own way to automate changes there.

Also: Is the difference between dev & ops a four-letter word?

2. What are the pros of serverless?

Why are we moving to a serverless computing model? What are the advantages & benefits of it?

o easier operations means faster time to market
o large application components become managed
o reduced costs, only pay while code is running
o faster deploy means more experimentation, more agile
o no more worry about which servers will this code run on?
o reduced people costs & less infrastructure
o no chef playbooks to manage, no deploy keys or IAM roles

Related: Is automation killing old-school operations?

3. What are the cons of serverless?

There are a lot of fanboys of serverless, because of the promise & hope of this new paradigm. But what about healthy criticism? A little dose of reality can identify a critical & active mind.

o With Lambda you have less vendor control which could mean… more downtime, system limits, sudden cost changes, loss of functionality or features and possible forced API upgrades. Remember that Amazon will choose the needs of the many over your specific application idiosyncracies.

o There’s no dedicated hardware option with serverless. So you have the multi-tenant challenges of security & performance problems of other customers code. You may even bump into problems because of other customers errors!

o Vendor lock-in is a real obvious issue. Changing to Google Cloud Functions or Azure Functions would mean new deployment & monitoring tools, a code rewrite & rearchitect, and new infrastructure too. You would also have to export & import your data. How easy does Amazon make this process?

o You can no longer store application & state data in local server memory. Because each instantiation of a function will effectively be a new “server”. So everything must be stored in the database. This may affect performance.

o Testing is more complicated. With multiple vendors, integration testing becomes more crucial. Also how do you create dev db instance? How do you fully test offline on a laptop?

o You could hit system wide limits. For example a big dev deploy could take out production functions by hitting an AWS account limit. You would thus have DDoS yourself! You can also hit the 5 minute execution time limit. And code will get aborted!

o How do you do zero downtime deployments? Since Amazon currently deploys function-by-function, if you have a group of 10 or 20 that act as a unit, they will get deployed in pieces. So your app would need to be taken offline during that period or it would be executing some from old version & some from new version together. With unpredictable results.

Read: Do managers underestimate operational cost?

4. How does security change?

o In serverless you may use multiple vendors, such as Auth0 for authentication, and perhaps Firebase for your data. With Lambda as your serverless platform you now have three vendors to work with. More vendors means a larger area across which hackers may attack your application.

o With the function as a service application model, you lose the protective wall around your database. It is no longer safely deployed & hidden behind a private subnet. Is this sufficient protection of your key data assets?

Also: Is the difference between dev & ops a four-letter word?

5. How do you troubleshoot & debug microservices?

o Monitoring & debugging is still very limited. This becomes a more complex process in the serverless world. You can log error & warning messages to CloudWatch.

o Currently Lambda doesn’t have any open API for third party tooling. This will probably come with time, but again it’s hard to see & examine a serverless function “server” while it is running.

o For example there is no New Relic for serverless.

o Performance tuning may be a bit of a guessing game in the serverless space right now. Amazon will surely be expanding it’s offering, and this is one area that will need attention.

Also: Is the difference between dev & ops a four-letter word?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

What engineering roles are most in demand at startups?

via GIPHY

I was just reading over StackOverflow’s 2017 Developer survey. As it turns out there were some surprising findings.

Join 33,000 others and follow Sean Hull on twitter @hullsean.

One that stood out was databases. In the media, one hears more and more about NoSQL databases like Cassandra, Dynamo & Firebase. Despite all that MySQL seems to remain the most popular database by a large margin. Legacy indeed!

1. Databases

MySQL is still the most popular db by a large margin 56%. Followed by SQL Server 39%, SQLite 27% and Postgres 27%.

Related: Is Amazon too big to fail?

2. Most popular language

Javascript sits at number one for Web developers, sysadmins & Data Scientists alike. Followed by SQL.

Read: Are SQL Databases dead?

3. Most popular framework

Node.js at 47%. It’s followed by AngularJS at 44%.

Also: 5 ways to move data to Amazon Redshift

4. Most loved database

Redis sits at number one here at 65%, followed by Postgres & Mongo.

Also: Myth of five nines – why HA is overrated

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Is on-demand consulting the answer to your hiring woes?

via GIPHY

A consultant costs more per hour than a developer you can hire right? That depends!

Join 33,000 others and follow Sean Hull on twitter @hullsean.

A big firm may cost a few thousand per day. But a smaller firm or one-man shop can bring you savings in line with a team hire.

1. You’re still looking

Have you been looking for 3 months? 6 months? You might find someone. But maybe never? I

Also: Is the difference between dev & ops a four-letter word?

2. On-boarding takes forever

***

Related: Is automation killing old-school operations?

3. Fulltime hires quit

I’ve worked at a few firms where the fulltime hires quit within a few months. Why? One was a very mismanaged team. They were juggling a lot of technical debt & lacked leadership direction. Devs were frustrated and morale was suffering.

At another firm the CTO left. A new one replaced him who started throwing his weight around. Many of the old team members got fed up & left.

In all these cases a consultant will still be there, working day-by-day, getting things done. I wrote about this How do we measure devotion.

Read: Do managers underestimate operational cost?

4. Halftime need

Smaller demand? Perhaps your capacity isn’t a full 40-hour week. Then an on-demand hire is really ideal.

Also: Is the difference between dev & ops a four-letter word?

5. Hit the ground running

Of course the biggest advantage is quicker on-boarding. You can expect productive work right away. That’s because a solo consultant has a lot of experience jumping right into the fray, and making an impact right away.

Also: Is the difference between dev & ops a four-letter word?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Key lessons from the Devops Handbook

I picked up a copy of the DevOps Handbook.

This is not a book about how to setup Amazon servers, how to use git, codePipeline or Jenkins. It’s not about Chef or Ansible or other tools.

Join 33,000 others and follow Sean Hull on twitter @hullsean.

This is a book about processes & people. It’s about how & why automation & world-class infrastructure will make your business more agile, raise quality & increase productivity.

1. Infrastructure in version control

With technologies like Terraform and CloudFormation, the entire state of your infrastructure can be captured. That means you can manage it just like any other code.

Also: Myth of five nines – Why high availability is overrated

2. Pushbutton builds

You’ve heard it before. Automate your builds. That means putting everything in version control, from environment building scripts, to configs, artifacts & reference data. Once you can do that, you’re on your way to automating production deploys completely.

Related: 5 ways to move data to amazon redshift

3. Devs & Ops comingled

In the devops world, devs should learn about operations, infrastructure, performance & more. What’s more operations teams should work closely with devs.

Read: Why were dev & ops siloed job roles?

4. Servers as cattle not pets

In the old days, we logged into servers & provided personal care & feeding. We treated them like pets.

In the new world of devops, we should treat servers like cattle. When it begins to fail, take it out back and shoot it. (tbh i don’t love the analogy, but it carries some meaning…)

Also: Are SQL databases dead?

5. Open to learnings & failures

Organizations that are open to failures, without playing the blame game, learn quicker & recover from problems faster.

Also: Is Amazon too big to fail?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Essential links this week

via GIPHY

Here’s some links & interesting stuff I’ve stumbled on this week. Enjoy!

Join 33,000 others and follow Sean Hull on twitter @hullsean.

1. Start coding

Looking to start coding? Take a look at Open source for beginners. It’s a graphical list of projects on github, great for beginners!

Also: 30 questions to ask a serverless fanboy

2. DIY Serverless

Interested in serverless & wanna dig past the hype? Take a look at this Functions as a Service howto which shows how to build lambda type offering in Kubernetes or Docker Swarm. Cool yo!

Related: Learning from the Dyn DNS outage

3. Serverless Use Cases

Curious when & where Amazon Lambda might make sense? Any and all microservices? Here’s a newstack article on viable use cases for serverless computing.

Read: Does Amazon Redshift have a dirty little secret?

4. Origami design software

Random, weird, and kinda cool! Robert Lang has designed some Origami software called TreeMaker. It replaces the pencil & paper method of designing new origami figures. Use the software to push the limits of paper folding further!

Also: My DIY Disqus.com hack for blog discovery

5. A distributed relational database that works?

Bloomberg LP has designed a relational database called Comdb2. Unlike many of it’s NoSQL peers, this distributed database is relational, speaks SQL, and is also highly available. Amazing!

Also: Are SQL databases dead?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters