Category Archives: CTO/CIO

Does FedRAMP formalize what good devops already do?

fedramp-logo

amazon-govcloud

Amazon’s GovCloud provides a specialized region within Amazon’s global footprint of datacenters. These are hosted within the United States, and provide a subset of the full Amazon cloud functionality.

Join 32,000 others and follow Sean Hull on twitter @hullsean.

However, hosting within GovCloud is not the whole story. Beyond this, you’ll want to implement FedRAMP compliant procedures & policies.

Are these policies new? As a seasoned systems administrator of Unix & Linux networks, you’ll likely find these very familiar best practices. What they do however, is formalize those into a set of procedures for testing compliance. And that’s a good thing.

1. Use a bastion box

A bastion box is a single point of entry for all your SSH activity. Instead of allowing SSH access to any of your servers from *anywhere* on the internet, you limit it to one box. This box is hardened with multi-factor authentication for security, only opens port 22, monitors & logs access, and funnels movement to all your other boxes. Thus you gain a virtual perimeter that you’re already familiar with in more traditional firewall setups.

Also: Ward Cunningham explains the high cost of technical debt (video)

2. Monitor & scan for vulnerabilities

Monitoring, scanning & logging are all key facilities for security management. Regular patch management of each of your servers, is essential to protect from newly discovered vulnerabilities. FedRAMP also requires scanning by tools such as Nessus or Retina.

Also centralizing your authorization, access & error logs allows easy monitoring & alerting of threats & improper access attempts.

Related: Do managers underestimate the cost of operations?

3. Policy of least privilege

The policy of least privilege is an old friend in computing & managing unix systems. It means first to eliminate all privileges (default to none) and then grant only those a user requires to do his or her work.

In Amazon it means not using the root account for provisioning infrastructure, it means a clear separation of dev, test & production environments. It limits who can access production & especially make changes there. It limits who can see sensitive data.

As well, you’ll use Access Control Lists (ACL’s) and security groups to control which servers can reach which other servers, whom on the internet can touch specific servers & ports, and so forth. These are the Amazon Cloud equivalent of perimeter security you may be familiar with in more traditional firewalls.

Read: When hosting data on Amazon turns bloodsport

4. Encrypt your data

If you want to be truly secure, you’ll want to encrypt your data at rest. You can do this by using encrypted filesystems in Linux. That way data is in a digital envelope, even on disk. Only when data is read into memory is it unencrypted. This provides additional insurance, because your EBS snapshots, backups & so forth are all hidden from prying eyes.

Also: Why dropbox didn’t have to fail

5. Conclusion

Amazon’s GovCloud provides access to a subset of their cloud offerings including EC2 their elastic compute cloud virtual servers, EBS the elastic block storage their own storage area network, S3 for file storage, VPC, IAM, RDS, Elasticache & Redshift.

FedRAMP formalizes what good systems administrators do already. Secure systems, deliver reliability & high availability & protect from unauthorized entry.

Also: Is Amazon too big to fail?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Why Dropbox didn’t have to fail

dropbox outage dec 2015

Dropbox is currently experiencing a *major* outage. See the dropbox status page to get an update.

Join 32,000 others and follow Sean Hull on twitter @hullsean.

I’ve written about outages a lot before. Are these types of major failures avoidable? Can we build better, with redundant services so everything doesn’t fall over at once?

Here’s my take.

1. Browse only mode

The first thing Dropbox can do to be more resilient is to build a browsing only mode into the application. Often we hear about this option for performaing maintenance without downtime. But it’s even more important during a real outage like Dropbox is currently experiencing.

Not if but *when* it happens, you don’t have control over how long it lasts. So browsing only can provide you with real insurance.

For a site like Dropbox it would mean that the entire website is still up and operating. Customers can browse their documents, view listings of files & download those files. However they would not be able to *add* or change files during the outage. Thus only a very small segment of customers is interrupted, and it becomes a much smaller PR problem to manage.

Facebook has experienced outages of service. People hardly notice because they’ll often only see a message when they try to comment on someone’s wall post, send a message or upload a photo. The site is still operating, but not allowing changes. That’s what a browsing only mode affords you.

A browsing only mode can make a big difference, keeping most of the site up even when transactions or publish are blocked.

Drupal is an open source platform that powers big publishing sites like Adweek, hollywoodreporter.com & economist.com. It supports a browsing only mode out of the box. An outage like this one would only stop editors from publishing new stories temporarily. It would be a huge win to sites that get 50 to 100 million with-an-m visitors per month.

Also: Is Amazon too big to fail

2. Redundancy

There are lots of components to a web infrastructure. Two big ones are webservers & databases. Turns out Dropbox could make both tiers redundant. How do we do it?

On the database side, you can take advantage of Amazon’s RDS & either read-replicas or Multi-AZ. Each have different service characteristics, so you’ll need to evaluate your app to figure out what works best.

You can also host MySQL, Percona or Mariadb direclty on Amazon instances yourself & then use replication.


Using redundant components like placing webservers and databases in multiple regions, Dropbox could avoid a major outage like they’re experiencing this weekend.

Wondering about MySQL versus RDS? Here are some uses cases.

Now that you’re using multiple zones & regions for your database the hard work is completed. Webservers can be hosted in different regions easily, and don’t require complicated replication to do it.

Related: Are SQL databases dead?

3. Feature flags

On/off switches are something we’re all familiar with. We have them in the fuse box in our house or apartment. And you’ll also find a bigger larger shutoff in the basement.

Individual on/off switches are valuable because they allow us to disable inessential features. We can build them into heavier parts of a website, allowing us to shutdown features in an emergency. Host components in multiple availability zones for extra piece of mind.

Read: 5 Things toxic to scalability

4. Simian armies

Netflix has taken a more progressive & proactive approach to outages. They introduce their own! Yes that’s right they bake redundancy & automation right into all of their infrastructure, then have a loose canon piece of software called Chaos Monkey that periodically kills servers. Did I hear that right? Yep it actually nocks components offline, to actively test the system for resiliency.

Take a look at the Netflix blog for details on intentional load & stress testing.

Also: When hosting data on Amazon turns bloodsport

5. Multiple clouds

If all these suggestions aren’t enough for you, taking it further you could do what George Reese of enstratus recommends and use multiple cloud providers. Not being dependant on one company could help in many situations, not just the ones described here.

Basic Amazon EC2 best practices require building redundancy into your infrastructure. Virtual servers & on-demand components are even less reliable than commodity hardware we’re familiar with. Because of that, we must use Amazon’s automation to insure us against expected failure.

Also: Why I like Etsy’s site performance report

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Five things I learned at NY CTO Summit 2015

cto summit 2015

Enjoyed attending the New York CTO Summit yesterday with a notable list of presenters. Looking forward to the slides. Links to follow.

Join 32,000 others and follow Sean Hull on twitter @hullsean.

1. Product is a reflection of teams

Conway’s law was repeated by three different presenters!

Also: Is the difference between dev & ops a four-letter word?

2. Agile government

Government efficiency can be tackled with startup efficiencies!

Related: Is AWS enabling Angellist to boil the VC business?

3. Learning culture

There are lots of benefits to building a learning culture, not least is making the business succeed.

Read: Do managers underestimate operational cost?

4. Don’t report to finance

Let’s remember how important which teams report to whom is. It can make or break your technology initiatives.

Also: Is Amazon too big to fail?

5. Course correction & size

The cost of changing course gets bigger as your org does.

Also: Airbnb didn’t have to fail

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Are software benchmarks to blame for Volkswagens woes?

volkswagen emissions

With the recent media attention Volkswagen has gotten, a lot of folks are wondering, how could that happen? Aren’t there checks & balances?

Join 32,000 others and follow Sean Hull on twitter @hullsean.

Then I ran across this observation on Todd Hoff’s brilliant blog High Scalability


Is what Volkswagen did really any different that what happens on benchmarks all the time? Cheating and benchmarks go together like a clear conscience and rationalization. Clever subterfuge is part of the software ethos. There are many many (search google) examples. Cars are now software is a slick meme, but that transformation has deep implications. The software culture and the manufacturing culture are radically different.

What exactly does all of this mean?

1. MySQL & Aurora

I was recently chatting with a colleague of mine Bret Miller who runs DeepSQL an adaptive database platform compatible with MySQL. He said:

“We’re actually doing testing against Aurora, but we recently had a couple customers do it independently with more challenging loads.   Didn’t see the performance stated in the marketing stuff. ”

My response was… “Yeah.  Aurora looks to be a win on the HIGH AVAILABILITY front. 

On the scalability front, MySQL has certain limitations in it’s core.  So i’m not surprised that the marketing material was grandiose in it’s promises.  

The best way to improve mysql performance is to tune queries.  As you’re writing your application, and when you want to boost performance.  ”

And so it goes.

Also: Can hosting data on Amazon turn bloodsport?

2. Redis & Memcached

Then I stumbled upon Salvatore Sanfilippo. He is the author of the brilliant & phenomenally successful NoSQL database called Redis. Turns out that another famous blogger was making some sweeping statements about Memcached & Redis and Salvatore ended up defending Redis in a blog post titled Clarifications about Redis.

The topic turned to benchmarks. Which lead me to another post titled
Why we don’t have benchmarks.

Heard this before?

Related: Did Airbnb have to fail?

3. Is Mongo webscale?

When Mongo was first releasing it’s benchmarks, the media went wild. And DBAs were scratching their heads. This fabulous video captures the sentiment of the time. :)

Read: Is Amazon too big to fail?

4. Oracle meets David DeWitt

In the 80’s Oracle began to forbid publishing benchmarks. After seeing a research paper by David DeWitt, Larry Ellison amended the End-user-license-agreement to include the DeWitt Clause. Later other database vendors followed.

It’s easy to see why. Benchmarks by their very nature depend on so many factors. It’s inevitable that those factors will be carefully picked by each platform to highlight it’s strengths.

Also: Are SQL databases dead?

5. Product versus disks

It is inevitable that all of this continues. When we reside at the level of the business, we perceive the product & its performance through that lens.

When we dive down to the level of disks, buses, cpus, network latency, multi-tenant clouds and a myriad of other factors, the waters are never so clear.

So remember your mileage may vary and buyer beware are as true today as they ever were.

Also: 5 Things are toxic to scalability

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

When hosting data on Amazon turns bloodsport

reddit aws outage

There’s a strong trend to automation across the cloud. That’s a great thing for startups because it reduces operational headaches & lets them focus on building products.

Join 31,000 others and follow Sean Hull on twitter @hullsean.

But as that trend begins to touch the database tier, all sorts of complications emerge. Let’s take a look at some of the tradeoffs.

1. Database as a service trend

I was recently reading Baron Schwartz’s article on the trend to database as a service.

I work with a lot of venture backed startups & pay close attention to what’s happening in New York & SF. From where I’m standing I see a similar trend. As automation simplifies management across the application stack, from load balancers to web & search servers, the same advantages are moving to database management.

Also: How to automate MySQL analysis on Amazon RDS

2. How Amazon RDS helps

Amazon’s RDS offers firms a data solution for Oracle & SQL Server as well as MySQL. For those just starting, it offers a long list of advantages.

o quick push-button deployment in minutes
o standardized parameters settings that just work
o ability to scale up or down from the dashboard
o automated backups
o multi-az so you can sleep at night

This brings a huge advantage to startups. Many have a team of developers but aren’t large enough to need an operations team and can’t afford a dedicated database administrator.

Amazon is obviously helping these firms raise the bar. And that’s a good thing.

Related: RDS or MySQL 10 use cases

3. How Amazon RDS hurts

As you get bigger, your needs will grow too. You’ll have tens of millions of customers, and with more customers comes an even higher bar. Zero downtime becomes critical. It’s then that Amazon’s solution starts to become frustrating.

Unpredictable upgrades

MySQL upgrades on RDS are a messy activity. Amazon will restart the instance, backup the instance, perform the upgrade then restart again. Each of these restarts takes a few minutes. The whole operation may have you down for ten minutes. This becomes more frustrating when your hands are completely tied. You don’t know when or what will happen!

When you roll-your-own instance, an upgrade can be performed in a matter of seconds. No instance restarts are necessary and you can monitor the process to know exactly where you are. This is the kind of control you’re going to want if you have millions of customers relying on your site & uptime.

Unnecessary slow restarts

When you apply parameter changes on RDS, some require a MySQL restart. Amazon forces the whole server to restart, increasing this downtime from a few seconds (when you roll your own) to many minutes. And while some parameters can be changed online, Amazon can provoke some strange behavior that is not always predictable.

With the frequency of these types of changes, you’ll quickly grow tired and frustrated with RDS.

EBS Snapshots are not portable

As mentioned above Amazon uses it’s standard filesystem snapshot technology to perform backups. While this works well, it can be slow & unpredictable in a multi-tenant environment.

When you roll your own, you can take advantage of xtrabackup, and perform hot backups against your database with zero downtime. This is a real godsend. What’s more they are portable, and can be moved to any other server even ones not hosted in Amazon’s cloud!

Promoting a read-replica is slow too!

One feature that Amazon touts is creating copies or “read replicas” of your data. These are great and can facilitate easy copying of data. However promoting these again brings unnecessary restarts which are slow.

When you roll your own, you can promote a read-replica or read-only slave in seconds. A few seconds can seem invisible to end users, while minutes will be perceived as a real outage or downtime.

Read: Is zero downtime even possible with RDS?

4. Is migration an option?

So what to do? As I mentioned above, there are real advantages to startups deploying their first database. It really does help. I would argue for many it can be a good place to start.

If you’re starting to outgrow RDS and frustrated with the limitations, performance tuning headaches & unneeded downtime, luckily you have options.

Migrating off of RDS onto a physical server can be done in a number of ways.

o slave off of the master

Here you build a MySQL slave on a standard EC2 instance, with your RDS instance as the master. When you’re caught up, bring your site down temporarily. Reset the slave & set to read-write mode. Then point your webservers at your new EC2 instance and bring the site back up. If done carefully 10 to 20 seconds of downtime should be plenty.

Don’t forget to run through the process with a firedrill first!

o dump & import

Another way to move your data may be MySQLdump. This option would be slower & bring a lot more downtime, but possibly necessary in some cases.

Also: 5 Reasons to move data to Amazon Redshift

5. Speed: It’s the database

Fred Wilson says speed is the number one feature of a web application. If customers are frustrated & waiting, they may leave & not come back. On the web it can be everything.

Many firms are rushing to database as a service to simplify administration. While that’s wonderful at the beginning, as you grow performance will become more of a day-to-day concern. And when it does, the database is going to be big on your list of headaches.

Web application performance inevitably involves the database and while it does, your decision to choose database as a service may come into question. Don’t be afraid to bite the bullet and manage things yourself when that time comes.

Also: Is upgrading RDS like a shit-storm that will not end?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

What Deborah Tannen taught me about conversation & interruption

tannen you just dont understand

I was recently invited to attend a charity event in Washington DC. Dinner was a catered affair of 300 with a few senators & Muhammad Yunus there to talk about micro financing.

After dinner we broke up into some smaller groups, and had great conversations into the night. It was interesting to me as I don’t often rub elbows with lobbyists & political animals. While we were all talking, the subject of language came up, and in particular how different people’s styles affect how they come off.

Join 30,000 others and follow Sean Hull on twitter @hullsean.

I became really engaged, as this topic has always interested me. I was introduced to the ideas of Deborah Tannen. She’s a professor of linguistics from Georgetown University, and an expert on the topic.

Afterward, I went straight to my kindle & bought here seminal book “You Just Don’t Understand”.

Boy do I understand a lot more now.

1. Conversational style varies by culture & gender

Across cultures, from europeans to Asians, North to South Americans, conversational styles vary. Some pause longer between breathes, while others make briefer pauses. Some deem conversation more like judge & jury, where each should be afforded carefully the chance to take stage, while others prefer the casual chance to jump in, and constant overlap.

These differences lead to the sense of pushiness versus interest, interruption versus dominance. Interest versus boredom. Since all these cultures have a different style, it can get rather complicated interpreting someone’s intentions if you’re not from that culture.

What’s more these vary quite a bit between men & women.

Also: What I learned from Jay Heinrichs about click worthy blog titles

2. Report & rapport talk are different

Report talk is in public, perhaps at a lecture, or out with a large group of friends around the dinner table. There stories & conversation revolves around a larger group.

Rapport talk on the other hand is at home, among intimates.

She says that women tend to prefer the latter while men prefer the former. So in different circumstances it can appear that one or the other has “nothing to say”, when it actually revolves around their preferences of when to speak.

Related: Is automation killing old-school operations?

3. Like & respect

Women’s behavior & style of speaking is rooted in the goal of being liked. So there are many cases where they may downplay themselves, to reach a more equal state with those around them.

Men’s behavior & conversational style is based around seeking respect. This can often mean emphasizing differences, and not parity.

Read: Do managers underestimate operational cost?

4. Contest or connection?

Men often see the world through the lens of contest, especially in relationships with others. Women on the other hand tend to see it as an interconnected network. By building bonds you strengthen that network.

These two styles inform dramatically different behaviors in similar situations.

Also: Is Reid Hoffman right about career risk?

5. Interest or independence

Here’s another example of how men & women may see things differently.


When men change the subject, women think they are showing a lack of sympathy — a failure of intimacy. But the failure to ask probing questions could just as well be a way of respecting the other’s need for independence.

So it seems styles & priorities inform intention & interpretation of a lot in conversations.

Although all of this doesn’t resolve or put to rest these differences, being informed can certainly help a lot towards understanding.

Also: What I learned from the 37 Signals team about work & startups

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Why I use Airbnb chat even when texting is easier

airbnb

If you’ve ever traveled & stayed with an Airbnb host, you know that once you book you can easily switch to text messaging. Sometimes this is easier. But as I found out, it’s smarter to stick with a channel that we all can share.

Join 30,000 others and follow Sean Hull on twitter @hullsean.

I had a similar experience with a recent consulting customer. The lesson was much the same.

Choose your communication channels wisely, for you may need them for other reasons later on.

1. Not what I paid for

I’ve been hosting travelers off & on through Airbnb for some time. It’s a fun past time, as you can meet some interesting people, and share a little bit of *your* city. That and there’s a little bit of extra income too, which doesn’t hurt.

One visitor I had wasn’t particularly happy with the setup. I’ve hosted dozens of people before, so I know that the space is popular to most. However this one guy seemed unhappy from the start. He didn’t read the fine print that it was a shared space with separate rooms. He was unhappy with the specific location too. And later he complained about a bicycle I had loaned him.

Also: Is Amazon too big to fail?

2. How Airbnb chat helped

At the end of his visit he asked for some of the fee to be refunded.

As I dug through our Airbnb chat, I copy/pasted our various communications, and in the end this helped clarify & remedy the situation. It also didn’t hurt that Airbnb themselves were there behind the scenes and could review all these messages as well.

Having a third party to arbitrate can make a big difference. Lets hope it doesn’t come to that, but if it does, you want a communication channel they can also see.

Related: 5 Reasons to move data to Amazon Redshift

3. Consulting engagements & corporate emails

Over the years I generally use my own email for projects & engagements. However recently I took on a longer engagement. At the start there was some insistence on using an internal email for communications. I was hesitant, but eventually conceded as it tied in with google calendaring and various internal aliases.

As the months went by, I tried & failed to use both emails for correspondence. It was a habit that was hard to change. What’s more forwarding *all* emails to my own was also difficult. With an ongoing barrage of all@company.com messages numbering in the hundreds, it simply blew up my email account. That wasn’t sustainable either.

Read: Do managers underestimate operational cost?

4. After you leave

You may not be thinking of after your consulting assignment at the start of it. But you should be. You’ll engage in many communications, about a lot of different topics. Some about what is & isn’t in scope. Some about deliverables & timelines.

You’ll also have communications has things unfold, and as they are delivered. All of these are crucial to the engagement, as evidence of what was done when. If after you leave, all those emails are gone (at least that you can reach), it can be problematic.

What’s more once you set a precedent communicating one way, it’s hard to change habits. Best to set the precedent strongly up front.

Also: Are we fast approaching cloud-mageddon

5. Your channel is your paper trail

In todays mobile-heavy world, there are tons of channels we can use to communicate. From Whatsapp to Slack, Hipchat to email & text. They all have their strengths & weaknesses.

But sometimes we need to choose based on future needs. Leaving a paper trail can be important. Having future control over those past communications can bring legal benefit.

And all of these communications can help avoid misunderstandings if they’re available for review later.

Also: Which tech do startups use most?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Is Amazon too big to fail?

aws fault tolerance

Amazon is the huge online retailer everyone knows well. However there is another side of Amazon, namely Amazon Web Services that hosts many of the internets largest websites.

Join 28,000 others and follow Sean Hull on twitter @hullsean.

In the infrastructure & operations world, Amazon is the Citibank, JP Morgan or Goldman Sachs of cloud providers.

1. Outage takes down Yelp & Netflix

As reported on Thousand Eyes among other places, Amazon had a major outage yesterday.

Amazon experienced a problem with how they route data over the network. Routing is the technical term for how the internet moves data around. When routing goes wrong at a provider like Amazon, the websites they host will go down too.

Also: Are we fast approaching cloud-mageddon?

2. Automation can’t save you

Netflix is famous for their great streaming service, and shows like House of Cards.

On the technology side they’re also pretty famous. They deploy legions of Amazon servers to stream movies using Chaos Monkey. This open source suite allows them to remain resilient even if individual servers or components go offline.

Yet a heavy reliance on Amazon itself, meant a wider outage for them was also an outage for Netflix.

Related: What tech do startups use most?

3. Of cloud monopolies

Amazon’s dominance in the cloud hosting space is incredible. There are providers that can beat them in compute power, speed & price. But with their incredible reach of global datacenters & relentless growth they are still the first choice for most internet shops.

What is the downside of such dominance? What happened yesterday illustrates it clearly. When Amazon goes down, so do financial companies like Experian,

Read: Do managers underestimate operational cost?

4. Diversify your data portfolio

In the banking world we can put together legislation, regulating banks. We can enact capital requirements or consider breaking up the largest ones. For investors & consumers you can diversify your portfolio, putting money in different asset classes & institutions. If one fund fails, others will balance it out.

We can do the same with cloud hosting. For larger internet applications, deploying on multiple clouds can be very beneficial. In that case an outage at Amazon, would merely mean your global load balancer kicks in, sending traffic to your plan B servers.

Also: Replicate big data to Amazon Redshift with Tungsten

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

Are we fast approaching cloud-mageddon ?

storm coming

One look at StackShare’s trending technologies, and you’ll discover the exploding growth of languages, webservers, load balancers, databases, caching servers, automation & monitoring tools, continuous integration suites & a broad spectrum of Software as a service solutions.

The choices today boggles the mind. Choice is good, but too much choice can mean trouble too.

Join 30,000 others and follow Sean Hull on twitter @hullsean.

1. What am I actually using?

Erich Schubert wrote a superb piece about the sad state of the sysadmin in the age of containers. Here’s what caught my eye:

Stack is the new term for “I have no idea what I’m actually using”.

That definitely rings true for me. The customers I’m seeing these days have such complicated stacks, that nobody really knows what’s installed. That’s dangerous!

Also: Do today’s startups assemble at their own risk?

2. Embrace failure more broadly

Recently I wrote a blog post asking Is AWS the patient that needs constant medication?. It got a lot of traction, and here’s why I think that happened.

AWS uses very commodity, cheapo components. The assumption is, with an infinitely redundant datacenter, component failure is ok. It’s ordinary & everyday.

Unfortunately most startups, even ones that employ some Ansible & devops, still don’t have Netflix grade automation.

Those regularly everyday failures are still getting detected by old-school manual monitoring. And that’s a recipe for trouble

Also: 5 Things toxic to scalability

3. What are complex systems?

In this excellent deck, James Urquhart talks about emergent behavior in complex systems. It’s worth a quick read.

***

Read: How I find entrepreneurial focus

4. What to do? Do you like boring?

Dan McKinley formerly principal engineer at Etsy & now with Stripe wrote a brilliant essay arguing for boring technology.

This comes as a shock to many in the startup world. It sort of smacks in the face of open source, or does it?

I worked in the enterprise space as an Oracle DBA for a decade starting in the mid-nineties. Among DBAs there was always a chuckle when a new version of Oracle came out. No one wanted to touch it with a ten foot pole. Sure we’d install it on test boxes, start learning the new features and so forth. But deploy it? No way, wait a good 2 or even 3 years before upgrading.

Meanwhile management was eager for the latest software. Don’t we want the newest? The Oracle sales guys would be selling the virtues of all sorts of features that nobody needed right away anyway.

Choosing boring components takes discipline to fight sexy new technologies & bleeding edge versions. But staid & stodgy wins you everyday in operations uptime.

Related: Is automation killing old-school operations?

5. Use tried & tested components

Do you find your application or stack contains java, ruby, python & PHP? Choose one.

One webserver like nginx, one caching server like memcache or redis, one search server like solr or elasticsearch, one database like MySQL or postgres. Standardize all your components on one image, so you can use that for all your servers, regardless of which you use.

Fewer components will mean fewer interdependencies, less maintenance, & less chaos.

Also: What’s the luckiest thing that’s happened in your career?

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters

What happens when entrepreneurs treat data as a product?

Sneachta Pix
Sneachta Pix

I’ve been reading DJ Patil’s thoughts on building data products. As the chief data scientist of the united states, he knows a thing or two.

Join 30,000 others and follow Sean Hull on twitter @hullsean.

I also attended a recent Look & Tell event, where Lincoln Ritter talked about Data Democracy at Animoto. He expressed many of Patil’s lessons.

I took away a few key lessons from these that seem to be repeating refrains…

1. UX of data

UX design involves looking at how customers actually use a product in the real world. What parts of the product work for them, how they flow through that product and so on.

That same design sense can be applied to data. At high level that means exposing data in a measured, meaningful & authoritative way. Not all the tables & all the data points but rather key ones that help the business make decisions. Then layering on top discovery tools like Looker to allow the biz-ops to make more informed decisions.

Also: 5 Reasons to move data to Amazon Redshift

2. Be iterative

Clean data, presented to business operations in a meaningful way, allows them to explore the data, and find useful trends. What’s more with good discovery tools, biz-ops is empowered to do their own reporting.

All this reduces the need to go to engineering for each report. It reduces friction and facilitates faster iteration. That’s agile!

Related: Is automation killing old-school operations?

3. Be authoritative

Handing the keys to the data kingdom over to business means more eyes on the prize. That may well surface data inconsistencies. Each such case can reduce trust on your data.

Being authoritative means building checks into your data feeds, and identifying where data is amiss. Then fixing it at the source.

Read: Are SQL Databases dead?

4. Spot checks & balances

Spot checks on data are like unit tests on code. They keep you honest. Those rules for how your business works, and what your data should look like, can be captured in code, then applied as tests against source data.

Also: Is Apple betting against big data?

5. Monitoring for data outages

As data is treated as a product, it should be monitored just like other production systems. A data inconsistency or failed spot check then becomes an “outage”. By taking these very seriously, and fire fighting just as you do other production systems, you can build trust in that data, as those fires become less frequent.

Also: Why Airbnb didn’t have to fail

Get more. Grab our exclusive monthly Scalable Startups. We share tips and special content. Our latest Why I don’t work with recruiters