I’ve been reading DJ Patil’s thoughts on building data products. As the chief data scientist of the united states, he knows a thing or two.
Join 30,000 others and follow Sean Hull on twitter @hullsean.
I also attended a recent Look & Tell event, where Lincoln Ritter talked about Data Democracy at Animoto. He expressed many of Patil’s lessons.
I took away a few key lessons from these that seem to be repeating refrains…
1. UX of data
UX design involves looking at how customers actually use a product in the real world. What parts of the product work for them, how they flow through that product and so on.
That same design sense can be applied to data. At high level that means exposing data in a measured, meaningful & authoritative way. Not all the tables & all the data points but rather key ones that help the business make decisions. Then layering on top discovery tools like Looker to allow the biz-ops to make more informed decisions.
2. Be iterative
Clean data, presented to business operations in a meaningful way, allows them to explore the data, and find useful trends. What’s more with good discovery tools, biz-ops is empowered to do their own reporting.
All this reduces the need to go to engineering for each report. It reduces friction and facilitates faster iteration. That’s agile!
3. Be authoritative
Handing the keys to the data kingdom over to business means more eyes on the prize. That may well surface data inconsistencies. Each such case can reduce trust on your data.
Being authoritative means building checks into your data feeds, and identifying where data is amiss. Then fixing it at the source.
Read: Are SQL Databases dead?
4. Spot checks & balances
Spot checks on data are like unit tests on code. They keep you honest. Those rules for how your business works, and what your data should look like, can be captured in code, then applied as tests against source data.
5. Monitoring for data outages
As data is treated as a product, it should be monitored just like other production systems. A data inconsistency or failed spot check then becomes an “outage”. By taking these very seriously, and fire fighting just as you do other production systems, you can build trust in that data, as those fires become less frequent.