Story's of Tinera

A blog on everyday information challenges

Confusion on results and definitions – Building on trust

The question

“How many products did we sell last month?” sounds like an easy question. Yet a multitude of answers is possible. Is every order a sell, do we include cancellations, should orders that haven’t been shipped or paid be included?

“How many customers did we have last year?” is equally vague. What’s a customer, are we using calendar or fiscal years, or just a year back from today?

In our day-2-day communication we’re used to these questions and we assume to understand the meaning of the question. And if we didn’t make the right assumptions, well, damage is often pretty limited. Yet when we’re controlling the space shuttle, a plane or a car you might want to be a little more certain that there’s no room for misinterpretation.

When we’re running our business misinterpretations could lead to erroneous decisions, resulting in loss of money, stock or customers. The impact could be small, but potentially also disastrous. (link)

The data

When we truly understand the question in its complete context, we will need to look at the data. Also the data itself can be open for interpretation. Something like order status “open” (or even worse: Order status “2”) can mean a lot of different things. It’s one of the toughest things in analysing data.

A proper analyst or data scientist needs to understand both the question (the definition of the information that’s required) as the data (what’s the functional meaning of a number in the data). If they make wrong assumptions, or draw wrong conclusions, decision makers will get wrong information, make wrong decisions, and trust will be gone. And if trust is gone, it’s hard to gain back.

Statistics

There are lies, damned lies and statistics.” (Mark Twain)

When we understand the question and the data, there’s still a third battle we need to win: the use of statistics. I guess everybody who’s working with data should know above quote. Bad use of statistics can lead to remarkable conclusions (link), which don’t always represent reality.

The most known wisdom on statistic is that correlation doesn’t equal causation, but there is more to learn on the use of statistics.

Concluded

Organizational confusion and wrong decision exist mostly due to human interaction. Most analysts follow their own way of working. Not because they want to, but because there are not corporate guidelines available to them.

Just as with scientific research, data analysis should be following standards. To prevent misinterpretation of data, misidentifying causations or wrong decisions data analysis should be kept to the same grounds as scientific research. Analysis should be independent of departmental interest, reproducible, verifiable, consistent and accurate. Off course there multiple gradations of analysis in organizations so a smart setup of governance guidelines is needed.

Confusion on results can be hard to deal with but with some simple and accessible actions trust can be regained. Some basic guidelines, peer reviews and some fundamental MDM (master data management) can already create a more pleasant environment for analysts and their customers. They can build trust in the organization, grow their knowledge and support decision makers at accelerating pace resulting in a proper data driven culture.

 

Confusions on results of analysis, the use of data and definitions? Building on a foundation of quality trust will grow and result in confidence in your data!