Mining the “Gemba” from Big Data – The Answers are Simpler Than You Think

“People who can’t understand numbers are useless. The gemba* where numbers are not visible is also bad. However, people who only look at the numbers are the worst of all.”  – Taiichi Ohno, father of the Toyota Production System (TPS) and the primary architect in rebuilding Toyota after the 2nd World War, (

Taiichi Ohno

*Gemba:  a Japanese term meaning “the real place.” Japanese detectives call the crime scene gemba, and Japanese TV reporters may refer to themselves as reporting from gemba. In business, gemba refers to the place where value is created; in manufacturing the gemba is the factory floor.

What a fascinating quote. And what a fascinating man.  Ohno was a visionary thinker in modern manufacturing and the philosophy and principles he developed still apply today almost 40 years since his death.  If you aren’t familiar with him I urge you to read the short biography linked above to learn a little more.

But what of the numbers he speaks of?  And the gemba? We have all been in situations where “the numbers are bad” but we are not quite sure what they mean or how to fix them. And with big data delivering more and more of them we can find ourselves buried by numbers with no clear direction or answers. It turns out that applying some simple analytics can help you understand the gemba and where to apply effort in trying to solve these problems.

I recently ran across an analytics technique that uses just a single set of data to break through the fog and uncover the type of problem you might be facing in a production environment.  The technique is described by Melinda R. Hodkiewicz and Nicholas A. Hastings in their article entitled, “Production Data Analysis for Plant Performance Improvement“.

They suggest that you collect daily production throughput over a period of several months to a year, and then create a simple histogram or frequency plot. It should look something like this one from their paper:


From this chart and the information it presents, and using only an Excel spreadsheet, you can determine several key metrics and thereby understand where you should be putting your efforts to improve “the numbers.”

The area of the chart from 0 tonnes per day to about 70,000 tonnes per day indicates “Low Production Days” and is the area you should be looking at for a reliability improvement focus. The area of the chart from about 70,000 tonnes per day to 120,000 tonnes are the “normal days.”  On normal days, you can still improve by looking at a variability reduction focus.

In system terms – if your biggest problem is “low production days,” look at a Downtime Solution to find the root causes. If your biggest problem is variability on the “normal days,” use a Real Time OEE solution to find root causes, or an Advanced Process Control solution  to tighten up the distribution.

You can do this as a “one-off” with Excel, and you can use a system such as a  MES software to capture this data on a systemic and automated basis. Here’s an example from a coal producer that has been doing such an analysis for many years:

In this case, the “Error Range” is defined as the difference between target and actual production, so an error range of -5 to +5 is pretty much “on target.”  You can see that this distribution is reasonably good, but there is a long-ish tail of “low production days” and quite a few “higher than target days.”  The “higher than target days” are also not ideal, as the customer is pushing equipment past desirable limits and probably increasing maintenance costs and maybe causing issues downstream (too much inventory for example).

This coal producer has been working on these issues for many years and this distribution looks very different today.  There is no “long tail” and the “normal days” are very controlled and on target.

So, look at the numbers, but drill down a little more into the real situation, to the “gemba.”  The analysis will help you figure out where you should be looking more closely for improvement.

Send me your daily production numbers for the last year and I will do an analysis for you!


One Response to “Mining the “Gemba” from Big Data – The Answers are Simpler Than You Think”

Leave a Reply

  • (will not be published)

Time limit is exhausted. Please reload CAPTCHA.