Mining For Data Gold

Mining For Data Gold

In the last 12 months I have found myself working in a great team on a unique subject matter which has enhanced my data skills exponentially.

Food Stock Loss and Accuracy at a company the size of Marks & Spencer generates millions of records each day as products are transported, sold, wasted and counted across a vast estate of stores in the UK and Ireland.

There is so much data, in fact, that it simply can't fit into an Excel workbook. To be precise - 1,048,576 rows are the most you can have and so we need other tools to support.

In step SQL and PowerBI.

Fortunately, M&S has numerous repositories tracking all manner of stock movements - it just needs someone to collate everything in ways that make insights easier to find.

Like a radio - we need to 'tune in' to see something that was there all along.

I find the discovery process addictive and there is gold to be found if you know where to look. It is only after insights have been identified that targeted actions can be taken.


SQL is a specific set of commands that computers understand in the context of searching databases. If we could download everything inside a database onto a spreadsheet, we would, but most of the time we can't so we need SQL to do some heavy lifting for us.

Once we have a large set of records, we need to display the results somewhere - seeing a long sheet of information isn't much use to anyone. This is where PowerBI comes in. It can present graphs and tables made up from tens or hundreds of millions of rows. There are limits to this too of course - usually the power of the average person's computer - if you want to keep your reports quick and easy to use. Most datasets can be grouped up in ways to ensure they run efficiently.

For example - if you managed a region of stores in the North of England, you'd probably want a way of reviewing those specific stores rather than seeing information for hundreds of others that you don't manage. Similarly, you probably want to know things you can do something about - data from more than a year ago is likely not relevant (unless it gives a useful reference to determine if something looks unusual). Adding filters and selectors give report-viewers the control needed to drill down to what they need to see.

Naturally we all just want to know the insights that matter without spending lots of time looking for them! Why can't we build something to just tell us where the problems are and what needs attention?

I'm sure we are not far away from the days of using AI in this way, for example:

"Tell me the Top 100 items lost in my stores this week"

Or perhaps:

"How many times did the team check inventory levels of our Beers & Spirits department last week?"

It is easy to say these things and to expect an equally fast answer but, in most cases, work needs to be done in between. No doubt the answers will come faster as technology advances.


Understanding what has happened in the past is the job of a Data Analyst whilst predicting the future is the job of a Data Scientist. The scopes are very different and I personally feel more affinity to the former because the results are factual but there is no doubt that the Holy Grail is to forecast events before they happen. It is one thing to act on something that has happened and quite another to act on something that hasn't (yet).

Ultimately businesses want to avoid problems and to maximise success (however that be measured). The ability to predict the future accurately is no doubt a good sign of how well understood a business is / can be but, even if you can only look in the rear-view mirror, it is still better than running blind!

I have been extremely fortunate to gain sponsorship to do a data apprenticeship run by Cambridge Spark alongside my day job (like a mini A-level) which is reinforcing techniques I know and developing new skills like Python (another tool that can be used to process large data sources). I have already put this into practice to help M&S identify trends in so-called 'Phantom Inventory' by simulating conditions that help identify stock which probably doesn't exist unlocking new sales opportunities.

I hope to continue building knowledge in this area and potentially venturing into the world of Machine Learning to help make the 'talking to AI' concept a reality one day.

Thank you in particular to Andrew McCurry for sponsoring me through the process and pushing the limits of my knowledge on a daily basis!

Bye for now

Robert Blackman

Sales Leader - Enterprise & Mid-Market EMEA | New Business Development

8mo

A nice read and very well explained for the ‘non-experts’ like myself. Thanks Adam!

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics