Three Rules for Comparing Forecast Accuracy

In the last post in the Retail Forecasting Playbook, I explained why Mean Absolute Percentage Error, or MAPE, is the best metric for measuring forecast accuracy.  In this post, I’m going to expand our focus and provide the three rules you and your organization need to follow to compare forecast accuracy.

Why is it necessary to have any rules for comparing forecast accuracy?

Once you start producing forecasts from your workforce management system, someone will challenge you.  He will say, figuratively or literally, “my forecast is better than yours” and produce some number that makes his forecast sound highly accurate.

My experience is that these incredibly accurate forecasts metrics are just that: not credible.   It’s not that the forecasts are wrong or that the person is trying to deceive you.  It’s just that this person is measuring his forecasts differently.

In such cases, it’s good to ask a few questions, but first start by saying, “Wow, that’s impressive.”  You don’t want to fight with this person or put them on the defensive.  So, there is nothing wrong with buttering him up, but then ask:

        1. What metric did you use to measure forecast accuracy?

        2. Was that a daily or weekly forecast?

        3. What are you forecasting?  Meaning, are you forecasting by store, department, product line, SKU, etc.?

The first time I was faced with this challenge and and ask these questions, I was surprised by the answers.  I learned that the forecast accuracy measurement that I was being shown was an aggregate of store-level sales by week across all 1,100 stores in the retail chain for a year using Mean Percentage Error (MPE).  Meanwhile, the WFM system was producing product line-level sales by day.  We were looking at the results for 10 sample stores across four weeks and measuring our forecasts accuracy using MAPE.  We were comparing apples and oranges!

To compare apples to apples, one of us needed to change the way that we were measuring forecast accuracy.  In this case, my challenger did not have the data to recalculate his forecast accuracy.  So, it was left to me.

I went back to the WFM system and pulled forecasts for all stores from the last year.  Where those forecasts didn’t exist (the system was still new and not all stores had forecasts for the entire year), I had the system generate forecasts as it had for other weeks.  I aggregated the data as my challenger had and calculated forecast accuracy.

My challenger had shown me an incredible 98.2 percent accuracy.  When I used the WFM system-generated forecasts and measured accuracy as he did, I showed that the WFM system produced a 98.8 percent accurate forecast!  Needless to say, I had to step him through all of my calculations, but after this, questions about how other forecasts compared with the WFM system forecast ceased.

All of this leads me to my Three Rules for Comparing Forecast Accuracy.

Rule #1: Use the Same Accuracy Metric.  Whether you follow my advice and use MAPE or you decide to use another measure of forecast accuracy, it is critical that both forecasts use the same metric.
Rule #2: Use the same Timeframe and Dates Range.  When forecasting, there are two dimensions of time that you need to be concerned with.  The first is the length of time the forecast represents.  With WFM, this is typically daily or weekly.  The second dimension is the specific date or date range being evaluated.  Needless to say, both dimensions need to be the same when comparing forecast.
Rule #3: Use the Same Organization and Driver Detail.  A sales forecast for a store is very different than a sales forecast for a department, product line or individual SKU.  Similarly, any sales forecast is also different from a transaction forecast, an item forecast, a traffic forecast or a freight forecast.  When comparing forecasts, make sure you’re comparing the same drivers

These three rules allow you to make a fair comparison between two forecasts. Of course, there are reasons why you may want to break one of the rules.  For example, to determine if forecast accuracy is getting worse or better over time, you may want to compare forecasts from different date ranges or perhaps you want to see if one driver produces more accurate forecasts than another.  In these cases, you may want to break one rule but no more.

When you break a rule you want to know why you’re breaking the rule.  If you don’t have a good reason, you probably won’t understand the result.

So, what happens if their forecast is more accurate than yours?  You’ve got two choices.

First, you can adopt their forecasts and import their forecast into your WFM system.  Not all WFM systems support this concept and you’ll still likely need to generate some aspects of the forecast (such as the daily or intraday distribution) using the WFM system.  However, this eliminates any argument about who has the better forecast.

Second, you can improve the accuracy of your forecast.  There are a variety of ways to improve the accuracy of your forecast, and I’ll cover that topic in my next post in this series.

This post is part of the Axsium Retail Forecasting Playbook, a series of articles designed to give retailers insight and techniques into forecasting as it relates to the weekly labor scheduling process.  For the introduction to the series and other posts in the series, please click here.