Get Started with Statistical Formulas and Forecasting
Published: July 12, 2016
Author: Oliver Eldredge
Forecasting is a task that can be easy to get right, and easy to get wrong. Having multiple forecasting models in your tool kit can help ensure that you get accurate results, but even then you’ll have to choose between several different forecasting models which can take time.
In this post, we’ll break down your options – including how to determine (with automation!) how accurate each is and which works best for your data set. Let’s dive in.
Applying Standard Deviation to Forecasting Run Rates
Understanding how predictable a dataset is can be a valuable tool in analysis. It can help us to determine how reliable an observation is, how accurate a prediction might be, or how severe a problem is. The =STDEV.P() formula can help accomplish all of those things in different circumstances. Having this formula in your toolkit when you are creating a run rate is a great application of this technique. Let’s take a look at the basic forecasting process first.
Simple Average Spend Technique
A simple but flawed method for calculating run rates is to determine your average daily spend over the completed days of the current month, and then use that average to create a prediction for the remaining days. This formula may work, but it can easily create bad predictions if there’s been a recent change in daily spend, fluctuations across DOW, or daily fluctuations for any other reason.
Even-keeled spend is easy to predict, and hard to get wrong:
In Excel, this looks something like =Cost + (Cost/Completed_Days)*Remaining_Days
Other Models to Account for Varying Spend
When our overall average spend is not a good candidate, we may choose to replace it with a seven-day rolling average, three-day rolling average, yesterday’s spend, or a new number that represents a new budget.
Predicting spend using an overall average, versus a seven-day rolling average:
In Excel the rolling seven-day average looks something like:
=Cost + (Last_7_Day_Cost/7)*Remaining_Days
Determining which of these models is the best fit can be a time-consuming process to complete manually at scale. Going through past performance for a large volume of campaigns is probably not something that anyone looks forward to. However, missing an important change can be a costly error, so skipping the tasks isn’t a great idea either. Our goal in automating this process is to qualify how much trust we should have in a given dataset, and then provide an at-a-glance indicator that helps to evaluate which model provides the most accuracy.
Automating Model Selection
Applying the =STDEV.P() and =AVERAGE() formulas to our daily spend can help us to understand if any of these flaws are likely to occur.
This can occur in countless different formats, but the mocked-up version below shows how this simple formula can automate our expectations around how accurate the overall and seven-day averages are for three different datasets. When combined with conditional formatting, these can provide an at-a-glance run rate.
Applying multiple forecasting models, along with a qualifier for their expected accuracy, can greatly speed up your process for determining expected run rates. You can quickly and confidently determine if your preferred model lacks accuracy, and then select an appropriate alternative or dive into the account to audit the actual cause of volatility.