Your business has been collecting people and business data for a while now, and you’ve recently decided to give predictive analytics a go. But how do you make a good predictive model? What kind of data is needed, and how much and of what quality? Here are the ‘need to know basics’ of HR predictive analytics to help you get started.
What predictive models can and can't do
Predictive models can be quite useful for solving business problems, such as when someone is likely to leave, what absence will look like in the future, and whether a new starter is likely to bolt or not. They can help you predict and prepare for events that have already been an issue so you can take action and address the situation.
That said, predictive models cannot predict what hasn’t already happened, such as black swan events (those rare, unpleasantly surprising, high impact events that tend to be every organisation’s nightmare). When such events occur it’s easy to wrongly assign cause, and believe precautionary steps could have been taken to prevent the matter, but hindsight is always 20/20. It’s important to have realistic expectations of what predictive analytics can do for you.
What is the right data?
The right data you need to get started with predictive analytics depends on what you want your model to predict. For example, if you want to determine when someone is likely to leave your organisation, do you have data on employee start and end dates, performance reviews, capability reports, and other relevant information?
If you don’t have the right data, simply knowing what you want to predict can help you start collecting what you need. (If you’ve got a good reporting tool implemented, this step will be much easier.)
Continuing our example, a good first step in predicting when someone is likely to leave your business is by looking at what you average ‘leaver’ looks like as there are countless factors that could have a potential impact (like tenure, line managers, time since last promotion, last performance rating, etc.) Next, compile all the necessary data in your choice analytics tool, keeping an eye out for gaps and where there could be potential issues.
Even if you’ve got five year’s historical data on a few thousand employees, if it’s not the right data to answer your question on what you want predicted, you won’t have a very good predictive model. That said, you should always do an exploratory analysis on the data you do have because although it might not predict the specific thing you want predicted, you may glean new insights or find something else of value to you.
During the exploratory analysis, choose a dependent variable (say, tenure), and look at the correlations it has between the other variables in your data set. What is the data telling you?
Extra tip: How do we know if a predictive model is good? Look at the confusion matrix, ROC curve, and p-values!
How much data is required?
The more data you have, the more accurate your predictive model will be. However, there isn’t a minimum to the data required, so our advice is if you’ve got the right data to answer your question, start using that data and continue to add to it as you go along to increase its accuracy.
That said, you should remember to adjust your model over time. Data will continue to impact the model outcome, and features that you’ve used in the past may not be relevant now–models cannot be used forever!
Also understand that the model you’ve built is reliant on the data you’ve put into it. If you’ve built a model to predict performance for one department of the business, it will not predict performance from another department whose data isn’t in the model.
And there you have it! The ‘need to know basics’ of HR predictive analytics to help you get started. If you have any more questions or are curious to know more about HR predictive analytics, please give us a shout via phone, email, or through social media networks.