How Callcredit uses machine learning to spot fraudsters and likely loan defaults

Callcredit is using machine learning to understand whether a consumer is likely to default on a loan and spot fraudulent applications.

Tom Macaulay Jan 08th 2018

Callcredit is using machine learning to understand whether a consumer is likely to default on a loan and spot fraudulent applications.

The company helps businesses assess whether a person applying for credit is likely to pay it back through sophisticated analysis of a variety of datasets.

It has become one of the UK's three main credit reference agencies since it was founded in 2000 and believes that machine learning could give it an advantage over its competitors.

It recently completed a trial of Microsoft Azure Machine Learning that suggested it could save credit card companies millions of pounds in bad debt.

"What we were doing with Microsoft was looking at more advanced analytics than we'd ever used in the past by getting deeper into the machine learning discipline," says Mark Davidson, Callcredit's Chief Data Officer.

Experiments with machine learning

Callcredit wanted to build a new generation of predictive tools that would provide greater certainty on its assessments and a better service to customers.

Around 18 months ago, the company began to explore how machine learning could help them do it.

"We wanted to come at from a no-assumptions basis," says Davidson. "Every data scientist that you talk to has a preference on a technique that they think is most optimal. We were determined to make sure that we found the best techniques for the right problem domains."

It did this by testing the predictive accuracy of different machine learning techniques in the services they help their clients with, such as credit rating assessments and anti-fraud monitoring. The best models were then researched in-depth to understand how to deploy them.

Most credit reference agencies rely on logistic regression models. Callcredit discovered that it was finding better predictions through boosted decision trees.

The improved predictions could help a business determine the rate at which it offers credit cards depending on their appetitive for risk. It could also ensure consumers receive faster and fairer on their application for a credit card or a loan for a car.

Why Microsoft Azure Machine Learning?

Callcredit spent a year assessing the different techniques and the vendors that provide them. The company reviewed platforms from AWS, SAS and Microsoft Azure ML before plumping for the latter.

"We chose Microsoft simply because it gave us the best roadmap," says Davidson. "We loved where Microsoft was going. The speed of deployment of both the platform and the models within the platform were by far market-leading, plus the fact is we're an existing Microsoft customer.

"Most of our technology stack is based on Microsoft technology, which works extremely well for us, and this then just sits alongside it."

Callcredit embarked on a year-long trial with Microsoft to evaluate its ability to identify potentially fraudulent applications and predict a customer's propensity to repay debts.

The models they tested using boosted decision trees offered at least five percent better predictions than those using logistic regression techniques. 

The difference can save millions of pounds in decisions on credit card applications and better protect consumers from fraud.

Using machine learning to save money and spot fraudsters

Callcredit modelled a scenario involving a typical credit card portfolio of 60,000 cards written over a year.

Each had an average balance of around £3,500, resulting in a total of £210 million loaned to consumers. Typically, 7 percent of them would not pay back their debt as required.

Using Azure ML to identify who would be unlikely to repay cut the level of default in the portfolio by more than £1 million.

"Now that's a material benefit for any of our clients, but more importantly, it's a material benefit for the consumers, because it's meaning that they're making more affordable decisions and the credit card company is being more of a responsible lender," says Davidson.

Callcredit also tested how machine learning could improve fraud prevention. Fraudsters frequently try to sign up to a loan in someone else's name, with the aim of racking up a debt that's left with the person whose name was on the application.

Azure ML helped Callcredit identify conspicuous patterns in the application process that reveal whether a fraudster is likely applying for the debt facility.

The results were so impressive that the company adopted the scorecards in its consumer credit report service.

"We used to use a third party fraud detection platform, which was very successful for us, but what we wanted to do was to build our own fraud detection model using Azure ML," says Davidson.

"After trials running them in parallel, we determined that our internal model was outperforming the third-party specialist service, so we swapped them out."

What's next for machine learning?

The company is now testing the scalability of the new methodology in a variety of use cases and building a decision-making scorecard for clients that will be released in the first quarter of 2018. 

"All the credit risk models will be deployed live by quarter one next year," says Davidson. "We're then continuing on in different problem domains. We'll be looking at affordability models and we're also looking at insurance fraud models and identity verification models. They should be coming over the course of the next 12 months."

Davidson recommends that other companies that are considering embarking on a similar project remain open-minded about which model to use until they've been thoroughly tested.

"Every data scientist has their favourite pet technique. You've got to take a big step back and say I don't actually know what's the best technique for the particular problem I'm solving. It’s best to adopt that no assumptions approach."