Machine learning explainability in finance: an application to default risk analysis
Bank of England 2019.10.16
We propose a framework for addressing the ‘black box’ problem present in some Machine Learning (ML) applications. We implement our approach by using the Quantitative Input In？uence (QII) method of Datta et al (2016) in a real？world example: a ML model to predict mortgage defaults. This method investigates the inputs and outputs of the model, but not its inner workings. It measures feature in？uences by intervening on inputs and estimating their Shapley values, representing the features’ average marginal contributions over all possible feature combinations. This method estimates key drivers of mortgage defaults such as the loan？to？value ratio and current interest rate, which are in line with the ？ndings of the economics and ？nance literature. However, given the non？linearity of ML model, explanations vary signi？cantly for different groups of loans. We use clustering methods to arrive at groups of explanations for different areas of the input space. Finally, we conduct simulations on data that the model has not been trained or tested on. Our main contribution is to develop a systematic analytical framework that could be used for approaching explainability questions in real world ？nancial applications. We conclude though that notable model uncertainties do remain which stakeholders ought to be aware of.