The relationship between credit and defaults seems to be negative. How come giving more credit results in lower chances of defaults? Rightfully suspicious, you go talk to other analysts in an attempt to understand this. It turns out the answer is very simple: to no one’s surprise, the lending company gives more credit to customers that have lower chances of defaulting. So, it is not the case that high lines reduce default risk, but rather, the other way around. Lower risk increases the credit lines. That explains it, but you still haven’t solved the initial problem: how to model the relationship between credit risk and credit lines with this data. Surely you don’t want your system to think more lines implies lower chances of default. Also, naively randomizing lines in an A/B test just to see what happens is pretty much off the table, due to the high cost of wrong credit decisions.
What both of these problems have in common is that you need to know the impact of changing something that you can control (marketing budget and credit limit) on some business outcome you wish to influence (customer applications and default risk). Impact or effect estimation has been the pillar of modern science for centuries, but only recently have we made huge progress in systematizing the tools of this trade into the field that is coming to be known as causal inference. Additionally, advancements in machine learning and a general desire to automate and inform decision-making processes with data has brought causal inference into the industry and public institutions. Still, the causal inference toolkit is not yet widely known by decision makers or data scientists.
Hoping to change that, I wrote Causal Inference for the Brave and True, an online book that covers the traditional tools and recent developments from causal inference, all with open source Python software, in a rigorous, yet lighthearted way. Now, I’m taking that one step further, reviewing all that content from an industry perspective, with updated examples and, hopefully, more intuitive explanations. My goal is for this book to be a starting point for whatever question you have about making decisions with data.