Definition:Propensity score matching (PSM)

📊 Propensity score matching (PSM) is a statistical technique used in insurance analytics to estimate the causal effect of a treatment, intervention, or exposure by matching treated and untreated observations that share similar predicted probabilities of receiving the treatment. In the insurance context, "treatment" might refer to a policyholder receiving a wellness incentive, a claims management intervention, a particular underwriting action, or enrollment in a telematics program. Because insurers rarely have the luxury of running fully randomized experiments on their book of business, PSM offers a practical way to approximate experimental conditions using observational data drawn from policy administration systems, claims databases, and third-party sources.

⚙️ The method works in two stages. First, a logistic regression or similar model estimates each observation's propensity score — the probability of receiving the treatment given a set of observed covariates such as age, claims history, risk class, geographic location, and coverage tier. Second, treated and untreated observations are paired based on the closeness of their propensity scores, creating a synthetic comparison group that resembles the treated group on all measured dimensions. Analysts in life insurance might use PSM to assess whether a disease-management outreach program genuinely reduced hospitalization costs, while property and casualty teams might evaluate whether a fraud-detection flag actually improved loss ratios or merely correlated with other risk characteristics. The technique's credibility depends heavily on the richness and relevance of the covariates included — unmeasured confounders remain a threat to validity, a limitation insurers must acknowledge when presenting findings to regulators or reinsurers.

💡 For an industry increasingly pressed by regulators and stakeholders to demonstrate that pricing, claims, and marketing decisions are fair and evidence-based, PSM provides a disciplined framework for separating correlation from causation. Under regimes such as Solvency II and evolving regulatory expectations around algorithmic fairness, insurers need defensible methods to show that a given intervention — not some hidden selection bias — drove an observed outcome. PSM also supports insurtech companies seeking to prove the value proposition of their products to carrier partners: a telematics vendor, for instance, can use PSM to demonstrate that policyholders who adopted its app experienced genuinely lower claim frequencies, after controlling for the possibility that safer drivers were simply more likely to opt in. While not a substitute for a true randomized controlled trial, PSM is one of the most widely adopted quasi-experimental tools in the insurance data science toolkit.

Related concepts: