Definition:Generalised linear model (GLM)

📈 Generalised linear model (GLM) is a statistical modelling framework that serves as the backbone of modern actuarial pricing and risk classification across the global insurance industry. GLMs extend classical linear regression by allowing the response variable — such as claim frequency, claim severity, or loss ratio — to follow distributions other than the normal distribution, including Poisson, gamma, binomial, and Tweedie distributions that better reflect the skewed, non-negative, and often zero-inflated nature of insurance data. First formalised by Nelder and Wedderburn in 1972, GLMs gained widespread adoption in insurance pricing from the 1990s onward and remain the industry's standard tool for decomposing risk into its constituent rating factors.

⚙️ A GLM works by linking a function of the expected response variable to a linear combination of predictor variables — such as policyholder age, vehicle type, geographic zone, sum insured, or claims history — through a specified link function (logarithmic, logit, or identity, among others). Actuaries typically fit separate models for claim frequency and claim severity, then combine the outputs to derive a technical price for each risk segment. The multiplicative structure inherent in a log-linked GLM aligns naturally with how insurers think about relativities: each rating factor contributes a multiplying effect to the base rate, making the model outputs directly translatable into rating algorithms. Model calibration involves iterative maximum-likelihood estimation, and actuaries evaluate fit using deviance statistics, residual diagnostics, and lift charts. Regulatory environments in some jurisdictions — particularly across Solvency II markets and in the more data-mature segments of US personal lines — expect insurers to demonstrate robust model governance, including documentation, validation, and avoidance of unfairly discriminatory variables.

🔬 The pervasiveness of GLMs in insurance is difficult to overstate: they underpin rate filings, portfolio analyses, and competitive benchmarking across motor, home, commercial property, and many other lines in virtually every major market. While more complex techniques — including machine learning algorithms such as gradient-boosted trees and neural networks — are increasingly used for predictive tasks, GLMs retain a central role because of their transparency, interpretability, and ease of regulatory explanation. Many insurers and insurtechs use a hybrid approach, employing advanced algorithms for feature discovery and then encoding the most predictive variables into a GLM framework that satisfies both actuarial judgment and regulatory scrutiny. As insurance data grows richer — incorporating telematics, IoT sensor feeds, and geospatial information — the GLM framework continues to evolve, accommodating higher-dimensional feature spaces while retaining the structural clarity that has made it indispensable to the industry for over three decades.

Related concepts: