Professional Documents
Culture Documents
Introduction and
Motivation
Pricing in Non-Life
Insurance
Pure Premium in a
Heterogeneous Context
Probability
Distributions in
Actuarial Science
Discrete Distributions
Continuous Distributions
Logistic Regression
in Actuarial Context
Evaluation Rules
Introduction and
Motivation
After choosing the appropriate models, which reflects the tariff
Pricing in Non-Life
Insurance
factors that influence the random variables (the Cost and the
Pure Premium in a Frequency of claims), the tariff will be constructed based on the
Heterogeneous Context
Probability
parameter estimates.
Distributions in
Actuarial Science
Discrete Distributions
Continuous Distributions
GLM’s in Insurance Assuming the independence between the number of claims and the
Pricing
Claim Frequency costs associated thereto, the estimate of the risk measure of each
policy (the Pure Premium) is obtained by the product of the
Regressions
Severity Regressions
Building the Pricing
Structure
estimates of the claim frequency and the expected cost of one
Large Claims
individual claim, according to:
Logistic Regression
in Actuarial Context
PP = E[N] E[X ]
Evaluation Rules
Introduction and
Motivation
Pricing in Non-Life
Insurance
Pure Premium in a
Heterogeneous Context For building the Pricing Structure we must start by defining the
Probability characteristics of the Standard Insured.
Distributions in
Actuarial Science
Discrete Distributions
Continuous Distributions
This will represent the Base Rate/Base Premium of the Tariff and
GLM’s in Insurance
Pricing the premiums of the other insured are obtained from it, through
Claim Frequency
Regressions
discounts or increases to the Base Premium, depending on
Severity Regressions whether it is considered that the risk profile of the insured is lower
Building the Pricing
Structure or higher than the Standard Insured.
Large Claims
Logistic Regression
in Actuarial Context
Evaluation Rules
Pricing Structure
Introduction and
Motivation In general, the pricing structure is presented by the Pure Premium of the Standard
Pricing in Non-Life Insured and the coefficients that relates the risk of Standard Insured with every
Insurance
Pure Premium in a
other risk profile.
Heterogeneous Context
The risk factors that influence the cost of claims may not be the same as those
Probability
Distributions in influencing the claim frequency. However, all of them should be included in the
Actuarial Science Pricing Structure.
Discrete Distributions
Continuous Distributions
GLM’s in Insurance
Pricing
Claim Frequency
Example
Regressions
Severity Regressions
Building the Pricing Risk Factor Level Premium
Structure
Large Claims 1 (Standard Insured) 253.41 u.m.
A(2) 0.824
Logistic Regression
in Actuarial Context
A(3) 0.670
I(2) 0.657
C(2) 1.216
C(3) 1.396
C(4) 1.869
Example 7
Evaluation Rules
Introduction and Consider a Pricing Structure Modelling that produced the following
Motivation
Pricing in Non-Life
estimates for Claim Frequency and Claim Severity:
Insurance
Pure Premium in a
Heterogeneous Context
Claim Frequency Claim Severity
Risk Factor Estimate Risk Factor Estimate
Probability
Distributions in S. Insured 0.10230 S. Insured 1178.33
Actuarial Science Zone C 0.12133 Brand 12 1438.70
Discrete Distributions Zone D 0.15263 Power 11 1125.45
Continuous Distributions Zone E 0.18106 Power 4 1478.52
GLM’s in Insurance Zone F 0.18309 Power 7 1332.15
Pricing Age 28-32 0.07602
Claim Frequency Age 32-36 0.07046
Regressions
Severity Regressions
Age 36-44 0.06856
Building the Pricing
Age 44-65 0.06582
Structure Age 65- 0.05906
Large Claims Brand 10 0.13023
Logistic Regression Brand 12 0.08638
in Actuarial Context Brand 13 0.13030
Brand 3 0.11615
Brand 5 0.12229
Introduction and
Motivation
Pricing in Non-Life
Insurance
Pure Premium in a
Heterogeneous Context
Probability
Distributions in
Actuarial Science
Discrete Distributions
Continuous Distributions
Logistic Regression
in Actuarial Context
Evaluation Rules
Logistic Regression
in Actuarial Context
E[Y ] = E[Y |Y ≤ s] P[Y ≤ s] + E[Y |Y > s] P[Y > s]
Evaluation Rules
Introduction and
Motivation Inclusion of Large Claims:
Pricing in Non-Life
Insurance
Pure Premium in a
Heterogeneous Context
Incorporating the risk factors in the estimation of claim costs, we have:
Probability
Distributions in
Actuarial Science
Discrete Distributions
Continuous Distributions
GLM’s in Insurance
E[Y |X] = E[Y |X, Y ≤ s] P[Y ≤ s|X] + E[Y |X, Y > s] P[Y > s|X]
Pricing
| {z } | {z } | {z }
A B C
Claim Frequency
Regressions
Severity Regressions
Building the Pricing
Structure A - Expected value of “common” claims
Large Claims
Logistic Regression
B - Expected value of “large” claims
in Actuarial Context
C - Probability of occurrence of a large claim
Introduction and
Motivation
Pricing in Non-Life
Insurance
Pure Premium in a
Heterogeneous Context
Probability
Distributions in
Actuarial Science
Discrete Distributions
Continuous Distributions
Logistic Regression
in Actuarial Context
Evaluation Rules
Introduction and
Motivation
Pricing in Non-Life
Insurance
Pure Premium in a
Heterogeneous Context Logistic Regression is a statistical technique used when the
Probability
Distributions in
dependent variable can be categorized into two groups, and aims to
Actuarial Science
Discrete Distributions
obtain the probability of an observation belonging to each of the
Continuous Distributions sets, depending on the explanatory variables.
GLM’s in Insurance
Pricing
Claim Frequency
Regressions
Severity Regressions
It is especially useful for modeling binary data in the form of
Building the Pricing
Structure
proportions.
Large Claims
Evaluation Rules
Introduction and
Motivation
Pricing in Non-Life
Insurance
Pure Premium in a
Heterogeneous Context In Logistic Regression, the dependent variable is introduced
Probability
Distributions in
through the use of dummy variables that assume the values:
Actuarial Science
Discrete Distributions
0 - failure or absence of the parsed attribute
Continuous Distributions
Logistic Regression
in Actuarial Context
GLM’s in Insurance
Pricing
Proportion of policies that report claims
Claim Frequency
Regressions Portfolio Profitability Analysis
Severity Regressions
Building the Pricing
Structure
Identify a “good” insured profile for marketing strategies
Large Claims
Logistic Regression
in Actuarial Context
Probability of renewal of a policy
...
Introduction and
Motivation
P[Y = y] = π y (1 − π)1−y
Pricing in Non-Life
Insurance , y = 0, 1
Pure Premium in a
Heterogeneous Context
Probability
Distributions in
Actuarial Science
The Binomial distribution belongs to the Exponential Family.
Discrete Distributions
Continuous Distributions
A linear model E[Y |X] = β0 + β1 X1 + . . . + βk Xk is not suitable because
GLM’s in Insurance
Pricing
the linear estimate may be negative or larger than 1.
Claim Frequency
Regressions
Severity Regressions
The natural idea is to model the odds or the logarithm or odds ratio:
Building the Pricing
Structure
Large Claims
Logistic Regression
in Actuarial Context
logit(π) = β0 + β1 X1 + . . . + βk Xk ⇔
π
⇔ ln = β0 + β1 X1 + . . . + βk Xk
1−π
Gracinda R. Guerreiro Data Science for Actuaries 2018/2019 97/102
Logistic Regression
Evaluation Rules
Introduction and
Motivation
The probability of sucess (occurrence of the event),
Pricing in Non-Life
Insurance π = P [Y = 1|X]
Pure Premium in a
Heterogeneous Context
is estimated from
Probability
Distributions in
Actuarial Science
Discrete Distributions
exp (β0 + β1 X1 + . . . + βk Xk )
Continuous Distributions π = P [Y = 1|X] =
GLM’s in Insurance
1 + exp (β0 + β1 X1 + . . . + βk Xk )
Pricing
Claim Frequency
Regressions
Severity Regressions
The odds ratio is easily obtained from:
Building the Pricing
Structure
Large Claims
Logistic Regression
in Actuarial Context P [Y = 1|X]
= exp (β0 + β1 X1 + . . . + βk Xk )
P [Y = 0|X]
Example 8
Evaluation Rules
Logistic Regression
3 Plot the probabilities of reporting claims for all driver ages.
in Actuarial Context
3 Fit a Logistic Regression for estimating the probability of reporting claims
in each zone of the country.
1 Estimate the probability of reporting claims in zones A and F.
2 Calculate the odds ratio of the model. Comment.
Gracinda R. Guerreiro Data Science for Actuaries 2018/2019 99/102
Logistic Regression for Large Claims
Evaluation Rules
Introduction and
Motivation Example 9
Pricing in Non-Life
Insurance
Consider the same portfolio studied in Example 4.
Pure Premium in a
Heterogeneous Context Consider that it was previously decided that every claim above 18.000 e is a
Probability large claim and must be modelled outside the GLM for “usual” claim costs.
Distributions in
Actuarial Science
Discrete Distributions 1 Fit a regression model to estimate the probability of an insured reporting
Continuous Distributions a large claim. Comment on the obtained results.
GLM’s in Insurance
Pricing
Claim Frequency 2 Estimate the probability of an insured with risk factors:
Regressions
Severity Regressions Age=50 ; Age Vehicle = 0 ; Zone = A ; Brand = Peugeout ; Fuel = Gasoline ; Power = 11
Building the Pricing
Structure
report a large claim.
Large Claims
Logistic Regression 3 Given the output, evaluate your options on how to include the modelling
in Actuarial Context
of large claims in your Pricing Structure.
Evaluation Rules
Introduction and
Motivation
Pricing in Non-Life
Insurance
All this models may be applied for a large variety of applications:
Pure Premium in a
Heterogeneous Context
Logistic Regression
in Actuarial Context
Use your knowledge and have fun!
Appendix
For Further Reading
References
Jong, P. and Zeller, G. (2008). Generalized Linear Models for Insurance Data.
Cambridge University Press, Cambridge.