# Basic concepts and techniques of the pricing process

## Factsheet

**Publication date:**

31 October 2016

**Last updated:**

29 October 2018

**Author(s):**

Pietro Parodi

Basic concepts and techniques of the pricing process in general insurance.

**Contents**

- Summary »
- Basic concepts and glossary »
- The pricing cycle -- pricing in the context of an insurance company, and external influences »
- The traditional approach to pricing - Burning cost analysis »
- The modern approach to commercial lines/reinsurance pricing - Stochastic loss modelling »
- Personal lines pricing techniques: GLM and price optimisation »
- Uncertainties around costing »
- From the expected losses to the technical premium »
- Key facts »
- Further reading »

### **Summary**

Pricing is the corporate process of putting a price tag on policies. It is best understood as the core part of the pricing control cycle which involves business planning, pricing itself and rate monitoring. It is sometimes useful to distinguish between "costing" (the calculation of the technical premium) and "pricing" (the actual commercial decision). The focus of this fact file is mainly on costing, although the role of market influences is mentioned.

Pricing is not an isolated function but has significant interaction with a number of other corporate functions, especially planning, underwriting, claims, reserving, management information, capital modelling, risk management, investment, finance and reinsurance.

Pricing can be carried out more or less scientifically depending on the amount and quality of data available and on how competitive and sophisticated the market is. The traditional approach to pricing is through burning cost analysis, in which the technical premium is based on the average past loss experience, suitably adjusted to reflect changed loss costs and exposures. In one form or the other this is still the most widespread technique in a commercial lines/reinsurance context.

Burning cost has significant limitations - specifically, it is difficult to move beyond a point estimate of the expected losses - and as a result a more refined technique, stochastic loss modelling, has gained ground in the past decades. This is based on the creation of a separate frequency and severity model for the policy losses (based on past experience, benchmarks and possibly some forward-looking thinking). The two models are then combined via Monte Carlo simulation or other numerical techniques to produce a model of the aggregate loss distribution.

In a data-rich situation such as personal lines insurance it is possible to move beyond a single stochastic model for all losses and charge policyholders on the basis of their individual riskiness, using rating factors selection and a technique called "generalised linear modelling". Recent development in machine learning are enabling insurers to make rating factors selection better and more efficient, and to make sense of large amounts of data in different form ("Big data").

### **Basic concepts and glossary**

In this section we introduce some concepts which will be useful in our treatment of burning cost analysis and stochastic loss modelling. For more information about these concepts, see the more detailed glossary (Parodi (2016b)) and Parodi (2014).

**Claims/losses inflation**

Claims inflation (or losses inflation) is that phenomenon by which the cost of a loss with given characteristics changes over time (Lloyd's, 2014). It differs in general from the retail price index or the consumer price index inflation as it depends on factors that may or may not be included in these indices.

If losses inflation is captured by an index *I*(t) (e.g. wage inflation plus superimposed inflation of 1%), the value of revalued losses can be calculated as follows: if is the time at which the original loss occurred, the value Rev*(X,t*)* of the loss *X* after revaluation can be calculated as:

where *t* * is the mid-point of the reference policy period (in the future), *I*(*t**) is the *assumed* value of the inflation index at time *t*.* In case claims inflation can be assumed to be constant at r, the expression above simplifies to:

For claims-made policies, the revaluation is from the reporting date rather than from occurrence date.

**Exposure **

An exposure measure is a quantity which is roughly proportional to the risk of a policyholder (or a group of policyholders). More formally, it can be seen as apriorestimate of the risk, before considering any additional information coming from the actual claims experience of the policyholder.

Examples of exposure measures are the number of vehicle years for motor, the wageroll or number of employees for employer's liability, the turnover for public liability, the total sum insured for property.

When the exposure is a monetary amount, it needs to be revalued (or rebased, on-levelled, trended) in a similar way to claims for it to remain relevant. Also, where necessary the exposure must be so transformed that it is aligned with the policy period for which the past claims experience is provided.

**Policy basis**

In order to correctly match claims and exposure for pricing purpose it is essential to understand which claims attach to a given policy period. How this is done depends on the policy basis:

- Insurance contract are most commonly on an
*occurrence basis,*i.e. the policy covers all losses occurred during the policy period, regardless of when they are reported (with some limitations). - Some commercial lines products such as professional indemnity and D&O (and also some reinsurance products) use the
*claims-made basis*, which covers all losses reported during the policy period, regardless of when they occurred (with some limitations). - In a reinsurance context we also have the
*risk attaching basis*, in which all losses occurring during the (reinsurance) policy period will be recovered from the reinsurer, regardless of the inception date of the original policy and the time at which the loss is reported. - More complicated policy bases have been designed to allow for the idiosyncrasies of specific products. E.g. the
*integrated occurrence**basis*is sometimes used for product liability to allow for the possibility of a large number of claims attached to the same product.

**Policy modifiers**

Some common policy modifiers that are used in this fact file are explained below.

- Excess/deductible. This is the amount that the policyholder needs to pay on a single loss before starting to receive compensation from the insurer. This is often called each-and-every-loss (EEL) deductible in a commercial lines context. If
*X*is the loss and*D*is the deductible, the amount ceded to the insurer (assuming no other policy modifiers apply) is*X'*= max(0,*X - D*). - Limit. This is the maximum amount that the insurer will pay for a given loss. If both an EEL deductible and a limit are present, the loss ceded to the insurer is
*X'*= min(*L,*max*(X**- D*, 0)). - Annual aggregate deductible (AAD). In a commercial lines context, the AAD places a cap on the annual amount retained by the insured as a result of having an EEL deductible. After the AAD is reached, each loss is paid in full by the insurer. In a reinsurance context, this is the total annual amount that the policyholder needs to pay before receiving compensation from the reinsurer - the two uses should not be confused.
- Quota share. This is the percentage of all losses ceded to the insurer, in exchange for an equal share of a premium.

**Return period**

The return period of an event is the reciprocal of the annual probability p of that event. E.g., a return period of 200 years means a probability of 0.5% in a given year.

**Technical premium**

The premium that the insurer should charge to cover costs and expenses and achieve the profit target it has set. The premium actually charged will need to take account of commercial considerations and will in general differ from the technical premium.

### **The pricing cycle - pricing in the context of an insurance company, and external influences**

Pricing is a process which is best understood in the corporate context of the pricing control cycle. It is not an isolated function but it interacts a significant number of other corporate functions, such as

- The
*planning function*, which is in charge of setting objectives that reflect the company's strategy, and produces guidelines and parameters for pricing decisions. - The
*underwriting function*, which is the function that normally decides the price for individual contracts based on the information coming from actuarial costing and other information - The
*claims function*, which provides the raw information for experience rating - The
*reserving function*, which provides information on IBNR and IBNER - The
*capital modelling function*, which provides information on capital loadings and possibly information on tail risk - The
*management information function*, which receive from pricing actuaries up-to-date information on price rates and price rate changes and feeds back the them information on the portfolio *Finance,*which is the source of information about expense loadings- The
*risk management function*, which independently validates the pricing process and pricing models - The
*investment function*, which provides the discounting for investment income - The
*reinsurance function*, which provides information on the cost of reinsurance, which is a component of the price - Other functions such as
*marketing, sales, IT, legal, compliance*vare all relevant to pricing, albeit often only indirectly

The figure below shows the *main activities* of a typical pricing control cycle. At a very high level, this is nothing but an instantiation of the classical project control cycle (planning - execution - monitoring).

*(Source: Parodi (2014), Pricing in general insurance)*

As the chart makes clear, a large component of the pricing cycle is the monitoring of everything that is relevant to pricing: claim costs, expenses, investments and so on. A crucial variable that needs to be monitored is the price rate change, because it is linked tightly to the performance of the portfolio. Pricing actuaries are in a privileged position to provide a view on price rate changes that takes account of changes to the risk in a quantitative fashion.

**Underwriting cycle**

While the technical premium can be calculated based on past experience and benchmarks but without a knowledge of the conditions of the market, a commercial decision on the premium actually charged requires knowledge of what the competition is doing and - at an aggregate level - awareness of the position of the market in the underwriting cycle. This is often visualised as the achieved loss ratio at market level as a function of time.

The underwriting cycle describes the "softening" (reducing) and "hardening" (increasing) of the premium that can be realistically charged to clients for the same risk. The cycle is the result of the complex interplay between supply of insurance capital, demand for insurance cover, unexpected external factors such as large catastrophes (e.g. 9/11 terrorist attacks and the hurricane Katrina) and the fact that the true profitability of a portfolio only becomes obvious retrospectively and in the long term.

The underwriting cycle is obviously not the only influence on the actual premium. Other strategic considerations, both internal (e.g. cross-selling) and external (e.g. desire to increase market share), will play a role.

### **The traditional approach to pricing - Burning cost analysis**

Burning cost is an odd name for what is actually the simplest and most intuitive approach to costing: estimate the expected losses to a policy based on some type of average of the losses in past years, after allowing for claims inflation, exposure changes, incurred but not reported (IBNR) claims, and any other amendments that need to be made to make the past claims data relevant to today's situation.

In its most basic incarnation, burning cost is based on aggregate losses, i.e. without considering the information on individual losses but only the total amount per year. However, this approach easily falls apart in the presence of deductibles and limits, as the policy might have had different levels of deductible over the years and the effect of inflation is non-linear in the presence of deductible (the so-called "gearing effect"). The technique explained here assumes that individual loss information is available.

**Methodology and calculation example**

In order to understand how the burning cost methodology works, it is useful to keep in mind the simple example of a public liability policy of annual duration that incepts on 1/4/2017, and for which the following are available:

- individual loss data from 2007 to 2016 (observation period) with values as estimated at 30 November 2016;
- historical exposure information (turnover) over the same period, and with an estimated value for 2017;
- cover data: an each-and-every-loss (EEL) deductible of £100,000, an annual aggregate deductible (AAD) of £2.4m, a limit (EEL and on aggregate) of £10m.

The claims and exposure information (before revaluation and other modifications) is summarised in the table below.

*Amounts in £m*

We will assume that claims inflation has been 3% p.a. over the observation period and will remain so during the reference period, and that all things (and specifically, all risk) being equal turnover has grown (and will grow) by 2% p.a. We also estimate that a loss of expected value £15m will occur with a probability of 1% in a given year (return period: 100 years).

Burning cost analysis can be articulated in the following steps:

**Inputs:** (a) historical individual losses, plus loss development triangles for client and/or portfolio if available; (b) historical exposure, and estimated exposure for the reference period (c) cover data, including reference period, duration, deductibles, limits

*Revalue individual losses*- In this step, we revalue each past loss to the value that we estimate it would have if it were to occur mid-policy-year (in our specific case, 1/10/2017)*Apply policy modifiers to all losses*- For each loss, the amount retained and ceded after individual modifiers such as deductibles and limits are applied. Note that we are not interested here in the historical policy modifiers - just the ones for the policy being priced (in our case, EEL=£100,000, Limit = £10m)*Aggregate losses by policy year*- The total amount of incurred, retained and ceded losses is calculated for each year. The retained losses can be split in retained below the deductible and retained above the limit. Note that the relevant policy year is not the historical policy year, but the one for the new/renewed policy.*Make adjustments for IBNR and IBNER*- The total losses for each year are grossed up to incorporate the effect of incurred but not (yet) reported losses (IBNR) and incurred but not enough reserved (IBNER) losses. This can be done by using loss development triangles for the client or (more frequently) by using portfolio/market correction factors, based on a significant number of different clients, and calculate loss development factors using techniques such as chain ladder or variants (Cape Cod, Bornhuetter-Fergusson, etc.) to project the amount for each year to ultimate. These techniques normally take into account both IBNR and IBNER at the same time. The same correction factors can be used for the incurred, retained and ceded.*Revalue exposure and align it to the policy year*- If the exposure is a monetary amount (e.g. turnover, payroll) it must also be brought to the expected value mid-policy. Note that the inflation index used for exposure is not necessarily (or even typically) the same as that used to revalue claims.- Also, in case historical exposure figures refer to periods different from the policy year, the two periods needs to be aligned: e.g. if exposure is 1/1 to 31/12 and the policy year is 1/4 to 31/3, the exposure for policy year = n can be approximated as 9/12 of exposure for year n and 3/12 of exposure for year n+1.
*Adjust for exposure changes*- Adjust the aggregate losses for each bringing them up to the level of exposure expected for our policy: adij_losses(PY) = exposure(RY)/exposure(PY) x losses(PY). (This step may be unnecessary if there are no aggregate policy features such as AAD and no adjustments for large losses.)*Make adjustments for large losses*- If we believe that large losses have been under-represented, we can correct for that by adding to the total amount for each year an amount equal to the amount of a possible large loss divided by the return period estimated for that loss. Conversely, it we believe that large losses have been over-represented because losses with long return period have occurred by a fluke during a much shorter observation period, we can remove these large losses and add them back after scaling them down to reflect their return period. Note that these correction depend on an often subjective assessment of the return period of losses.*Make other adjustments (risk profile changes, terms and conditions, etc.)*- If for some reason we believe that the past experience is not fully relevant to today's environment because of changes to the risk profile (e.g. presence of new risk control mechanisms, M&A activities) or to the terms and conditions (e.g. the introduction/removal of an important exclusion), we may want to make adjustment to the past years to reflect that. Needless to say, this may easily become very subjective and some bias (or utter wishful thinking) may be introduced in the process.*Impose aggregate deductibles/limits as necessary -*We now have aggregate claims figures for each year that reflect current monetary amounts, current exposures and make an allowance for IBNR, IBNER, large losses and other profile changes. We can therefore apply other policy modifiers of an aggregate nature, such as an annual aggregate deductible or an annual aggregate limit. Although the overall incurred losses do not change as a result of this,the balance between retained and ceded losses may shift. E.g. the AAD may put a cap on the retained losses and increase the ceded amount.*Exclude specific years or give them lower weights in the calculation -*Some of the policy years may be excluded from the burning cost calculation because they are either too immature and therefore too uncertain (e.g. the latest year) or because they are not fully relevant (even after allowing for risk profile changes). This feature should be used sparingly to avoid excessive subjectivity and the intrusion of bias.*Calculate the burning cost -*We are now in a position to calculate a burning cost for each year (total losses per unit of exposure) and an overall burning cost for next year (expected total losses per unit of exposure) as the average of the losses over a selected number of years. The average could be calculated either on an exposure-weighted or an unweighted basis, but the use of an exposure-weighted average is strongly recommended as years with a small exposure basis tend to be - as a general rule, at least - less trustworthy. The average should also take account of any excluded/differently weighted years.*Loading for costs, investment income, and profit*- Claims-related expenses may (or may not) be already included in the incurred loss amounts. Other items that should be added are underwriting expenses

**Output:** technical premium

The table below summarises the calculations for our selected example.

Legend: Inc = Incurred, Ret = Retained (below the deductible)

**In formulae**

The mathematically inclined might find it useful to have the methodology above summarised by the following formulae:

Where is the burning cost p.u.e. for policy year *j*:

And where in turn:

- is the value of the
*i*-th loss in policy year*j*after revaluation and after each-and-every-loss deductibles and limits are taken into account; - LDF
_{j}is the loss development factor for policy year*j*which takes into account IBNR/IBNER development; - is the exposure for policy year
*j*after revaluation (aka on-levelling) and alignment between exposure period and policy period; is the exposure for the reference policy year;*Ɛ**- "risk profile adjustments" refers to adjustments for changes in the risk profile, terms and conditions, etc. as per Step 8 above. This will normally take the form of a risk profile index;
*LLA*is a large-loss adjustment (negative or positive) applied uniformly to all policy years. In case large losses are under-represented, i.e. there is currently no loss in the dataset where the return period is significantly larger than the observation period, this adjustment will normally take the form*LLA = LL/RP*where*LL*is the is the expected size of a large loss of a given return period and*RP*is the return period (note that the size of the depends on the reference exposure - hence this correction must be made after the exposure correction). In case of large losses of estimated return periods (all larger than the observation period) are already present in the data set and we want to scale down their effect, the adjustment might take the form: . Note that the size of the adjustment depends on the reference exposure - hence this correction must be made after the exposure correction.

The burning cost can also be expressed as an absolute monetary amount, by multiplying the BC p.u.e. by the reference exposure:

The technical premium is then given by:

**Producing an aggregate loss distribution**

Despite its simplicity and its lack of granularity, underwriters and actuaries have consistently tried to stretch the methodology to produce an estimate not only for the point estimate of the expected losses but also of the year-on-year volatility and even of the distribution of outcomes.

Formally, this can be done easily. The volatility of the total losses can be estimated by the standard deviation of the burning cost (expressed in terms of the reference exposure) for the different policy years.

As for the distribution of outcomes (which allows to express an opinion on various important percentiles of the incurred/retained/ceded distribution, helping to set the correct retention levels), a popular method is to assume that the distribution is a lognormal and to calibrate such lognormal using the method of moments, i.e. by choosing the parameters of the lognormal distribution so that the expected losses E(S) and the standard deviation SD(S) are equal to those calculated based on the observation period for the reference exposure:

This technique can be used to model the incurred, retained or ceded distributionbeforethe effect of aggregate deductibles/limits are taken into account.

Despite the popularity of this method, one would do well to remember that this is barely a formal solution: there is actually no guarantee that the aggregate loss model (incurred, retained or ceded) is a lognormal distribution, and even if that was the case to a good approximation, a calibration of sufficient accuracy is not possible based on 5-10 years of aggregate experience.

To use this method we need the formulae that relate the mean and standard deviation of the lognormal distribution to its parameters, and vice versa:

where *S* is the random variable representing total losses and *SD(S)* is the standard deviation of the total losses. In practice, the parameters of the lognormal distribution are estimated using Equation 11.7 but using the empirical estimates of *E(S)* and *SD(S)* derived from the data.

**Limitations of burning cost analysis**

The burning cost method has several severe limitations. We will only mention three:

- It is a blunt tool for estimating the expected losses, and an even blunter tool if used for capturing the volatility around the expected losses and the effect of policy deductibles and limits.
- It fails to disentangle frequency and severity trends, when such trends are present.
- The treatment of large losses is inevitably too rough. E.g., it is unclear what return period one should assign to a large loss which is added to/removed from the observation period.

### **The modern approach to commercial lines/reinsurance pricing - Stochastic loss modelling**

Stochastic loss modelling is a methodology that has been devised to overcome the limitations of burning cost analysis, which implicitly assumes that the future is going to be like a sample from the past, after adjustments for trends such as inflation and exposure changes.

In contrast to this, stochastic loss modelling attempts to build a simplified mathematical description of the loss process in terms of the number of claims, their severity and the timing by which they occur, are reported and paid.

The discipline that studies the theory of modelling is called machine learning. This has come to the foray recently as a means of making sense of large quantities of data ("Big Data") but at a more fundamental level it addresses the problem of producing models of the right level of complexity and form to maximise their predictive power. The main teaching to remember is probably that models should not have more parameters than it is possible to calibrate in a reliable way (Parodi, 2012).

Stochastic loss modelling is typically based on the collective risk model (Klugman et al., 2008), which models the total losses over a year as the sum of a number of individual losses. The number of individual losses is assumed to be random and distributed according to a given loss count distribution (more frequently albeit inaccurately referred to as the "frequency distribution") such as a Poisson distribution. The individual loss are assumed to be independent and identically sampled from the same loss severity distribution such as a lognormal distribution. The number of losses and the individual severities are assumed to be independent.

*(Source: Parodi (2014), Pricing in General Insurance)*

The standard frequency/severity approach is based on attempting to reconstruct the true frequency and severity distribution described above, and can be summarised as in the chart above. The main steps of the process (data preparation, frequency modelling, severity modelling, aggregate loss modelling) are explained below.

**Inputs**

Much as in the case of burning cost analysis, the inputs to the process are the historical individual losses (plus portfolio information), historical and current exposure, policy cover data.

**STEP 1 - Data preparation**

In this step, basic data checks are performed and the individual losses are adjusted to take account of factors such as inflation, currency conversion, etc. so that the past data set is transformed into a data set of present relevance.

**STEP 2 - Frequency modelling**

Overall, the way in which frequency modelling works is not too dissimilar from burning cost analysis - except that it is applied to loss counts and not to total amounts.

*IBNR (incurred but not reported) analysis.*First of all, calculate the number of claims that have been reported for each year and project the reported number of claims in each policy year to the estimated ultimate number of claims. This can be done by triangle development techniques such as chain ladder development, BF, Cape Cod… The development factors can be based on client information or, more frequently, on portfolio/market information, which tend to be more stable (see Parodi (2014), Chapter 13).- Once projected to ultimate, the number of losses is adjusted for changes in exposure.
- A loss count model is then fitted to the loss counts for the observation period. Since there will never be more than 10 periods or so on which to base our model, there will never be any excuse for using a complicated frequency model. In most cases, one of these three models will be used: binomial model (for which the variance is lower than the mean - rarely useful for the collective risk model), Poisson model (for which the variance is equal to the mean) and negative binomial model (for which the variance is larger than the mean). The calibration of even only two parameters is problematic and therefore the use of portfolio information to estimate the variance-to-mean ratio is recommended.

**STEP 3 - Severity modelling**

Severity modelling consists of two steps:

*IBNER (incurred but not enough reserved) analysis*. Available historical losses are often not settled amount but reserve estimates. IBNER analysis attempts to find patterns of under- or over-reserving in historical losses and correct the value of individual claims to reflect this. However, it is often the case (apart from reinsurance) that historical information on the development of each individual claim is unavailable and therefore IBNER adjustments are ignored. This topic is rather technical and the interested reader is referred to Parodi (2014), Chapter 15 for a detailed treatment.*Selection and calibration of a severity model.*A severity model is then fitted to the inflation-adjusted past loss amounts. As usual, the crucial principle is to avoid complex distributions that cannot be supported by the data. In most cases, a simple lognormal model is a good starting point. However, when data becomes relatively abundant a lognormal distribution might prove inappropriate.

The crucial consideration when it comes to modelling the severity of losses is to get the tail right, as that is the part of the distribution that has the largest impact on the long-term average of ceded losses. At the same time, the tail of the distribution is the most difficult to capture correctly as it is the region of the distribution where the lack of sufficient data is felt most acutely.

The good thing about modelling the tail is that there are solid statistical results - extreme value theory - that constrain the type of distributions that can be used to model the tail and the only thing which is left is to calibrate it.

A good approach to the problem of severity modelling is therefore to use the empirical distribution below a certain threshold and model the tail using a GPD, based on client or (above a certain threshold) relevant portfolio data.

The chart below shows an example of severity modelling in which an empirical loss distribution has been used for the attritional losses and a GPD has been used for losses in excess of €10,000.

*(Source: Parodi (2014), Pricing in General Insurance)*

**STEP 4 - Aggregate loss modelling**

Once a frequency model and a severity model have been selected, we can combine them to produce an aggregate loss model both for the gross and ceded/retained losses.

The mathematical problem of combining these distributions cannot in general be solved exactly but needs to be approximated with numerical techniques. The most common techniques are Monte Carlo simulation, Panjer recursion and Fast Fourier Transform (FFT). Of these, FFT is the most efficient, and Monte Carlo simulation is the most flexible.

**Output**

The typical output of a stochastic model is a table that provides the main statistics of the aggregate model for the gross, retained and ceded loss distributions: estimated mean, estimated standard deviation and estimated percentiles, as in the following table.

### **Personal lines pricing techniques: GLM and price optimisation**

**Rating factors selection and calibration**

In the case of personal lines (and, increasingly, in the most commoditised sectors of commercial lines insurance), it is possible to go beyond a single frequency/severity model for the annual losses and charge policyholders depending on their degree of riskiness.

The underlying idea is that there is a number of *risk factors* that make some policyholders inherently riskier than others - e.g. aggressive behaviour at the wheel, number of miles driven, etc. might be risk factors for motor insurance. Risk factors are often unmeasurable but there are objective and measurable factors such as type of car, age, etc. that can work collectively as proxies for the risk factors and can be used as *rating factors*.

Using rating factors is not a luxury - it is an absolute necessity in a competitive market where other players also use rating factors: it rating factors selection as used by an insurer is sub-optimal (because, e.g., uses too few factors), that insurer will quickly find itself undercharging (and therefore attracting) the worst risks.

Over time, personal lines insurers have been striving to identify the rating factors that are most useful to predict the behaviour of policyholders and to build and calibrate models based on these rating factors. A number of more or less heuristic techniques (single-way analysis, multi-way analysis, minimum bias approach) were developed by the insurance industry to address this problem, until generalised linear modelling, a rigorous and successful technique, came along and became the de facto industry standard (Anderson et al., 2007b).

In recent years there has been a growing awareness that techniques borrowed from machine learning such as regularisation, artificial neural networks, decision-tree methods, etc. can further improve the accuracy and efficiency of rating factors selection and calibration process.

Despite the fact that there is significant mathematical machinery around GLM, the basic concept is rather simple and can be seen as an elaboration of least squares regression, where a straight line is fitted through a number of data points.

A generalised linear model relates a dependent variable (e.g., the number of claims Y) to a number of explanatory rating factors _{}through a relationship like this:

Some explanations are necessary. The rating factors are first packaged into a number of functions _{} each of which can be a more or less complicated function of _{}: e.g. _{}.

These functions are combined via a linear (hence the "linear" in "generalised linear modelling") combination:

... where *ßj* are constants. Finally, the output of this linear combination is further transformed via a link function *g* to add an extra level of flexibility and influence the properties of the dependent variable: e.g. by choosing *g * to be a logarithmic function, we can force *Υ* (the number of claims) to be positive. The term represents noise (not necessarily additive), e.g. the deviation from the predicted number of claims because the actual number of claims follows a Poisson distribution.

As for the case of classical least squares regression, the parameters *ßj* can be calculated by maximum likelihood estimation.

There are many different methodologies to pick the best model of the form described above. One such technique (forward selection) starts with a very simple model such as _{} and adds the most promising functions_{}one by one until the improvement is statistically insignificant. Statistical significance can be assessed by out-of-sample validation, cross-validation or by approximate methods such as the Akaike Information Criterion (AIC).

**Price optimisation**

Price optimisation helps select a price for a policy based not only on the production costs (i.e. the expected claims, which can be determined, albeit in an approximate way, using the techniques explained above) but also on the demand elasticity.

To see how price optimisation works in principle, consider the simple case where there is a single policy sold at the same price P to all customers. The total expected profits are given by *TEP(P)* = *π(P)* x *D(P)* where *π(P) * is the profit per policy and increases linearly with P and *D(P) * (the demand curve) is the propensity of a customer to make a purchase, and is a monotonically decreasing function eventually going to zero. The objective of price optimisation is finding the premium level that maximises the total expected profits *TEP(P)*.

The most accurate method to determine the demand function is by a so-called "price test". This works by offering a randomly selected sample of policyholders a different price for the same policy, and analysing the conversion rate (percentage of quotes leading to purchases) at different levels of price.

Not all legislations allow for running such price tests and hence price optimisation only thrives in those countries such as the UK and Israel with favourable legislation.

### **Uncertainties around costing**

Although most insurance professionals are familiar with pricing models, considerably fewer have developed a sense of the uncertainty around the results of a pricing model, and with the sources of such uncertainty. This may lead to over-reliance (or, sometimes, under-reliance) on the quantitative results of pricing models.

The main sources of uncertainty are:

**Parameter uncertainty**: since models are calibrated based on a limited amount of data, the parameters of such models have a limited degree of accuracy, which in turn causes inaccuracies in the assessment of the various statistics extracted by the model (mean, standard deviation, and especially the higher percentiles of the loss distribution). It is the easiest to measure as maximum likelihood estimation methods come with approximate estimates of parameter error as a function of data size;**Data/assumptions uncertainty**: the results of a model can only be as good as the quality of the inputs, e.g. loss data points and related assumptions such as claims inflation or risk profile changes. Sensitivity analysis is the most common way of measuring the effect of assumptions uncertainty. Data uncertainty can be estimated by simulation methods such as parametric bootstrap;**Model uncertainty**: all models are imperfect representations of reality and their results are consequently inaccurate. Measuring model uncertainty is fiendishly difficult (if not theoretically impossible) and the best one can achieve is to assess the difference in the results between a number of competing standard models;**Simulation uncertainty:**when Monte Carlo simulation is used, the results are always approximate as a result of the finite number of simulations. This source of uncertainty, however, is normally negligible when compared with the other sources of uncertainty. It can be measured by repeating the simulation several times and can be controlled by increasing the number of simulations. Other approximation schemes are likewise subject to errors, e.g. discretisation errors in FFT methods.

Note that process uncertainty (the volatility of actual losses from year to year) is not included in the list above since process uncertainty is inherent to the stochastic nature of the loss production process.

### **From the expected losses to the technical premium**

**Pricing methods**

The General Insurance Rating Issues Working Party (GRIP), a working party of the Institute and Faculty of Actuaries, identified in 2007 five different pricing methods (Anderson at al., 2007a). In order of increasing sophistication, these are:

*tariff*, in which there is a rating bureau (e.g. the regulator, or an association of insurers) sets the rates or needs to approve them;*qualitative,*in which data is scant and therefore subjective judgment is combined with a rough statistical analysis;*cost plus,*in which there is sufficient data analysis for a quantitative analysis and the price has the form: Technical premium = Expected losses + Loadings (expenses, profits…);*distribution,*where demand-side elements such as the customer's propensity to shop around is accounted for with the objective of achieving price optimisation. This is typically used in personal lines insurance;*industrial*, in which the focus is on achieving operational efficiency and economies of scale with the industrialisation of the pricing process across several classes of business. Typically used by large personal lines insurers operating in commoditised markets such as motor or household insurance.

We are now going to take a more in-depth look at Cost Plus as it is the most widespread pricing method on which the distribution and industrial methods are based. The tariff and qualitative are by definition less liable to be spelled out in general.

**Cost plus**

Cost Plus subdivides the technical premium into five components:

Technical Premium = Expected Losses + Allowance for Uncertainty + Costs - Incomes + Profit

The *Expected Losses* component is normally the mean of the aggregate loss distribution. Since this can only be determined with some uncertainty, and *Allowance for Uncertainty* might sometimes be added, especially if the insurer is not in a position to fully diversify between clients. *Costs* includes claims expenses (if not already included in the expected losses), underwriting expenses, commissions, reinsurance premiums, general expenses allocated to the policy. *Incomes* is a discount that the insurer may want to apply to pass on to the customer the investment returns arising from the delay between premium intake and claims payout. *Profit* is simply an allowance for profit based on the company's guidelines or required return on capital.

Although the technical premium is expressed as a sum, in practice the calculation of the technical premium can take a number of different forms. An example is the following:

where:

- is the mean delay between premium receipt and claim payout,
_{ }is the investment return of "risk-free" bonds of term(or, more in general, the yield curve for risk-free government bonds at term, where the yield for terms that do not correspond to any available bond are obtained by interpolation)- The net reinsurance premium is the difference between the reinsurance premium and the expected losses ceded to the reinsurer
- The other elements of the formula are self-explanatory

**Key facts**

- Pricing is not an isolated process but has significant interactions with a number of other corporate functions, such as planning, underwriting, claims, reserving, capital modelling.
- The pricing cycle is an example of he classic project control cycle: it involves planning, pricing and (crucially) monitoring a number of variables (rates, costs, investments, sales, competition, portfolio composition…)
- We should distinguish between costing (the production of a technical premium) and pricing (the decision on the actual premium charged to the customer)
- The traditional approach to costing is burning cost analysis, according to which the expected losses are calculated as the exposure-weighted average of the past historical experience after adjustments for IBNR claims, claims inflation and changes in the risk profile
- The modern approach to pricing is through stochastic modelling, which consists of separately modelling the frequency and the severity of claims and combining them through Monte Carlo simulation to produce a distribution of total losses, both before and after the application of insurance cover.
- In the case of personal lines (and increasingly for data-rich commercial lines situations) it is possible to charge each policyholder a different amount depending on their riskiness, using a technique called generalised linear modelling
- Price optimisation techniques allow to select the right price based not only on the pure costs (claims and expenses) but also on the demand elasticity
- Costing is subject to a number of uncertainties (model, data, assumptions, parameter) not all of which can be quantified and that the underwriter needs to be aware of
- The most popular method for the calculation of the technical premium from the expected losses is "cost plus", that calculates the technical premium as a sum of different components: expected losses, allowance for uncertainty, costs, profit, investment income

### **Further reading**

**Publications**

- Anderson, D. (chairman), Bolton, C., Callan, G., Cross, M., Howard, S., Mitchell, G., Murphy, K., Rakow, J., Stirling, P., Welsh, G. (2007a), General Insurance Premium Rating Issues Working Party (GRIP) report
- Anderson, D., Feldblum, S., Modlin, C., Schirmacher, D., Schirmacher, E., Thandi, N. (2007b) A Practitioner's Guide to Generalized Linear Models, A CAS Study Note
- Klugman, S. A., Panjer, H. H., Willmot, G. E. (2008), Loss models. From data to decisions. Third Edition, Wiley, Hoboken NJ.
- Lloyd's (2014), Claims inflation
- Parodi, P. (2012) Computational intelligence with applications to general insurance: A review. I - The role of statistical learning. Annals of Actuarial Science, September 2012
- Parodi, P. (2014), Pricing in general insurance, CRC Press
- Parodi, P. (2016a), Pricing in General Insurance (II): Specialist pricing and advanced techniques, CII
- Parodi (2016b), Glossary (http://pricingingeneralinsurance.com)

**Useful websites**

This document is believed to be accurate but is not intended as a basis of knowledge upon which advice can be given. Neither the author (personal or corporate), the CII group, local institute or Society, or any of the officers or employees of those organisations accept any responsibility for any loss occasioned to any person acting or refraining from action as a result of the data or opinions included in this material. Opinions expressed are those of the author or authors and not necessarily those of the CII group, local institutes, or Societies.