Coding overview

interested ? ….. anyone ????

This documentation is in a rudimentary form for release 0.1.1. which is meant to see how much interest (not the financial one) this package generates.

Vignettes

The following vignettes are available.

• ggsolvencyii
• plotdetails (placeholder)
• showcase
• coding_overview (this vignette)

On https://github.com/vanzanden/ggsolvencyii/tree/master/vignettes less rudimentary versions might be available between releases.

Have you seen the examples in vignettes ggsolvency and showcase yet?

It will be very helpful to have seen a few examples of what ggsolvencyii can do before going through this vignette.

data in human readable and tidyverse format

a typical spreadsheet might show some ORSA (own risk and solvency assessment) in the shape represented by the following data.frame:

id time ratio SCR BSCR operational life market l_expenses l_CAT m_equity and so on
1 2017 230 100 80 25 33 50 .. .. .. ..
2 2018 225 103 85 25 33 57 .. .. .. ..
3 2019 227 107 90 23 37 60 .. .. .. ..
..

One can discern several parts. The first columns are id of each SCR composition and its ‘meta’ attributes (time, ratio). The further columns describe the components of each SCR item. The value of each item is in the crossing of its corresponding column and row.

data in ‘ggplot2’ prescribe format (tidyverse format)

ggplot2, the foundation on which the plotting part of this package is build expects data in a tidyverse format. Each row in the data describes only one data point i.e. value of SCR item for one specific ‘id’.

the following code is used from transferring data (for example 2, a single SCR plot) in a spreadsheet the same form as the “human format” as above to tidyverse format (the numbers differ though !)

data <- readxl::read_xlsx(path = "path/filename.xlsx",sheet = "ex2_data")
data <- tidyr::gather(data,
key = description,
value = value,
-id, -time, -ratio)
sii_z_ex2_data <- data.frame(   time = as.numeric(data$time), ratio = as.numeric(data$ratio),
description = data$description, # it has to be a factor !! value = as.numeric(data$value),
id = data$id   head(sii_z_ex2_data,7) #> time ratio description value id #> 1 2017 230 SCR 30 1 #> 2 2017 230 BSCR 35 1 #> 3 2017 230 operational 5 1 #> 4 2017 230 Adjustment-LACDT -10 1 #> 5 2017 230 BSCR_div -5 1 #> 6 2017 230 market 20 1 #> 7 2017 230 life 15 1 ggsolvencyii: data transformations when the above data is passed to the package with (a very) basic line as ggplot() + geom_sii_risksurface(data = sii_z_ex2_data , mapping = aes(x=time, y = ratio, id=id, value = value, description = description)) a lot happens under the hood. Broadly speaking the next steps are taken for geom_sii_surface and .._outline:  1. when geom_sii_riskoutline is used for comparison of id's, risk-values are moved between data rows 2. the structure of the SCR composition a expanded with grouping information 3. the expanded structure is integrated with the data 4. actual grouping is performed by adding rows 5. for all elements to be plotted the corner-coordinates of the circle segments are calculated 6. when applicable rotation and/or "squarification" is applied by changing the corner-coordinates 7. corner coordinates are transformed in a series of points for polygons  shuffling with riskvalues in the data geom_sii_riskoutline plots (some of) the outlines of circle segment and as such can be used for a non-obtrusive plot, or for an overlay of the composition of one SCR over the other (see use in vignette showcase. To prevent the need of working with two separate datasets the optional aesthetic comparewithid is present in geom_sii_outline. It is best explained with an example. Compare the data of sii_z_ex1_data with the expanded structures without and with use of the comparewithid-aesthetic. It shows that the structure of id = 1 is not plotted anymore at its own location (2016,230) but three times in 201: Value 23 for SCR is now present three times in the data. This transformation is used for all (sub)risks. ## the original data sii_z_ex1_data[sii_z_ex1_data$description == "SCR", ]
#>    time ratio description    value id comparewithid
#> 1  2016   230         SCR 23.00000  1            NA
#> 2  2017   233         SCR 23.14993  2             1
#> 3  2018   238         SCR 19.99461  3             2
#> 4  2019   243         SCR 15.61773  4             3
#> 5  2017   231         SCR 19.60600  5             1
#> 6  2018   232         SCR 25.74336  6             5
#> 7  2019   232         SCR 21.91342  7             6
#> 8  2017   227         SCR 25.08169  8             1
#> 9  2018   225         SCR 22.43068  9             8
#> 10 2019   226         SCR 21.91607 10             9
#> without passing the aesthetic 'comparewithid: 10 lines of data
#>    description id    x   y    value
#> 35         SCR  1 2016 230 23.00000
#> 34         SCR  2 2017 233 23.14993
#> 33         SCR  3 2018 238 19.99461
#> 31         SCR  4 2019 243 15.61773
#> 39         SCR  5 2017 231 19.60600
#> 38         SCR  6 2018 232 25.74336
#> 32         SCR  7 2019 232 21.91342
#> 36         SCR  8 2017 227 25.08169
#> 37         SCR  9 2018 225 22.43068
#> 40         SCR 10 2019 226 21.91607
#> and with passing passing the aesthetic 'comparewithid': 9 lines of data
#>    description id    x   y    value
#> 28         SCR  2 2017 233 23.00000
#> 31         SCR  3 2018 238 23.14993
#> 32         SCR  4 2019 243 19.99461
#> 29         SCR  5 2017 231 23.00000
#> 33         SCR  6 2018 232 19.60600
#> 34         SCR  7 2019 232 25.74336
#> 35         SCR  8 2017 227 23.00000
#> 30         SCR  9 2018 225 25.08169
#> 36         SCR 10 2019 226 22.43068

structure: levels, levels, levels…

The foundation of the package is the structure. A representation of the buildup of the SCR from its risks and subrisks. This structure is applied as a data.frame passed as a parameter to the geom’s geom_sii_surface and geom_sii_outline. The default data.frame is sii_structure_sf16_eng where ‘sf16’ stands for the standard formula as of 2016, and ‘eng’ for English descriptions.

 head(sii_structure_sf16_eng, 15)
#> # A tibble: 15 x 3
#>    description      level childlevel
#>    <chr>            <chr> <chr>
#>  1 SCR              1     2
#>  2 BSCR             2     3
#>  3 operational      2     <NA>
#>  5 BSCR_div         3d    <NA>
#>  6 market           3     4.01
#>  7 life             3     4.02
#>  8 non-life         3     4.03
#>  9 health           3     4.04
#> 10 cp-default       3     <NA>
#> 11 intangibles      3     <NA>
#> 12 market_div       4.01d <NA>
#> 13 m_interestrate   4.01  <NA>
#> 14 m_equity         4.01  <NA>
#> 15 m_property       4.01  <NA>

A Dutch version, sii_structure_sf16_nld, is present in the package.

The hierarchy of the elements in description is determined by level and their components (childlevel). SCR has a mandatory level (character value) “1”. rows with a suffix ‘d’ indicate a diversification item.

For other localizations or for use with internal models another structure can be passed to the geom. see my interpretation of the Internal Model of the dutch insurer “nationale nederlanden” in sii_z_ex6_structure. Changing level-numbering or descriptions of items leads possible to the need of changing other (parameter) files as well (i.e. levelmax, plotdetails, coloring-sets).

expanding the structure: possible grouping

When reporting the SCR composition of a large insurance company many risks will be present. This can lead to a very cluttered plot where all information is present but which is difficult to interpret. The package provides the means to restrict the amount of items to ‘k’ (in general or for each level separately) by means of the parameter levelmax. this can be an integer, to applied to all items or in the form of a data.frame. The default value is 99, only grouping for risks with more than 100 sub-risks….

Parameter levelmax = sii_levelmax_sf16_995 shows all higher levels (lower level numbers) but restricts the lower levels (higher numbers) to 4 individual risks and 1 grouping of the smallest risks in that level.

sii_levelmax_sf16_995
#> # A tibble: 8 x 2
#>   level levelmax
#>   <chr>    <dbl>
#> 1 1           99
#> 2 2           99
#> 3 3           99
#> 4 4.01         5
#> 5 4.02         5
#> 6 4.03         5
#> 7 4.04         5
#> 8 5            5

Combining the structure and the levelmax-information leads to an expanded structure of which the lines for levels 3 and 4.01 are shown here:

#> # A tibble: 15 x 4
#>    description     level childlevel levelmax
#>    <chr>           <chr> <chr>         <dbl>
#>  1 market          3     4.01             99
#>  2 life            3     4.02             99
#>  3 non-life        3     4.03             99
#>  4 health          3     4.04             99
#>  5 cp-default      3     <NA>             99
#>  6 intangibles     3     <NA>             99
#>  7 market_div      4.01d <NA>             99
#>  8 m_interestrate  4.01  <NA>              5
#>  9 m_equity        4.01  <NA>              5
#> 10 m_property      4.01  <NA>              5
#> 11 m_spread        4.01  <NA>              5
#> 12 m_currency      4.01  <NA>              5
#> 13 m_concentration 4.01  <NA>              5
#> 14 m_illiquidity   4.01  <NA>              5
#> 15 market_other    4.01o <NA>             99

The row with level 4.01o is the added row. The description is derived from the row where childlevel = 4.01 and the value of the parameter aggregatesuffix (default value is “other”).

integration with data and actual grouping

The data (in tidyverse format!) is combined with the expanded structure by means of a left-join on the side of the data. Because the data is not expected to have o-lines for integration they will not be present in the merged table. When a possible grouping line is present in the expanded structure a check is conducted whether the data contains so much risks for that level that actual grouping is needed. (The dataset can contain less risks than the structure which is used; i.e. a pure life-insurance company can use the standard sii_structure_sf16_eng without any problems)

Now it’s known which lines in the expanded structure/data-data.frame should be plotted it is time to convert the date into circle segments. For the data-row with the largest SCR value it is defined as a full circle with radius = 1whatever the values of x and y. When combining several calls to geom_sii_risksurface and/or _riskoutline the parameter maxscrvalue overwrites this extracted value. All plot-elements are scaled to the surface value of the item. additional manual horizontal and vertical scaling is possible, depending on the range of x and y values of the axes to retain the round shape.

For other levels the circle segments are defined by an inner and outer radius and a number of (compass-)degrees of the first and last radial line (clockwise). the inner radius is defined by the outer radius of the next higher level. the number of compass-degrees is defined by the fraction of the value of each item and its (equal leveled) ‘peers’. The value / surface dictates the outer radius.

When applicable a rotation is performed, a rotation in such a way that the first radial line of a specific (sub)risk point to 12 ’o clock, and/or an added fixed rotation.

A final transformation to a squared form is possible. to keep surfaces correct the ‘radial’-lines are adjusted. This might lead to unpredictable results in combination with a rotation which is not a multiple of 45 degrees or description-based rotation.

The (transformed/rotated) corner points are translated in polygon points (for geom_sii_risksurface) or line segments (for geom_sii_riskoutline)

The final step is to define which of all these polygons or line segments actually will be plotted. By default everything will be plotted but passing a dataframe to parameter plotdetails can determine this on a level-level or a description-level.

In the showcase two data-frames are used, only differing in column surface, but equal for outline1 to outline13. one of them is shown here.

sii_z_ex1_plotdetails
#>    levelordescription surface outline1 outline2 outline3 outline4
#> 1                   1    TRUE       NA     TRUE       NA       NA
#> 2                   2    TRUE     TRUE       NA     TRUE       NA
#> 3                  2d    TRUE       NA       NA       NA       NA
#> 4                   3    TRUE     TRUE     TRUE     TRUE       NA
#> 5                  3d    TRUE       NA       NA       NA       NA
#> 6                4.01   FALSE       NA     TRUE       NA       NA
#> 7               4.01d   FALSE       NA       NA       NA       NA
#> 8               4.01o   FALSE       NA     TRUE       NA       NA
#> 9                4.02   FALSE       NA     TRUE       NA       NA
#> 10              4.02d   FALSE       NA       NA       NA       NA
#> 11              4.02o   FALSE       NA     TRUE       NA       NA
#> 12        operational      NA     TRUE     TRUE     TRUE       NA
#> 13         cp-default      NA     TRUE     TRUE     TRUE       NA
#>    outline11 outline13
#> 1       TRUE      TRUE
#> 2         NA        NA
#> 3         NA        NA
#> 4         NA        NA
#> 5         NA        NA
#> 6       TRUE      TRUE
#> 7         NA        NA
#> 8       TRUE      TRUE
#> 9       TRUE      TRUE
#> 10        NA        NA
#> 11      TRUE      TRUE
#> 12        NA        NA
#> 13        NA        NA

surface is used by geom_sii_risksurface, the other columns by geom_sii_riskoutline. It can best be read as follows. for each risk the line of the corresponding level is used, possibly overrule by the line with the correct description and a explicit TRUE or FALSE` present.