Plotting healthy life expectancy and life expectancy by deprivation for English local authorities

This worked example attempts to document a common workflow a user might follow when using the fingertipsR package.

fingertipsR provides users the ability to import data from the Fingertips website. Fingertips is a major repository of public health indicators in England. The site is structured in the following way:

This example demonstrates how you can plot healthy life expectancy and life expectancy by geographical regions for a given year of data that fingertips contains. So, where to start?

Where to start

There is one function in the fingertipsR package that extracts data from the Fingertips API: fingertips_data(). This function has the following inputs:

At least one of IndicatorID, DomainID or ProfileID must be complete. These fields relate to each other as described in the introduction. AreaTypeID is also required, and determines the geography for which data is extracted. In this case we want County and Unitary Authority level. AreaCode needs completing if you are extracting data for a particular area or group of areas only. ParentAreaTypeID requires an area type code that the AreaTypeID maps to at a higher level of geography. For example, County and Unitary Authorities map to a higher level of geography called Government Office Regions. These mappings can be identified using the area_types() function. If ignored, a ParentAreaTypeID will be chosen automatically.

Therefore, the inputs to the fingertips_data function that we need to find out are the ID codes for:

We need to begin by calling the fingertipsR package:

library(fingertipsR)

IndicatorID

There are two indicators we are interested in for this exercise. Without consulting the Fingertips website, we know approximately what they are called:

We can use the indicators() function to return a list of all the indicators within Fingertips. We can then filter the name field for the term life expectancy (note, the IndicatorName field has been converted to lower case in the following code chunk to ensure matches will not be overlooked as a result of upper case letters).

inds <- indicators_unique()
life_expectancy <- inds[grepl("life expectancy", tolower(inds$IndicatorName)),]
IndicatorID IndicatorName
90362 Healthy life expectancy at birth
90366 Life expectancy at birth
90825 Inequality in healthy life expectancy at birth ENGLAND
91102 Life expectancy at 65
92031 Inequality in healthy life expectancy at birth LA
92901 Inequality in life expectancy at birth
93190 Inequality in life expectancy at 65
93505 Healthy life expectancy at 65
93523 Disability-free life expectancy at 65
93562 Disability-free life expectancy at birth
650 Life expectancy - MSOA based
93249 Disability free life expectancy, (Upper age band 85+)
93283 Life expectancy at birth, (upper age band 90+)
93285 Life expectancy at birth, (upper age band 85+)
93298 Healthy life expectancy, (upper age band 85+)
92641 Life expectancy at 75 (SPOT: NHSOD 1b)
90365 Gap in life expectancy at birth between each local authority and England as a whole

The two indicators we are interested in from this table are:

AreaTypeID

We can work out what the AreaTypeID codes we need using the function area_types(). We’ve decided that we want to produce the graph at County and Unitary Authority level. From the section Where to start we need codes for AreaTypeID and ParentAreaTypeID.

areaTypes <- area_types()

The table shows that the AreaTypeID for County and Unitary Authority level is 202. The third column, ParentAreaTypeID, shows the IDs of the area types that these map to. In the case of County and Unitary Authorities, these are:

AreaTypeID AreaTypeName ParentAreaTypeID ParentAreaTypeName
202 UTLA (post 4/19) 6 Government Office Region
202 UTLA (post 4/19) 104 PHEC 2015 new plus PHEC 2013 unchanged
202 UTLA (post 4/19) 10105 Depriv. decile (IMD2015, 4/19 boundaries)
202 UTLA (post 4/19) 10113 Depriv. deciles (IMD2019)
202 UTLA (post 4/19) 126 Combined authorities

ParentAreaTypeID is 6 by default for the fingertips_data() function for AreaTypeID of 202 (this value changes if different AreaTypeIDs are entered), so we can stick with that in this example. Use the area_types() function to understand more about how areas map to each other.

Extracting the data

Finally, we can use the fingertips_data() function with the inputs we have determined previously.

indicators <- c(90362, 90366)
data <- fingertips_data(IndicatorID = indicators,
                        AreaTypeID = 202)
## 
## 
## |  &nbsp;  | IndicatorID |      IndicatorName       | ParentCode |
## |:--------:|:-----------:|:------------------------:|:----------:|
## | **7703** |    90366    | Life expectancy at birth | E12000005  |
## | **7704** |    90366    | Life expectancy at birth | E12000006  |
## | **7705** |    90366    | Life expectancy at birth | E12000008  |
## | **7706** |    90366    | Life expectancy at birth | E12000005  |
## | **7707** |    90366    | Life expectancy at birth | E12000008  |
## | **7708** |    90366    | Life expectancy at birth | E12000005  |
## 
## Table: Table continues below
## 
##  
## 
## |  &nbsp;  |       ParentName       | AreaCode  |    AreaName    |  AreaType   |
## |:--------:|:----------------------:|:---------:|:--------------:|:-----------:|
## | **7703** |  West Midlands region  | E10000028 | Staffordshire  | County & UA |
## | **7704** | East of England region | E10000029 |    Suffolk     | County & UA |
## | **7705** |   South East region    | E10000030 |     Surrey     | County & UA |
## | **7706** |  West Midlands region  | E10000031 |  Warwickshire  | County & UA |
## | **7707** |   South East region    | E10000032 |  West Sussex   | County & UA |
## | **7708** |  West Midlands region  | E10000034 | Worcestershire | County & UA |
## 
## Table: Table continues below
## 
##  
## 
## |  &nbsp;  |  Sex   |   Age    | CategoryType | Category | Timeperiod | Value |
## |:--------:|:------:|:--------:|:------------:|:--------:|:----------:|:-----:|
## | **7703** | Female | All ages |      NA      |    NA    | 2016 - 18  | 83.1  |
## | **7704** | Female | All ages |      NA      |    NA    | 2016 - 18  | 84.17 |
## | **7705** | Female | All ages |      NA      |    NA    | 2016 - 18  | 85.09 |
## | **7706** | Female | All ages |      NA      |    NA    | 2016 - 18  | 83.66 |
## | **7707** | Female | All ages |      NA      |    NA    | 2016 - 18  | 84.17 |
## | **7708** | Female | All ages |      NA      |    NA    | 2016 - 18  | 83.92 |
## 
## Table: Table continues below
## 
##  
## 
## |  &nbsp;  | LowerCI95.0limit | UpperCI95.0limit | LowerCI99.8limit |
## |:--------:|:----------------:|:----------------:|:----------------:|
## | **7703** |      82.88       |      83.32       |        NA        |
## | **7704** |      83.95       |      84.39       |        NA        |
## | **7705** |      84.92       |      85.27       |        NA        |
## | **7706** |      83.39       |      83.92       |        NA        |
## | **7707** |      83.97       |      84.38       |        NA        |
## | **7708** |      83.67       |      84.17       |        NA        |
## 
## Table: Table continues below
## 
##  
## 
## |  &nbsp;  | UpperCI99.8limit | Count | Denominator | Valuenote |
## |:--------:|:----------------:|:-----:|:-----------:|:---------:|
## | **7703** |        NA        |  NA   |     NA      |    NA     |
## | **7704** |        NA        |  NA   |     NA      |    NA     |
## | **7705** |        NA        |  NA   |     NA      |    NA     |
## | **7706** |        NA        |  NA   |     NA      |    NA     |
## | **7707** |        NA        |  NA   |     NA      |    NA     |
## | **7708** |        NA        |  NA   |     NA      |    NA     |
## 
## Table: Table continues below
## 
##  
## 
## |  &nbsp;  |     RecentTrend      | ComparedtoEnglandvalueorpercentiles |
## |:--------:|:--------------------:|:-----------------------------------:|
## | **7703** | Cannot be calculated |               Similar               |
## | **7704** | Cannot be calculated |               Better                |
## | **7705** | Cannot be calculated |               Better                |
## | **7706** | Cannot be calculated |               Better                |
## | **7707** | Cannot be calculated |               Better                |
## | **7708** | Cannot be calculated |               Better                |
## 
## Table: Table continues below
## 
##  
## 
## |  &nbsp;  | ComparedtoRegionvalueorpercentiles | TimeperiodSortable | Newdata |
## |:--------:|:----------------------------------:|:------------------:|:-------:|
## | **7703** |               Better               |      20160000      |   NA    |
## | **7704** |               Better               |      20160000      |   NA    |
## | **7705** |               Better               |      20160000      |   NA    |
## | **7706** |               Better               |      20160000      |   NA    |
## | **7707** |              Similar               |      20160000      |   NA    |
## | **7708** |               Better               |      20160000      |   NA    |
## 
## Table: Table continues below
## 
##  
## 
## |  &nbsp;  | Comparedtogoal |
## |:--------:|:--------------:|
## | **7703** |       NA       |
## | **7704** |       NA       |
## | **7705** |       NA       |
## | **7706** |       NA       |
## | **7707** |       NA       |
## | **7708** |       NA       |

The data frame returned by fingertips_data() contains 26 variables. For this exercise, we are only interested in a few of them and for the time period 2012-14:

The data frame also contains data for the parent area, and for England, so we want to filter it to remove these too.

cols <- c("IndicatorID", "AreaCode", "ParentName", "Sex", "Timeperiod", "Value")

area_type_name <- table(data$AreaType) # tally each group in the AreaType field

area_type_name <- area_type_name[area_type_name == max(area_type_name)] # pick the group with the highest frequency
area_type_name <- names(area_type_name) # retrieve the name

data <- data[data$AreaType == area_type_name & 
               data$Timeperiod == "2012 - 14", cols]

Plotting outputs

Using ggplot2 it is possible to plot the outputs.

library(ggplot2)
ggplot(data, aes(x = reorder(ParentName, Value, median), y = Value, col = factor(IndicatorID))) + 
        geom_boxplot(data = data[data$IndicatorID == 90366, ]) +
        geom_boxplot(data = data[data$IndicatorID == 90362, ]) +
        facet_wrap(~ Sex) +
        scale_colour_manual(name = "Indicator",
                            breaks = c("90366", "90362"),
                            labels = c("Life expectancy", "Healthy life expectancy"),
                            values = c("#128c4a", "#88c857")) +
        labs(x = "Region",
             y = "Age",
             title = "Life expectancy and healthy life expectancy at birth \nfor Upper Tier Local Authorities within England regions (2012 - 2014)") +
        theme_bw() +
        theme(axis.text.x = element_text(angle = 45,
                                         hjust = 1))

Other useful functions

The plot above makes use of the fields that are within the dataset by default when using the fingertips_data() function. There is also a deprivation_decile() function, which provides an indicator of deprivation for each geographical area (see ?deprivation_decile()).

Not all indicators are available for every geography. To understand how indicators are mapped to different gegoraphies, there is a function indicator_areatypes().

To understand more about what comprises each indicator, there is the indicator_metadata() function, which provides the information on the definitions page of the Fingertips website.

Finally, the nearest_neighbours() function provides groups of statistically similar area for some of the geographies that are available. The geographies these are available for, and their sources, are documented within the function documentation (?nearest_neighbours()).