The `discAUC`

package was written to be a generic method
to calculate various forms of area under the curve (AUC) for discounting
data. The original formulation of AUC for discounting data was Myerson et
al. (2001) and the formulations for AUClog and AUCord are described
in Borges et al. (2016).
The point of this package is not to provide *a better way* to
calculate AUC, rather it is to standardize the calculations. The goals
of this package are to:

Have a single function (

`AUC`

) to calculate AUC.Have a function to easily convert probability discounting for

`AUC`

.- As is the standard practice, AUC for probability discounting requires converting the likelihood of receiving the outcome to the odds against receiving the outcome.

Have a function to easily prepare discounting data to calculate AUCord.

Have a function to easily prepare discounting data to calculate AUClog.

Have a function that will impute missing indifference points when delay/social distance/odds against receiving the outcome/etc. are equal to zero.

```
#Load discounting AUC package
library(discAUC)
#Load dplyr, which aids in data manipulation
library(dplyr)
```

The one major assumption of this package is that discounting data are
**Tidy**. The principles of tidy data are outlined by Wickham (2014). To
overly simplify, the discounting data should be organized in a
long-format in which there is only one indifference point per row of
data. All of the pertinent information for that indifference point
(e.g., subject/participant identifier, condition, delay, etc.) should be
organized in different columns of the data. Two example data sets are
included with the package that demonstrate the expected data format and
can be used to test the package.

```
#Example Tidy Delay Discounting Data
examp_DD#> # A tibble: 360 × 4
#> subject delay_months outcome prop_indiff
#> <dbl> <dbl> <chr> <dbl>
#> 1 103 0.0333 alcohol 0.878
#> 2 103 0.25 alcohol 0.749
#> 3 103 0.5 alcohol 0.747
#> 4 103 1 alcohol 0.741
#> 5 103 6 alcohol 0.249
#> 6 103 60 alcohol 0.241
#> 7 103 0.0333 entertainment 0.987
#> 8 103 0.25 entertainment 0.933
#> 9 103 0.5 entertainment 0.929
#> 10 103 1 entertainment 0.935
#> # … with 350 more rows
```

The indifference points are the last column in this example data set (“prop_indiff”). For each indifference point the subject number, delay to receiving the indifference point, and the type of outcome being discounted are all labeled in different columns. The tidy format allows for easier data manipulation and organization. For example, if you only wanted to calculate AUC for a specific subject or for a specific set of conditions then you filter the data based on the variables of interest.

```
#Filter example DD data by subject (relies on dplyr library)
%>%
examp_DD filter(subject == 103)
#> # A tibble: 24 × 4
#> subject delay_months outcome prop_indiff
#> <dbl> <dbl> <chr> <dbl>
#> 1 103 0.0333 alcohol 0.878
#> 2 103 0.25 alcohol 0.749
#> 3 103 0.5 alcohol 0.747
#> 4 103 1 alcohol 0.741
#> 5 103 6 alcohol 0.249
#> 6 103 60 alcohol 0.241
#> 7 103 0.0333 entertainment 0.987
#> 8 103 0.25 entertainment 0.933
#> 9 103 0.5 entertainment 0.929
#> 10 103 1 entertainment 0.935
#> # … with 14 more rows
#Filter example DD data by outcome type
%>%
examp_DD filter(outcome=="alcohol")
#> # A tibble: 90 × 4
#> subject delay_months outcome prop_indiff
#> <dbl> <dbl> <chr> <dbl>
#> 1 103 0.0333 alcohol 0.878
#> 2 103 0.25 alcohol 0.749
#> 3 103 0.5 alcohol 0.747
#> 4 103 1 alcohol 0.741
#> 5 103 6 alcohol 0.249
#> 6 103 60 alcohol 0.241
#> 7 108 0.0333 alcohol 0.298
#> 8 108 0.25 alcohol 0.298
#> 9 108 0.5 alcohol 0.298
#> 10 108 1 alcohol 0.298
#> # … with 80 more rows
```

For the initial demonstrations, we will use the median indifference points for money to show the calculations for AUC. We will obtain the median indifference points for both delay discounting (DD) and probability discounting (PD).

```
#Subject -987.987 are precalculated median indifference points for each outcome.
= examp_DD %>%
DD_med_indiff filter(subject == -987.987,
== "$100 Gain")
outcome
= examp_PD %>%
PD_med_indiff filter(subject == -987.987,
== "$100 Gain")
outcome
#Note that the median indifference point subject number (-987.987) is truncated in the output.
DD_med_indiff#> # A tibble: 6 × 4
#> subject delay_months outcome prop_indiff
#> <dbl> <dbl> <chr> <dbl>
#> 1 -988. 0.0333 $100 Gain 0.957
#> 2 -988. 0.25 $100 Gain 0.862
#> 3 -988. 0.5 $100 Gain 0.771
#> 4 -988. 1 $100 Gain 0.750
#> 5 -988. 6 $100 Gain 0.456
#> 6 -988. 60 $100 Gain 0.200
PD_med_indiff#> # A tibble: 6 × 4
#> subject prob outcome prop_indiff
#> <dbl> <dbl> <chr> <dbl>
#> 1 -988. 0.95 $100 Gain 0.957
#> 2 -988. 0.9 $100 Gain 0.862
#> 3 -988. 0.7 $100 Gain 0.771
#> 4 -988. 0.5 $100 Gain 0.750
#> 5 -988. 0.3 $100 Gain 0.456
#> 6 -988. 0.05 $100 Gain 0.200
```

If the data are in a tidy format, AUC can be calculated simply by
supplying the data to the `AUC`

function and entering the
necessary parameters. Several parts of the function are looking for the
variable names in the data that indicate the relevant data. The variable
names should be entered within quotes (`"prop_indiff"`

instead of `prop_indiff`

). The `x_axis`

is the
delay/social distance/likelihood of receiving the outcome. The
`amount`

must be set manually because it is possible that an
indifference point with the maximum value (amount of the larger outcome)
does not exist in the data. The `grouping`

factor is designed
to aid in calculating AUC across a whole data set at once (see below).
At this time a `grouping`

factor must be included, even if
there is only *one* set of indifference points being used to
calculate AUC.

```
AUC(dat = DD_med_indiff,
indiff = "prop_indiff",
x_axis = "delay_months",
amount = 1,
grouping = "subject")
#> # A tibble: 1 × 2
#> subject AUC
#> <dbl> <dbl>
#> 1 -988. 0.359
```

To calculate probability discounting, set the flag of
`prob_disc = TRUE.`

The function will convert all of the
likelihoods (`prob`

) to odds against receiving the outcome.
AUC will then be calculated based on the odds against receiving the
outcome. Note, that the output will not clearly indicate that you
calculated AUC for probability discounting data because the function
only returns AUC values.

```
AUC(dat = PD_med_indiff,
indiff = "prop_indiff",
x_axis = "prob",
amount = 1,
grouping = "subject",
prob_disc = TRUE)
#> # A tibble: 1 × 2
#> subject AUC
#> <dbl> <dbl>
#> 1 -988. 0.372
```

Borges et al. (2016)
outlined two alternative methods for calculating AUC. AUClog simply
transforms the `x_axis`

values and calculates AUC based on
the transformed values. AUCord transforms the `x_axis`

values
into their ordinal position and calculates AUC based on the ordinal
position.

We will describe calculating AUCord first because it can be done
simply by changing the type of AUC in the function. Specifically,
`type = "ordinal"`

```
#Ordinal AUC for DD data
AUC(dat = DD_med_indiff,
indiff = "prop_indiff",
x_axis = "delay_months",
amount = 1,
grouping = "subject",
type = "ordinal")
#> # A tibble: 1 × 2
#> subject AUC
#> <dbl> <dbl>
#> 1 -988. 0.359
#Ordinal AUC for PD data
AUC(dat = PD_med_indiff,
indiff = "prop_indiff",
x_axis = "prob",
amount = 1,
groupings = "subject",
prob_disc = TRUE,
type = "ordinal")
#> # A tibble: 1 × 2
#> subject AUC
#> <dbl> <dbl>
#> 1 -988. 0.372
```

*Note on the order of operations*: When calculating AUCord for
probability discounting data, the AUC function will obtain the odds
against and then calculate the ordinal values based on the odds
against.

Calculating AUClog with the AUC function is also relatively simple.
However, in addition to setting the `type = "log"`

you can
also specify the base of the logarithm you wish to use. By default, the
function uses a base of 2. The AUC function uses an adjustment procedure
(based on the average distance between log transformed values) to
account for when the x_axis value equals 0. See below for a full
description of the correction types.

```
#Ordinal AUC for DD data
AUC(dat = DD_med_indiff,
indiff = "prop_indiff",
x_axis = "delay_months",
amount = 1,
grouping = "subject",
type = "log")
#> # A tibble: 1 × 2
#> subject AUC
#> <dbl> <dbl>
#> 1 -988. 0.693
#Ordinal AUC for PD data with log base 10
AUC(dat = PD_med_indiff,
indiff = "prop_indiff",
x_axis = "prob",
amount = 1,
groupings = "subject",
prob_disc = TRUE,
type = "log",
log_base = 10)
#> # A tibble: 1 × 2
#> subject AUC
#> <dbl> <dbl>
#> 1 -988. 0.676
```

A single data set might have several sets of indifference points. The
example data sets include indifference one set of indifference points
per outcome for each participant. Instead of calculating AUC one-by-one
for each specific set, AUC can be calculated *en masse* for all
of the sets of indifference points. To use the *en masse* AUC
calculations, you need to set the variables that can be used to identify
each specific set of indifference points in the `groupings`

factor. The column names should be entered within quotes.

```
#For demonstration, filter for median indifference points for all outcomes.
= examp_DD %>%
DD_med_outcomes filter(subject == -987.987)
#Simple AUC
AUC(dat = DD_med_outcomes,
indiff = "prop_indiff",
x_axis = "delay_months",
amount = 1,
grouping = "outcome")
#> # A tibble: 4 × 2
#> outcome AUC
#> <chr> <dbl>
#> 1 $100 Gain 0.359
#> 2 alcohol 0.0953
#> 3 entertainment 0.405
#> 4 food 0.158
```

For this example, we simplified the example data to the median
indifference points for each outcome. We added
`grouping = "outcome"`

to indicate that the function should
return an AUC value for each outcome in the median indifference point
data set. The `AUC`

returns a different AUC value for each
level in the “grouping” factor.

The grouping function can handle more than one variable at a time by
using the combine function `c()`

on the left hand side of the
`grouping =`

in the `AUC`

function. The columns
that should be used to identify the independent the sets of indifference
points should be entered as
`c("variable1", "variable2", "variable3", etc.)`

. In the
following example, AUC is calculated for each outcome by subject.

```
#AUC by outcome and subject
AUC(dat = examp_DD,
indiff = "prop_indiff",
x_axis = "delay_months",
amount = 1,
grouping = c("outcome", "subject"))
#> # A tibble: 60 × 3
#> # Groups: outcome [4]
#> outcome subject AUC
#> <chr> <dbl> <dbl>
#> 1 $100 Gain -988. 0.359
#> 2 $100 Gain -2 0.000278
#> 3 $100 Gain -1 1
#> 4 $100 Gain 103 0.789
#> 5 $100 Gain 108 0.768
#> 6 $100 Gain 119 0.669
#> 7 $100 Gain 122 0.462
#> 8 $100 Gain 142 0.0642
#> 9 $100 Gain 169 0.0673
#> 10 $100 Gain 175 0.465
#> # … with 50 more rows
```

The order of the `grouping`

variable does not change the
resulting AUC data, only the order in which the columns are
returned.

```
#AUC by outcome and subject
AUC(dat = examp_DD,
indiff = "prop_indiff",
x_axis = "delay_months",
amount = 1,
grouping = c("outcome", "subject"))
#> # A tibble: 60 × 3
#> # Groups: outcome [4]
#> outcome subject AUC
#> <chr> <dbl> <dbl>
#> 1 $100 Gain -988. 0.359
#> 2 $100 Gain -2 0.000278
#> 3 $100 Gain -1 1
#> 4 $100 Gain 103 0.789
#> 5 $100 Gain 108 0.768
#> 6 $100 Gain 119 0.669
#> 7 $100 Gain 122 0.462
#> 8 $100 Gain 142 0.0642
#> 9 $100 Gain 169 0.0673
#> 10 $100 Gain 175 0.465
#> # … with 50 more rows
#AUC by subject and outcome
AUC(dat = examp_DD,
indiff = "prop_indiff",
x_axis = "delay_months",
amount = 1,
grouping = c("subject", "outcome"))
#> # A tibble: 60 × 3
#> # Groups: subject [15]
#> subject outcome AUC
#> <dbl> <chr> <dbl>
#> 1 -988. $100 Gain 0.359
#> 2 -988. alcohol 0.0953
#> 3 -988. entertainment 0.405
#> 4 -988. food 0.158
#> 5 -2 $100 Gain 0.000278
#> 6 -2 alcohol 0.000278
#> 7 -2 entertainment 0.000278
#> 8 -2 food 0.000278
#> 9 -1 $100 Gain 1
#> 10 -1 alcohol 1
#> # … with 50 more rows
```

If you are calculating probability discounting, AUClog, or AUCord,
technically the `AUC()`

function does not transform you
`x_axis`

values. The AUC only uses the default AUC equation
as described by Myerson et
al. (2001). That is AUC is the sum of all the trapezoid area between
successive indifference points. A single trapezoid has an area of,

\[ (x_2 -x_1)*\frac{y_1+y_2}{2} \]

where the *x* values are the successive delays, social
distances, odds against, etc. and the *y* values are the
successive indifference points. The *x* values are all
standardized by the maximum delay, social distance, odds against, etc.
The *y* values are all standardized by the amount of the larger
outcome.

Within the `discAUC`

package are separate functions that
calculate the transformations on the indifference points supplied to the
`AUC()`

function. The `AUC()`

function calls those
specific transformation functions and then uses the transformed data in
the above formula. In other words, there are no special versions of the
above formula like,

\[ [log(x_2) -log(x_1)]*\frac{y_1+y_2}{2} \]

rather, each transformation is conducted and then fed back into the original formulation of AUC.

Each of the specific transformation functions can be called on their own. Using the functions prior to the AUC function is useful if you would like fine control of all of the pieces of the AUC calculation or you would like to inspect the transformed data prior to the AUC calculation.

Myerson et
al. (2001) call for including an indifference point when the x_axis
value = 0. If an experimental assessment of the indifference point at X
= 0 was not obtained then the value should be added. If the indifference
points need to be added, the value of each added indifference should be
equal to the larger outcome (i.e., *A* from a discounting
model*)*.

The `AUC_zeros`

function will add zeros to a tidy set of
indifference points. The function will only add the indifference points
at X = 0 if they did not already exist in the data. Additionally, a new
column will be added to the data file that will indicate which
indifference points were originally included and which indifference
points were added by the function.

```
#Examp_DD data did not include indifference points when delay = 0
AUC_zeros(dat = examp_DD,
indiff = "prop_indiff",
x_axis = "delay_months",
amount = 1,
groupings = c("subject","outcome"))
#> # A tibble: 420 × 5
#> subject outcome delay_months prop_indiff orig
#> <dbl> <chr> <dbl> <dbl> <lgl>
#> 1 103 alcohol 0 1 FALSE
#> 2 103 alcohol 0.0333 0.878 TRUE
#> 3 103 alcohol 0.25 0.749 TRUE
#> 4 103 alcohol 0.5 0.747 TRUE
#> 5 103 alcohol 1 0.741 TRUE
#> 6 103 alcohol 6 0.249 TRUE
#> 7 103 alcohol 60 0.241 TRUE
#> 8 103 entertainment 0 1 FALSE
#> 9 103 entertainment 0.0333 0.987 TRUE
#> 10 103 entertainment 0.25 0.933 TRUE
#> # … with 410 more rows
```

With probability discounting, the odds against values will be in the
opposite order of the likelihoods of receiving the outcome. For that
reason, if the data are probability discounting data you must indicate
that `prob_disc = TRUE`

in the function call. This will
impute an indifference point with a likelihood of occurring of 100%.
With a likelihood of 100% the odds against receiving that outcome will
be converted to 0.

```
#Examp_PD data did not include indifference points when prob = 1
AUC_zeros(dat = examp_PD,
indiff = "prop_indiff",
x_axis = "prob",
amount = 1,
groupings = c("subject","outcome"),
prob_disc = TRUE
)#> # A tibble: 420 × 5
#> subject outcome prob prop_indiff orig
#> <dbl> <chr> <dbl> <dbl> <lgl>
#> 1 103 alcohol 1 1 FALSE
#> 2 103 alcohol 0.95 0.878 TRUE
#> 3 103 alcohol 0.9 0.749 TRUE
#> 4 103 alcohol 0.7 0.747 TRUE
#> 5 103 alcohol 0.5 0.741 TRUE
#> 6 103 alcohol 0.3 0.249 TRUE
#> 7 103 alcohol 0.05 0.241 TRUE
#> 8 103 entertainment 1 1 FALSE
#> 9 103 entertainment 0.95 0.987 TRUE
#> 10 103 entertainment 0.9 0.933 TRUE
#> # … with 410 more rows
```

With the `prep_odds_against()`

function, you can indicate
the `x_axis`

variable and the function will calculate the
odds against receiving the outcome. Odds against is the typical form for
displaying probability discounting because the value of the outcome will
decrease as the odds against receiving the outcome increase. The formula
for calculating the odds against is

\[ \frac{1-p}{p} \]

where *p* is the probability of receiving the outcome.

```
#Make sure to indicate groupings, if necessary
prep_odds_against(dat = examp_PD,
x_axis = "prob",
groupings = c("subject","outcome"))
#> # A tibble: 360 × 5
#> # Groups: subject, outcome [60]
#> subject prob outcome prop_indiff prob_against
#> <dbl> <dbl> <chr> <dbl> <dbl>
#> 1 -988. 0.95 $100 Gain 0.957 0.0526
#> 2 -988. 0.9 $100 Gain 0.862 0.111
#> 3 -988. 0.7 $100 Gain 0.771 0.429
#> 4 -988. 0.5 $100 Gain 0.750 1
#> 5 -988. 0.3 $100 Gain 0.456 2.33
#> 6 -988. 0.05 $100 Gain 0.200 19
#> 7 -988. 0.95 alcohol 0.569 0.0526
#> 8 -988. 0.9 alcohol 0.273 0.111
#> 9 -988. 0.7 alcohol 0.329 0.429
#> 10 -988. 0.5 alcohol 0.189 1
#> # … with 350 more rows
```

The odds against will be added as a new variable in the data set. The
name of the variable will be the name of the original probability
variable (“prob” in this case) with “_against” added afterwards. For the
`examp_PD`

data set, the odds against will be included as
`prob_against`

. Indicating the indifference points in this
function is unnecessary, because it only transforms the probability of
receiving the outcomes.

The `prep_ordinal()`

function behaves very similarly to
`prep_odds_against()`

. `prep_ordinal()`

transforms
the `x_axis`

values into the ordinal position. Note that if
there are indifference points for when the x_axis = 0, the ordinal value
will be calculated as 0.

```
#Groupings must be specified, if necessary
prep_ordinal(dat = examp_DD,
x_axis = "delay_months",
groupings = c("subject","outcome"))
#> # A tibble: 360 × 5
#> # Groups: subject, outcome [60]
#> subject delay_months outcome prop_indiff delay_months_ord
#> <dbl> <dbl> <chr> <dbl> <int>
#> 1 -988. 0.0333 $100 Gain 0.957 1
#> 2 -988. 0.25 $100 Gain 0.862 2
#> 3 -988. 0.5 $100 Gain 0.771 3
#> 4 -988. 1 $100 Gain 0.750 4
#> 5 -988. 6 $100 Gain 0.456 5
#> 6 -988. 60 $100 Gain 0.200 6
#> 7 -988. 0.0333 alcohol 0.569 1
#> 8 -988. 0.25 alcohol 0.273 2
#> 9 -988. 0.5 alcohol 0.329 3
#> 10 -988. 1 alcohol 0.189 4
#> # … with 350 more rows
```

The ordinals will be added as a new variable in the data set. The
name of the variable will be the name of the original probability
variable (“delay_months” in this case) with “_ord” added afterwards. For
the `examp_DD`

data set, the odds against will be included as
`delay_months_ord`

.

If ordinal values are desired for probability discounting data, then
the `prob_disc = TRUE`

flag must be set. For probability
discounting, the ordinal values will be calculated in reverse order to
match taking ordinal values of the odds against receiving the outcome.
In other words, for delay discounting ordinal position of a delay
**increases** as delay **increases** but for
probability typically we want ordinal value **increasing**
as the probability **decreases**.

```
prep_ordinal(dat = examp_PD,
x_axis = "prob",
groupings = c("subject","outcome"),
prob_disc = TRUE)
#> # A tibble: 360 × 5
#> # Groups: subject, outcome [60]
#> subject prob outcome prop_indiff prob_ord
#> <dbl> <dbl> <chr> <dbl> <int>
#> 1 -988. 0.95 $100 Gain 0.957 1
#> 2 -988. 0.9 $100 Gain 0.862 2
#> 3 -988. 0.7 $100 Gain 0.771 3
#> 4 -988. 0.5 $100 Gain 0.750 4
#> 5 -988. 0.3 $100 Gain 0.456 5
#> 6 -988. 0.05 $100 Gain 0.200 6
#> 7 -988. 0.95 alcohol 0.569 1
#> 8 -988. 0.9 alcohol 0.273 2
#> 9 -988. 0.7 alcohol 0.329 3
#> 10 -988. 0.5 alcohol 0.189 4
#> # … with 350 more rows
```

`prep_ordinal_all()`

is currently included to account for
situations in which participants experience different x_axis values. For
example, indifference points might be obtained for participant 1 at 1
week, 1 month, and 6 months and for participant 2 at 1 week, 3 months,
and 1 year. In such a case, it would not be appropriate to label the
ordinals as 1, 2, and 3 for the respective subjects. Young (2016) has
proposed using a Halton sequence to measure discounting, but to my
knowledge this technique has not be used. In a Halton sequence, a vast
number of x_axis values would be used to provide a more accurate picture
of the global discounting curve.

To account for this sort of situation, the
`prep_ordinal_all()`

function calculates the ordinal value
for each x_axis value based on the respective ordinal position in the
list of all x_axis values not just the ordinal position of values for a
specific subject.

As there is currently limited utility for this technique, an example data set is not included in the package and must be created

```
#Create data based on values included in above example
=
examp_ord_all tibble(
sub = c(1, 1, 1, 2, 2, 2),
delay_weeks = c(1, 4, 26, 1, 13, 52)
)
#Groupings are not necessary
prep_ordinal_all(dat = examp_ord_all,
x_axis = "delay_weeks")
#> # A tibble: 6 × 3
#> sub delay_weeks delay_weeks_ord
#> <dbl> <dbl> <int>
#> 1 1 1 1
#> 2 1 4 2
#> 3 1 26 4
#> 4 2 1 1
#> 5 2 13 3
#> 6 2 52 5
```

As with `prep_ordinal()`

, if the data are from probability
discounting then `prep_ordinal_all()`

will reverse order the
ordinal values.

When log transforming any values, a common concern is how to account
for `log(0)`

which is undefined.

A common recommendation is to add a constant value (e.g., 1) to all
values in the data set and transform the “corrected” values. One
limitation of just using correction factor can change the relative
distances between the log transformed `x_axis`

values (those
relative distances being the whole point of the log transformation in
the first place).

```
prep_log_AUC(dat = examp_DD,
x_axis = "delay_months",
type = "corr")
#> # A tibble: 360 × 5
#> subject delay_months outcome prop_indiff log_delay_months
#> <dbl> <dbl> <chr> <dbl> <dbl>
#> 1 103 0.0333 alcohol 0.878 0.0473
#> 2 103 0.25 alcohol 0.749 0.322
#> 3 103 0.5 alcohol 0.747 0.585
#> 4 103 1 alcohol 0.741 1
#> 5 103 6 alcohol 0.249 2.81
#> 6 103 60 alcohol 0.241 5.93
#> 7 103 0.0333 entertainment 0.987 0.0473
#> 8 103 0.25 entertainment 0.933 0.322
#> 9 103 0.5 entertainment 0.929 0.585
#> 10 103 1 entertainment 0.935 1
#> # … with 350 more rows
```

The correction factor can be specified by setting
`correction`

to the desired value.

```
prep_log_AUC(dat = examp_DD,
x_axis = "delay_months",
type = "corr",
correction = .25)
#> # A tibble: 360 × 5
#> subject delay_months outcome prop_indiff log_delay_months
#> <dbl> <dbl> <chr> <dbl> <dbl>
#> 1 103 0.0333 alcohol 0.878 -1.82
#> 2 103 0.25 alcohol 0.749 -1
#> 3 103 0.5 alcohol 0.747 -0.415
#> 4 103 1 alcohol 0.741 0.322
#> 5 103 6 alcohol 0.249 2.64
#> 6 103 60 alcohol 0.241 5.91
#> 7 103 0.0333 entertainment 0.987 -1.82
#> 8 103 0.25 entertainment 0.933 -1
#> 9 103 0.5 entertainment 0.929 -0.415
#> 10 103 1 entertainment 0.935 0.322
#> # … with 350 more rows
```

An additional problem that will crop up when calculating log transformed x_axis values is that you will have negative log(X) values, which is problematic for the standard method for calculating AUC. The following method for correcting log transformed x_axis values accounts for the negative log(X) values.

Included in this package is a method for correcting log(0) values that is referred to as “adjust.” The adjust method calculates the mean distance between each successive log transformed x_axis value, forces log(0) = 0, and then add the mean log distance to the log transformed indifference points. Essentially, the adjust procedure uses a correction factor that will shift the indifference points upwards without changing the underlying distribution of the data. Specifically,

\[ \kappa = \frac{\sum_{n=1}^{N-1} (\log(X_{n+1}) - \log(X_{n}))}{N} \]

and

\[ X_{logtrans} = \log(X_{old}) + \kappa \]

The following code block demonstrates the calculations for adjustment correction.

```
#Initial vector with delays (in weeks)
= c(0, 0.25, 1, 4, 26)
delays
#Log transform delays
= log(delays, base = 2)
log_delays
#Display values
log_delays#> [1] -Inf -2.00000 0.00000 2.00000 4.70044
#Eliminate log(0) = -Inf
= log_delays[-1]
non_zero_log
#Calculate the difference between succesive non-zero indifference points
= diff(non_zero_log)
log_diff
#Print log_diff
log_diff#> [1] 2.00000 2.00000 2.70044
#Adjustment factor
= mean(log_diff)
adjustment
#Adjust log delays
= log_delays + adjustment
new_log_delays
#Set log(0) = 0
1] = 0
new_log_delays[
#Print adjusted log delays
new_log_delays#> [1] 0.0000000 0.2334799 2.2334799 4.2334799 6.9339196
```

Additionally, the adjust method includes an automated correction for x_axis values that are between 0 and 1. The AUC formula assumes that the lowest x_axis value is 0. However, if an x_axis value is between 0 and 1 (for example 1 week will be expressed as 0.25 if the delays are expressed as months) then the log transformed x_axis value will be negative. The negative x_axis values will introduce errors to the AUC calculations. One solution is to express all indifference points in the lowest possible unit (weeks instead of months, days instead of weeks, etc.). The solution this package uses is to increase all of the log transformed x_axis values by the absolute value of the minimum log transformed x_axis value. Specifically,

\[ X_{logtrans} = \log(X_{old}) + \log(\min(X)) \]

is the formula for correcting for negative log transformed x_axis values.

In total, the final logs transformed X values can be found with the formula,

\[ X_{logtrans} = \log(X_{old}) + \kappa + \log(\min(X)) \]

This adjustment procedure is the default method the AUC function uses is an “adjustment” to all of the log transformed values.

The following code block gives an example of the
`"adjust"`

method for log AUC.

```
#Default adjust method
prep_log_AUC(dat = examp_DD,
x_axis = "delay_months",
type = "adjust")
#> # A tibble: 360 × 5
#> subject delay_months outcome prop_indiff log_delay_months
#> <dbl> <dbl> <chr> <dbl> <dbl>
#> 1 103 0.0333 alcohol 0.878 0
#> 2 103 0.25 alcohol 0.749 2.91
#> 3 103 0.5 alcohol 0.747 3.91
#> 4 103 1 alcohol 0.741 4.91
#> 5 103 6 alcohol 0.249 7.49
#> 6 103 60 alcohol 0.241 10.8
#> 7 103 0.0333 entertainment 0.987 0
#> 8 103 0.25 entertainment 0.933 2.91
#> 9 103 0.5 entertainment 0.929 3.91
#> 10 103 1 entertainment 0.935 4.91
#> # … with 350 more rows
```

If the adjust method is still desired with out the decimal
correction, set `dec_offset = FALSE`

.

```
#Log adjust method with no decimal correction
prep_log_AUC(dat = examp_DD,
x_axis = "delay_months",
type = "adjust",
dec_offset = FALSE)
#> # A tibble: 360 × 5
#> subject delay_months outcome prop_indiff log_delay_months
#> <dbl> <dbl> <chr> <dbl> <dbl>
#> 1 103 0.0333 alcohol 0.878 -4.91
#> 2 103 0.25 alcohol 0.749 -2
#> 3 103 0.5 alcohol 0.747 -1
#> 4 103 1 alcohol 0.741 0
#> 5 103 6 alcohol 0.249 2.58
#> 6 103 60 alcohol 0.241 5.91
#> 7 103 0.0333 entertainment 0.987 -4.91
#> 8 103 0.25 entertainment 0.933 -2
#> 9 103 0.5 entertainment 0.929 -1
#> 10 103 1 entertainment 0.935 0
#> # … with 350 more rows
```

The final method for log transformation is not strictly a log transformation. Gilroy et al. (2021) proposed using the IHS transformation to aid in conduct demand analyses. Specifically, IHS was proposed to handle consumption values obtained from hypothetical purchase tasks and the associated problems of log transforming zero consumption values. I highly recommend reading their description of the IHS transformation. The first main benefit of the IHS transformation is that \(\mathrm{IHS}(0) = 0\). The second benefit of IHS is that it produces values that are generally similar to log transformed values (i.e., a high degree of correlation).

Specifically,

\[ \mathrm{IHS}(X) = \sinh^{-1}{ X} = \mathrm{arsinh } (X) \]

and the specific calculations for a single value are

\[ \mathrm{IHS}(X) = \ln{(X + \sqrt{X^2 + 1})} \]

```
#IHS transformation
prep_log_AUC(dat = examp_DD,
x_axis = "delay_months",
type = "IHS")
#> # A tibble: 360 × 5
#> subject delay_months outcome prop_indiff log_delay_months
#> <dbl> <dbl> <chr> <dbl> <dbl>
#> 1 103 0.0333 alcohol 0.878 0.0333
#> 2 103 0.25 alcohol 0.749 0.247
#> 3 103 0.5 alcohol 0.747 0.481
#> 4 103 1 alcohol 0.741 0.881
#> 5 103 6 alcohol 0.249 2.49
#> 6 103 60 alcohol 0.241 4.79
#> 7 103 0.0333 entertainment 0.987 0.0333
#> 8 103 0.25 entertainment 0.933 0.247
#> 9 103 0.5 entertainment 0.929 0.481
#> 10 103 1 entertainment 0.935 0.881
#> # … with 350 more rows
```

The functions described above can be used in a successive fashion. For example, when obtaining AUClog for probability discounting: first, probabilities of receiving an outcome should be converted to odds against and then second, log transformations should be performed on odds against values.

The main factor to account for when feeding the results from one
function to the next function is that the `x_axis`

value of
interest changes after transformations have been transformed. Following
the above example, we transform probability values into odds against
values. For the log transformation, we want to treat the odds against
values as the x_axis of interest.

```
#Calculate odds against
= prep_odds_against(dat = examp_PD,
examp_PD_odds x_axis = "prob",
groupings = c("subject","outcome"))
#Print odds against
examp_PD_odds#> # A tibble: 360 × 5
#> # Groups: subject, outcome [60]
#> subject prob outcome prop_indiff prob_against
#> <dbl> <dbl> <chr> <dbl> <dbl>
#> 1 -988. 0.95 $100 Gain 0.957 0.0526
#> 2 -988. 0.9 $100 Gain 0.862 0.111
#> 3 -988. 0.7 $100 Gain 0.771 0.429
#> 4 -988. 0.5 $100 Gain 0.750 1
#> 5 -988. 0.3 $100 Gain 0.456 2.33
#> 6 -988. 0.05 $100 Gain 0.200 19
#> 7 -988. 0.95 alcohol 0.569 0.0526
#> 8 -988. 0.9 alcohol 0.273 0.111
#> 9 -988. 0.7 alcohol 0.329 0.429
#> 10 -988. 0.5 alcohol 0.189 1
#> # … with 350 more rows
#Add zeros, but already converted to odds against so prob_disc = FALSE
#Note the x_axis value was changed to "prob_against" which was the newly added column
= AUC_zeros(dat = examp_PD_odds,
examp_PD_odds x_axis = "prob_against",
indiff = "prop_indiff",
amount = 1,
groupings = c("subject","outcome"),
prob_disc = FALSE)
#Print odds agianst with zeros
examp_PD_odds#> # A tibble: 420 × 6
#> subject outcome prob_against prop_indiff orig prob
#> <dbl> <chr> <dbl> <dbl> <lgl> <dbl>
#> 1 -988. $100 Gain 0 1 FALSE NA
#> 2 -988. $100 Gain 0.0526 0.957 TRUE 0.95
#> 3 -988. $100 Gain 0.111 0.862 TRUE 0.9
#> 4 -988. $100 Gain 0.429 0.771 TRUE 0.7
#> 5 -988. $100 Gain 1 0.750 TRUE 0.5
#> 6 -988. $100 Gain 2.33 0.456 TRUE 0.3
#> 7 -988. $100 Gain 19 0.200 TRUE 0.05
#> 8 -988. alcohol 0 1 FALSE NA
#> 9 -988. alcohol 0.0526 0.569 TRUE 0.95
#> 10 -988. alcohol 0.111 0.273 TRUE 0.9
#> # … with 410 more rows
#Odds against to log_odds against.
#Note the x_axis value was changed to "prob_against" which was the newly added column
= prep_log_AUC(dat = examp_PD_odds,
examp_PD_log_odds x_axis = "prob_against",
type = "adjust")
examp_PD_log_odds#> # A tibble: 420 × 7
#> subject outcome prob_against prop_indiff orig prob log_prob_against
#> <dbl> <chr> <dbl> <dbl> <lgl> <dbl> <dbl>
#> 1 -988. $100 Gain 0 1 FALSE NA 0
#> 2 -988. $100 Gain 0.0526 0.957 TRUE 0.95 1.70
#> 3 -988. $100 Gain 0.111 0.862 TRUE 0.9 2.78
#> 4 -988. $100 Gain 0.429 0.771 TRUE 0.7 4.72
#> 5 -988. $100 Gain 1 0.750 TRUE 0.5 5.95
#> 6 -988. $100 Gain 2.33 0.456 TRUE 0.3 7.17
#> 7 -988. $100 Gain 19 0.200 TRUE 0.05 10.2
#> 8 -988. alcohol 0 1 FALSE NA 0
#> 9 -988. alcohol 0.0526 0.569 TRUE 0.95 1.70
#> 10 -988. alcohol 0.111 0.273 TRUE 0.9 2.78
#> # … with 410 more rows
```

Finally, when using the completely transformed data, simply supply it to the AUC function.

```
AUC(dat = examp_PD_log_odds,
indiff = "prop_indiff",
x_axis = "log_prob_against",
amount = 1,
groupings = c("subject","outcome"))
#> # A tibble: 60 × 3
#> # Groups: subject [15]
#> subject outcome AUC
#> <dbl> <chr> <dbl>
#> 1 -988. $100 Gain 0.676
#> 2 -988. alcohol 0.309
#> 3 -988. entertainment 0.735
#> 4 -988. food 0.379
#> 5 -2 $100 Gain 0.0833
#> 6 -2 alcohol 0.0833
#> 7 -2 entertainment 0.0833
#> 8 -2 food 0.0833
#> 9 -1 $100 Gain 1
#> 10 -1 alcohol 1
#> # … with 50 more rows
```