BMD modeling using ToxicR — bmd

Calculate the benchmark dose (BMD) for continuous dose-response data with optional model averaging. This function is intended to model the dose-response of mutation frequency using the ToxicR software.

Usage

bmd_toxicr(
  mf_data,
  data_type = "individual",
  dose_col = "dose",
  response_col = c("mf_min", "mf_max"),
  sd_col = NULL,
  n_col = NULL,
  bmr_type = "rel",
  bmr = 0.5,
  model = "exp-aerts",
  alpha = 0.05,
  model_averaging = TRUE,
  plot_results = FALSE,
  ma_summary = FALSE,
  output_path = NULL,
  ...
)

Arguments

mf_data: A data frame containing the dose-response data. Data may be individual for each sample or averaged over dose groups. Required columns for individual data are the column containing the dose dose_col and the column(s) containing the mutation frequency data response_col(s). Summary data must include the dose_col, the response_col(s) containing the mean response for each dose group, the sd_col containing the standard deviation of the response data, and the n_col containing the sample size for each dose group.
data_type: A string specifying the type of response data. Data may be response per individual or summarised across dose groups. Options are ("individual", "summary"). Default is "individual".
dose_col: The column in mf_data containing the dose data. Values must be numeric. Default is "dose".
response_col: The column(s) in mf_data containing the mutation frequency data. For summarised data types, this should be the mean response for each dose group. Multiple response_cols can be provided.
sd_col: The column in mf_data containing the standard deviation of the summarised response data. This is only required for data_type = "summary". If multiple response columns are provided, multiple sd_cols should be provided in the same order. Default is NULL.
n_col: The column in mf_data containing the sample size of each dose group. This is only required for data_type = "summary". If multiple response columns are provided, multiple n_cols should be provided in the same order. Default is NULL.
bmr_type: The type of benchmark response. Options are: "rel", "sd", "hybrid", "abs". Default is "rel". See details for more information.
bmr: A numeric value specifying the benchmark response. The bmr is defined in relation to the calculation requested in bmr_type. Default is 0.5.
model: The model type to use. Options are "all" or a vector of model types. Default is "exp-aerts", the Exponential model. See details for available models. Note that model averaging will use a pre-defined model set. See details for more information.
alpha: The specified nominal coverage rate for computation of the lower and upper confidence intervals for the benchmark dose (BMDL, BMDU). The confidence level is calculated as \(100\times(1-2\alpha)\% \). The default is 0.05 (90% CI).
model_averaging: A logical value indicating whether to use model averaging. Default is TRUE (recommended).
plot_results: A logical value indicating whether to plot the BMD models and/or the Cleveland plots. Default is FALSE. If TRUE, the function will save plots to the output_path or return them as a list alongside the summary of the results.
ma_summary: A logical value indicating whether to return the summary of the model averaging results. Default is FALSE.
output_path: The file path indicating where to save the plots. If NULL, the plots will automatically be returned as a list alongside the bmd results.
...: Additional arguments to be passed to the model fitting function. For more information, see single_continuous_fit or ma_continuous_fit if model averaging.

Value

If model_averaging = FALSE, the function returns a data frame with the BMD values and the \(100\times(1-2\alpha)\% \) confidence intervals (BMDL, BMDU)for each response column and each model listed. The AIC value is calculated for each model to compare fits. The AIC is calculated as maximum likelihood + 2 * degrees of freedom.

If model_averaging = TRUE, the function returns a data frame with the BMD values and the \(100\times(1-2\alpha)\% \) confidence intervals (BMDL, BMDU) for each response column calculated using model averaging. If ma_summary = TRUE, the function will return the posterior probabilities used in the model averaging.

If plot_results = TRUE, the function will plot the fited models or the model averaged model to the data. When mode averaging, the function will also make a Cleveland plot, saved to the output_path. Here, the BMDs are plotted for each model in the set alongside the model averaged BMD. The BMD is represented by a red dot. The size of the dot is scaled on the model probability with the Model Average having a value of 100%. The BMDL and BMDU are expressed as interval bars. Plots may be automatically exported to an output_path. Alternatively, if output_path = NULL, the function will return a list that includes summary (the data frame contianing the BMD results), and all generated plots.

Details

Available model types for single model fitting are:

"exp-aerts": \(f(x) = a(1 + (c-1)(1-\exp(-bx^{d}))) \)
"invexp-aerts": \(f(x) = a(1 + (c-1)(\exp(-bx^{-d})))\)
"gamma-aerts": \(f(x) = a(1 + (c-1)(Gamma(bx^d;xi)))\)
"invgamma-aerts": \(f(x) = a(1 + (c-1)(1-Gamma(bx^{-d};xi)))\)
"hill-aerts": \(f(x) = a(1 + (c-1)(1-\frac{b^d}{b^d + x^d}))\)
"lomax-aerts": \(f(x) = a\left\{1 + (c-1)(1-\left(\frac{b}{b+x^d} \right)^\xi) \right\}\)
"invlomax-aerts": \(f(x) = a\left\{1 + (c-1)(\left(\frac{b}{b+x^{-d}} \right))^\xi \right\}\)
"lognormal-aerts": \(f(x) = a\left\{1 + (c-1)\left(\Phi( \ln(b) + d\times \ln(x))\right) \right\}\)
"logskew-aerts": \(f(x) = a\left\{1 + (c-1)\left(\Phi_{SN}( \ln(b) + d\times \ln(x); \xi )\right) \right\}\)
"invlogskew-aerts": \(f(x) = a\left\{1 + (c-1)\left(1 - \Phi_{SN}( \ln(b) - d\times \ln(x); \xi )\right) \right\}\)
"logistic-aerts": \(f(x) = \frac{c}{1 + \exp(-a - b\times x^d)} \)
"probit-aerts": \(f(x) = c\left(\Phi(a + b\times x^d)\right) \)
"LMS": \(f(x) = a(1 + (c-1)(1 - \exp(-bx - dx^2)))\)
"gamma-efsa": \(f(x) = a(1 + (c-1)(Gamma(bx; d)))\)

Here: \(\Phi(\cdot)\) is the standard normal distribution and \(\Phi_{SN}(\cdot;\cdot)\) is the skew-normal distribution. See single_continuous_fit for more details.

Model averaging is done over the the model set described in The European Food Safety Authority's (2022) Guidance on the use of the benchmark dose approach in risk assessment. These models are (normal then lognormal for each model): exp-aerts, invexp-aerts, hill-aerts, lognormal-aerts, gamma-efsa, LMS, probit-aerts, and logistic-aerts. See ma_continuous_fit for more details.

BMR types for continuous models:

Relative deviation (default; bmr_type = "rel"). This defines the BMD as the dose that changes the control mean/median a certain percentage from the background dose. It is the dose that solves \(\mid f(dose) - f(0) \mid = (1 \pm BMR) f(0)\)
Standard deviation (bmr_type = "sd"). This defines the BMD as the dose associated with the mean/median changing a specified number of standard deviations from the mean at the control dose. It is the dose that solves \(\mid f(dose)-f(0) \mid = BMR \times \sigma\)
Absolute deviation (bmr_type="abs"). This defines the BMD as an absolute change from the control dose of zero by a specified amount. That is the BMD is the dose that solves the equation \(\mid f(dose) - f(0) \mid = BMR\).
Hybrid deviation (bmr_type = "hybrid"). This defines the BMD that changes the probability of an adverse event by a stated amount relative to no exposure (i.e 0). That is, it is the dose that solves \(\frac{Pr(X > x| dose) - Pr(X >x|0)}{Pr(X < x|0)} = BMR\). For this definition, \(Pr(X < x|0) = 1 - Pr(X > X|0) = \pi_0\), where \(0 \leq \pi_0 < 1\) is defined by the user as "point_p," and it defaults to 0.01. Note: this discussion assumed increasing data. The fitter determines the direction of the data and inverts the probability statements for decreasing data.

Examples

if (requireNamespace("ToxicR", quietly = TRUE)) {
  # Calculate the BMD for a 50% increase in mutation frequency from control
  # Individual data with Model averaging.
  example_file <- system.file("extdata", "Example_files",
                              "example_mutation_data_filtered.rds",
                              package = "MutSeqR")
  example_data <- readRDS(example_file)
  mf <- calculate_mf(example_data, retain_metadata_cols = "dose")
  bmd <- bmd_toxicr(mf_data = mf,
                    dose_col = "dose",
                    response_col = c("mf_min", "mf_max"))
  # Plot the results using plot_ci()
  plot <- plot_ci(bmd, order = "asc", log_scale = FALSE)

  # Summary data with Model averaging.
  mf_sum <- mf %>%
    dplyr::group_by(dose) %>%
    dplyr::summarise(mean_mf_min = mean(mf_min),
                     sd_min = sd(mf_min),
                     n_min = dplyr::n(),
                     mean_mf_max = mean(mf_max),
                     sd_max = sd(mf_max),
                     n_max = dplyr::n())
  bmd <- bmd_toxicr(mf_data = mf_sum,
                    data_type = "summary",
                    dose_col = "dose",
                    response_col = c("mean_mf_min", "mean_mf_max"),
                    sd_col = c("sd_min", "sd_max"),
                    n_col = c("n_min", "n_max"))
}
#> Performing internal depth correction to prevent double-counting...
#> Internal depth correction complete.
#> Joining with `by = join_by(sample)`
#> Joining with `by = join_by(sample)`