Bayesian DEB Modelling of Eisenia fetida Growth and DEBtox Analysis

Branimir K. Hackenberger, Tamara Djerdj, Domagoj K. Hackenberger

2026-06-17

Illustrative settings. To keep build time within CRAN’s vignette limits the chunks below use 2 chains × (300 + 300) iterations on a 5-individual subset of eisenia_growth. The publication-grade analysis with 4 chains × (1000 + 1000) iterations on all 21 individuals is reproduced by the scripts in the replication archive (BayesianDEB_replication.zip).

1 Introduction

Dynamic Energy Budget (DEB) theory provides a mechanistic, thermodynamically consistent framework for describing how organisms acquire and utilise energy for maintenance, growth, development, and reproduction (Kooijman 2010). Classical DEB calibration relies on deterministic optimisation, yielding a single best-fit parameter vector without formal uncertainty quantification.

BayesianDEB embeds the DEB ordinary differential equation (ODE) system within a Bayesian state-space model, using Stan’s Hamiltonian Monte Carlo (HMC) sampler (Carpenter et al. 2017) to explore the full joint posterior distribution of all parameters. This approach naturally provides:

This vignette demonstrates the complete workflow on two standard ecotoxicological test organisms:

  1. Eisenia fetida (earthworm) — individual growth model, then a hierarchical multi-individual analysis.
  2. DEBtox analysis — toxicokinetic-toxicodynamic (TKTD) model on growth data under toxicant exposure, with EC50/NEC estimation.

2 Prerequisites

library(BayesianDEB)
library(ggplot2)
library(posterior)  # for summarise_draws()
#> This is posterior version 1.6.1
#> 
#> Attaching package: 'posterior'
#> The following objects are masked from 'package:stats':
#> 
#>     mad, sd, var
#> The following objects are masked from 'package:base':
#> 
#>     %in%, match

BayesianDEB requires cmdstanr and a working CmdStan installation for model fitting. Data preparation, prior specification, and utility functions work without Stan.

# Internal helper; emits an informative error when CmdStan is missing.
BayesianDEB:::check_cmdstanr()

3 Part 1: Eisenia fetida Growth

3.1 Data description

The eisenia_growth dataset contains simulated weekly length measurements for 21 Eisenia fetida individuals over 84 days (12 weeks). The simulation used the standard 2-state DEB model (reserve \(E\), structure \(V\)) with parameters representative of E. fetida from the AmP collection: \(\{p_{Am}\} = 5.0\) J d\(^{-1}\) cm\(^{-2}\), \([p_M] = 0.5\) J d\(^{-1}\) cm\(^{-3}\), \(\kappa = 0.75\), \(v = 0.2\) cm d\(^{-1}\), \([E_G] = 400\) J cm\(^{-3}\). Individual variation in \(\{p_{Am}\}\) (CV \(\approx\) 10%) and Gaussian observation error (\(\sigma_L = 0.015\) cm) were added.

data(eisenia_growth)

# Structure: 273 obs, 3 variables (id, time, length)
str(eisenia_growth)
#> 'data.frame':    273 obs. of  3 variables:
#>  $ id    : int  1 1 1 1 1 1 1 1 1 1 ...
#>  $ time  : num  0 7 14 21 28 35 42 49 56 63 ...
#>  $ length: num  0.103 0.109 0.11 0.107 0.137 ...

length(unique(eisenia_growth$id))   # 21 individuals
#> [1] 21
length(unique(eisenia_growth$time)) # 13 time points (days 0–84)
#> [1] 13
ggplot(eisenia_growth, aes(time, length, group = id)) +
  geom_line(alpha = 0.3, colour = "steelblue") +
  geom_point(size = 0.8, alpha = 0.4) +
  theme_bw(base_size = 12) +
  labs(x = "Time (days)", y = expression(paste("Structural length ", L, " (cm)")),
       title = "Eisenia fetida: 21 individuals, 12 weeks")
Growth trajectories of 21 *E. fetida* individuals.  Structural length $L = V^{1/3}$ measured weekly over 12 weeks.

Growth trajectories of 21 E. fetida individuals. Structural length \(L = V^{1/3}\) measured weekly over 12 weeks.

Key features visible in the data:

3.2 Individual model: single organism

We start with a single individual to validate the approach.

3.2.1 Step 1: Prepare data

df1 <- eisenia_growth[eisenia_growth$id == 5, ]
dat1 <- bdeb_data(growth = df1, f_food = 1.0)
dat1
#> 
#> ── BDEB Data ──
#> 
#> ℹ Individuals: 1
#> ℹ Endpoints: growth
#> ℹ Functional response (f): 1
#> → Growth: 13 observations, t = [0, 84]

The f_food = 1.0 argument specifies ad libitum feeding (\(f = 1\), the ratio of actual to maximum ingestion rate; Kooijman 2010, Eq. 2.3).

3.2.2 Step 2: Specify model and priors

The individual model tracks two state variables: reserve energy \(E\) (J) and structural volume \(V\) (cm\(^3\)), governed by:

\[ \frac{dE}{dt} = f\{p_{Am}\}L^2 - \frac{EvL}{E + [E_G]V}, \qquad \frac{dV}{dt} = \frac{\kappa \dot{p}_C - [p_M]V}{[E_G]} \]

where \(L = V^{1/3}\) is structural length. Observed lengths are assumed to follow \(L_\text{obs} \sim \mathcal{N}(\hat{L}, \sigma_L)\).

We set biologically informed priors based on published AmP values for earthworms:

Parameter Prior Rationale
\(\{p_{Am}\}\) LogNormal(1.5, 0.5) Median \(e^{1.5} \approx 4.5\); AmP range 3–8
\([p_M]\) LogNormal(−1.0, 0.5) Median \(e^{-1} \approx 0.37\); typical 0.15–0.8
\(\kappa\) Beta(3, 2) Mode 0.67; earthworms allocate ~60–80% to soma
\(v\) LogNormal(−1.5, 0.5) Median \(e^{-1.5} \approx 0.22\) cm/d
\([E_G]\) LogNormal(6.0, 0.5) Median \(e^6 \approx 403\); typical 200–800
\(\sigma_L\) HalfNormal(0.05) Measurement precision ~0.01–0.03 cm
mod1 <- bdeb_model(dat1, type = "individual",
  priors = list(
    p_Am    = prior_lognormal(mu = 1.5, sigma = 0.5),
    p_M     = prior_lognormal(mu = -1.0, sigma = 0.5),
    kappa   = prior_beta(a = 3, b = 2),
    v       = prior_lognormal(mu = -1.5, sigma = 0.5),
    E_G     = prior_lognormal(mu = 6.0, sigma = 0.5),
    sigma_L = prior_halfnormal(sigma = 0.05)
  ))
mod1
#> 
#> ── BDEB Model Specification ──
#> 
#> ℹ Type: individual
#> ℹ Stan model: bdeb_individual_growth
#> ℹ Individuals: 1
#> ℹ Endpoints: growth
#> 
#> ── Priors
#> →   p_Am: LogNormal(1.5, 0.5)
#> →   p_M: LogNormal(-1.0, 0.5)
#> →   kappa: Beta(3.0, 2.0)
#> →   v: LogNormal(-1.5, 0.5)
#> →   E_G: LogNormal(6.0, 0.5)
#> →   sigma_L: HalfNormal(0.05)
#> →   E0: LogNormal(0.0, 1.0)
#> →   L0: LogNormal(-2.0, 1.0)

Unspecified priors (E0, L0) are filled automatically from prior_default("individual").

3.2.3 Step 3: Fit via MCMC

The model is compiled to C++ and sampled using the No-U-Turn Sampler (NUTS; Hoffman and Gelman 2014) with the stiff BDF ODE solver.

fit1 <- bdeb_fit(mod1,
  chains        = 2,
  iter_warmup   = 300,
  iter_sampling = 300,
  refresh       = 100,
  seed          = 42
)
#> ℹ Compiling Stan model: 'bdeb_individual_growth'
#> ℹ Running MCMC (2 chains, 300 iterations each)
#> Running MCMC with 2 parallel chains...
#> 
#> Chain 1 Iteration:   1 / 600 [  0%]  (Warmup) 
#> Chain 2 Iteration:   1 / 600 [  0%]  (Warmup)
#> Chain 2 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
#> Chain 2 Exception: normal_lpdf: Scale parameter is 0, but must be positive! (in '/tmp/Rtmp1KA7pd/model-c47577a84097.stan', line 111, column 4 to column 35)
#> Chain 2 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
#> Chain 2 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
#> Chain 2
#> Chain 2 Iteration: 100 / 600 [ 16%]  (Warmup) 
#> Chain 1 Iteration: 100 / 600 [ 16%]  (Warmup) 
#> Chain 2 Iteration: 200 / 600 [ 33%]  (Warmup) 
#> Chain 1 Iteration: 200 / 600 [ 33%]  (Warmup) 
#> Chain 2 Iteration: 300 / 600 [ 50%]  (Warmup) 
#> Chain 2 Iteration: 301 / 600 [ 50%]  (Sampling) 
#> Chain 1 Iteration: 300 / 600 [ 50%]  (Warmup) 
#> Chain 1 Iteration: 301 / 600 [ 50%]  (Sampling) 
#> Chain 2 Iteration: 400 / 600 [ 66%]  (Sampling) 
#> Chain 1 Iteration: 400 / 600 [ 66%]  (Sampling) 
#> Chain 2 Iteration: 500 / 600 [ 83%]  (Sampling) 
#> Chain 2 Iteration: 600 / 600 [100%]  (Sampling) 
#> Chain 2 finished in 3.2 seconds.
#> Chain 1 Iteration: 500 / 600 [ 83%]  (Sampling) 
#> Chain 1 Iteration: 600 / 600 [100%]  (Sampling) 
#> Chain 1 finished in 3.8 seconds.
#> 
#> Both chains finished successfully.
#> Mean chain execution time: 3.5 seconds.
#> Total execution time: 3.8 seconds.
fit1
#> 
#> ── BDEB Fit ──
#> 
#> ℹ Model type: individual
#> ℹ Algorithm: sampling (NUTS)
#> ℹ Chains: 2, Warmup: 300, Sampling: 300
#> ✔ No divergent transitions

3.2.4 Step 4: Convergence diagnostics

Diagnostics follow the recommendations of Vehtari et al. (2021):

diag1 <- bdeb_diagnose(fit1)

What to check:

  • Divergences = 0 — if present, increase adapt_delta (e.g., 0.95).
  • \(\hat{R} < 1.01\) for all parameters — chains have mixed.
  • Bulk ESS > 400 — sufficient effective samples for posterior means.
  • Tail ESS > 400 — sufficient for credible interval endpoints.
  • E-BFMI > 0.3 — momentum resampling is efficient.
plot(fit1, type = "trace",
     pars = c("p_Am", "p_M", "kappa", "sigma_L"))
MCMC trace plots for core DEB parameters.  Well-mixed chains should appear as overlapping 'hairy caterpillars'.

MCMC trace plots for core DEB parameters. Well-mixed chains should appear as overlapping ‘hairy caterpillars’.

# `bayesplot::mcmc_pairs` requires gridExtra (a Suggests of bayesplot).
plot(fit1, type = "pairs",
     pars = c("p_Am", "p_M", "kappa", "E_G"))
Bivariate posterior scatter.  Strong correlation between $\{p_{Am}\}$ and $[p_M]$ is expected: both control ultimate size $L_\infty = \kappa \{p_{Am}\} / [p_M]$.

Bivariate posterior scatter. Strong correlation between \(\{p_{Am}\}\) and \([p_M]\) is expected: both control ultimate size \(L_\infty = \kappa \{p_{Am}\} / [p_M]\).

3.2.5 Step 5: Posterior summary

summary(fit1,
  pars = c("p_Am", "p_M", "kappa", "v", "E_G", "sigma_L"),
  prob = 0.95)
#> # A tibble: 6 × 9
#>   variable     mean        sd   median    `2.5%` `97.5%`  rhat ess_bulk ess_tail
#>   <chr>       <dbl>     <dbl>    <dbl>     <dbl>   <dbl> <dbl>    <dbl>    <dbl>
#> 1 p_Am       5.44     2.90      4.69     1.70    1.25e+1 0.998     568.     386.
#> 2 p_M        0.408    0.217     0.355    0.141   9.93e-1 0.999    1014.     353.
#> 3 kappa      0.577    0.190     0.570    0.226   8.97e-1 1.00      554.     409.
#> 4 v          0.212    0.0927    0.193    0.0858  4.82e-1 1.00      454.     484.
#> 5 E_G      461.     172.      434.     205.      8.98e+2 1.00      402.     272.
#> 6 sigma_L    0.0130   0.00315   0.0123   0.00828 2.04e-2 1.00      439.     301.

3.2.6 Step 6: Posterior predictive check

Posterior predictive checks (PPCs) compare data replicated from the fitted model (\(L^\text{rep}\)) with the observed data (\(L^\text{obs}\)). If the model fits well, the observed points should fall within the envelope of replicated trajectories (Gelman et al. 2013, Ch. 6).

ppc1 <- bdeb_ppc(fit1, type = "growth")
plot(ppc1, n_draws = 200)
Posterior predictive check: grey lines are replicated growth trajectories, red points are observed data.

Posterior predictive check: grey lines are replicated growth trajectories, red points are observed data.

plot(fit1, type = "trajectory", n_draws = 200)
Posterior predicted trajectories (blue) with observed data (black points).  The spread reflects parameter uncertainty.

Posterior predicted trajectories (blue) with observed data (black points). The spread reflects parameter uncertainty.

3.2.7 Step 7: Derived quantities

BayesianDEB computes biologically meaningful quantities directly from the posterior, automatically propagating uncertainty:

Quantity Formula Interpretation
\(L_m\) \(\kappa \{p_{Am}\} / [p_M]\) Maximum structural length (\(f = 1\))
\(L_\infty\) \(f \cdot L_m\) Ultimate structural length at food \(f\)
\(k_M\) \([p_M] / [E_G]\) Somatic maintenance rate constant
\(g\) \([E_G] \, v / (\kappa \{p_{Am}\})\) Energy investment ratio
\(\dot{r}_B\) \(k_M \, g \,/\, 3(f + g)\) von Bertalanffy growth rate (Eq. 3.23)

Note: all lengths are structural (\(L = V^{1/3}\)), not physical. Physical length \(L_w = L / \delta_M\) where \(\delta_M\) is the species-specific shape coefficient (not estimated by this package).

der1 <- bdeb_derived(fit1,
  quantities = c("L_m", "L_inf", "k_M", "g", "growth_rate"), f = 1.0)

summarise_draws(der1,
  "mean", "sd",
  "q2.5"  = ~quantile(.x, 0.025),
  "q97.5" = ~quantile(.x, 0.975))
#> # A tibble: 5 × 5
#>   variable         mean        sd    `2.5%`    `97.5%`
#>   <chr>           <dbl>     <dbl>     <dbl>      <dbl>
#> 1 L_m          9.53      8.09     1.51       30.0     
#> 2 L_inf        9.53      8.09     1.51       30.0     
#> 3 k_M          0.00101   0.000694 0.000248    0.00286 
#> 4 g           47.0      46.9      6.20      183.      
#> 5 growth_rate  0.000323  0.000220 0.0000812   0.000927

3.2.8 Step 8: Scenario analysis — reduced food

How does ultimate size change if food availability drops to 70%? Since \(L_\infty \propto f\), we can compute this directly:

d_f10 <- bdeb_derived(fit1, quantities = "L_inf", f = 1.0)
d_f07 <- bdeb_derived(fit1, quantities = "L_inf", f = 0.7)

df_compare <- data.frame(
  L_inf = c(d_f10$L_inf, d_f07$L_inf),
  food  = rep(c("f = 1.0", "f = 0.7"), each = nrow(d_f10))
)

ggplot(df_compare, aes(x = L_inf, fill = food)) +
  geom_density(alpha = 0.4) +
  theme_bw(base_size = 12) +
  labs(x = expression(L[infinity] ~ "(cm)"),
       y = "Posterior density",
       fill = "Food level")
Posterior distributions of $L_\infty$ at $f = 1.0$ (blue) and $f = 0.7$ (orange).

Posterior distributions of \(L_\infty\) at \(f = 1.0\) (blue) and \(f = 0.7\) (orange).

3.3 Hierarchical model: 21 individuals

When multiple individuals are available, a hierarchical model is preferred. It estimates population-level distributions for parameters that vary across individuals, while sharing parameters that are species-level constants.

3.3.1 Model structure

The hierarchical model places a lognormal random effect on the assimilation rate:

\[ \log\{p_{Am}\}_j = \mu_{\log p_{Am}} + \sigma_{\log p_{Am}} \cdot z_j, \qquad z_j \sim \mathcal{N}(0, 1) \]

This non-centred parameterisation (Betancourt and Girolami 2015) avoids the pathological funnel geometry that arises when \(\sigma\) is small. The parameters \([p_M]\), \(\kappa\), \(v\), \([E_G]\) are shared across all individuals.

3.3.2 Prepare and specify

# Illustrative subset of 5 individuals; replication archive uses all 21.
dat_all <- bdeb_data(
  growth = eisenia_growth[eisenia_growth$id %in% 1:5, ],
  f_food = 1.0
)
dat_all
#> 
#> ── BDEB Data ──
#> 
#> ℹ Individuals: 5
#> ℹ Endpoints: growth
#> ℹ Functional response (f): 1
#> → Growth: 65 observations, t = [0, 84]
mod_h <- bdeb_model(dat_all, type = "hierarchical",
  priors = list(
    mu_log_p_Am    = prior_normal(mu = 1.5, sigma = 0.5),
    sigma_log_p_Am = prior_exponential(rate = 2),
    p_M            = prior_lognormal(mu = -1.0, sigma = 0.5),
    kappa          = prior_beta(a = 3, b = 2),
    v              = prior_lognormal(mu = -1.5, sigma = 0.5),
    E_G            = prior_lognormal(mu = 6.0, sigma = 0.5),
    sigma_L        = prior_halfnormal(sigma = 0.05)
  ))

The prior on \(\sigma_{\log p_{Am}}\) uses an Exponential(2) distribution, following the guidance of Gelman (2006) for hierarchical variance parameters. This places most prior mass near zero while allowing substantial variation if the data support it.

3.3.3 Fit

With multiple individuals, each requiring an independent ODE solve per iteration, the hierarchical model benefits from within-chain parallelism via Stan’s reduce_sum and threads_per_chain. We omit threading here because the illustrative subset has only five individuals; the replication archive turns it on for the full 21.

fit_h <- bdeb_fit(mod_h,
  chains        = 2,
  iter_warmup   = 300,
  iter_sampling = 300,
  refresh       = 100,
  seed          = 123
)
#> ℹ Compiling Stan model: 'bdeb_hierarchical_growth'
#> ℹ Running MCMC (2 chains, 300 iterations each)
#> Running MCMC with 2 parallel chains...
#> 
#> Chain 1 Iteration:   1 / 600 [  0%]  (Warmup)
#> Chain 1 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
#> Chain 1 Exception: Exception: normal_lpdf: Scale parameter is 0, but must be positive! (in '/tmp/Rtmp1KA7pd/model-c47538a8e01f.stan', line 61, column 12 to column 60) (in '/tmp/Rtmp1KA7pd/model-c47538a8e01f.stan', line 149, column 2 to line 153, column 43)
#> Chain 1 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
#> Chain 1 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
#> Chain 1
#> Chain 1 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
#> Chain 1 Exception: Exception: normal_lpdf: Scale parameter is 0, but must be positive! (in '/tmp/Rtmp1KA7pd/model-c47538a8e01f.stan', line 61, column 12 to column 60) (in '/tmp/Rtmp1KA7pd/model-c47538a8e01f.stan', line 149, column 2 to line 153, column 43)
#> Chain 1 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
#> Chain 1 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
#> Chain 1
#> Chain 1 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
#> Chain 1 Exception: Exception: ode_bdf_tol: ode parameters and data is inf, but must be finite! (in '/tmp/Rtmp1KA7pd/model-c47538a8e01f.stan', line 51, column 6 to line 55, column 8) (in '/tmp/Rtmp1KA7pd/model-c47538a8e01f.stan', line 149, column 2 to line 153, column 43)
#> Chain 1 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
#> Chain 1 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
#> Chain 1
#> Chain 2 Iteration:   1 / 600 [  0%]  (Warmup)
#> Chain 2 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
#> Chain 2 Exception: Exception: ode_bdf_tol: ode parameters and data is inf, but must be finite! (in '/tmp/Rtmp1KA7pd/model-c47538a8e01f.stan', line 51, column 6 to line 55, column 8) (in '/tmp/Rtmp1KA7pd/model-c47538a8e01f.stan', line 149, column 2 to line 153, column 43)
#> Chain 2 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
#> Chain 2 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
#> Chain 2
#> Chain 2 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
#> Chain 2 Exception: Exception: CVode(cvodes_mem, t_final, nv_state_, &t_init, CV_NORMAL) failed with error flag -4:
#> Chain 2 Convergence test failures occurred too many times during one internal time step or minimum step size was reached. (in '/tmp/Rtmp1KA7pd/model-c47538a8e01f.stan', line 51, column 6 to line 55, column 8) (in '/tmp/Rtmp1KA7pd/model-c47538a8e01f.stan', line 149, column 2 to line 153, column 43)
#> Chain 2 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
#> Chain 2 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
#> Chain 2
#> Chain 2 Iteration: 100 / 600 [ 16%]  (Warmup) 
#> Chain 1 Iteration: 100 / 600 [ 16%]  (Warmup)
#> Chain 1 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
#> Chain 1 Exception: Exception: CVode(cvodes_mem, t_final, nv_state_, &t_init, CV_NORMAL) failed with error flag -1:
#> Chain 1 The solver took mxstep internal steps but could not reach tout. (in '/tmp/Rtmp1KA7pd/model-c47538a8e01f.stan', line 51, column 6 to line 55, column 8) (in '/tmp/Rtmp1KA7pd/model-c47538a8e01f.stan', line 149, column 2 to line 153, column 43)
#> Chain 1 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
#> Chain 1 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.
#> Chain 1
#> Chain 2 Iteration: 200 / 600 [ 33%]  (Warmup) 
#> Chain 1 Iteration: 200 / 600 [ 33%]  (Warmup) 
#> Chain 2 Iteration: 300 / 600 [ 50%]  (Warmup) 
#> Chain 2 Iteration: 301 / 600 [ 50%]  (Sampling) 
#> Chain 2 Iteration: 400 / 600 [ 66%]  (Sampling) 
#> Chain 1 Iteration: 300 / 600 [ 50%]  (Warmup) 
#> Chain 1 Iteration: 301 / 600 [ 50%]  (Sampling) 
#> Chain 2 Iteration: 500 / 600 [ 83%]  (Sampling) 
#> Chain 1 Iteration: 400 / 600 [ 66%]  (Sampling) 
#> Chain 2 Iteration: 600 / 600 [100%]  (Sampling) 
#> Chain 2 finished in 33.2 seconds.
#> Chain 1 Iteration: 500 / 600 [ 83%]  (Sampling) 
#> Chain 1 Iteration: 600 / 600 [100%]  (Sampling) 
#> Chain 1 finished in 40.3 seconds.
#> 
#> Both chains finished successfully.
#> Mean chain execution time: 36.7 seconds.
#> Total execution time: 40.4 seconds.

3.3.4 Diagnostics

bdeb_diagnose(fit_h)
#> 
#> ── BDEB Diagnostics (hierarchical) ──
#> 
#> ✔ No divergent transitions.
#> ✔ Treedepth OK.
#> ✔ E-BFMI OK (all > 0.3).
#> 
#> ── Parameter Summary
#> ✖ R-hat > 1.01 for: z_log_p_Am[5], kappa
#> ! Low bulk ESS (<400) for: sigma_log_p_Am, E_G
#>        variable     mean       sd   median       5%      95% rhat ess_bulk
#>     mu_log_p_Am   1.6210   0.4399   1.6421   0.8770   2.2993 1.00      649
#>  sigma_log_p_Am   0.4584   0.4200   0.3358   0.0257   1.2483 1.00      304
#>             p_M   0.4656   0.2709   0.4013   0.1675   0.9645 1.01      583
#>           kappa   0.6108   0.1720   0.6100   0.3265   0.8889 1.01      454
#>               v   0.2337   0.1040   0.2122   0.1162   0.4109 1.00      462
#>             E_G 384.3195 150.8336 354.8295 190.7204 664.7267 1.00      350
#>              E0   1.6237   3.6562   0.9810   0.1723   4.7548 1.01      461
#>         sigma_L   0.0168   0.0016   0.0167   0.0144   0.0196 1.01      774
#>  ess_tail
#>       435
#>       285
#>       406
#>       420
#>       417
#>       359
#>       423
#>       480
#> ℹ 15 latent-state rows hidden; use `print(x, full = TRUE)` or `summary(x)$table` to see all.
plot(fit_h, type = "trace",
     pars = c("mu_log_p_Am", "sigma_log_p_Am"))
Trace plots for population-level hyperparameters $\mu_{\log p_{Am}}$ and $\sigma_{\log p_{Am}}$.

Trace plots for population-level hyperparameters \(\mu_{\log p_{Am}}\) and \(\sigma_{\log p_{Am}}\).

plot(fit_h, type = "posterior",
     pars = c("mu_log_p_Am", "sigma_log_p_Am", "p_M", "kappa"))
Marginal posterior densities for shared parameters.

Marginal posterior densities for shared parameters.

3.3.5 Population-level results

summary(fit_h,
  pars = c("mu_log_p_Am", "sigma_log_p_Am",
           "p_M", "kappa", "v", "E_G", "sigma_L"),
  prob = 0.95)
#> # A tibble: 7 × 9
#>   variable          mean      sd  median  `2.5%` `97.5%`  rhat ess_bulk ess_tail
#>   <chr>            <dbl>   <dbl>   <dbl>   <dbl>   <dbl> <dbl>    <dbl>    <dbl>
#> 1 mu_log_p_Am    1.62e+0 4.40e-1 1.64e+0 7.67e-1 2.46e+0 1.00      649.     435.
#> 2 sigma_log_p_Am 4.58e-1 4.20e-1 3.36e-1 1.33e-2 1.56e+0 1.00      304.     285.
#> 3 p_M            4.66e-1 2.71e-1 4.01e-1 1.42e-1 1.17e+0 1.01      583.     406.
#> 4 kappa          6.11e-1 1.72e-1 6.10e-1 2.85e-1 9.21e-1 1.01      454.     420.
#> 5 v              2.34e-1 1.04e-1 2.12e-1 1.01e-1 4.89e-1 1.000     462.     417.
#> 6 E_G            3.84e+2 1.51e+2 3.55e+2 1.65e+2 7.49e+2 1.00      350.     359.
#> 7 sigma_L        1.68e-2 1.60e-3 1.67e-2 1.41e-2 2.00e-2 1.01      774.     480.

3.3.6 Shrinkage of individual estimates

A key feature of hierarchical models is shrinkage: individuals with sparse or noisy data are pulled toward the population mean. This is partial pooling — a principled compromise between complete pooling (ignoring individual variation) and no pooling (fitting each individual independently).

n_ind <- dat_all$n_ind
ind_summary <- summary(fit_h,
  pars = paste0("p_Am_ind[", seq_len(n_ind), "]"),
  prob = 0.90)

pop_summary <- summary(fit_h, pars = "mu_log_p_Am")
pop_mean_pAm <- exp(as.data.frame(pop_summary)$mean)

ind_df <- as.data.frame(ind_summary)
ind_df$individual <- seq_len(n_ind)

ggplot(ind_df, aes(x = individual, y = mean)) +
  geom_pointrange(aes(ymin = `5%`, ymax = `95%`),
                  colour = "steelblue", size = 0.4) +
  geom_hline(yintercept = pop_mean_pAm, linetype = "dashed",
             colour = "red", linewidth = 0.8) +
  theme_bw(base_size = 12) +
  labs(x = "Individual", y = expression({p[Am]} ~ "(J/d/cm"^2*")"),
       title = "Individual assimilation rates with 90% CI")
Individual-level $\{p_{Am}\}$ estimates (points: posterior means; bars: 90% CI) compared to the population mean (dashed red line).  Shrinkage toward the mean is visible for individuals with noisier data.

Individual-level \(\{p_{Am}\}\) estimates (points: posterior means; bars: 90% CI) compared to the population mean (dashed red line). Shrinkage toward the mean is visible for individuals with noisier data.

3.3.7 Prediction for a new individual

The generated quantities block draws p_Am_new from the population distribution — useful for predicting the performance of an unobserved individual from the same population:

summary(fit_h, pars = "p_Am_new", prob = 0.95)
#> # A tibble: 1 × 9
#>   variable  mean    sd median `2.5%` `97.5%`  rhat ess_bulk ess_tail
#>   <chr>    <dbl> <dbl>  <dbl>  <dbl>   <dbl> <dbl>    <dbl>    <dbl>
#> 1 p_Am_new  7.42  16.6   4.90  0.953    25.8  1.00     580.     523.

3.3.8 Individual vs hierarchical: comparison

s_ind  <- summary(fit1, pars = c("p_Am", "p_M", "kappa"), prob = 0.90)
s_hier <- summary(fit_h, pars = c("p_M", "kappa"), prob = 0.90)

cat("=== Individual model (id = 5, n = 1) ===\n")
#> === Individual model (id = 5, n = 1) ===
print(as.data.frame(s_ind), digits = 3, row.names = FALSE)
#>  variable  mean    sd median    5%    95%  rhat ess_bulk ess_tail
#>      p_Am 5.435 2.898  4.692 1.882 11.273 0.998      568      386
#>       p_M 0.408 0.217  0.355 0.171  0.824 0.999     1014      353
#>     kappa 0.577 0.190  0.570 0.265  0.873 1.001      554      409

cat("\n=== Hierarchical model (n = 21) ===\n")
#> 
#> === Hierarchical model (n = 21) ===
print(as.data.frame(s_hier), digits = 3, row.names = FALSE)
#>  variable  mean    sd median    5%   95% rhat ess_bulk ess_tail
#>       p_M 0.466 0.271  0.401 0.167 0.964 1.01      583      406
#>     kappa 0.611 0.172  0.610 0.326 0.889 1.01      454      420

The hierarchical model yields narrower credible intervals for shared parameters (\([p_M]\), \(\kappa\)) because it pools information across 21 individuals.

4 Part 2: DEBtox Analysis

4.1 Background

The DEBtox framework (Jager et al. 2006; Jager and Zimmer 2012) extends DEB with toxicokinetic-toxicodynamic (TKTD) components. A scaled internal damage variable \(D_w\) tracks the toxicant’s effect:

\[ \frac{dD_w}{dt} = k_d\bigl(\max(C_w - z_w, 0) - D_w\bigr) \]

where \(k_d\) is the damage recovery rate, \(C_w\) is the external concentration, and \(z_w\) is the no-effect concentration (NEC). The damage causes a stress factor \(s = b_w \cdot D_w\) that reduces assimilation:

\[ \dot{p}_A = f\{p_{Am}\}L^2 \cdot \max(1 - s, \; 0) \]

At steady state (\(D_w = C_w - z_w\) for \(C_w > z_w\)), the EC50 for 50% assimilation reduction is:

\[ \text{EC}_{50} = z_w + \frac{0.5}{b_w} \]

4.2 Data exploration

The debtox_growth dataset simulates growth under 4 toxicant concentrations (0, 20, 80, 200 arbitrary units), 10 individuals per group, measured weekly over 6 weeks. True parameters: NEC = 15, \(b_w = 0.003\).

data(debtox_growth)

ggplot(debtox_growth,
       aes(time, length, colour = factor(concentration), group = id)) +
  geom_line(alpha = 0.3) +
  geom_point(size = 0.8, alpha = 0.4) +
  facet_wrap(~concentration, labeller = label_both) +
  theme_bw(base_size = 11) +
  scale_colour_brewer(palette = "RdYlBu", direction = -1) +
  labs(x = "Time (days)", y = "Structural length (cm)",
       colour = "Concentration") +
  theme(legend.position = "none")
Growth trajectories under 4 toxicant concentrations.  Higher concentrations suppress growth through reduced assimilation.

Growth trajectories under 4 toxicant concentrations. Higher concentrations suppress growth through reduced assimilation.

4.3 Data preparation

conc_levels <- unique(debtox_growth$concentration)
conc_map <- setNames(conc_levels, as.character(conc_levels))

dat_tox <- bdeb_data(
  growth        = debtox_growth,
  concentration = conc_map,
  f_food        = 1.0
)
dat_tox
#> 
#> ── BDEB Data ──
#> 
#> ℹ Individuals: 40
#> ℹ Endpoints: growth
#> ℹ Functional response (f): 1
#> → Growth: 280 observations, t = [0, 42]
#> → Concentration groups: 4

4.4 Model specification

mod_tox <- bdeb_tox(dat_tox, stress = "assimilation",
  priors = list(
    p_Am    = prior_lognormal(mu = 1.5, sigma = 0.5),
    p_M     = prior_lognormal(mu = -1.0, sigma = 0.5),
    kappa   = prior_beta(a = 3, b = 2),
    v       = prior_lognormal(mu = -1.5, sigma = 0.5),
    E_G     = prior_lognormal(mu = 6.0, sigma = 0.5),
    sigma_L = prior_halfnormal(sigma = 0.05),
    k_d     = prior_lognormal(mu = -1.0, sigma = 1.0),
    z_w     = prior_lognormal(mu = 2.5, sigma = 1.0),
    b_w     = prior_lognormal(mu = -5.0, sigma = 2.0)
  ))
#> Warning: ! 4 concentration group(s) contain multiple individuals.
#> ℹ DEBtox fits one ODE per concentration group, not per individual.
#> ℹ Aggregating to group means per time point. For individual-level TKTD, a
#>   hierarchical DEBtox extension is needed (not yet implemented).
mod_tox
#> 
#> ── BDEB Model Specification ──
#> 
#> ℹ Type: debtox
#> ℹ Stan model: bdeb_debtox
#> ℹ Individuals: 40
#> ℹ Endpoints: growth
#> 
#> ── Priors
#> →   p_Am: LogNormal(1.5, 0.5)
#> →   p_M: LogNormal(-1.0, 0.5)
#> →   kappa: Beta(3.0, 2.0)
#> →   v: LogNormal(-1.5, 0.5)
#> →   E_G: LogNormal(6.0, 0.5)
#> →   sigma_L: HalfNormal(0.05)
#> →   k_d: LogNormal(-1.0, 1.0)
#> →   z_w: LogNormal(2.5, 1.0)
#> →   b_w: LogNormal(-5.0, 2.0)
#> →   E0: LogNormal(0.0, 1.0)
#> →   L0: LogNormal(-2.0, 1.0)
#> →   k_R: LogNormal(-1.0, 1.0)
#> →   phi_R: LogNormal(0.0, 1.0)

Prior rationale for toxicological parameters:

Parameter Prior Median 95% prior range Rationale
\(k_d\) LogNormal(−1, 1) 0.37 d\(^{-1}\) 0.05–2.7 Damage recovery: hours to days
\(z_w\) LogNormal(2.5, 1) 12.2 1.6–89 NEC within tested range 0–200
\(b_w\) LogNormal(−5, 2) 0.0067 0.00009–0.5 Weakly informative on effect intensity

4.5 Fit

For this demo only, we fit the DEBtox model with variational inference (ADVI), which yields an approximation of the posterior in seconds rather than minutes. The replication archive uses full HMC (NUTS) with chains = 4, iter = 1000 + 1000 and is the publication-grade method.

fit_tox <- bdeb_fit(mod_tox, algorithm = "variational",
                    seed = 77, refresh = 0)
#> ℹ Compiling Stan model: 'bdeb_debtox'
#> ℹ Running variational inference (ADVI, mean-field; approximation, NOT exact MCMC).
#> ------------------------------------------------------------ 
#> EXPERIMENTAL ALGORITHM: 
#>   This procedure has not been thoroughly tested and may be unstable 
#>   or buggy. The interface is subject to change. 
#> ------------------------------------------------------------ 
#> Gradient evaluation took 0.004254 seconds 
#> 1000 transitions using 10 leapfrog steps per transition would take 42.54 seconds. 
#> Adjust your expectations accordingly! 
#> Begin eta adaptation. 
#> Iteration:   1 / 250 [  0%]  (Adaptation) 
#> Iteration:  50 / 250 [ 20%]  (Adaptation) 
#> Iteration: 100 / 250 [ 40%]  (Adaptation) 
#> Iteration: 150 / 250 [ 60%]  (Adaptation) 
#> Iteration: 200 / 250 [ 80%]  (Adaptation) 
#> Success! Found best value [eta = 1] earlier than expected. 
#> Begin stochastic gradient ascent. 
#>   iter             ELBO   delta_ELBO_mean   delta_ELBO_med   notes  
#>    100           27.574             1.000            1.000 
#>    200           74.717             0.815            1.000 
#>    300           84.104             0.581            0.631 
#>    400           85.188             0.439            0.631 
#>    500           77.152             0.372            0.112 
#>    600           86.143             0.327            0.112 
#>    700           86.560             0.281            0.104 
#>    800           85.659             0.247            0.104 
#>    900           86.444             0.221            0.104 
#>   1000           86.720             0.199            0.104 
#>   1100           86.146             0.100            0.013 
#>   1200           75.494             0.051            0.013 
#>   1300           84.929             0.051            0.013 
#>   1400           85.722             0.050            0.011 
#>   1500           86.846             0.041            0.011 
#>   1600           86.402             0.031            0.009   MEDIAN ELBO CONVERGED 
#> Drawing a sample of size 1000 from the approximate posterior...  
#> COMPLETED. 
#> Finished in  6.6 seconds.

4.6 Diagnostics

For the variational fit used in this vignette, MCMC-specific diagnostics (\(\hat{R}\), divergent transitions, treedepth, ESS) are not defined. In the replication archive, where the same model is fitted with full HMC, bdeb_diagnose(fit_tox) and the trace plot below are the appropriate checks.

# Re-fit with algorithm = "sampling" for these diagnostics; see
# the replication archive for the publication-grade analysis.
plot(fit_tox, type = "trace", pars = c("k_d", "z_w", "b_w"))
plot(fit_tox, type = "posterior", pars = c("k_d", "z_w", "b_w"))
Marginal posterior densities for toxicological parameters.

Marginal posterior densities for toxicological parameters.

# `bayesplot::mcmc_pairs` requires gridExtra (a Suggests of bayesplot).
plot(fit_tox, type = "pairs", pars = c("z_w", "b_w", "k_d"))
#> Warning: Only one chain in 'x'. This plot is more useful with multiple chains.
Posterior pairs for toxicological parameters.  A correlation between $z_w$ and $b_w$ is expected since both determine the shape of the dose-response curve.

Posterior pairs for toxicological parameters. A correlation between \(z_w\) and \(b_w\) is expected since both determine the shape of the dose-response curve.

4.7 Parameter estimates

summary(fit_tox,
  pars = c("p_Am", "p_M", "kappa", "v", "E_G",
           "k_d", "z_w", "b_w", "sigma_L"),
  prob = 0.95)
#> # A tibble: 9 × 9
#>   variable      mean       sd    median   `2.5%` `97.5%`  rhat ess_bulk ess_tail
#>   <chr>        <dbl>    <dbl>     <dbl>    <dbl>   <dbl> <dbl>    <dbl>    <dbl>
#> 1 p_Am       4.75     1.39      4.54     2.62e+0 8.43e+0 1.00      957.     787.
#> 2 p_M        0.403    0.216     0.361    1.33e-1 9.25e-1 1.00      978.     968.
#> 3 kappa      0.707    0.0961    0.716    4.99e-1 8.65e-1 0.999     952.     885.
#> 4 v          0.211    0.0287    0.209    1.60e-1 2.72e-1 1.00      947.    1024.
#> 5 E_G      442.      50.7     437.       3.54e+2 5.51e+2 1.00      956.     894.
#> 6 k_d        0.499    0.611     0.294    3.70e-2 2.32e+0 1.00      967.     944.
#> 7 z_w       23.9     36.3      12.3      1.48e+0 1.12e+2 1.00     1017.     943.
#> 8 b_w        0.00217  0.00285   0.00122  1.58e-4 1.07e-2 1.00      790.     990.
#> 9 sigma_L    0.00763  0.00120   0.00750  5.60e-3 1.01e-2 1.00     1136.     979.

4.8 EC50 and NEC

The EC50 is computed analytically in the Stan generated quantities block, giving the full posterior distribution without post-hoc root-finding.

ec <- bdeb_ec50(fit_tox, prob = 0.95)
#> 
#> ── DEBtox Effect Concentrations
#>  parameter  mean median    sd lower upper
#>       EC50 730.1  435.3 932.5 66.91  3191
#>        NEC  23.9   12.3  36.3  1.48   112
print(ec$summary, digits = 3)
#>       parameter  mean median    sd lower upper
#> 2.5%       EC50 730.1  435.3 932.5 66.91  3191
#> 2.5%1       NEC  23.9   12.3  36.3  1.48   112

hist(ec$draws, breaks = 50, col = "steelblue", border = "white",
     main = expression("Posterior distribution of EC"[50]),
     xlab = "Concentration", freq = FALSE)
abline(v = ec$summary$median[1], col = "red", lwd = 2, lty = 2)
legend("topright", "Posterior median",
       col = "red", lty = 2, lwd = 2, bty = "n")
Posterior distribution of EC$_{50}$ (blue histogram) with the posterior median (red dashed line).  The full distribution — not just a point estimate — is available for regulatory risk assessment.

Posterior distribution of EC\(_{50}\) (blue histogram) with the posterior median (red dashed line). The full distribution — not just a point estimate — is available for regulatory risk assessment.

Interpretation for risk assessment:

4.9 Dose-response curve

plot_dose_response(fit_tox, n_draws = 30, n_conc = 25)
Dose-response curve with posterior uncertainty bands (blue lines: individual posterior draws).  The dashed horizontal line marks 50% effect; vertical dashed lines mark the NEC (green) and EC$_{50}$ (red).  The vignette uses lite settings; replication archive uses 200 draws.

Dose-response curve with posterior uncertainty bands (blue lines: individual posterior draws). The dashed horizontal line marks 50% effect; vertical dashed lines mark the NEC (green) and EC\(_{50}\) (red). The vignette uses lite settings; replication archive uses 200 draws.

4.10 Prior sensitivity analysis

A Bayesian analysis should always report the sensitivity of key conclusions to prior choices. We refit with a tighter prior on \(z_w\):

mod_tox2 <- bdeb_tox(dat_tox, stress = "assimilation",
  priors = list(
    z_w = prior_lognormal(mu = 3.0, sigma = 0.3),  # tighter
    b_w = prior_lognormal(mu = -5.0, sigma = 2.0)
  ))
#> Warning: ! 4 concentration group(s) contain multiple individuals.
#> ℹ DEBtox fits one ODE per concentration group, not per individual.
#> ℹ Aggregating to group means per time point. For individual-level TKTD, a
#>   hierarchical DEBtox extension is needed (not yet implemented).
fit_tox2 <- bdeb_fit(mod_tox2, algorithm = "variational",
                     seed = 78, refresh = 0)
#> ℹ Compiling Stan model: 'bdeb_debtox'
#> ℹ Running variational inference (ADVI, mean-field; approximation, NOT exact MCMC).
#> ------------------------------------------------------------ 
#> EXPERIMENTAL ALGORITHM: 
#>   This procedure has not been thoroughly tested and may be unstable 
#>   or buggy. The interface is subject to change. 
#> ------------------------------------------------------------ 
#> Gradient evaluation took 0.006923 seconds 
#> 1000 transitions using 10 leapfrog steps per transition would take 69.23 seconds. 
#> Adjust your expectations accordingly! 
#> Begin eta adaptation. 
#> Iteration:   1 / 250 [  0%]  (Adaptation) 
#> Iteration:  50 / 250 [ 20%]  (Adaptation) 
#> Iteration: 100 / 250 [ 40%]  (Adaptation) 
#> Iteration: 150 / 250 [ 60%]  (Adaptation) 
#> Iteration: 200 / 250 [ 80%]  (Adaptation) 
#> Iteration: 250 / 250 [100%]  (Adaptation) 
#> Success! Found best value [eta = 0.1]. 
#> Begin stochastic gradient ascent. 
#>   iter             ELBO   delta_ELBO_mean   delta_ELBO_med   notes  
#>    100         -199.633             1.000            1.000 
#>    200         -115.354             0.865            1.000 
#>    300          -73.566             0.766            0.731 
#>    400          -41.641             0.766            0.767 
#>    500          -25.278             0.743            0.731 
#>    600          -20.533             0.657            0.731 
#>    700            0.197            15.607            0.731 
#>    800           -0.610            13.821            0.767 
#>    900           17.678            12.400            0.767 
#>   1000           22.149            11.181            0.767 
#>   1100           27.615            11.100            0.731   MAY BE DIVERGING... INSPECT ELBO 
#>   1200           31.258            11.039            0.647   MAY BE DIVERGING... INSPECT ELBO 
#>   1300           30.855            10.983            0.647   MAY BE DIVERGING... INSPECT ELBO 
#>   1400           39.085            10.928            0.231   MAY BE DIVERGING... INSPECT ELBO 
#>   1500           42.999            10.872            0.211   MAY BE DIVERGING... INSPECT ELBO 
#>   1600           46.439            10.857            0.202   MAY BE DIVERGING... INSPECT ELBO 
#>   1700           49.307             0.332            0.198 
#>   1800           51.769             0.205            0.117 
#>   1900           54.273             0.106            0.091 
#>   2000           57.009             0.090            0.074 
#>   2100           61.031             0.077            0.066 
#>   2200           62.295             0.067            0.058 
#>   2300           65.818             0.072            0.058 
#>   2400           65.690             0.051            0.054 
#>   2500           66.756             0.043            0.048 
#>   2600           70.503             0.041            0.048 
#>   2700           70.084             0.036            0.048 
#>   2800           71.569             0.033            0.046 
#>   2900           71.556             0.029            0.021 
#>   3000           73.193             0.026            0.021 
#>   3100           74.825             0.022            0.021 
#>   3200           74.954             0.020            0.021 
#>   3300           75.889             0.016            0.016 
#>   3400           76.168             0.016            0.016 
#>   3500           76.837             0.015            0.012 
#>   3600           77.304             0.010            0.009   MEDIAN ELBO CONVERGED 
#> Drawing a sample of size 1000 from the approximate posterior...  
#> COMPLETED. 
#> Finished in  27.7 seconds.

cat("=== Original: z_w ~ LogNormal(2.5, 1.0) ===\n")
#> === Original: z_w ~ LogNormal(2.5, 1.0) ===
summary(fit_tox,  pars = c("z_w", "b_w"), prob = 0.95)
#> # A tibble: 2 × 9
#>   variable     mean       sd   median   `2.5%`  `97.5%`  rhat ess_bulk ess_tail
#>   <chr>       <dbl>    <dbl>    <dbl>    <dbl>    <dbl> <dbl>    <dbl>    <dbl>
#> 1 z_w      23.9     36.3     12.3     1.48     112.      1.00    1017.     943.
#> 2 b_w       0.00217  0.00285  0.00122 0.000158   0.0107  1.00     790.     990.

cat("\n=== Tighter:  z_w ~ LogNormal(3.0, 0.3) ===\n")
#> 
#> === Tighter:  z_w ~ LogNormal(3.0, 0.3) ===
summary(fit_tox2, pars = c("z_w", "b_w"), prob = 0.95)
#> # A tibble: 2 × 9
#>   variable   mean    sd  median    `2.5%` `97.5%`  rhat ess_bulk ess_tail
#>   <chr>     <dbl> <dbl>   <dbl>     <dbl>   <dbl> <dbl>    <dbl>    <dbl>
#> 1 z_w      21.0    6.18 20.3    11.5        35.7  0.999    1008.     911.
#> 2 b_w       0.250  1.55  0.0298  0.000567    1.47 1.000     964.    1066.

If the posteriors agree despite different priors, the data are informative and the inference is robust. If they diverge, the parameter is prior-dominated and should be reported as weakly identified.

5 Part 3: Practical Tools

5.1 Structural vs physical length

BayesianDEB works with structural length \(L = V^{1/3}\) (cube root of structural volume), not the physical body length \(L_w\) that you measure with a ruler. The two are related by the species-specific shape coefficient \(\delta_M\):

\[L = \delta_M \times L_w\]

Species \(\delta_M\) Source
Eisenia fetida 0.24 AmP
Folsomia candida 0.19 AmP
Daphnia magna 0.37 AmP

Before fitting, convert your measured lengths:

# Example: measured body lengths in mm for E. fetida
L_physical_mm <- c(12, 18, 25, 30)
delta_M <- 0.24

# Convert to structural length in cm
L_structural_cm <- delta_M * L_physical_mm / 10
L_structural_cm
#> [1] 0.288 0.432 0.600 0.720
# [1] 0.288 0.432 0.600 0.720

If you pass physical lengths directly, the estimated DEB parameters will absorb the shape coefficient and will not be comparable with AmP values. BayesianDEB warns if maximum length exceeds 10 cm, which is unusually large for structural length.

5.2 Prior predictive check

Before fitting, it is good practice to verify that priors produce biologically plausible predictions (Gabry et al., 2019). Sample directly from the prior distributions and compute derived quantities:

set.seed(42)
n_sim <- 4000

# Sample from priors
p_Am_sim  <- rlnorm(n_sim, 1.5, 0.5)
p_M_sim   <- rlnorm(n_sim, -1.0, 0.5)
kappa_sim <- rbeta(n_sim, 3, 2)
v_sim     <- rlnorm(n_sim, -1.5, 0.5)
E_G_sim   <- rlnorm(n_sim, 6.0, 0.5)

# Prior predictive for L_inf
L_inf_prior <- kappa_sim * p_Am_sim / p_M_sim

hist(L_inf_prior, breaks = 50, col = "steelblue", border = "white",
     main = "Prior predictive: ultimate structural length",
     xlab = expression(L[infinity] ~ "(cm)"), xlim = c(0, 50))

# Should cover plausible range for earthworms (~2-20 cm structural)

If the prior predictive distribution covers unreasonable values (e.g., \(L_\infty > 100\) cm for an earthworm), tighten the priors.

5.3 Observation model selection

By default, growth observations use a Gaussian likelihood and reproduction uses negative binomial. You can switch the observation model via the observation argument — the Stan likelihood is controlled by integer flags, so no recompilation is needed:

# Robust to outliers: Student-t with 5 df
mod_robust <- bdeb_model(dat1, type = "individual",
  observation = list(growth = obs_student_t(nu = 5)))

# Multiplicative error (constant CV)
mod_logn <- bdeb_model(dat1, type = "individual",
  observation = list(growth = obs_lognormal()))
# For reproduction with Poisson likelihood (when overdispersion is
# negligible) -- requires growth + reproduction data:
# mod_pois <- bdeb_model(dat_gr, type = "growth_repro",
#   observation = list(growth      = obs_normal(),
#                      reproduction = obs_poisson()))

Available observation families:

Endpoint Family Function When to use
Growth Gaussian obs_normal() Default; additive error
Growth Log-normal obs_lognormal() Multiplicative error (constant CV)
Growth Student-t obs_student_t(nu) Outlier-robust
Reproduction Neg. binomial obs_negbinom() Default; overdispersed counts
Reproduction Poisson obs_poisson() Equidispersed counts

5.4 Temperature correction

DEB rate parameters scale with temperature via the Arrhenius relationship (Kooijman 2010, Eq. 1.2):

\[ c_T = \exp\!\left(\frac{T_A}{T_\text{ref}} - \frac{T_A}{T}\right) \]

# Experiment at 22 C, reference 20 C, typical T_A for ectotherms
cT <- arrhenius(temp = 273.15 + 22, T_ref = 273.15 + 20, T_A = 8000)
cat("Temperature correction factor:", round(cT, 3), "\n")
#> Temperature correction factor: 1.203
# Rate at reference temperature: p_Am_ref = p_Am_obs / cT

5.5 Energy flux calculator

Inspect the energetics at a specific state:

fl <- deb_fluxes(E = 10, V = 0.5, f = 1.0,
                 p_Am = 5, p_M = 0.5, kappa = 0.75,
                 v = 0.2, E_G = 400)

cat(sprintf("Assimilation  (p_A): %.3f J/d\n", fl$p_A))
#> Assimilation  (p_A): 3.150 J/d
cat(sprintf("Mobilisation  (p_C): %.3f J/d\n", fl$p_C))
#> Mobilisation  (p_C): 0.008 J/d
cat(sprintf("Maintenance   (p_M): %.3f J/d\n", fl$p_M))
#> Maintenance   (p_M): 0.250 J/d
cat(sprintf("Growth        (p_G): %.3f J/d\n", fl$p_G))
#> Growth        (p_G): 0.000 J/d
cat(sprintf("Struct. length (L) : %.3f cm\n",  fl$L))
#> Struct. length (L) : 0.794 cm
cat(sprintf("Scaled reserve (e) : %.3f\n",     fl$e))
#> Scaled reserve (e) : 0.800

5.6 Converting cumulative reproduction data

Many protocols report cumulative offspring. The repro_to_intervals() function converts these to the interval format required by BayesianDEB:

cumul <- data.frame(
  id = rep(1, 5),
  time = c(0, 7, 14, 21, 28),
  cumulative = c(0, 10, 30, 60, 100)
)
repro_to_intervals(cumul)
#>     id t_start t_end count
#> 1.1  1       0     7    10
#> 1.2  1       7    14    20
#> 1.3  1      14    21    30
#> 1.4  1      21    28    40
#   id t_start t_end count
# 1  1       0     7    10
# 2  1       7    14    20
# 3  1      14    21    30
# 4  1      21    28    40

6 Validation Against Published DEB Parameters

The eisenia_growth dataset was simulated using DEB parameters from the Add-my-Pet (AmP) collection entry for Eisenia fetida (Marques et al., 2018). The table below compares the simulation truth with published AmP estimates and the expected posterior recovery from BayesianDEB:

Parameter Symbol True (simulation) AmP estimate Units
Assimilation rate \(\{p_{Am}\}\) 5.0 3.9–6.2 J d\(^{-1}\) cm\(^{-2}\)
Maintenance rate \([p_M]\) 0.5 0.3–0.8 J d\(^{-1}\) cm\(^{-3}\)
Allocation fraction \(\kappa\) 0.75 0.6–0.85
Energy conductance \(v\) 0.2 0.1–0.3 cm d\(^{-1}\)
Cost of structure \([E_G]\) 400 200–600 J cm\(^{-3}\)
Max. structural length \(L_m\) 7.5 5–12 cm

The simulation truth falls within the published AmP ranges for all parameters. When BayesianDEB is fitted to these data (see Section @ref(individual)), the posterior medians should recover values close to the simulation truth, providing a closed-loop validation: known parameters → simulated data → Bayesian recovery → comparison with truth.

This is not a substitute for fitting real experimental data, but it demonstrates that:

  1. The DEB ODE implementation is correct (simulated data match the expected growth pattern).
  2. The Bayesian inference machinery recovers known parameters.
  3. The prior specification is compatible with realistic parameter ranges from the AmP database.

For real-data applications, we recommend comparing estimated parameters with the AmP entry for the species of interest as a sanity check.

7 Summary

This vignette demonstrated three analysis types:

Analysis Model type Individuals Key output
Single growth "individual" 1 DEB posteriors, PPC, \(L_\infty\)
Population growth "hierarchical" 21 \(\mu/\sigma\) of \(\{p_{Am}\}\), shrinkage, prediction for new individual
Toxicant effect "debtox" 40 (4 groups) EC50, NEC with full uncertainty

What the Bayesian framework provides:

  1. Full uncertainty quantification — every derived quantity (\(L_\infty\), EC50, NEC) comes as a posterior distribution, not a point estimate.
  2. Prior incorporation — biological knowledge from the AmP database or regulatory guidelines constrains the estimation.
  3. Hierarchical structure — partial pooling borrows strength across individuals, improving estimates for data-poor organisms.
  4. Principled model checking — posterior predictive checks and convergence diagnostics provide clear evidence of model adequacy.
  5. Within-chain parallelismreduce_sum accelerates hierarchical and DEBtox models on multi-core machines.

References

Betancourt, M. and Girolami, M. (2015). Hamiltonian Monte Carlo for hierarchical models. In: Upadhyay, S.K. et al. (eds) Current Trends in Bayesian Methodology with Applications. CRC Press, pp. 79–101.

Carpenter, B., Gelman, A., Hoffman, M.D. et al. (2017). Stan: A probabilistic programming language. Journal of Statistical Software, 76(1), 1–32. doi: 10.18637/jss.v076.i01

ECHA (2017). Guidance on Information Requirements and Chemical Safety Assessment, Chapter R.10: Characterisation of dose [concentration]- response for environment. European Chemicals Agency.

Gelman, A. (2006). Prior distributions for variance parameters in hierarchical models. Bayesian Analysis, 1(3), 515–534. doi: 10.1214/06-BA117A

Gelman, A. and Hill, J. (2006). Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press.

Gelman, A., Carlin, J.B., Stern, H.S. et al. (2013). Bayesian Data Analysis. 3rd edition. Chapman & Hall/CRC.

Hoffman, M.D. and Gelman, A. (2014). The No-U-Turn Sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(47), 1593–1623.

Jager, T., Heugens, E.H.W. and Kooijman, S.A.L.M. (2006). Making sense of ecotoxicological test results: towards application of process-based models. Ecotoxicology, 15(3), 305–314. doi: 10.1007/s10646-006-0060-x

Jager, T. and Zimmer, E.I. (2012). Simplified Dynamic Energy Budget model for analysing ecotoxicity data. Ecological Modelling, 225, 74–81. doi: 10.1016/j.ecolmodel.2011.11.012

Kooijman, S.A.L.M. (2010). Dynamic Energy Budget Theory for Metabolic Organisation. 3rd edition. Cambridge University Press. doi: 10.1017/CBO9780511805400

Marques, G.M., Augustine, S., Lika, K. et al. (2018). The AmP project: comparing species on the basis of dynamic energy budget parameters. PLOS Computational Biology, 14(5), e1006100. doi: 10.1371/journal.pcbi.1006100

Vehtari, A., Gelman, A., Simpson, D. et al. (2021). Rank-normalization, folding, and localization: an improved \(\hat{R}\) for assessing convergence of MCMC. Bayesian Analysis, 16(2), 667–718. doi: 10.1214/20-BA1221

Session info

sessionInfo()
#> R version 4.5.2 (2025-10-31)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 22.04.5 LTS
#> 
#> Matrix products: default
#> BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0 
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0  LAPACK version 3.10.0
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=hr_HR.UTF-8        LC_COLLATE=C              
#>  [5] LC_MONETARY=hr_HR.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=hr_HR.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=hr_HR.UTF-8 LC_IDENTIFICATION=C       
#> 
#> time zone: Europe/Budapest
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] posterior_1.6.1   ggplot2_4.0.1     BayesianDEB_0.2.1
#> 
#> loaded via a namespace (and not attached):
#>  [1] tensorA_0.36.2.1     utf8_1.2.6           sass_0.4.10         
#>  [4] generics_0.1.4       stringi_1.8.7        digest_0.6.39       
#>  [7] magrittr_2.0.4       evaluate_1.0.5       grid_4.5.2          
#> [10] RColorBrewer_1.1-3   fastmap_1.2.0        plyr_1.8.9          
#> [13] jsonlite_2.0.0       processx_3.8.6       backports_1.5.0     
#> [16] deSolve_1.42         gridExtra_2.3        ps_1.9.1            
#> [19] scales_1.4.0         jquerylib_0.1.4      abind_1.4-8         
#> [22] cli_3.6.5            rlang_1.1.7          cmdstanr_0.8.0      
#> [25] withr_3.0.2          cachem_1.1.0         yaml_2.3.12         
#> [28] tools_4.5.2          parallel_4.5.2       reshape2_1.4.5      
#> [31] checkmate_2.3.3      dplyr_1.2.0          vctrs_0.7.1         
#> [34] R6_2.6.1             matrixStats_1.5.0    lifecycle_1.0.5     
#> [37] stringr_1.6.0        pkgconfig_2.0.3      pillar_1.11.1       
#> [40] bslib_0.9.0          gtable_0.3.6         glue_1.8.0          
#> [43] data.table_1.17.8    Rcpp_1.1.0           xfun_0.54           
#> [46] tibble_3.3.0         tidyselect_1.2.1     knitr_1.50          
#> [49] dichromat_2.0-0.1    farver_2.1.2         bayesplot_1.14.0    
#> [52] htmltools_0.5.9      rmarkdown_2.30       labeling_0.4.3      
#> [55] compiler_4.5.2       S7_0.2.1             distributional_0.5.0