Package {leafwax}


Title: Bayesian Inversion of Leaf Wax Hydrogen Isotopes to Precipitation
Version: 0.2.7
Description: Bayesian inversion of leaf wax hydrogen isotopes to reconstruct precipitation isotopes using hierarchical spatial models. Provides fourteen Bayesian models that vary in their use of spatial Gaussian processes and ancillary covariates (precipitation amount, plant functional type, C4 fraction). Models are pre-computed using 'Stan' and stored as posterior distributions, so prediction does not require 'Stan' to be installed. A 100-draw fixture ships with the package; full 1000-draw posteriors are downloaded from a versioned 'Zenodo' deposit on first use; see Bradley (2026) <doi:10.5281/zenodo.20085465>.
License: MIT + file LICENSE
URL: https://github.com/bradleylab/leafwax, https://bradleylab.github.io/leafwax/
BugReports: https://github.com/bradleylab/leafwax/issues
Encoding: UTF-8
Language: en-US
RoxygenNote: 7.3.3
Imports: jsonlite, stats, utils
Suggests: rappdirs, knitr, rmarkdown, testthat (≥ 3.0.0)
VignetteBuilder: knitr
Depends: R (≥ 3.5.0)
LazyData: true
LazyDataCompression: xz
Config/testthat/edition: 3
NeedsCompilation: no
Packaged: 2026-06-01 17:49:57 UTC; abradley
Author: Alex Bradley ORCID iD [aut, cre]
Maintainer: Alex Bradley <abradley@wustl.edu>
Repository: CRAN
Date/Publication: 2026-06-15 13:40:02 UTC

leafwax: Bayesian Calibration of Leaf Wax Hydrogen Isotope Reconstructions

Description

The leafwax package provides tools for probabilistic inversion of leaf wax hydrogen isotope measurements (delta-2-H) to reconstruct precipitation isotope values. It implements hierarchical Bayesian models that account for multiple sources of uncertainty including measurement error, biological fractionation, and spatial correlation in isotope patterns.

Main Functions

invert_d2H

Bayesian inversion of leaf wax delta2H to precipitation delta2H

available_models

List all available calibration models

load_posteriors

Load posterior distributions for a specific model

get_model_parameters

Get model capabilities and required parameters

validate_model_inputs

Validate inputs for a specific model

Available Models

The package includes 14 calibration models with different capabilities. The v10 fits include precipitation amount (baseline_env* and full* variants), C4 abundance, and PFT cover; none of the v10 variants carry a fitted elevation coefficient despite the historical "elevation_*" naming. Runtime capability flags in load_posteriors() are derived from each model's posterior columns at load time.

Models with "_sp" suffix use spatial Gaussian processes with 125 knots on a Fibonacci sphere lattice for improved uncertainty quantification.

Model Selection

Pass model = "auto" to predict_d2h_precip() to let select_best_model_from_flags() choose a model based on which covariates the caller has supplied; otherwise pick a model name from available_models() explicitly.

Key Features

Author(s)

Maintainer: Alex Bradley abradley@wustl.edu (ORCID)

References

Bowen, G. J., Cai, Z., Fiorella, R. P., & Putman, A. L. (2019). Isotopes in the water cycle: Regional-to global-scale patterns and applications. Annual Review of Earth and Planetary Sciences, 47, 453-479. doi:10.1146/annurev-earth-053018-060220

Sachse, D., Billault, I., Bowen, G. J., Chikaraishi, Y., Dawson, T. E., Feakins, S. J., ... & Kahmen, A. (2012). Molecular paleohydrology: Interpreting the hydrogen-isotopic composition of lipid biomarkers from photosynthesizing organisms. Annual Review of Earth and Planetary Sciences, 40, 221-249. doi:10.1146/annurev-earth-042711-105535

Bradley, A. (2026). leafwax v10 model posteriors. Zenodo DOI doi:10.5281/zenodo.20085465.

See Also

Useful links:

Examples

local({
  old <- options(leafwax.suppress_preview_warning = TRUE)
  on.exit(options(old))

  # List available models
  models <- available_models()
  n_models <- length(models)

  # Simple single-location inversion
  result <- invert_d2H(
    d2H_wax = -150,
    d2H_wax_sd = 3,
    longitude = -120,
    latitude = 40,
    model_name = "baseline",
    verbose = FALSE
  )
})


Assess a paleoclimate claim against the leaf-wax taxonomy

Description

Walks the four-level taxonomy from manuscript Section 4.5.6 and reports the highest level a claim survives at. The taxonomy is:

Usage

assess_claim(
  record,
  claim,
  reconstruction = NULL,
  longitude = NULL,
  latitude = NULL,
  model_name = "baseline_sp",
  ...
)

Arguments

record

Data frame (or list) with at least d2h_wax and age columns of equal length. d2h_wax_err is optional; defaults to claim$sigma_analytical per row.

claim

Named list specifying the claim. Required fields: level (integer 1-4, the level the user is asserting), interval_baseline (length-2 numeric c(min, max) age window), interval_test (length-2 numeric age window). Optional fields, used by higher levels: sigma_analytical (default 3), rho_t (default 0; from estimate_temporal_autocorrelation()), beta_eff (numeric scalar; required at Level 3+), confidence (default 0.95), magnitude_precip (numeric, the precip-space magnitude the user asserts; required at Level 3+), sediment_source_ruled_out, depositional_artifact_ruled_out (each a list with value (TRUE) and a non-empty evidence string; BOTH required at Level 2+ regardless of which Level 2 path is used), corroborating_proxies (list, used at Level 2 path (a); the test is non-empty + named), level2_vegetation_path (list, used at Level 2 path (b); must contain vegetation_scenario = list(from, to) with named numeric vectors over ⁠{tree, shrub, grass, C4}⁠ and an optional evidence string. The claim must also supply oipc_ref (numeric scalar, calibration-period d2H_precip at the site, per mil) at the top level. An optional level2_vegetation_path$model_name selects the calibration model used for the envelope; default "full_interact_sp".), vegetation_stationary, seasonal_source_stationary, evapotranspirative_stationary (each a list with value (TRUE) and a non-empty evidence string; required at Level 4).

reconstruction

Optional output of invert_d2H(..., return_full = TRUE) on the record. When NULL and the claim's level is 3 or 4, the function runs the inversion itself.

longitude, latitude

Site coordinates, used only when reconstruction is NULL.

model_name

Model to use when running the inversion (default "baseline_sp").

...

Additional args forwarded to invert_d2H() (e.g., elevation, c4_fraction, pft_*, n_posterior_draws).

Details

Use this when a colleague claims that a downcore record shows a specific d2H_precip shift and you want a structured check of which levels of the taxonomy that claim actually clears.

Value

A list with elements:


Get available models

Description

Returns a list of all available calibration models for leaf wax hydrogen isotope inversion. Models vary in their complexity and data requirements.

Usage

available_models()

Value

Character vector of available model names. Models include:

Examples

# List all available models
models <- available_models()
print(models)

# Get details for a specific model
model_info <- get_model_parameters("baseline_sp")
print(model_info$description)

Batch predict precipitation d2H for multiple sites

Description

Processes multiple sites with progress indicators and optional parallelization. Handles large datasets efficiently by processing in chunks.

Usage

batch_predict(
  data,
  model = "auto",
  chunk_size = 100,
  parallel = FALSE,
  n_cores = NULL,
  progress = TRUE,
  return_diagnostics = FALSE,
  ...
)

Arguments

data

Data frame containing all measurements

model

Model name or "auto" for automatic selection

chunk_size

Number of sites to process at once (default 100)

parallel

Logical whether to use parallel processing

n_cores

Number of cores for parallel processing (NULL for auto)

progress

Logical whether to show progress bar

return_diagnostics

Logical whether to return diagnostic information

...

Additional arguments passed to predict_d2h_precip

Value

Data frame with predictions for all sites

Examples


local({
  old <- options(leafwax.suppress_preview_warning = TRUE)
  on.exit(options(old))

  data(example_data)
  large_data <- example_data[rep(seq_len(nrow(example_data)), length.out = 12), ]
  row.names(large_data) <- NULL

  # Process in chunks
  results <- batch_predict(
    large_data,
    chunk_size = 6,
    progress = FALSE,
    verbose = FALSE
  )

  # Process with a specific model
  results <- batch_predict(
    large_data,
    model = "baseline_sp",
    chunk_size = 6,
    progress = FALSE,
    verbose = FALSE
  )
})


Check if model data exists in cache

Description

Checks whether the heavy posterior file for a model is present in the user cache populated by download_model_data().

Usage

check_data_cache(
  model_name,
  data_type = c("minimal", "standard", "full"),
  verbose = FALSE
)

Arguments

model_name

Character string specifying the model name.

data_type

Retained for API compatibility. The v0.2 download layout ships a single posterior file per model (⁠posteriors/<model>_posterior.rds⁠) so all values check the same path; the argument is accepted but otherwise ignored.

verbose

Logical, whether to print status messages.

Value

Logical indicating whether the cached posterior file exists.

Examples


local({
  old <- options(leafwax.cache_dir = file.path(tempdir(), "leafwax_cache"))
  on.exit(options(old))

  exists <- check_data_cache("baseline_sp", verbose = FALSE)
})


Clear download cache

Description

Removes downloaded model data from the local cache.

Usage

clear_download_cache(
  model_name = NULL,
  type = c("all", "posteriors"),
  confirm = TRUE
)

Arguments

model_name

Model name to clear (NULL for all)

type

Type of data to clear: "all" or "posteriors"

confirm

Whether to ask for confirmation

Value

Invisible NULL

Examples


local({
  old <- options(leafwax.cache_dir = file.path(tempdir(), "leafwax_cache"))
  on.exit({
    unlink(getOption("leafwax.cache_dir"), recursive = TRUE, force = TRUE)
    options(old)
  })

  post_dir <- file.path(getOption("leafwax.cache_dir"), "posteriors")
  dir.create(post_dir, recursive = TRUE, showWarnings = FALSE)
  file.create(file.path(post_dir, "baseline_sp_posterior.rds"))

  # Clear cache for a specific model without prompting
  suppressMessages(clear_download_cache("baseline_sp", confirm = FALSE))
})


Compare predictions across multiple models

Description

Runs predictions using multiple models and compares results.

Usage

compare_models(
  data,
  models = NULL,
  summary_fun = mean,
  return_all = FALSE,
  progress = TRUE,
  ...
)

Arguments

data

Data frame with measurements

models

Character vector of model names to compare

summary_fun

Function to summarize across models (default is mean)

return_all

Logical whether to return all model results

progress

Logical whether to show progress

...

Additional arguments passed to predict_d2h_precip

Value

Data frame with ensemble predictions or list of all results

Examples


local({
  old <- options(leafwax.suppress_preview_warning = TRUE)
  on.exit(options(old))

  data(example_data)

  # Compare multiple models
  comparison <- compare_models(
    example_data,
    models = c("baseline", "baseline_env", "baseline_sp"),
    progress = FALSE
  )

  # Get all individual model results
  all_results <- compare_models(
    example_data,
    models = c("baseline", "baseline_env"),
    return_all = TRUE,
    progress = FALSE
  )
})


Vegetation-only envelope for a paleo wax-isotope record

Description

Computes the posterior-propagated wax-shift envelope expected under a user-supplied PFT-change scenario, holding d2H_precip constant. This is the magnitude path of the Level 2 claim taxonomy described in the accompanying manuscript Section 4.5.3: the calibration's PFT main-effect and PFT-by-\delta^2 H_p interaction coefficients are combined across all posterior draws to bound how much wax change vegetation reorganization alone can produce at the site, with no contribution from precipitation-isotope change.

Usage

compute_vegetation_envelope(
  oipc_ref,
  from,
  to,
  model_name = "full_interact_sp",
  n_draws = NULL,
  verbose = TRUE
)

Arguments

oipc_ref

Numeric scalar, the calibration-period d2H_precip at the site (per mil). Typically extracted from the OIPC raster (Bowen and Wilkinson 2002) at the site coordinates; the package does not bundle the raster. Held constant across the envelope by construction.

from

Named numeric vector of fractional PFT cover at the baseline interval. Names must be exactly tree, shrub, grass, C4 (case-sensitive). Values in [0, 1] with tree + shrub + grass <= 1; C4 is an independent fraction.

to

Named numeric vector of fractional PFT cover at the test interval. Same name and value constraints as from.

model_name

Character, name of the calibration model to use. Must contain all eight PFT coefficients (PFT main effects and PFT-by-\delta^2 H_p interactions). Default "full_interact_sp" is the only shipped model that satisfies this requirement.

n_draws

Integer, number of posterior draws to use. NULL (default) uses all available draws. Subsampling is deterministic (stratified, via load_posteriors()).

verbose

Logical, whether to emit progress messages.

Details

For each posterior draw the per-draw envelope is ⁠envelope = sum_k beta_k * delta_pft_k + sum_k beta_d2Hp_x_k * oipc_ref * delta_pft_k⁠ where k indexes the four PFT classes (tree, shrub, grass, C4) and delta_pft_k = to[k] - from[k]. The \beta_{\delta^2 H_p} \times PFT interaction terms are interactions with the precipitation-isotope calibration slope. The returned envelope_p975_abs = quantile(abs(envelope_draws), 0.975) is the absolute upper bound used by assess_claim() path (b) to test whether an observed ⁠|delta_wax|⁠ exceeds what the supplied PFT scenario alone can produce.

Passing path (b) rejects the vegetation-only null for the supplied scenario at the site. It does not identify the hydroclimate mechanism, quantify the precipitation-isotope change, or address sediment-source change, depositional artifact, compound-source mixing, age-model errors, evapotranspirative regime change, or seasonality shifts (which assess_claim() gates with separate fields). The calibration coefficients are derived from spatial variation across sites; applying them to within-record temporal vegetation change assumes the same response holds through time at one location.

Value

List with

Manuscript reference

Section 4.5.3 of the accompanying manuscript defines the vegetation-only envelope and the constant-precipitation framing. Section 4.5.6 places it in the Level 2 claim taxonomy.

Examples


local({
  old <- options(leafwax.suppress_preview_warning = TRUE)
  on.exit(options(old))

  # Hypothetical: a 30 percentage-point woody-to-grass transition at a
  # site where the OIPC raster lookup gave d2H_precip approximately -60 per mil.
  env <- compute_vegetation_envelope(
    oipc_ref = -60,
    from = c(tree = 0.4, shrub = 0.3, grass = 0.2, C4 = 0.05),
    to   = c(tree = 0.1, shrub = 0.2, grass = 0.5, C4 = 0.20),
    model_name = "full_interact_sp",
    n_draws = 100,
    verbose = FALSE
  )
  vegetation_bound <- env$envelope_p975_abs
})


Within-record d2H_precip change detection

Description

Given a downcore invert_d2H() reconstruction posterior with full draws (return_full = TRUE), report (a) the posterior probability that the difference in mean d2H_precip between two stratigraphic intervals exceeds user-supplied magnitudes, and (b) the within- record 95\ 4.5.3:

Usage

detect_change(
  reconstruction,
  age,
  baseline_interval,
  test_intervals = NULL,
  sigma_residual,
  sigma_analytical = 3,
  rho_t = NULL,
  beta_eff,
  confidence = 0.95,
  magnitudes = NULL
)

Arguments

reconstruction

Output of invert_d2H(..., return_full = TRUE) on a downcore series. Must contain a posterior_draws matrix of shape ⁠n_iter x n_samples⁠.

age

Numeric vector, length n_samples, of sample ages matching the reconstruction columns.

baseline_interval

Length-2 numeric c(min, max) defining the baseline window in age units.

test_intervals

Either a length-2 numeric vector for a single test window, or a named list of length-2 numerics for multiple windows. NULL skips the per-interval probability table and returns only the threshold.

sigma_residual

Numeric, required, the model's posterior residual SD on the leaf-wax per-mil scale (sigma, approximately 16 per mil for the spatial models; see Section 4.5.3).

sigma_analytical

Numeric, the analytical uncertainty on d2H_wax measurements in per mil (default 3).

rho_t

Numeric, lag-1 temporal autocorrelation. Use estimate_temporal_autocorrelation() to compute. Defaults to 0 (independent samples) with a message.

beta_eff

Numeric, the local effective slope. Use local_effective_slope() for a point estimate (e.g., its median).

confidence

Numeric in (0, 1), the confidence level for the detection threshold. Default 0.95.

magnitudes

Optional numeric vector of magnitudes (per mil) to evaluate posterior ⁠Pr(|delta| > magnitude)⁠ against.

Details

\mathrm{threshold}_{precip} = \frac{z_{\alpha/2}\, \sqrt{2 \sigma_{residual}^2 (1 - \rho_t) + 2 \sigma_{analytical}^2}} {\beta_{\mathrm{eff}}}

The threshold is the smallest difference in d2H_precip between two independent samples that can be distinguished from within-record noise at the chosen confidence level. The lag-1 autocorrelation rho_t enters only the residual term, because analytical measurement error is independent between samples by construction. The spatial GP intercept contributes a constant to every sample in the record and cancels in the contrast; the same sigma_residual from the spatial calibration applies (manuscript Section 4.5.3).

Value

A list with elements:


Detect model capabilities from static v10 metadata

Description

Detect model capabilities from static v10 metadata

Usage

detect_model_capabilities(model_name)

Arguments

model_name

Name of the model

Value

List of capability flags


Download model data from GitHub releases

Description

Downloads model posterior draws and lookup tables from GitHub releases with progress tracking and integrity verification.

Usage

download_model_data(
  model_name,
  version = "latest",
  data_type = c("posteriors"),
  cache_dir = NULL,
  overwrite = FALSE,
  verify = TRUE,
  verbose = TRUE
)

Arguments

model_name

Character string specifying the model name

version

Version tag to download (default "latest")

data_type

Type of data to download (only "posteriors" is currently supported)

cache_dir

Directory to save files (default uses get_cache_dir())

overwrite

Logical whether to overwrite existing files

verify

Logical whether to verify file integrity with checksums

verbose

Logical whether to show progress messages

Value

Logical indicating success

Examples


cache_dir <- file.path(tempdir(), "leafwax_download_example")
ok <- download_model_data(
  "baseline",
  version = "v1.0.1",
  cache_dir = cache_dir,
  verify = FALSE,
  verbose = FALSE
)



Download file with progress bar

Description

Downloads a file with a text progress bar showing download progress.

Usage

download_with_progress(url, destfile, verbose = TRUE)

Arguments

url

URL to download from

destfile

Destination file path

verbose

Whether to show progress bar

Value

Logical indicating success


Estimate lag-1 temporal autocorrelation

Description

Estimate the lag-1 autocorrelation rho_t of a leaf-wax record's residuals after a flat-mean detrend, ordering by age. This is the quantity that enters the within-record detection threshold from manuscript Section 4.5.3 (⁠Var(X1 - X2) = 2 sigma^2 (1 - rho_t)⁠).

Usage

estimate_temporal_autocorrelation(
  d2h_wax,
  age,
  method = c("ar1", "lomb_scargle")
)

Arguments

d2h_wax

Numeric vector of leaf-wax delta-2-H measurements (per mil).

age

Numeric vector of sample ages, same length as d2h_wax.

method

One of "ar1" or "lomb_scargle".

Details

Two methods are supported:

Value

Numeric scalar in ⁠[-1, 1]⁠, or NA_real_ when the residuals are constant (e.g., n < 3 finite samples).

Examples


set.seed(1)
n <- 200
rho <- 0.7
e  <- numeric(n); e[1] <- rnorm(1, 0, 5)
for (k in 2:n) e[k] <- rho * e[k-1] + rnorm(1, 0, 5 * sqrt(1 - rho^2))
ag <- seq(0, 10000, length.out = n)
estimate_temporal_autocorrelation(-150 + e, ag)


Example leaf wax hydrogen isotope data

Description

A 10-row data frame of synthetic leaf wax hydrogen isotope measurements bundled for demonstration and testing.

Usage

example_data

Format

A data frame with 10 rows and 10 variables:

site_id

Character, site identifier

longitude

Numeric, longitude in decimal degrees (-180 to 180)

latitude

Numeric, latitude in decimal degrees (-90 to 90)

elevation

Numeric, elevation in meters above sea level

d2h_wax

Numeric, leaf wax hydrogen isotope value in per mil VSMOW

d2h_wax_sd

Numeric, analytical uncertainty in per mil

c4_fraction

Numeric, C4 vegetation fraction (0-1)

pft_tree

Numeric, tree plant functional type fraction (0-1)

pft_shrub

Numeric, shrub plant functional type fraction (0-1)

pft_grass

Numeric, grass plant functional type fraction (0-1)

Source

Synthetic values designed to span the calibration range.

Examples

data(example_data)
head(example_data)

Generate Fibonacci sphere points

Description

Generates evenly distributed points on a sphere using the Fibonacci spiral method.

Usage

generate_fibonacci_sphere(n_points = N_SPATIAL_KNOTS)

Arguments

n_points

Number of points to generate

Value

Matrix with columns "lon" and "lat" in degrees


Generate human-readable model description

Description

Generate human-readable model description

Usage

generate_model_description(model_name, capabilities)

Arguments

model_name

Name of the model

capabilities

Model capabilities list

Value

Character description


Get all model metadata

Description

Returns a list of all 14 models with their descriptions and properties. The models are based on the spatial leafwax hierarchical Bayesian models.

Usage

get_all_model_metadata()

Value

Named list of available models with metadata


Get leafwax data cache directory

Description

Returns the path to the local cache directory for leafwax model data. Uses rappdirs for platform-specific paths or a user-specified directory.

Usage

get_cache_dir(create = TRUE)

Arguments

create

Logical, whether to create the directory if it doesn't exist

Value

Character string with the cache directory path

Examples


local({
  old <- options(leafwax.cache_dir = file.path(tempdir(), "leafwax_cache"))
  on.exit(options(old))

  cache_dir <- get_cache_dir(create = FALSE)
  dir.exists(cache_dir)
})


Get cache files for a model

Description

Internal helper that returns the cached file paths for a model. The v0.2 download layout ships a single posterior per model at ⁠posteriors/<model>_posterior.rds⁠, so the returned vector has at most one element. The data_type argument is accepted for API compatibility but does not affect the result.

Usage

get_cache_files(model_name, data_type, cache_dir)

Arguments

model_name

Model name.

data_type

Retained for API compatibility (ignored).

cache_dir

Cache directory path.

Value

Character vector of cached file paths that exist on disk.


Get cache size information

Description

Reports the disk space used by cached model data.

Usage

get_cache_info(by_model = FALSE, by_type = FALSE)

Arguments

by_model

Logical, whether to break down by model

by_type

Logical, whether to break down by data type

Value

Data frame with cache size information

Examples


local({
  old <- options(leafwax.cache_dir = file.path(tempdir(), "leafwax_cache"))
  on.exit(options(old))

  # Get total cache size
  cache_info <- get_cache_info()

  # Get size by model and type
  cache_info <- get_cache_info(by_model = TRUE, by_type = TRUE)
})


Get data manifest

Description

Loads or downloads the data manifest with file checksums. Returns NULL (with a warning()) when the manifest is unreachable and there is no cached copy on disk; callers must treat that as "checksum verification skipped" rather than "no checksums found".

Usage

get_data_manifest()

Value

Parsed manifest list, or NULL if no manifest is available locally and the download failed.


Get path to data file

Description

Returns the full path to a data file based on the data source.

Usage

get_data_path(filename, data_source = "auto")

Arguments

filename

Name of the file

data_source

Source of data: "package", "cache", or "download"

Value

Character string with the file path


Get data download URLs

Description

Constructs download URLs for model data from GitHub releases.

Usage

get_data_url(model_name, version = "latest", data_type = c("posteriors"))

Arguments

model_name

Character string specifying the model name

version

Version tag (e.g., "v1.0.0" or "latest")

data_type

Type of data (only "posteriors" is currently supported)

Value

List of download URLs and filenames

Examples

# Get URLs for latest version
urls <- get_data_url("baseline_sp", "latest")

# Get URLs for specific version
urls <- get_data_url("baseline_sp", "v1.0.1")

Get model info

Description

Get model info

Usage

get_model_info(model_name)

Arguments

model_name

Name of the model

Value

List with model metadata


Get model parameters

Description

Returns which parameters each model expects and provides metadata about model capabilities for validation.

Usage

get_model_parameters(model_name)

Arguments

model_name

Name of the model

Value

List with model parameters and capabilities


Get URL configuration

Description

Loads the URL configuration from the package data.

Usage

get_url_config()

Value

List with URL configuration


Ensemble predictions across multiple models

Description

Runs invert_d2H against each model in models and combines the per-draw reconstructions per site, preserving the per-site dimension. Useful for downstream uncertainty estimates that should span structural model uncertainty rather than condition on one calibration variant.

Usage

invert_d2H_ensemble(
  ...,
  models = c("full_sp", "full_interact_sp", "elevation_c4_interact_sp"),
  ensemble_method = c("equal", "all")
)

Arguments

...

Arguments passed to invert_d2H (e.g., d2H_wax, d2H_wax_sd, longitude, latitude, optional covariates).

models

Character vector of v10 model names to include in the ensemble. Defaults to three structurally distinct variants: full_sp (all covariates + spatial GP), full_interact_sp (full + elevation x C4 interaction + spatial GP), and elevation_c4_interact_sp (elevation x C4 interaction with spatial GP, no PFT).

ensemble_method

"equal" (default) pools per-draw reconstructions per site across models with equal weighting and returns a per-site posterior. "all" returns the per-model results without pooling.

Value

If ensemble_method = "equal", a list with: posterior_draws (an ⁠n_draws x n_sites⁠ matrix of pooled per-site, per-draw reconstructions), ensemble_summary (a data frame with one row per site: mean, median, sd, ci_90_lower/ci_90_upper, ci_95_lower/ci_95_upper, n_models_used), model_results (the per-model output from invert_d2H() with return_full = TRUE), models_used (the models actually pooled), and ensemble_method. If ensemble_method = "all", only model_results and ensemble_method are returned.


Invert leaf wax d2H to precipitation d2H

Description

Uses Bayesian posterior draws to invert leaf wax hydrogen isotope values to precipitation isotope values, accounting for all fitted model components including vegetation effects and spatial correlations where applicable.

Usage

invert_d2h(
  d2h_wax,
  d2h_wax_err = NULL,
  longitude,
  latitude,
  elevation = NULL,
  c4_percent = NULL,
  pft_tree = NULL,
  pft_shrub = NULL,
  pft_grass = NULL,
  model_name = "baseline",
  n_draws = NULL,
  return_full = FALSE,
  credible_level = 0.9,
  verbose = TRUE,
  record_id = NULL,
  slope = NULL
)

invert_d2H(
  d2H_wax,
  d2H_wax_sd = NULL,
  longitude,
  latitude,
  elevation = NULL,
  elevation_sd = 100,
  c4_fraction = NULL,
  c4_fraction_sd = 10,
  pft_tree = NULL,
  pft_shrub = NULL,
  pft_grass = NULL,
  model_name = "baseline",
  n_posterior_draws = NULL,
  return_full = FALSE,
  credible_level = 0.9,
  verbose = TRUE,
  record_id = NULL,
  slope = NULL
)

Arguments

d2h_wax

Numeric vector of leaf wax d2H values (per mil)

d2h_wax_err

Numeric vector of measurement uncertainties (per mil)

longitude

Numeric vector of longitudes (decimal degrees)

latitude

Numeric vector of latitudes (decimal degrees)

elevation

Numeric vector of elevations (meters)

c4_percent

Numeric vector of C4 vegetation percentage (0-100)

pft_tree

Numeric vector of tree PFT fraction (0-1)

pft_shrub

Numeric vector of shrub PFT fraction (0-1)

pft_grass

Numeric vector of grass PFT fraction (0-1)

model_name

Character string specifying which model to use

n_draws

Integer number of posterior draws to use (NULL for all)

return_full

Logical whether to return full posterior draws or just summary

credible_level

Numeric credible interval level (default 0.9)

verbose

Logical whether to print progress messages

record_id

Character or numeric, optional record identifier. When supplied and constant across all input rows, all rows are treated as belonging to the same downcore series: the spatial Gaussian process is evaluated once per posterior draw at the shared site, so spatial draws are reused across the series rather than redrawn per row. The current implementation already shares spatial draws between identical (longitude, latitude) pairs; the record_id argument adds explicit validation that the caller intends within-record inference.

slope

Optional numeric override for the d2H_wax-d2H_precip slope. NULL (default) uses the model's site-specific slope, i.e., the local \beta_{\delta^2 H_p} calibration slope including the spatial slope GP perturbation at the site. A single numeric replaces the slope with a fixed point estimate (broadcast across all posterior draws). A vector of length n_draws is used per draw. Use local_effective_slope() to build a defensible per-draw override from the calibration's site-specific posterior. When supplied, the override applies uniformly to every input row.

d2H_wax

Numeric vector of leaf wax d2H values (per mil)

d2H_wax_sd

Numeric vector of measurement uncertainties (per mil)

elevation_sd

Elevation uncertainty (not used, kept for compatibility)

c4_fraction

Numeric vector of C4 vegetation cover as a fraction in ⁠[0, 1]⁠. The wrapper converts to the percent (0-100) scale used internally before standardisation.

c4_fraction_sd

C4 fraction uncertainty (not used, kept for compatibility)

n_posterior_draws

Integer number of posterior draws to use

Value

If return_full is FALSE, a data frame with columns:

d2h_precip_mean

Mean predicted precipitation d2H

d2h_precip_median

Median predicted precipitation d2H

d2h_precip_sd

Standard deviation of the posterior predictive interval

d2h_precip_lower

Lower bound of the credible interval

d2h_precip_upper

Upper bound of the credible interval

prediction_interval_width

Width of the credible interval (upper - lower).

The interval is the posterior predictive specified in manuscript supplement Section S4.1, Eq. 7: the wax-error draw combines analytical uncertainty with the model's posterior residual SD sigma. For within-record change detection, the spatial GP intercept's contribution cancels in any contrast computed from the returned posterior_draws (manuscript Section 4.5.3); the same sigma applies in both regimes.

If return_full is TRUE, a list with:

summary

The summary data frame described above

posterior_draws

Matrix of all posterior draws (n_draws x n_locations)

model_info

Information about the model used.

Examples


local({
  old <- options(leafwax.suppress_preview_warning = TRUE)
  on.exit(options(old))

  # Simple inversion with base model
  results <- invert_d2h(
    d2h_wax = c(-150, -140, -130),
    d2h_wax_err = c(3, 3, 3),
    longitude = c(-120, -110, -100),
    latitude = c(40, 35, 30),
    elevation = c(1000, 1500, 500),
    model_name = "baseline",
    verbose = FALSE
  )

  # Inversion with spatial model
  results <- invert_d2h(
    d2h_wax = c(-150, -140, -130),
    d2h_wax_err = c(3, 3, 3),
    longitude = c(-120, -110, -100),
    latitude = c(40, 35, 30),
    elevation = c(1000, 1500, 500),
    model_name = "baseline_sp",
    return_full = TRUE,
    verbose = FALSE
  )
})


Get leafwax configuration

Description

Returns current configuration options for the leafwax package.

Usage

leafwax_config(option = NULL)

Arguments

option

Specific option to retrieve (NULL for all)

Value

List of options or single option value

Examples

# Get all configuration options
cfg <- leafwax_config()

# Get specific option
auto_download <- leafwax_config("auto_download")

Set leafwax configuration

Description

Sets configuration options for the leafwax package.

Usage

leafwax_set_config(..., persist = TRUE)

Arguments

...

Named arguments for options to set

persist

Logical, whether to show code to make changes permanent

Value

Invisible NULL

Examples


# Enable auto-download
old <- options()
leafwax_set_config(auto_download = TRUE, persist = FALSE)
options(old)

# Set multiple options
old <- options()
leafwax_set_config(
  auto_download = TRUE,
  cache_dir = "~/my_leafwax_cache",
  verbose = FALSE,
  persist = FALSE
)
options(old)


List available models in cache

Description

Lists all models that have been downloaded to the local cache.

Usage

list_cached_models(data_type = NULL, verbose = TRUE)

Arguments

data_type

Filter by data type (NULL for any)

verbose

Logical, whether to print detailed information

Value

Character vector of available model names

Examples


local({
  old <- options(leafwax.cache_dir = file.path(tempdir(), "leafwax_cache"))
  on.exit(options(old))

  # List all cached models
  models <- list_cached_models(verbose = FALSE)

  # List models with full data
  models_full <- list_cached_models(data_type = "full", verbose = FALSE)
})


List model names

Description

Returns the names of every model the package can resolve. Prefers the heavy posteriors directory when present (development install), otherwise falls back to the lightweight posteriors directory that ships with every install. The user cache is intentionally not enumerated here so the answer is stable regardless of what has been downloaded.

Usage

list_model_names()

Value

Character vector of model names. Empty if neither directory contains posterior files.


List available models with details

Description

Returns information about the v10 models available in the leafwax package, including which covariates each model uses.

Usage

list_models(check_data = TRUE, verbose = TRUE)

Arguments

check_data

Logical, whether to check if model data is available

verbose

Logical, whether to print formatted output

Value

Data frame with model information

Examples


# List all models
models <- list_models(verbose = FALSE)
models_head <- head(models)


Load posterior draws for a model

Description

Loads posterior draws for one of the 14 leafwax v10 models. The function searches three tiers in order:

Usage

load_posteriors(model_name, n_draws = NULL, verbose = TRUE)

Arguments

model_name

Character string specifying the model name.

n_draws

Integer number of posterior draws to use, or NULL for all available. Requesting more draws than are present silently returns whatever is available (e.g. all 100 from the preview tier).

verbose

Logical indicating whether to print loading info.

Details

  1. Heavy posteriors at ⁠inst/extdata/posteriors/⁠ (1000 draws, development install only; excluded from the CRAN tarball).

  2. Cache populated by download_model_data() under get_cache_dir().

  3. Preview posteriors at ⁠inst/extdata/posteriors_light/⁠. These are a 100-draw stratified subsample shipped with every install so examples and tests run offline. They are intended as a fixture for code-path verification, not for inference: tail probabilities and 95% credible intervals are noisy at this sample size. The package issues a warning whenever the preview tier is in use; downstream functions (invert_d2H(), assess_claim(), detect_change()) repeat the warning so it is visible at the call that actually matters.

For inference, run download_model_data() once to populate the cache and then call load_posteriors() again – the cache tier wins over the preview tier and no further downloads are needed.

Value

A leafwax_posterior object: a list with draws, metadata (including metadata$tier, one of "heavy", "cache", "light"), optional spatial, and accessor closures.

Examples

local({
  old <- options(leafwax.suppress_preview_warning = TRUE)
  on.exit(options(old))

  # Load a model (preview tier on a fresh install)
  model <- load_posteriors("baseline", verbose = FALSE)

  # Spatial model with limited draws
  model_fast <- load_posteriors("baseline_sp", n_draws = 50, verbose = FALSE)
})

Local effective slope at a paleo-reconstruction site

Description

Returns a per-draw vector of the local \beta_{\delta^2 H_p} calibration slope at a single site, combining the global precipitation-isotope slope with the spatial slope GP prediction at that site. The returned vector is the raw posterior at the site; every draw the calibration produced is preserved without modification.

Usage

local_effective_slope(
  longitude,
  latitude,
  model_name,
  override = NULL,
  n_draws = NULL,
  verbose = FALSE
)

Arguments

longitude

Numeric, single longitude in decimal degrees.

latitude

Numeric, single latitude in decimal degrees.

model_name

Character, v10 model name (see available_models()). Must be a spatial model (⁠*_sp⁠) for the site-specific slope to differ from the global mean; non-spatial models return the global posterior unchanged.

override

Optional numeric. NULL (default) uses the model slope. A single value broadcasts across all draws. A vector of length n_draws is used per draw.

n_draws

Integer, optional number of posterior draws to use (NULL uses all). Forwarded to load_posteriors().

verbose

Logical, whether to print progress messages.

Details

Two modes:

Pass the returned vector to invert_d2H(..., slope = ...) to propagate it through the inversion.

Mechanistic reference values (e.g. the simple two-pool stationarity bound alpha = 1 + epsilon_app/1000 ~ 0.88 under ⁠epsilon_app ~= -120 permil⁠; Sessions 2005) are documented for interpretation but are never applied to the returned draws. The frequency of draws above any chosen reference is computable directly from the returned vector (mean(slope > 0.88)) and carries scientific information about how often the calibration implicates non-stationarity at the site.

Value

Numeric vector of length n_draws, the per-draw effective slope at the site (after override, if any).

Examples


local({
  old <- options(leafwax.suppress_preview_warning = TRUE)
  on.exit(options(old))

  # St. Louis with the baseline_sp model
  s <- local_effective_slope(
    longitude = -90, latitude = 38,
    model_name = "baseline_sp",
    n_draws = 200
  )
  slope_summary <- summary(s)

  # How often does the calibration imply a slope above the simple-model
  # stationarity bound at this site?
  fraction_above_bound <- mean(s > 0.88)

  # Override with a defended local slope
  s_fixed <- local_effective_slope(
    longitude = -90, latitude = 38,
    model_name = "baseline_sp",
    override = 0.55
  )

  # Pass through to the inversion. The slope vector and the
  # inversion's posterior must use the same n_draws: pair
  # local_effective_slope(..., n_draws = N) with
  # invert_d2H(..., n_posterior_draws = N, slope = s), or pass a
  # single point estimate (e.g., median(s)).
  result <- invert_d2H(d2H_wax = -180, d2H_wax_sd = 3,
                       longitude = -90, latitude = 38,
                       model_name = "baseline_sp",
                       n_posterior_draws = 200,
                       slope = s,
                       verbose = FALSE)
})


Predict precipitation d2H from leaf wax d2H

Description

Main user-facing function for inverting leaf wax hydrogen isotopes to precipitation isotopes. Automatically selects appropriate model based on available data and returns results in a tidy format.

Usage

predict_d2h_precip(
  data = NULL,
  d2h_wax = NULL,
  longitude = NULL,
  latitude = NULL,
  d2h_wax_err = NULL,
  elevation = NULL,
  c4_fraction = NULL,
  pft_tree = NULL,
  pft_shrub = NULL,
  pft_grass = NULL,
  model = "auto",
  n_draws = NULL,
  credible_level = 0.9,
  return_draws = FALSE,
  progress = TRUE,
  verbose = TRUE
)

Arguments

data

Data frame containing measurements, or NULL to use individual vectors

d2h_wax

Numeric vector of leaf wax d2H values (per mil)

longitude

Numeric vector of longitudes (decimal degrees)

latitude

Numeric vector of latitudes (decimal degrees)

d2h_wax_err

Numeric vector of measurement uncertainties (optional)

elevation

Numeric vector of elevations in meters (optional)

c4_fraction

Numeric vector of C4 vegetation fraction 0-1 (optional)

pft_tree

Numeric vector of tree PFT fraction (optional)

pft_shrub

Numeric vector of shrub PFT fraction (optional)

pft_grass

Numeric vector of grass PFT fraction (optional)

model

Character string specifying model, or "auto" for automatic selection

n_draws

Integer number of posterior draws (NULL for all)

credible_level

Numeric credible interval level (default 0.9)

return_draws

Logical whether to return full posterior draws

progress

Logical whether to show progress bar for batch processing

verbose

Logical whether to print status messages

Value

A data frame with predictions (or list if return_draws = TRUE):

d2h_precip_mean

Mean predicted precipitation d2H

d2h_precip_median

Median predicted precipitation d2H

d2h_precip_sd

Standard deviation of the posterior predictive interval

d2h_precip_lower

Lower bound of the credible interval

d2h_precip_upper

Upper bound of the credible interval

prediction_interval_width

Width of the credible interval

model_used

Name of model used for prediction

The interval is the posterior predictive specified in manuscript supplement Section S4.1, Eq. 7 (analytical uncertainty plus the model's posterior residual SD).

Examples


local({
  old <- options(leafwax.suppress_preview_warning = TRUE)
  on.exit(options(old))

  # Using data frame input
  data(example_data)
  results <- predict_d2h_precip(example_data, verbose = FALSE)

  # Using individual vectors
  results <- predict_d2h_precip(
    d2h_wax = c(-150, -140, -130),
    longitude = c(-120, -110, -100),
    latitude = c(40, 35, 30),
    elevation = c(1000, 1500, 500),
    verbose = FALSE
  )

  # Specify model explicitly
  results <- predict_d2h_precip(
    example_data,
    model = "baseline_env_sp",
    verbose = FALSE
  )

  # Get full posterior draws
  results <- predict_d2h_precip(
    example_data,
    return_draws = TRUE,
    verbose = FALSE
  )
})


Predict an mPP Gaussian-process random effect at a new location

Description

Single-GP version. Used internally by predict_spatial_dual_gp() for each of the two (intercept, slope) fields. Matches the Matern 3/2 kernel and standardized-coordinate convention from the v10 Stan model.

Usage

predict_one_gp_mpp(
  coords_new,
  knot_coords,
  z_knots,
  sigma_draws,
  ls_km_draws,
  scaling,
  jitter = 1e-04
)

Arguments

coords_new

matrix(n_obs, 2) of (lon, lat) in DEGREES.

knot_coords

matrix(n_knots, 2) of (lon, lat) in DEGREES.

z_knots

matrix(n_draws, n_knots) of standardized knot effects (e.g. ⁠z_intercept_spatial[1..125]⁠ from the posterior).

sigma_draws

numeric(n_draws), the GP marginal SD.

ls_km_draws

numeric(n_draws), the GP length scale in km (e.g. ls_intercept_km).

scaling

list with lon_mean, lon_sd, lat_mean, lat_sd.

jitter

ridge added to K_knots for numerical stability.

Value

matrix(n_draws, n_obs) of predicted GP values at the new sites.


Predict both spatial intercept and spatial slope at new locations

Description

v10 carries two independent GPs. Both share knot coordinates and a single length scale parameter (ls_intercept_km == ls_slope_km in v10's posterior, two names for the same draw), but have distinct sigma_intercept_spatial and sigma_slope_spatial, and distinct ⁠z_intercept_spatial[*]⁠ and ⁠z_slope_spatial[*]⁠ knot effects.

Usage

predict_spatial_dual_gp(coords_new, knot_coords, draws, scaling)

Arguments

coords_new

matrix(n_obs, 2) of (lon, lat) in DEGREES.

knot_coords

matrix(n_knots, 2) of (lon, lat) in DEGREES.

draws

data.frame of posterior draws (subset of leafwax_posterior$draws). Must contain columns ⁠z_intercept_spatial[1..n_knots]⁠, ⁠z_slope_spatial[1..n_knots]⁠, sigma_intercept_spatial, sigma_slope_spatial, and one of ls_intercept_km / ls_slope_km.

scaling

list with lon_mean, lon_sd, lat_mean, lat_sd.

Value

list with two matrices, each n_draws x n_obs: intercept (additive contribution to beta_0 in standardized d2H_wax space) and slope (additive contribution to the local \beta_{\delta^2 H_p} slope).


Print method for leafwax_posterior

Description

Print method for leafwax_posterior

Usage

## S3 method for class 'leafwax_posterior'
print(x, ...)

Arguments

x

A leafwax_posterior object

...

Additional arguments

Value

The input leafwax_posterior object x, invisibly. Called for its side effect of printing a one-line model summary to the console.


Process chunks in parallel

Description

Process chunks in parallel

Usage

process_parallel(data, chunks, model, n_cores, progress, ...)

Arguments

data

Full dataset

chunks

List of index vectors for chunks

model

Model name

n_cores

Number of cores

progress

Show progress

...

Additional arguments

Value

List of results for each chunk


Process chunks sequentially with progress bar

Description

Process chunks sequentially with progress bar

Usage

process_sequential(data, chunks, model, progress, ...)

Arguments

data

Full dataset

chunks

List of index vectors for chunks

model

Model name

progress

Show progress bar

...

Additional arguments

Value

List of results for each chunk


Select best model based on available data

Description

Automatically selects the most appropriate v10 model name from the 14 shipped variants given which covariates the user has available. Spatial-aware models are preferred when prefer_spatial = TRUE.

Usage

select_best_model_from_flags(
  has_elevation = FALSE,
  has_c4 = FALSE,
  has_pft = FALSE,
  prefer_spatial = TRUE,
  verbose = FALSE
)

Arguments

has_elevation

Logical, whether elevation data is available. Accepted for compatibility; shipped v10 posteriors do not contain fitted elevation coefficients, so elevation alone does not change the selected model.

has_c4

Logical, whether C4 vegetation data is available

has_pft

Logical, whether PFT data is available

prefer_spatial

Logical, whether to prefer spatial models

verbose

Logical, whether to print selection reasoning

Value

Character string with selected v10 model name


Validate input data for inversion

Description

Checks that input data meets requirements for the specified model and returns cleaned, validated data.

Usage

validate_inputs(
  d2h_wax,
  longitude,
  latitude,
  d2h_wax_err = NULL,
  elevation = NULL,
  c4_fraction = NULL,
  pft_tree = NULL,
  pft_shrub = NULL,
  pft_grass = NULL,
  model_name = "baseline"
)

Arguments

d2h_wax

Leaf wax d2H values

longitude

Longitude values

latitude

Latitude values

d2h_wax_err

Measurement uncertainties

elevation

Elevation values

c4_fraction

C4 vegetation fraction

pft_tree

Tree PFT fraction

pft_shrub

Shrub PFT fraction

pft_grass

Grass PFT fraction

model_name

Name of model to use

Value

List of validated inputs


Validate inputs for a specific model

Description

Checks that all required predictors are provided and warns about unused ones.

Usage

validate_model_inputs(
  model_name,
  d2h_wax,
  longitude,
  latitude,
  elevation = NULL,
  c4_percent = NULL,
  pft_tree = NULL,
  pft_shrub = NULL,
  pft_grass = NULL,
  verbose = TRUE
)

Arguments

model_name

Name of the model

d2h_wax

Leaf wax d2H values

longitude

Longitude values

latitude

Latitude values

elevation

Elevation values (optional)

c4_percent

C4 percentage values (optional)

pft_tree

Tree PFT fraction (optional)

pft_shrub

Shrub PFT fraction (optional)

pft_grass

Grass PFT fraction (optional)

verbose

Whether to print validation messages

Value

List with validation results