TemporalForest

CRAN status

TemporalForest is an R package for reproducible feature selection in high-dimensional longitudinal data.

It combines time-aware network reduction (consensus TOM from WGCNA), mixed-effects model trees that respect within-subject correlation, and stability selection to deliver a small, interpretable, and stable set of predictors.

Why TemporalForest?

Installation

You can install the released version of TemporalForest from CRAN with:

install.packages("TemporalForest")

And the development version from GitHub with:

# install.packages("remotes")
remotes::install_github("SisiShao/TemporalForest")

30-second Quick Start

A tiny example that skips network construction by supplying a lightweight dissimilarity matrix:

library(TemporalForest)

set.seed(11)
n_subjects <- 60; n_timepoints <- 2; p <- 20

# Build X: list of length T, each an n × p matrix with identical column names
X <- replicate(n_timepoints, matrix(rnorm(n_subjects * p), n_subjects, p), simplify = FALSE)
colnames(X[[1]]) <- colnames(X[[2]]) <- paste0("V", 1:p)

# Long view + metadata
X_long <- do.call(rbind, X)
id   <- rep(seq_len(n_subjects), each = n_timepoints)
time <- rep(seq_len(n_timepoints), times = n_subjects)

# Outcome with three strong signals
u_subj <- rnorm(n_subjects, 0, 0.7)
eps    <- rnorm(length(id), 0, 0.08)
Y <- 4*X_long[, "V1"] + 3.5*X_long[, "V2"] + 3.2*X_long[, "V3"] +
     rep(u_subj, each = n_timepoints) + eps

# Simple dissimilarity to bypass Stage 1 (fast demo)
A <- 1 - abs(stats::cor(X_long)); diag(A) <- 0
dimnames(A) <- list(colnames(X[[1]]), colnames(X[[1]]))

fit <- temporal_forest(
  X = X, Y = Y, id = id, time = time,
  dissimilarity_matrix = A,     # skip WGCNA/TOM
  n_features_to_select = 3,     # expect V1, V2, V3
  n_boot_screen = 6, n_boot_select = 18,
  keep_fraction_screen = 1,
  min_module_size = 2,
  alpha_screen = 0.5, alpha_select = 0.6
)

print(fit$top_features)
#> [1] "V1" "V3" "V2"

For a more detailed example and a full pipeline run, please see the package vignette.

Documentation

A long-form guide and reproducible examples can be found in the vignette: vignette("TemporalForest-Introduction", package = "TemporalForest")

Contributing

Issues and pull requests are welcome! Please report bugs or request features at the official GitHub repository.

Citation

If you use TemporalForest in your work, please cite the manuscript:

Shao, S., Moore, J.H., Ramirez, C.M. (2025). Network-Guided Temporal Forests for Feature Selection in High-Dimensional Longitudinal Data. Manuscript submitted for publication.

You can also get the citation from within R:

citation("TemporalForest")