| Type: | Package |
| Title: | Change Point Detection for Non-Stationary and Cross-Correlated Time Series |
| Version: | 0.1.1 |
| Maintainer: | Yuhan Tian <yuhan.tian@fau.de> |
| Description: | Implements methods for multiple change point detection in multivariate time series with non-stationary dynamics and cross-correlations. The methodology is based on a model in which each component has a fluctuating mean represented by a random walk with occasional abrupt shifts, combined with a stationary vector autoregressive structure to capture temporal and cross-sectional dependence. The framework is broadly applicable to correlated multivariate sequences in which large, sudden shifts occur in all or subsets of components and are the primary targets of interest, whereas small, smooth fluctuations are not. Although random walks are used as a modeling device, they provide a flexible approximation for a wide class of slowly varying or locally smooth dynamics, enabling robust performance beyond the strict random walk setting. |
| License: | GPL-2 |
| Encoding: | UTF-8 |
| Imports: | Rcpp, blockmatrix, corpcor, doParallel, ggplot2, glmnet, MASS, Matrix, nnls, pracma, SimDesign |
| LinkingTo: | Rcpp, RcppArmadillo |
| RoxygenNote: | 7.3.3 |
| NeedsCompilation: | yes |
| Packaged: | 2025-12-19 16:20:57 UTC; AAA |
| Author: | Yuhan Tian [aut, cre], Abolfazl Safikhani [aut] |
| Repository: | CRAN |
| Date/Publication: | 2026-01-06 11:00:20 UTC |
FluxPoint: Change Point Detection for Non-Stationary and Cross-Correlated Time Series
Description
Implements methods for multiple change point detection in multivariate time series with non-stationary dynamics and cross-correlations. The methodology is based on a model in which each component has a fluctuating mean represented by a random walk with occasional abrupt shifts, combined with a stationary vector autoregressive structure to capture temporal and cross-sectional dependence. The framework is broadly applicable to correlated multivariate sequences in which large, sudden shifts occur in all or subsets of components and are the primary targets of interest, whereas small, smooth fluctuations are not. Although random walks are used as a modeling device, they provide a flexible approximation for a wide class of slowly varying or locally smooth dynamics, enabling robust performance beyond the strict random walk setting.
Author(s)
Maintainer: Yuhan Tian yuhan.tian@fau.de
Authors:
Abolfazl Safikhani asafikha@gmu.edu
FluxPoint change point detection algorithm
Description
Implements the full FluxPoint algorithm for detecting multiple change points in multivariate time series with non-stationary dynamics and cross-correlations. The procedure iteratively estimates model parameters and change point locations, alternating between parameter estimation and detection steps until convergence.
Usage
FluxPoint(
data,
w,
tc,
max_iter1,
max_iter2,
ignoreCross = FALSE,
noeta = FALSE,
nophi = FALSE,
needReproduce = FALSE
)
Arguments
data |
Numeric matrix of dimension |
w |
Integer specifying the window size used by the detector. |
tc |
Numeric tuning constant used in the detection threshold
|
max_iter1 |
Integer specifying the maximum number of iterations for the first-stage loop, which alternates between diagonal robust parameter estimation and change point detection. |
max_iter2 |
Integer specifying the maximum number of iterations for the second-stage refinement loop, which incorporates non-diagonal vector autoregressive updates. |
ignoreCross |
Logical; if |
noeta |
Logical; if |
nophi |
Logical; if |
needReproduce |
Logical; if |
Details
The algorithm proceeds through the following stages:
-
Stage I (diagonal estimation): Robust parameter estimation is performed to obtain diagonal estimates of
\Sigma_{\boldsymbol{\eta}},\Sigma_{\boldsymbol{\nu}}, and\Phi. These estimates are used to construct the windowed covariance matrix\Sigma_{\mathrm{ALL}}^{*}and its inverse. Change point detection is then carried out using the resulting detector statistic. The estimation and detection steps are iterated until the detected change points stabilize ormax_iter1is reached. -
Stage II (refinement with cross-correlation): If enabled, the fluctuating mean is estimated segmentwise and removed from the data. A sparse vector autoregressive model is then fitted to the residuals to obtain non-diagonal estimates of
\Phiand\Sigma_{\boldsymbol{\nu}}. The covariance matrix\Sigma_{\mathrm{ALL}}^{*}is recomputed and change point detection is rerun. This refinement loop is repeated until convergence or untilmax_iter2is reached.
Value
A list containing:
-
cps: Sorted indices of the detected change points. -
Sig_eta_hat: Final estimate of\Sigma_{\boldsymbol{\eta}}. -
Sig_nu_hat: Final estimate of\Sigma_{\boldsymbol{\nu}}, which may be non-diagonal if the second-stage refinement is performed. -
Phi_hat: Final estimate of\Phi, which may be non-diagonal if the second-stage refinement is performed. -
muhats: Estimated fluctuating mean sequence. -
detectorStats: Detector statistic evaluated over time. -
cps_at: A list mapping each detected change point to the indices of components selected as contributing to that change.
References
Tian, Y. and Safikhani, A. (2025). Multiple change point detection in time series with non-stationary dynamics. Manuscript under review.
Examples
## Minimal runnable example (fast)
set.seed(123)
p <- 1
mu0 <- rep(0, p)
deltas <- list(c(3), c(-3))
Sig_eta <- diag(0.01, p)
Sig_nu <- random_Signu(p, 0)
Phi <- random_Phi(p, 0)
Sig_e1 <- get_Sig_e1_approx(Sig_nu, Phi)
# Simulate data with two evenly spaced change points
Y <- generate_data(mu0, deltas, Sig_eta, Sig_nu, Phi, Sig_e1,
errortype = "n", number_cps = 2, lengthofeachpart = 100)
# Run the algorithm
out <- FluxPoint(Y, w = 20, tc = 1, max_iter1 = 5, max_iter2 = 5)
out$cps
# Visualization
p1 <- plot_FluxPoint(Y, out$muhats, out$cps, titlename = "", xaxis = "Time")
print(p1)
## More realistic example (may take longer)
set.seed(123)
p <- 3
mu0 <- rep(0, p)
deltas <- list(c(3, 0, -3), c(0, -2, 4))
Sig_eta <- diag(0.01, p)
Sig_nu <- random_Signu(p, 0)
Phi <- random_Phi(p, p)
Sig_e1 <- get_Sig_e1_approx(Sig_nu, Phi)
Y <- generate_data(mu0, deltas, Sig_eta, Sig_nu, Phi, Sig_e1,
errortype = "n", number_cps = 2, lengthofeachpart = 100)
out <- FluxPoint(Y, w = 20, tc = 1, max_iter1 = 5, max_iter2 = 5)
out$cps
# Visualization
p1 <- plot_FluxPoint(Y, out$muhats, out$cps, titlename = "", xaxis = "Time")
print(p1)
Core change point detection algorithm (given known parameters)
Description
Implements the core step of the proposed change point
detection (CPD) algorithm to estimate the locations of change points,
given the inverse windowed covariance \Sigma_{\mathrm{ALL}}^{*-1}.
The method computes detector statistics over a moving window using a
projection-based quadratic form and identifies candidate change points
via peak detection.
Usage
FluxPoint_raw(data, Sig_all_inv, w, D, needReproduce = FALSE)
Arguments
data |
Numeric matrix of dimension |
Sig_all_inv |
Numeric matrix of dimension |
w |
Integer; window size used in the moving-window detection step. |
D |
Numeric; detection threshold used in the peak-finding step. |
needReproduce |
Logical; if |
Details
For each center index k, a window of width w is formed and
contrast vectors are constructed to compare the first and second halves of
the window. Before computing the detector statistic, a component-selection
step is performed using an \ell_1-penalized regression (lasso, via
glmnet) with weights \Sigma_{\mathrm{ALL}}^{*-1} to identify
variables that exhibit a shift. The resulting active set determines the
projection used in the statistic. Sparse projection matrices indexed by the
active-set pattern are cached and reused for computational efficiency. The
detector statistic is defined as a weighted quadratic form measuring
deviation from the baseline (no-change) projection, and locations at which
the statistic exceeds the threshold D are declared as estimated
change points.
Value
A list with:
'shiftIndices' — Binary matrix (
n \times p) indicating selected components at each index.'detectorStats' — Numeric vector of detector values over time.
'Projection_list' — Cache of projection matrices by active-set pattern.
'cps' — Indices of detected change points.
Examples
## Minimal runnable example (fast)
set.seed(123)
p <- 1
mu0 <- rep(0, p)
deltas <- list(c(3), c(4))
Sig_eta <- diag(0.01, p)
Sig_nu <- random_Signu(p, 0)
Phi <- random_Phi(p, 0)
Sig_e1 <- get_Sig_e1_approx(Sig_nu, Phi)
# Simulate data with two evenly spaced change points
Y <- generate_data(mu0, deltas, Sig_eta, Sig_nu, Phi, Sig_e1,
errortype = "n", number_cps = 2,
lengthofeachpart = 100)
# Windowed covariance and its inverse
w <- 20
Sigs <- get_Sigs(w, p, Sig_eta, Sig_nu, Phi, Sig_e1)
Sig_all_inv <- Sigs$Sig_all_inv
# Run detector with a common threshold choice
n <- nrow(Y)
D <- min(4, log(exp(2) + p)) * log(n - w)
res <- FluxPoint_raw(Y, Sig_all_inv, w, D)
res$cps
## More realistic example (may take longer)
set.seed(123)
p <- 3
mu0 <- rep(0, p)
deltas <- list(c(3, 0, -3), c(0, -2, 4))
Sig_eta <- diag(0.01, p)
Sig_nu <- random_Signu(p, 0)
Phi <- random_Phi(p, p)
Sig_e1 <- get_Sig_e1_approx(Sig_nu, Phi)
Y <- generate_data(mu0, deltas, Sig_eta, Sig_nu, Phi, Sig_e1,
errortype = "n", number_cps = 2,
lengthofeachpart = 100)
w <- 20
Sigs <- get_Sigs(w, p, Sig_eta, Sig_nu, Phi, Sig_e1)
Sig_all_inv <- Sigs$Sig_all_inv
n <- nrow(Y)
D <- min(4, log(exp(2) + p)) * log(n - w)
res <- FluxPoint_raw(Y, Sig_all_inv, w, D)
res$cps
Add mean shifts to multivariate time series data
Description
Adds constant mean shifts to a multivariate time series by applying a fixed jump vector at evenly spaced change points. After each change point, all subsequent observations are shifted by the specified vector.
Usage
add_jumps(data, delta, num)
Arguments
data |
Numeric matrix of dimension |
delta |
Numeric vector of length |
num |
Integer; number of change points. The data are divided evenly
into |
Details
The total length of the time series is denoted by n. Change points are
placed at evenly spaced locations given by
k \lfloor n / (num + 1) \rfloor, for k = 1, \ldots, num. After each
change point, a constant shift vector delta is added to all subsequent
observations. This construction produces synthetic data with known and
controlled mean shifts, making the function useful for simulation studies and
benchmarking change point detection methods.
Value
A numeric matrix of the same dimension as 'data', containing the adjusted series with added mean shifts.
Apply Thresholding to VAR Coefficients
Description
Applies a thresholding rule to a coefficient matrix by setting entries below a certain threshold to zero. Two types of thresholding are available: "soft" and "hard".
Usage
applyThreshold(a_mat, nr, nc, p, type = "soft")
Arguments
a_mat |
Numeric matrix. The coefficient matrix to be thresholded. |
nr |
Integer. The number of rows in the original data. |
nc |
Integer. The number of variables (columns) in the original data. |
p |
Integer. The order of the VAR model. |
type |
Character. The type of threshold to apply; either |
Value
The thresholded coefficient matrix.
Compute VAR Model Residuals
Description
Computes the residuals from a VAR model by subtracting the fitted values (obtained from the estimated coefficient matrices) from the original time series data.
Usage
computeResiduals(data, A)
Arguments
data |
A numeric matrix of the original time series (observations in rows). |
A |
List. A list of VAR coefficient matrices (one for each lag). |
Value
A numeric matrix of residuals.
Cross-Validated VAR Estimation using Elastic Net
Description
This internal function performs cross validation for VAR estimation using the elastic net penalty. It prepares the data, calls the elastic net CV routine, reshapes the estimated coefficients, applies optional thresholding, computes residuals, and estimates the error covariance.
Usage
cvVAR(data, p, opt = NULL, needReproduce = FALSE)
Arguments
data |
A numeric matrix with time series data (observations in rows, variables in columns). |
p |
Integer. The order of the VAR model. |
opt |
List. A list of options (see |
Value
A list with components:
mu |
Vector of means of the original series. |
A |
List of VAR coefficient matrices (one for each lag). |
fit |
The complete elastic net CV fit (if requested). |
lambda |
The optimal lambda value chosen by CV. |
mse |
The minimum mean squared error from CV. |
mse_sd |
Standard deviation of the MSE. |
time |
Elapsed time for the ENET estimation. |
series |
The transformed series (after centering/scaling). |
residuals |
Residuals from the VAR model. |
sigma |
Estimated covariance matrix of the residuals. |
Cross Validation for Elastic Net VAR Estimation
Description
This internal function performs cross validation using elastic net (ENET)
estimation via the glmnet package. It supports parallel processing if requested.
Usage
cvVAR_ENET(X, y, nvar, opt, needReproduce = FALSE)
Arguments
X |
A numeric matrix of predictors. |
y |
Numeric vector of responses. |
nvar |
Integer. The number of variables in the original VAR (number of columns in data). |
opt |
List. A list of options including:
|
Value
An object of class cv.glmnet as returned by glmnet::cv.glmnet.
Construct Lagged Design Matrix for VAR
Description
Duplicates the original data matrix to create a lagged predictor matrix for VAR estimation.
Usage
duplicateMatrix(data, p)
Arguments
data |
A numeric matrix with time series data (observations in rows). |
p |
Integer. The order of the VAR model (number of lags). |
Value
A numeric matrix with duplicated columns corresponding to lagged observations.
Estimate Covariance Matrix from Residuals
Description
Estimates the covariance (or variance) matrix of the residuals using shrinkage estimation.
This function utilizes corpcor::cov.shrink for covariance estimation.
Usage
estimateCovariance(res, ...)
Arguments
res |
A numeric matrix of residuals from the VAR model. |
... |
Additional arguments passed to |
Value
A numeric covariance matrix.
Estimate non-diagonal VAR(1) parameters after mean removal
Description
Estimates the non-diagonal autoregressive coefficient matrix
\Phi and innovation covariance matrix \Sigma_{\boldsymbol{\nu}}
for the residual process obtained after removing the estimated
fluctuating mean from the data. The estimation applies the
Lasso to encourage sparsity in the cross-variable dependence structure.
Usage
estimatePhinu_nondiag(
epsilons,
Sig_nu_diag,
Phi_diag,
replace_diag = FALSE,
needReproduce = FALSE
)
Arguments
epsilons |
Numeric matrix of dimension |
Sig_nu_diag |
Numeric |
Phi_diag |
Numeric |
replace_diag |
Logical; if |
needReproduce |
Logical; if |
Details
The function applies a Lasso-penalized VAR(1) fit to the residual
process \boldsymbol{\epsilon}_t to estimate cross-dependencies
among variables.
The fitting is performed using the function
fitVAR(), which is adapted from the sparsevar package.
When replace_diag = TRUE, the diagonal entries of
\Phi and \Sigma_{\boldsymbol{\nu}} are replaced by
their componentwise estimates obtained in Phase I for improved
numerical stability.
Value
A list containing:
'Phi_hat' — Estimated non-diagonal autoregressive matrix
\Phi.'Sig_nu_hat' — Estimated non-diagonal innovation covariance matrix
\Sigma_{\boldsymbol{\nu}}.
Robust parameter estimation (RPE) for univariate time series
Description
Implements the robust parameter estimation (RPE) procedure to estimate
the parameters \sigma_{\eta}^2, \sigma_{\nu}^2, and \phi
in the univariate version of the proposed model.
The method is based on minimizing the objective function defined in
objective_func, using variance estimates computed from
lagged differences of the data.
Usage
estimate_RWVAR_cp(
data,
L = 15,
phiLower = -0.8,
phiUpper = 0.8,
sigetaLower = 0,
sigetaUpper = Inf,
signuLower = 1e-06,
signuUpper = Inf,
num_inis = 20,
CPs = NULL
)
Arguments
data |
Numeric vector containing the univariate time series
observations |
L |
Integer; number of lag differences used in the estimation (default = 15). |
phiLower, phiUpper |
Numeric; lower and upper bounds for the
autoregressive coefficient |
sigetaLower, sigetaUpper |
Numeric; lower and upper bounds for
|
signuLower, signuUpper |
Numeric; lower and upper bounds for
|
num_inis |
Integer; number of initial values of |
CPs |
Optional numeric vector of change point locations (indices). If provided, differenced data crossing these points are removed to improve the robustness of the variance estimation in the presence of structural breaks. |
Details
For each lag l = 1, \ldots, L, the function computes the variance of
the l-lagged differences
z^{(l)}_t = y_{t+l} - y_t using the median absolute deviation (MAD).
If change points (CPs) are specified, all differences that overlap
a change point are excluded from the computation. The resulting empirical
variances \{v^{(l)}\}_{l=1}^L are then used to construct the following
optimization problem:
\sum_{l=1}^L \left(l\sigma^2_{\eta} +
2\frac{1-\phi^l}{1-\phi^2}\sigma^2_{\nu} - v^{(l)}\right)^2,
which is solved via bounded optimization using optim() with the
L-BFGS-B algorithm. Initial parameter values are obtained using
non-negative least squares (NNLS) regression over a grid of \phi
values.
Value
A list with elements:
'sigeta_est' — Estimated
\sigma_{\eta}^2.'signu_est' — Estimated
\sigma_{\nu}^2.'phi_est' — Estimated autoregressive coefficient
\phi.'inis' — Initial parameter values used in optimization.
Robust parameter estimation (RPE) for multivariate time series
Description
Applies the robust parameter estimation (RPE) procedure componentwise
to a multivariate time series in order to estimate the diagonal elements
of \Sigma_{\boldsymbol{\eta}}, \Sigma_{\boldsymbol{\nu}}, and
\Phi.
Usage
estimate_RWVAR_cp_heter(
data,
L = 15,
phiLower = -0.8,
phiUpper = 0.8,
sigetaLower = 0,
sigetaUpper = Inf,
signuLower = 1e-06,
signuUpper = Inf,
num_inis = 20,
CPs = NULL
)
Arguments
data |
Numeric matrix of dimension |
L |
Integer; number of lag differences used in each univariate RPE estimation (default = 15). |
phiLower, phiUpper |
Numeric; lower and upper bounds for the
autoregressive coefficient |
sigetaLower, sigetaUpper |
Numeric; lower and upper bounds for
|
signuLower, signuUpper |
Numeric; lower and upper bounds for
|
num_inis |
Integer; number of initial values of |
CPs |
Optional numeric vector of change point locations (indices). If provided, differenced data overlapping these points are removed for more robust estimation. |
Details
This function performs the RPE procedure for each variable (column)
in 'data' independently, using estimate_RWVAR_cp as the
univariate estimator. The resulting estimates are combined into
diagonal matrices:
-
\Sigma_{\boldsymbol{\nu}}— estimated innovation covariance of the VAR(1) component. -
\Sigma_{\boldsymbol{\eta}}— estimated innovation covariance of the random walk component. -
\Phi— estimated autoregressive coefficient matrix.
Value
A list containing:
'Sig_nu' — Diagonal matrix of estimated
\sigma_{\nu,i}^2.'Sig_eta' — Diagonal matrix of estimated
\sigma_{\eta,i}^2.'Phi' — Diagonal matrix of estimated autoregressive coefficients
\phi_i.
Examples
set.seed(123)
p <- 3
# True (diagonal) parameters for simulation
mu0 <- rep(0, p)
Sig_eta <- diag(0.01, p)
Sig_nu <- random_Signu(p, 0) # diagonal here since num_nonzero = 0
Phi <- random_Phi(p, 0) # diagonal here since num_nonzero = 0
Sig_e1 <- get_Sig_e1_approx(Sig_nu, Phi)
# Two evenly spaced change points
deltas <- list(c(3, 0, -3), c(-2, 4, 0))
Y <- generate_data(mu0, deltas, Sig_eta, Sig_nu, Phi, Sig_e1,
errortype = "n", number_cps = 2, lengthofeachpart = 100)
# Provide CP locations to remove affected differences in RPE
CPs <- c(100, 200)
# Componentwise robust parameter estimation
fit <- estimate_RWVAR_cp_heter(Y, L = 15, CPs = CPs)
# Estimated diagonal matrices:
fit$Sig_eta
fit$Sig_nu
fit$Phi
Estimate the fluctuating mean sequence via maximum likelihood
Description
Implements the maximum likelihood estimation (MLE) procedure for the
fluctuating mean sequence \{\boldsymbol{\mu}_t\}_{t=1}^n in the
proposed model, given the parameters (or their estimates)
\Sigma_{\boldsymbol{\eta}}, \Sigma_{\boldsymbol{\nu}},
\Phi, and \Gamma_{\boldsymbol{\epsilon}}(0).
Usage
estimate_mus(data, Sig_eta, Sig_nu, Phi, Sig_e1)
Arguments
data |
Numeric matrix of dimension |
Sig_eta |
Numeric |
Sig_nu |
Numeric |
Phi |
Numeric |
Sig_e1 |
Numeric |
Details
The algorithm performs forward and backward recursions to compute
the MLE of \boldsymbol{\mu}_t under the proposed model with Gaussian noises:
\mathbf{y}_t = \boldsymbol{\mu}_t + \boldsymbol{\epsilon}_t, \quad
\boldsymbol{\mu}_t = \boldsymbol{\mu}_{t-1} + \boldsymbol{\eta}_t, \quad
\boldsymbol{\epsilon}_t = \Phi \boldsymbol{\epsilon}_{t-1} + \boldsymbol{\nu}_t.
This estimation provides the smoothed mean trajectory that captures gradual fluctuations between change points, conditioned on the given model parameters.
Value
A numeric matrix of dimension n \times p, containing the estimated
fluctuating mean vectors \hat{\boldsymbol{\mu}}_t.
Estimate fluctuating mean segmentwise given detected change points
Description
Estimates the fluctuating mean sequence \{\boldsymbol{\mu}_t\}_{t=1}^n
segmentwise by applying the maximum likelihood estimation (MLE) procedure
within each segment defined by detected change points.
Usage
estimate_musseg(data, cps, Sig_eta, Sig_nu, Phi, Sig_e1)
Arguments
data |
Numeric matrix of dimension |
cps |
Numeric vector of detected change point locations (sorted indices). |
Sig_eta |
Numeric |
Sig_nu |
Numeric |
Phi |
Numeric |
Sig_e1 |
Numeric |
Details
The time series is partitioned into contiguous segments defined by the
specified change points. Within each segment,
estimate_mus is applied to obtain the maximum likelihood
estimate of the fluctuating mean sequence for that interval. The resulting
segment-wise estimates are then concatenated to form a complete piecewise
estimate of \boldsymbol{\mu}_t over the entire time series.
Value
A numeric matrix of dimension n \times p, containing the estimated
fluctuating mean sequence across all segments.
Examples
set.seed(123)
p <- 3
mu0 <- rep(0, p)
deltas <- list(c(3, 0, -3), c(-2, 4, 0))
Sig_eta <- diag(0.01, p)
Sig_nu <- random_Signu(p, 0)
Phi <- random_Phi(p, p)
Sig_e1 <- get_Sig_e1_approx(Sig_nu, Phi)
# Generate data and estimate mean segmentwise after known CPs
Y <- generate_data(mu0, deltas, Sig_eta, Sig_nu, Phi, Sig_e1,
errortype = "n", number_cps = 2, lengthofeachpart = 100)
cps <- c(100, 200)
mu_seg <- estimate_musseg(Y, cps, Sig_eta, Sig_nu, Phi, Sig_e1)
dim(mu_seg)
Fit VAR Model with Elastic Net via Cross Validation
Description
Estimates a (possibly high-dimensional) VAR model using penalized least squares with an elastic net penalty and cross validation. This function is adapted from the sparsevar package (<https://github.com/svazzole/sparsevar/tree/master>), which is distributed under the GNU General Public License v2. The code has been modified to specifically implement the elastic net penalty (penalty = "ENET") and cross validation (method = "cv").
Usage
fitVAR(data, p = 1, needReproduce = FALSE, ...)
Arguments
data |
A numeric matrix or data frame with time series data (observations in rows, variables in columns). |
p |
Integer. The order of the VAR model. |
... |
Additional options for estimation. Global options include:
|
Value
A list with the following components:
mu |
A vector of means for each variable. |
A |
A list (of length |
fit |
(Optional) The complete results of the penalized least squares estimation. |
lambda |
The chosen lambda value (by cross validation). |
mse |
The minimum mean squared error from cross validation. |
mse_sd |
The standard deviation of the mean squared error. |
time |
Elapsed time for the estimation. |
series |
The (possibly transformed) input time series. |
residuals |
The residuals of the VAR model. |
sigma |
The estimated variance/covariance matrix of the residuals. |
References
The original source code is adapted from the sparsevar package, which is distributed under the GNU General Public License v2.
Generate multivariate time series from the proposed model
Description
Simulates a multivariate time series following the proposed model structure, where the mean component evolves as a random walk with abrupt shifts, overlaid by a stationary VAR(1) process to account for temporal and cross-sectional correlations.
Specifically, at each time point t = 1, \ldots, n, the data are
generated as
\mathbf{y}_t = \boldsymbol{\mu}_t + \boldsymbol{\epsilon}_t,
where, for t = 2, \ldots, n,
\boldsymbol{\mu}_t = \boldsymbol{\mu}_{t-1} + \boldsymbol{\eta}_t + \boldsymbol{\delta}_t,
and
\boldsymbol{\epsilon}_t = \Phi \boldsymbol{\epsilon}_{t-1} + \boldsymbol{\nu}_t.
Here, \boldsymbol{\eta}_t denotes the random walk innovation with
covariance \Sigma_{\boldsymbol{\eta}}, and
\boldsymbol{\nu}_t is the VAR(1) innovation with covariance
\Sigma_{\boldsymbol{\nu}}. The vector
\boldsymbol{\delta}_t is nonzero only at change points.
Usage
generate_data(
mu0,
deltas,
Sig_eta,
Sig_nu,
Phi,
Sig_e1,
errortype,
df = 10,
number_cps,
lengthofeachpart
)
Arguments
mu0 |
Numeric vector of length |
deltas |
A list of numeric vectors, each representing the jump
magnitude |
Sig_eta |
Numeric |
Sig_nu |
Numeric |
Phi |
Numeric |
Sig_e1 |
Numeric |
errortype |
Character; either "n" (Gaussian) or "t" (Student's t) specifying the distribution of the innovations. |
df |
Degrees of freedom for the t-distribution (used only when 'errortype = "t"'). Default is 10. |
number_cps |
Integer; number of change points ( |
lengthofeachpart |
Integer; number of observations between
consecutive change points ( |
Details
The total length of the time series is given by
n = (number\_cps + 1) \times lengthofeachpart, so that the specified
change points partition the data into equally sized segments. When
\Sigma_{\boldsymbol{\eta}} = 0, the model reduces to a piecewise
constant mean process with no random walk component. When \Phi = 0,
the process reduces to a random walk model without vector autoregressive
dependence. If both \Sigma_{\boldsymbol{\eta}} = 0 and \Phi = 0,
the model simplifies to the classical piecewise constant setting commonly
used in multiple change point analysis. The two innovation components are
generated independently.
The innovations \boldsymbol{\eta}_t and \boldsymbol{\nu}_t are
drawn either from a multivariate normal distribution (when
errortype = "n") using mvrnorm, or from a
multivariate Student's t distribution (when errortype = "t") using
rmvt.
Value
A numeric matrix of dimension n \times p, with
n = (number\_cps+1)\,lengthofeachpart, containing the simulated
observations \{\mathbf{y}_t\}_{t=1}^n.
Examples
set.seed(123)
p <- 3
mu0 <- rep(0, p)
deltas <- list(c(3, 0, -3), c(-2, 4, 0))
Sig_eta <- diag(0.01, p)
Sig_nu <- random_Signu(p, 0)
Phi <- random_Phi(p, p)
Sig_e1 <- get_Sig_e1_approx(Sig_nu, Phi)
Y <- generate_data(mu0, deltas, Sig_eta, Sig_nu, Phi, Sig_e1,
errortype = "n", number_cps = 2, lengthofeachpart = 100)
dim(Y)
Approximate the long-run covariance matrix \Gamma_{\boldsymbol{\epsilon}}(0)
Description
Computes an approximate long-run covariance matrix
\Gamma_{\boldsymbol{\epsilon}}(0) for the stationary VAR(1)
process
\boldsymbol{\epsilon}_t = \Phi \boldsymbol{\epsilon}_{t-1} + \boldsymbol{\nu}_t,
where \boldsymbol{\nu}_t has innovation covariance
\Sigma_{\boldsymbol{\nu}}. The approximation is obtained via a
truncated series expansion up to order 'm'.
Usage
get_Sig_e1_approx(Sig_nu, Phi, m = 6)
Arguments
Sig_nu |
Numeric |
Phi |
Numeric |
m |
Integer (default = 6). Number of lag terms used in the truncated series expansion. Larger values yield higher accuracy but increase computation time. |
Details
For a stable VAR(1) process, the theoretical long-run covariance satisfies
\mathrm{vec}(\Gamma_{\boldsymbol{\epsilon}}(0)) =
(I_{p^2} - \Phi \otimes \Phi)^{-1} \mathrm{vec}(\Sigma_{\boldsymbol{\nu}}).
To avoid matrix inversion, this function approximates the inverse by the truncated Neumann series:
(I_{p^2} - \Phi \otimes \Phi)^{-1} =
I_{p^2} + \sum_{i=1}^{m} (\Phi^{\otimes i}),
where \Phi^{\otimes i} denotes the Kronecker product of
\Phi^i with itself. The truncation level 'm' controls the
approximation accuracy.
Value
A numeric p \times p matrix giving the approximate
\Gamma_{\boldsymbol{\epsilon}}(0).
Compute the covariance matrix \Sigma_{\mathrm{ALL}}^* for observations within a moving window
Description
Calculates the covariance matrix \Sigma_{\mathrm{ALL}}^* for all
observations within a moving window of length w, given the random walk
innovation covariance \Sigma_{\boldsymbol{\eta}}, the VAR(1) innovation
covariance \Sigma_{\boldsymbol{\nu}}, the autoregressive coefficient
matrix \Phi, and the initial-state covariance matrix
\Gamma_{\boldsymbol{\epsilon}}(0) (denoted here by 'Sig_e1').
The resulting matrix accounts for both the random walk component and
the temporal dependence introduced by the VAR(1) structure.
Usage
get_Sigs(w, p, Sig_eta, Sig_nu, Phi, Sig_e1)
Arguments
w |
Integer; window size. |
p |
Integer; data dimension. |
Sig_eta |
Numeric |
Sig_nu |
Numeric |
Phi |
Numeric |
Sig_e1 |
Numeric |
Details
The function decomposes the overall covariance matrix
\Sigma_{\mathrm{ALL}}^* into two additive components corresponding to
the random walk contribution \Sigma_{\mathrm{RW}} and the
autoregressive contribution \Sigma_{\mathrm{AR}}, so that
\Sigma_{\mathrm{ALL}}^* = \Sigma_{\mathrm{RW}} + \Sigma_{\mathrm{AR}}.
When p = 1, the construction reduces to the scalar random walk and
AR(1) case, for which closed-form covariance expressions are available.
For higher-dimensional settings, block-matrix structures are constructed
using functions from the blockmatrix package to capture both
cross-sectional and temporal dependence. The returned inverse matrix
(\Sigma_{\mathrm{ALL}}^*)^{-1} is used in the main change point
detection algorithm to adjust for the effects of random walk drift and
vector autoregressive correlation.
Value
A list with the following components:
'Sig_AR' — Covariance contribution from the VAR(1) component (
\Sigma_{\mathrm{AR}}).'Sig_RW' — Covariance contribution from the random walk component (
\Sigma_{\mathrm{RW}}).'Sig_all' — Combined covariance matrix (
\Sigma_{\mathrm{ALL}}^* = \Sigma_{\mathrm{AR}} + \Sigma_{\mathrm{RW}}).'Sig_all_inv' — Inverse of the combined covariance matrix
(\Sigma_{\mathrm{ALL}}^*)^{-1}.
Examples
set.seed(1)
p <- 3
w <- 20
Sig_eta <- diag(0.01, p)
Sig_nu <- random_Signu(p, 0)
Phi <- random_Phi(p, p)
Sig_e1 <- get_Sig_e1_approx(Sig_nu, Phi)
res <- get_Sigs(w, p, Sig_eta, Sig_nu, Phi, Sig_e1)
Evaluate change point detection accuracy metrics
Description
Computes standard evaluation metrics — bias, precision, recall, and F1-score — for change point detection results by comparing estimated change points against true ones within a specified tolerance (acceptance radius).
Usage
get_metrics(n, num_cps, est_cps, accept_radius)
Arguments
n |
Integer; total series length. |
num_cps |
Integer; true number of change points. |
est_cps |
Numeric vector of estimated change point locations. |
accept_radius |
Numeric; tolerance radius within which an estimated change point is considered correctly detected (a true positive). |
Details
True change points are assumed to occur at evenly spaced positions. An
estimated change point is counted as a true positive if it lies within
accept_radius of any true change point location. Estimated points
that do not match any true change point are classified as false positives,
while true change points that are not detected are counted as false
negatives. Bias is defined as the absolute difference between the true and
estimated numbers of change points.
The metrics are defined as:
\text{Precision} = \frac{TP}{TP + FP}, \quad
\text{Recall} = \frac{TP}{TP + FN}, \quad
F_1 = \frac{2 \cdot \text{Precision} \cdot \text{Recall}}{\text{Precision} + \text{Recall}}.
Value
A list containing:
'bias' — Absolute difference between true and estimated number of change points.
'precision' — Proportion of correctly detected change points among all detections.
'recall' — Proportion of true change points correctly detected.
'F1' — Harmonic mean of precision and recall.
Matrix inverse via Armadillo (Rcpp)
Description
Matrix inverse via Armadillo (Rcpp)
Usage
inver(A)
Arguments
A |
Numeric matrix. |
Value
The inverse of A.
Objective function for robust parameter estimation (RPE)
Description
Computes the objective value minimized in the robust parameter estimation (RPE) procedure.
Usage
objective_func(parameters, Var_ests)
Arguments
parameters |
Numeric vector of length three, containing the parameters
|
Var_ests |
Numeric vector of empirical variance estimates
|
Details
For each lag l = 1, \ldots, L, the theoretical variance of the
l-lagged difference
z^{(l)}_t = y_{t+l} - y_t is calculated by
V_l = l\sigma_{\eta}^2 +
2\,\sigma_{\nu}^2 \frac{1 - \phi^l}{1 - \phi^2}.
The function returns the sum of squared deviations between
V_l and the empirical variance estimates v^{(l)}:
\sum_{l=1}^L \left(V_l - v^{(l)}\right)^2.
Value
A numeric scalar representing the objective value.
Plot multivariate time series with detected change points and estimated means
Description
Visualizes multivariate time series data together with the estimated fluctuating mean sequence and detected change points obtained from the proposed change point detection (CPD) algorithm. Each variable is plotted in a separate panel (facet), with vertical dashed lines marking detected change points and solid curves showing the estimated means when provided.
Usage
plot_FluxPoint(data, muhats, cps, titlename = "", xaxis = "")
Arguments
data |
Numeric matrix of dimension |
muhats |
Optional numeric matrix of the same dimension as 'data',
giving the estimated fluctuating mean sequence
|
cps |
Numeric vector of detected change point locations. |
titlename |
Character string for the plot title. |
xaxis |
Character string for the x-axis label (e.g., "Time"). |
Details
When p = 1, the function produces a single plot displaying the
observed time series, the estimated mean curve, and vertical dashed lines
indicating the detected change points. When p > 1, each variable is
shown in a separate facet with independently scaled y-axes for improved
readability. If muhats is provided, the estimated mean is overlaid
using geom_line(); otherwise, only the observed data and detected
change points are displayed.
Value
A ggplot2 object displaying the time series, estimated means (if provided), and detected change points.
Randomly generate an autoregressive coefficient matrix \Phi
Description
Generates a p \times p autoregressive coefficient matrix
\Phi for the VAR(1) component in the proposed model. The diagonal
entries are randomly chosen from {0.5, -0.5}, and a specified number of
off-diagonal elements are randomly assigned nonzero values to introduce
cross-dependence among variables.
Usage
random_Phi(p, num_nonzero)
Arguments
p |
Integer. Dimension of the square matrix ( |
num_nonzero |
Integer. Target number of nonzero off-diagonal
entries in |
Details
The diagonal elements are sampled independently from the set
\{0.5, -0.5\}. Nonzero off-diagonal entries are then placed at random
positions until the total number of nonzero off-diagonal elements reaches
at least num_nonzero. Each nonzero off-diagonal element has magnitude
0.1 or 0.2 with equal probability and a randomly assigned sign. The resulting
matrix \Phi governs the temporal dependence of the stationary VAR(1)
process
\boldsymbol{\epsilon}_t = \Phi \boldsymbol{\epsilon}_{t-1} +
\boldsymbol{\nu}_t.
Value
A numeric p \times p matrix representing the autoregressive
coefficient matrix \Phi with random diagonal entries in
{0.5, -0.5} and approximately 'num_nonzero' nonzero off-diagonal
elements.
Randomly generate an innovation covariance matrix \Sigma_{\boldsymbol{\nu}}
Description
Generates a symmetric p \times p innovation covariance matrix
\Sigma_{\boldsymbol{\nu}} for the VAR(1) component in the proposed
model. The diagonal elements are fixed at 0.5, and a specified number of
off-diagonal elements are randomly assigned nonzero values to introduce
cross-correlation between variables.
Usage
random_Signu(p, num_nonzero)
Arguments
p |
Integer. Dimension of the covariance matrix ( |
num_nonzero |
Integer. Target number of nonzero off-diagonal entries
(counted individually; both upper and lower triangles are included).
Since nonzero values are inserted in symmetric pairs, an even value is
recommended. The maximum meaningful value is |
Details
Each nonzero off-diagonal entry is placed in symmetric pairs
(i,j) and (j,i) to ensure symmetry of the matrix. The magnitudes
of the nonzero entries are randomly drawn from the set
\{0.1, 0.2\} with randomly assigned signs. The diagonal entries are
fixed at 0.5 to maintain a moderate level of innovation variance.
In the full model, \Sigma_{\boldsymbol{\nu}} governs the variability
of the VAR(1) innovation term \boldsymbol{\nu}_t in
\boldsymbol{\epsilon}_t = \Phi \boldsymbol{\epsilon}_{t-1} + \boldsymbol{\nu}_t.
Value
A numeric symmetric matrix of dimension p \times p representing
\Sigma_{\boldsymbol{\nu}} with diagonal 0.5 and approximately
'num_nonzero' nonzero off-diagonal entries.
Split Coefficient Matrix into VAR Lags
Description
Splits a matrix of estimated coefficients into a list of matrices, each corresponding to one lag of the VAR model.
Usage
splitMatrix(M, p)
Arguments
M |
A numeric matrix of coefficients. |
p |
Integer. The order of the VAR model (number of lags). |
Value
A list of p matrices, each of dimension (number of variables) x (number of variables).
Matrix square root via Armadillo (Rcpp)
Description
Matrix square root via Armadillo (Rcpp)
Usage
sqrtmat(A)
Arguments
A |
Numeric matrix. |
Value
The square root of A as a numeric matrix. If the result is complex due to rounding, the real part is returned.
Transform Data for VAR Estimation
Description
Transforms the input time series data into the design matrices required for VAR estimation. This includes centering, optional scaling, and constructing the lagged predictor matrix.
Usage
transformData(data, p, opt)
Arguments
data |
A numeric matrix or data frame with time series data (observations in rows, variables in columns). |
p |
Integer. The order of the VAR model (number of lags). |
opt |
List. Options for data transformation. Supported options include:
|
Value
A list with the following components:
X |
The design matrix (via the Kronecker product) for lagged predictors. |
y |
A vectorized response corresponding to the lagged data. |
series |
The (centered and possibly scaled) original time series matrix. |
mu |
A row vector of the column means used for centering. |