Type: Package
Title: Extreme Value Modeling for r-Largest Order Statistics
Version: 0.1.0
Description: Tools for extreme value modeling based on the r-largest order statistics framework. The package provides functions for parameter estimation via maximum likelihood, return level estimation with standard errors, profile likelihood-based confidence intervals, random sample generation, and entropy difference tests for selecting the number of order statistics r. Several r-largest order statistics models are implemented, including the four-parameter kappa (rK4D), generalized logistic (rGLO), generalized Gumbel (rGGD), logistic (rLD), and Gumbel (rGD) distributions. The rK4D methodology is described in Shin et al. (2022) <doi:10.1016/j.wace.2022.100533>, the rGLO model in Shin and Park (2024) <doi:10.1007/s00477-023-02642-7>, and the rGGD model in Shin and Park (2025) <doi:10.1038/s41598-024-83273-y>. The underlying distributions are related to the kappa distribution of Hosking (1994) <doi:10.1017/CBO9780511529443>, the generalized logistic distribution discussed by Ahmad et al. (1988) <doi:10.1016/0022-1694(88)90015-7>, and the generalized Gumbel distribution of Jeong et al. (2014) <doi:10.1007/s00477-014-0865-8>. Penalized likelihood approaches for extreme value estimation follow Martins and Stedinger (2000) <doi:10.1029/1999WR900330> and Coles and Dixon (1999) <doi:10.1023/A:1009905222644>. Selection of r is supported using methods discussed in Bader et al. (2017) <doi:10.1007/s11222-016-9697-3>. The package is intended for hydrological, climatological, and environmental extreme value analysis.
License: GPL-3
Encoding: UTF-8
RoxygenNote: 7.3.3
Depends: R (≥ 4.1.0)
Imports: eva, graphics, lmomco, numDeriv, Rsolnp, stats
Suggests: testthat (≥ 3.0.0),
Config/testthat/edition: 3
LazyData: true
URL: https://github.com/yire-shin/evmr
BugReports: https://github.com/yire-shin/evmr/issues
NeedsCompilation: no
Packaged: 2026-03-25 01:20:07 UTC; user
Author: Yire Shin ORCID iD [aut, cre], Jeong-Soo Park ORCID iD [aut, ths]
Maintainer: Yire Shin <shinyire87@gmail.com>
Repository: CRAN
Date/Publication: 2026-03-29 16:40:08 UTC

Bangkok Rainfall Data

Description

Annual top five daily rainfall events recorded in Bangkok, Thailand, from 1961 to 2018. The dataset contains the five largest daily rainfall amounts observed each year.

Usage

bangkok

Format

A data frame with 58 rows and 5 columns:

X1

Largest daily rainfall in the year (mm)

X2

Second largest daily rainfall (mm)

X3

Third largest daily rainfall (mm)

X4

Fourth largest daily rainfall (mm)

X5

Fifth largest daily rainfall (mm)

Details

The data are commonly used for extreme value analysis based on r-largest order statistics.

Each row corresponds to one year from 1961 to 2018 and contains the five largest daily rainfall observations recorded in that year.

Source

Rain gauge station records from Bangkok, Thailand.

References

Shin, Y and Park, J-S. (2023). Modeling climate extremes using the four-parameter kappa distribution for r-largest order statistics.

Examples

data(bangkok)
head(bangkok)


Bevern Stream Flow Data

Description

Annual r-largest stream flow observations from the Bevern River in the UK. The dataset contains the three largest daily stream flow values recorded in each year.

Usage

bevern

Format

A data frame with 52 rows and 4 columns:

Year

Year of observation

r1

Largest daily stream flow in the year

r2

Second largest daily stream flow

r3

Third largest daily stream flow

Details

This dataset is commonly used for extreme value analysis based on r-largest order statistics.

The data represent annual r-largest daily stream flow observations from the Bevern River. Each row corresponds to one year and contains the three largest daily stream flow measurements recorded in that year.

Source

United Kingdom hydrological records. This is the original data source containing the daily stream flow observations.

References

Shin, Y. and Park, J.-S. (2024). Generalized logistic model for r-largest order statistics, with hydrological application.

Examples

data(bevern)
head(bevern)


Fit and Compare r-Largest Order Statistics Models

Description

Fit multiple extreme value models for r-largest order statistics and return a combined summary table including parameter estimates, standard errors, and return levels.

Usage

evmr(data, models = c("rk4d", "rglo", "rggd", "rgd", "rld"), num_inits = 100)

Arguments

data

A vector, matrix, or data frame containing r-largest order statistics.

models

Character vector specifying models to fit.

num_inits

Number of random initial values used in optimization.

Value

A data frame summarizing fitted models.

Examples


x <- rk4dr(n = 50, r = 2, loc = 10, scale = 2, shape1 = 0.1, shape2 = 0.1)
evmr(x$rmat)

data(bangkok)
evmr(bangkok)


Oykel River Stream Flow Data

Description

Annual r-largest daily stream flow observations from the Oykel River in the United Kingdom. The dataset contains the three largest daily stream flow values recorded in each year.

Usage

oykel

Format

A data frame with 42 rows and 4 variables:

Year

Year of observation

r1

Largest daily stream flow in the year

r2

Second largest daily stream flow

r3

Third largest daily stream flow

Details

The data are used for extreme value analysis based on r-largest order statistics models.

Each row represents one year and contains the three largest daily stream flow observations recorded in that year. Missing observations are represented by NA.

Source

United Kingdom hydrological records. This is the original data source containing the daily stream flow data.

References

Shin, Y. and Park, J.-S. (2025). Generalized Gumbel model for r-largest order statistics, with hydrological application.

Examples

data(oykel)
head(oykel)


Quantile Function of the Gumbel Distribution

Description

Computes the quantiles of the Gumbel distribution with location parameter loc and scale parameter scale.

Usage

qgd(p, loc = 0, scale = 1)

Arguments

p

A numeric vector of probabilities in (0,1).

loc

A numeric value specifying the location parameter.

scale

A positive numeric value specifying the scale parameter.

Details

The quantile function of the Gumbel distribution is

Q(p) = \mu - \sigma \log(-\log(p)),

where \mu is the location parameter and \sigma > 0 is the scale parameter.

Value

A numeric vector of quantiles corresponding to p.

Examples

qgd(0.5, loc = 0, scale = 1)
qgd(c(0.1, 0.5, 0.9), loc = 0, scale = 1)

Quantile Function of the Generalized Gumbel Distribution

Description

Computes the quantiles of the generalized Gumbel distribution with location parameter loc, scale parameter scale, and shape parameter shape.

Usage

qggd(p, loc = 0, scale = 1, shape = 0)

Arguments

p

A numeric vector of probabilities in (0,1).

loc

A numeric value specifying the location parameter.

scale

A positive numeric value specifying the scale parameter.

shape

A numeric value specifying the shape parameter.

Details

The quantile function is computed as

Q(p) = \mu - \sigma \log \left( \frac{1 - p^h}{h} \right), \quad h \neq 0,

with the limiting case

Q(p) = \mu - \sigma \log(-\log p), \quad h = 0,

where \mu is the location parameter, \sigma > 0 is the scale parameter, and h is the shape parameter.

Value

A numeric vector of quantiles corresponding to p.

References

Jeong, B.-Y., Murshed, M. S., Seo, Y. A., and Park, J.-S. (2014). A three-parameter kappa distribution with hydrologic application: a generalized Gumbel distribution. Stochastic Environmental Research and Risk Assessment, 28(8), 2063–2074.

Examples

qggd(0.5, loc = 0, scale = 1, shape = 0.1)
qggd(c(0.1, 0.5, 0.9), loc = 0, scale = 1, shape = 0.1)

Quantile Function of the Generalized Logistic Distribution

Description

Computes the quantiles of the generalized logistic distribution with location parameter loc, scale parameter scale, and shape parameter shape.

Usage

qglo(p, loc = 0, scale = 1, shape = 0)

Arguments

p

A numeric vector of probabilities in (0,1).

loc

A numeric value specifying the location parameter.

scale

A positive numeric value specifying the scale parameter.

shape

A numeric value specifying the shape parameter.

Details

The quantile function is computed as

Q(p) = \mu + \frac{\sigma}{\xi}\left[1 - \left(\frac{1-p}{p}\right)^{\xi}\right], \quad \xi \neq 0,

with the limiting case

Q(p) = \mu - \sigma \log\left(\frac{1-p}{p}\right), \quad \xi = 0,

where \mu is the location parameter, \sigma > 0 is the scale parameter, and \xi is the shape parameter.

Value

A numeric vector of quantiles corresponding to p.

References

Ahmad, M. I., Sinclair, C. D., and Werritty, A. (1988). Log-logistic flood frequency analysis. Journal of Hydrology. doi:10.1016/0022-1694(88)90015-7

Examples

qglo(0.5, loc = 0, scale = 1, shape = 0.1)
qglo(c(0.1, 0.5, 0.9), loc = 0, scale = 1, shape = 0.1)

Quantile Function of the Four-Parameter Kappa Distribution

Description

Computes the quantiles of the four-parameter kappa distribution with location parameter loc, scale parameter scale, and shape parameters shape1 and shape2.

Usage

qk4d(p, loc = 0, scale = 1, shape1 = 0.1, shape2 = 0.1)

Arguments

p

A numeric vector of probabilities in (0,1).

loc

A numeric value specifying the location parameter.

scale

A positive numeric value specifying the scale parameter.

shape1

A numeric value specifying the first shape parameter.

shape2

A numeric value specifying the second shape parameter.

Details

The quantile function of the four-parameter kappa distribution is

Q(p) = \mu + \frac{\sigma}{\xi}\left[1 - \left(\frac{1-p^h}{h}\right)^\xi \right],

where \mu is the location parameter, \sigma > 0 is the scale parameter, and \xi and h are shape parameters.

For numerical stability, the limiting cases \xi = 0 and/or h = 0 are handled separately.

Value

A numeric vector of quantiles corresponding to p.

References

Shin, Y., and Park, J.-S.(2023). Modeling climate extremes using the four-parameter kappa distribution for r-largest order statistics. Weather and Climate Extremes. doi:10.1016/j.wace.2022.100533

Hosking, J. R. M. (1994). The four-parameter Kappa distribution. Cambridge University Press.

Examples

qk4d(0.5, loc = 0, scale = 1, shape1 = 0.1, shape2 = 0.1)
qk4d(c(0.1, 0.5, 0.9), loc = 0, scale = 1, shape1 = 0.1, shape2 = 0.1)

Quantile Function of the Logistic Distribution

Description

Computes the quantiles of the logistic distribution with location parameter loc and scale parameter scale.

Usage

qld(p, loc = 0, scale = 1)

Arguments

p

A numeric vector of probabilities in (0,1).

loc

A numeric value specifying the location parameter.

scale

A positive numeric value specifying the scale parameter.

Details

The quantile function of the logistic distribution is

Q(p) = \mu + \sigma \log\left(\frac{p}{1-p}\right),

where \mu is the location parameter and \sigma > 0 is the scale parameter.

Value

A numeric vector of quantiles corresponding to p.

Examples

qld(0.5, loc = 0, scale = 1)
qld(c(0.1, 0.5, 0.9), loc = 0, scale = 1)

Fit the Gumbel Distribution to r-Largest Order Statistics

Description

Fits the Gumbel distribution to r-largest order statistics using maximum likelihood estimation. Stationary and non-stationary models are supported through generalized linear modelling of the location and scale parameters.

Usage

rgd.fit(
  xdat,
  r = NULL,
  ydat = NULL,
  mul = NULL,
  sigl = NULL,
  mulink = identity,
  siglink = identity,
  num_inits = 100,
  muinit = NULL,
  siginit = NULL,
  show = TRUE,
  method = "Nelder-Mead",
  maxit = 10000,
  ...
)

Arguments

xdat

A numeric vector, matrix, or data frame of observations. Each row should contain decreasing order statistics for a given year or block. The first column therefore contains block maxima. Only the first r columns are used in the fitted model. If r is NULL, all available columns are used. If some rows contain fewer order statistics than others, missing values should be appended at the end of the corresponding rows.

r

The number of largest order statistics to use in the fitted model. If NULL, all columns of xdat are used.

ydat

A matrix or data frame of covariates for non-stationary modelling of the parameters, or NULL for a stationary model. The number of rows must match the number of rows of xdat.

mul, sigl

Integer vectors indicating which columns of ydat are used as covariates for the location and scale parameters, respectively. Use NULL for stationary parameters.

mulink, siglink

Inverse link functions for the location and scale parameters, respectively.

num_inits

The number of initial parameter sets used in the optimization.

muinit, siginit

Numeric vectors giving initial values for the location and scale parameters. If NULL, default initial values based on L-moments are used.

show

Logical. If TRUE, details of the fitted model are printed.

method

Optimization method passed to optim for stationary fits.

maxit

Maximum number of iterations for optim.

...

Additional control arguments passed to the optimizer.

Value

A list with components including:

trans

Logical; TRUE if a non-stationary model is fitted.

model

A list containing mul and sigl.

link

A character string describing the inverse link functions.

conv

The convergence code returned by the optimizer. A value of 0 indicates successful convergence for optim.

nllh

The negative log-likelihood evaluated at the fitted parameters.

data

The data used in the fit.

mle

The maximum likelihood estimates.

cov

The estimated covariance matrix.

se

The estimated standard errors.

vals

A matrix containing fitted values of the location and scale parameters at each observation.

r

The number of order statistics used in the fitted model.

References

Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.

Shin, Y., & Park, J.-S. (2025). Generalized Gumbel model for r-largest order statistics with application to peak streamflow. Scientific Reports. doi:10.1038/s41598-024-83273-y

See Also

optim

Examples

x <- rgdr(n = 50, r = 2, loc = 10, scale = 2)
fit <- rgd.fit(x$rmat)

Profile Likelihood for Return Levels under the rGD Model

Description

Computes and plots the profile log-likelihood for a return level under a stationary r-largest Gumbel distribution model fitted by rgd.fit().

Usage

rgd.prof(z, m, xlow, xup, conf = 0.95, nint = 100)

Arguments

z

An object returned by rgd.fit. The fitted model must be stationary.

m

A return period greater than 1. The profile likelihood is computed for the corresponding return level exceeded with probability 1/m.

xlow, xup

The lower and upper bounds of the return level grid over which the profile likelihood is evaluated.

conf

A numeric vector of confidence levels for profile likelihood confidence intervals.

nint

The number of grid points used to evaluate the profile likelihood.

Details

The function evaluates the profile log-likelihood over a grid of return level values and plots the resulting curve. Horizontal and vertical lines are added to indicate profile likelihood confidence intervals for the confidence levels specified in conf.

Value

A data frame containing the return period, estimated return level, confidence level, lower confidence limit, upper confidence limit, and interval width. A profile likelihood plot is also produced.

See Also

rgd.fit, rgd.rl

Examples

x <- rgdr(n = 50, r = 2, loc = 10, scale = 2)
fit <- rgd.fit(x$rmat)
rgd.prof(fit, m = 100, xlow = 12, xup = 25)

Return Levels for the Gumbel Distribution

Description

Computes return levels and their standard errors for a stationary Gumbel model fitted by rgd.fit.

Usage

rgd.rl(z, year = c(20, 50, 100, 200), show = TRUE)

Arguments

z

An object returned by rgd.fit. The fitted model should represent a stationary model.

year

A numeric vector of return periods for which return levels are to be computed.

show

Logical. If TRUE, the estimated return levels and their standard errors are printed.

Details

For a return period T, the return level is defined as the quantile exceeded with probability 1/T. Under the Gumbel distribution, the return level is

x_T = \mu - \sigma \log\{-\log(1 - 1/T)\}.

Standard errors are obtained using the delta method.

Value

The input object z with two additional components:

rl

A numeric vector of estimated return levels.

rlse

A numeric vector of standard errors of the estimated return levels.

See Also

rgd.fit, rgd.prof

Examples

x <- rgdr(n = 50, r = 2, loc = 10, scale = 2)
fit <- rgd.fit(x$rmat)
out <- rgd.rl(fit, year = c(20, 50, 100, 200))

Summary of Fitted rGD Models over Different Values of r

Description

Summarizes fitted Gumbel distribution models for r-largest order statistics over r = 1, \dots, R. For each value of r, the function fits the model using rgd.fit and computes return levels using rgd.rl.

Usage

rgd.summary(
  data,
  r = NULL,
  ydat = NULL,
  mul = NULL,
  sigl = NULL,
  mulink = identity,
  siglink = identity,
  num_inits = 100,
  muinit = NULL,
  siginit = NULL,
  show = FALSE,
  method = "Nelder-Mead",
  maxit = 10000,
  ...
)

Arguments

data

A numeric vector, matrix, or data frame containing the r-largest order statistics. Each row should contain decreasing order statistics for one block or time period.

r

Optional integer giving the maximum number of order statistics to summarize. If NULL, all available columns are used.

ydat

A matrix or data frame of covariates for generalized linear modelling of the parameters, or NULL for stationary fitting.

mul, sigl

Integer vectors indicating which columns of ydat are used as covariates for the location and scale parameters, respectively.

mulink, siglink

Inverse link functions for the location and scale parameters, respectively.

num_inits

Number of initial parameter sets used in optimization.

muinit, siginit

Optional initial values for the location and scale parameters.

show

Logical. If TRUE, print details from model fitting.

method

Optimization method passed to optim.

maxit

Maximum number of iterations for optimization.

...

Additional arguments passed to rgd.fit.

Value

A data frame containing:

Examples

x <- rgdr(n = 50, r = 2, loc = 10, scale = 2)
rgd.summary(x$rmat)

Random Generation from the Gumbel Distribution for r-Largest Order Statistics

Description

Generates random samples from the Gumbel distribution for r-largest order statistics.

Usage

rgdr(n, r, loc = 0, scale = 1)

Arguments

n

A positive integer specifying the number of observations.

r

A positive integer specifying the number of order statistics for each observation.

loc

A numeric value specifying the location parameter.

scale

A positive numeric value specifying the scale parameter.

Details

The function first generates independent uniform random variables and then constructs decreasing variables through cumulative products. These are transformed using the Gumbel quantile function qgd.

Value

A list with components:

umat

An n x r matrix of independent uniform random numbers.

wmat

An n x r matrix of transformed uniform variables used to construct decreasing order statistics.

rmat

An n x r matrix of simulated r-largest order statistics from the Gumbel distribution.

Examples

x <- rgdr(n=10, r=3, loc = 0, scale = 1)
x$rmat

Fit the Generalized Gumbel Distribution to r-Largest Order Statistics

Description

Fits the generalized Gumbel distribution to r-largest order statistics using maximum likelihood estimation. Stationary and non-stationary models are supported through generalized linear modelling of the location, scale, and shape parameters.

Usage

rggd.fit(
  xdat,
  r = NULL,
  ydat = NULL,
  mul = NULL,
  sigl = NULL,
  hl = NULL,
  mulink = identity,
  siglink = identity,
  hlink = identity,
  num_inits = 100,
  muinit = NULL,
  siginit = NULL,
  hinit = NULL,
  show = TRUE,
  method = "Nelder-Mead",
  maxit = 10000,
  ...
)

Arguments

xdat

A numeric vector, matrix, or data frame of observations. Each row should contain decreasing order statistics for a given year or block. The first column therefore contains block maxima. Only the first r columns are used in the fitted model. If r is NULL, all available columns are used. If some rows contain fewer order statistics than others, missing values should be appended at the end of the corresponding rows.

r

The number of largest order statistics to use in the fitted model. If NULL, all columns of xdat are used.

ydat

A matrix or data frame of covariates for non-stationary modelling of the parameters, or NULL for a stationary model. The number of rows must match the number of rows of xdat.

mul, sigl, hl

Integer vectors indicating which columns of ydat are used as covariates for the location, scale, and shape parameters, respectively. Use NULL for stationary parameters.

mulink, siglink, hlink

Inverse link functions for the location, scale, and shape parameters, respectively.

num_inits

The number of initial parameter sets used in the optimization.

muinit, siginit, hinit

Numeric vectors giving initial values for the location, scale, and shape parameters. If NULL, default initial values based on L-moments are used.

show

Logical. If TRUE, details of the fitted model are printed.

method

Optimization method passed to optim for stationary fits.

maxit

Maximum number of iterations for optim.

...

Additional control arguments passed to the optimizer.

Value

A list with components including:

trans

Logical; TRUE if a non-stationary model is fitted.

model

A list containing mul, sigl, and hl.

link

A character vector describing the inverse link functions.

conv

The convergence code returned by the optimizer.

nllh

The negative log-likelihood evaluated at the fitted parameters.

data

The data used in the fit.

mle

The maximum likelihood estimates.

cov

The estimated covariance matrix when available.

se

The estimated standard errors when available.

vals

A matrix containing fitted values of the location, scale, and shape parameters at each observation.

r

The number of order statistics used in the fitted model.

#' @references Shin, Y., & Park, J.-S. (2025). Generalized Gumbel model for r-largest order statistics with application to peak streamflow. Scientific Reports. doi:10.1038/s41598-024-83273-y

See Also

optim

Examples

x <- rggdr(n = 50, r = 2, loc = 10, scale = 2, shape = 0.1)
fit <- rggd.fit(x$rmat)

Profile Likelihood for Return Levels under the rGGD Model

Description

Computes and plots the profile log-likelihood for a return level under a stationary r-largest generalized Gumbel distribution (rGGD) model fitted by rggd.fit.

Usage

rggd.prof(z, m, xlow, xup, conf = 0.95, nint = 100)

Arguments

z

An object returned by rggd.fit. The fitted model must represent a stationary model.

m

A return period greater than 1. The profile likelihood is computed for the corresponding return level exceeded with probability 1/m.

xlow, xup

Lower and upper bounds of the return level grid over which the profile likelihood is evaluated.

conf

A numeric vector of confidence levels for profile likelihood confidence intervals.

nint

The number of grid points used to evaluate the profile likelihood.

Details

The function evaluates the profile log-likelihood over a grid of return level values and plots the resulting curve. Horizontal and vertical lines are added to indicate profile likelihood confidence intervals for the confidence levels specified in conf.

#' @references Shin, Y., & Park, J.-S. (2025). Generalized Gumbel model for r-largest order statistics with application to peak streamflow. Scientific Reports. doi:10.1038/s41598-024-83273-y

Value

A data frame containing the return period, estimated return level, confidence level, lower confidence limit, upper confidence limit, and interval width. A profile likelihood plot is also produced.

See Also

rggd.fit, rggd.rl

Examples

x <- rggdr(n = 50, r = 2, loc = 10, scale = 2, shape = 0.1)
fit <- rggd.fit(x$rmat)
rggd.prof(fit, m = 100, xlow = 12, xup = 30)

Return Levels for the Generalized Gumbel Distribution

Description

Computes return levels and their standard errors for a stationary generalized Gumbel model fitted by rggd.fit.

Usage

rggd.rl(z, year = c(20, 50, 100, 200), show = TRUE)

Arguments

z

An object returned by rggd.fit. The fitted model should represent a stationary model.

year

A numeric vector of return periods for which return levels are to be computed.

show

Logical. If TRUE, the estimated return levels and their standard errors are printed.

Details

For a return period T, the return level is defined as the quantile exceeded with probability 1/T. Under the generalized Gumbel distribution, the return level is

x_T = \mu - \sigma \log\left(\frac{1-(1-1/T)^h}{h}\right), \quad h \neq 0.

Standard errors are obtained using the delta method.

Value

The input object z with two additional components:

rl

A numeric vector of estimated return levels.

rlse

A numeric vector of standard errors of the estimated return levels.

See Also

rggd.fit, rggd.prof

Examples

x <- rggdr(n = 50, r = 2, loc = 10, scale = 2, shape = 0.1)
fit <- rggd.fit(x$rmat)
out <- rggd.rl(fit, year = c(20, 50, 100, 200))

Summary of Fitted rGGD Models over Different Values of r

Description

Summarizes fitted generalized Gumbel distribution models for r-largest order statistics over r = 1, \dots, R. For each value of r, the function fits the model using rggd.fit and computes return levels using rggd.rl.

Usage

rggd.summary(
  data,
  r = NULL,
  ydat = NULL,
  mul = NULL,
  sigl = NULL,
  hl = NULL,
  mulink = identity,
  siglink = identity,
  hlink = identity,
  num_inits = 100,
  muinit = NULL,
  siginit = NULL,
  hinit = NULL,
  show = FALSE,
  method = "Nelder-Mead",
  maxit = 10000,
  ...
)

Arguments

data

A numeric vector, matrix, or data frame containing the r-largest order statistics. Each row should contain decreasing order statistics for one block or time period.

r

Optional integer giving the maximum number of order statistics to summarize. If NULL, all available columns are used.

ydat

A matrix or data frame of covariates for generalized linear modelling of the parameters, or NULL for stationary fitting.

mul, sigl, hl

Integer vectors indicating which columns of ydat are used for the location, scale, and shape parameters, respectively.

mulink, siglink, hlink

Inverse link functions for the location, scale, and shape parameters, respectively.

num_inits

Number of initial parameter sets used in optimization.

muinit, siginit, hinit

Optional initial values for the location, scale, and shape parameters.

show

Logical. If TRUE, print details from model fitting.

method

Optimization method passed to optim.

maxit

Maximum number of iterations for optimization.

...

Additional arguments passed to rggd.fit.

Value

A data frame containing:

Examples

x <- rggdr(n = 50, r = 2, loc = 10, scale = 2, shape = 0.1)
rggd.summary(x$rmat)

Entropy Difference Test for rGGD Models

Description

Performs the entropy difference (ED) test for selecting the number of order statistics in the r-largest generalized Gumbel distribution (rGGD) model.

Usage

rggdEd(data)

Arguments

data

A numeric matrix or data frame containing the r-largest order statistics. Each row represents one block or observation, and columns must be ordered from largest to smallest.

Details

The test compares the entropy of models fitted with r and r-1 order statistics and evaluates whether the additional order statistic provides significant information.

This function fits the rGGD model using rggd.fit and then computes the entropy difference test statistic by comparing the fitted likelihood contributions from models with r and r-1 order statistics.

Value

A list containing:

References

Shin, Y., & Park, J.-S. (2025). Generalized Gumbel model for r-largest order statistics with application to peak streamflow. Scientific Reports. doi:10.1038/s41598-024-83273-y

Bader, B., Yan, J., & Zhang, X. (2017). Automated selection of r for the r-largest order statistics approach. Statistics and Computing. doi:10.1007/s11222-016-9697-3

Examples

x <- rggdr(n=50, r=3, loc = 10, scale = 2, shape = 0.1)
rggdEd(x$rmat)

Sequential Entropy Difference Test for rGGD Models

Description

Performs the sequential entropy difference (ED) test for selecting the number of order statistics in the r-largest generalized Gumbel distribution (rGGD) model.

Usage

rggdEdtest(data)

Arguments

data

A numeric matrix or data frame containing the r-largest order statistics. Each row represents one observation (or block), and columns must be ordered from largest to smallest.

Details

The procedure computes ED tests sequentially for r = 2, \dots, R and applies the ForwardStop and StrongStop stopping rules to control the false discovery rate.

The function sequentially applies the entropy difference test (rggdEd) for increasing values of r. The columns of data must represent decreasing order statistics within each row, with the first column containing the block maximum. The resulting p-values are adjusted using the ForwardStop and StrongStop procedures to help determine an appropriate value of r.

Value

A data frame containing:

References

Shin, Y., & Park, J.-S. (2025). Generalized Gumbel model for r-largest order statistics with application to peak streamflow. Scientific Reports. doi:10.1038/s41598-024-83273-y

Bader, B., Yan, J., & Zhang, X. (2017). Automated selection of r for the r-largest order statistics approach. Statistics and Computing. doi:10.1007/s11222-016-9697-3

See Also

rggdEd, rggd.fit

Examples



x <- rggdr(n=50, r=3, loc = 10, scale = 2, shape = 0.1)
rggdEdtest(x$rmat)

#' data(bangkok)
rggdEdtest(bangkok)


Negative Log-Likelihood for the rGGD Model

Description

Computes the negative log-likelihood for the r-largest generalized Gumbel distribution (rGGD) model.

Usage

rggdLh(data, par)

Arguments

data

A numeric vector, matrix, or data frame of observations. If a vector is supplied, it is treated as a one-column matrix. If a matrix or data frame is supplied, each row is treated as one observation and columns represent decreasing order statistics.

par

A numeric vector of length 3 giving the location, scale, and shape parameters, respectively.

Details

This function is intended for internal likelihood evaluation in optimization. Invalid parameter combinations return Inf rather than stopping with an error, which makes the function more robust when used inside optimizers such as optim.

#' @references Shin, Y., & Park, J.-S. (2025). Generalized Gumbel model for r-largest order statistics with application to peak streamflow. Scientific Reports. doi:10.1038/s41598-024-83273-y

Value

A single numeric value giving the negative log-likelihood. If the parameter combination is invalid, the function returns Inf.

Examples

x <- rggdr(n=50, r=2, loc = 10, scale = 2, shape = 0.1)
fit <- rggd.fit(x$rmat, num_inits = 5)
rggdLh(data=fit$data,par=fit$mle)

Random Generation from the Generalized Gumbel Distribution for r-Largest Order Statistics

Description

Generates random samples from the generalized Gumbel distribution for r-largest order statistics.

Usage

rggdr(n, r, loc = 0, scale = 1, shape = 0.1)

Arguments

n

A positive integer specifying the number of observations.

r

A positive integer specifying the number of order statistics for each observation.

loc

A numeric value specifying the location parameter.

scale

A positive numeric value specifying the scale parameter.

shape

A numeric value specifying the shape parameter.

Details

The function first generates independent uniform random variables and then constructs decreasing variables through recursive transformations depending on the shape parameter. These are transformed using the generalized Gumbel quantile function qggd.

For valid generation, the shape parameter must satisfy 1 - (j-1)h > 0 for j = 2, \dots, r, which implies h < 1/(r-1) when r > 1.

Value

A list with components:

umat

An n x r matrix of independent uniform random numbers.

wmat

An n x r matrix of transformed uniform variables used to construct decreasing order statistics.

rmat

An n x r matrix of simulated r-largest order statistics from the generalized Gumbel distribution.

Examples

x <- rggdr(n=10, r=3, loc = 10, scale = 2, shape = 0.1)
x$rmat

Fit the Generalized Logistic Distribution to r-Largest Order Statistics

Description

Fits the generalized logistic distribution to r-largest order statistics using maximum likelihood estimation. Stationary and non-stationary models are supported through generalized linear modelling of the location, scale, and shape parameters.

Usage

rglo.fit(
  xdat,
  r = NULL,
  ydat = NULL,
  mul = NULL,
  sigl = NULL,
  shl = NULL,
  mulink = identity,
  siglink = identity,
  shlink = identity,
  num_inits = 100,
  muinit = NULL,
  siginit = NULL,
  shinit = NULL,
  show = TRUE,
  method = "Nelder-Mead",
  maxit = 10000,
  ...
)

Arguments

xdat

A numeric vector, matrix, or data frame of observations. Each row should contain decreasing order statistics for a given year or block. The first column therefore contains block maxima. Only the first r columns are used in the fitted model. If r is NULL, all available columns are used. If some rows contain fewer order statistics than others, missing values should be appended at the end of the corresponding rows.

r

The number of largest order statistics to use in the fitted model. If NULL, all columns of xdat are used.

ydat

A matrix or data frame of covariates for non-stationary modelling of the parameters, or NULL for a stationary model. The number of rows must match the number of rows of xdat.

mul, sigl, shl

Integer vectors indicating which columns of ydat are used as covariates for the location, scale, and shape parameters, respectively. Use NULL for stationary parameters.

mulink, siglink, shlink

Inverse link functions for the location, scale, and shape parameters, respectively.

num_inits

The number of initial parameter sets used in the optimization.

muinit, siginit, shinit

Numeric vectors giving initial values for the location, scale, and shape parameters. If NULL, default initial values based on L-moments are used.

show

Logical. If TRUE, details of the fitted model are printed.

method

Optimization method passed to optim for stationary fits.

maxit

Maximum number of iterations for optim.

...

Additional control arguments passed to the optimizer.

Value

A list with components including:

trans

Logical; TRUE if a non-stationary model is fitted.

model

A list containing mul, sigl, and shl.

link

A character vector describing the inverse link functions.

conv

The convergence code returned by the optimizer.

nllh

The negative log-likelihood evaluated at the fitted parameters.

data

The data used in the fit.

mle

The maximum likelihood estimates.

cov

The estimated covariance matrix when available.

se

The estimated standard errors when available.

vals

A matrix containing fitted values of the location, scale, and shape parameters at each observation.

r

The number of order statistics used in the fitted model.

References

Ahmad, M. I., Sinclair, C. D., and Werritty, A. (1988). Log-logistic flood frequency analysis. Journal of Hydrology. doi:10.1016/0022-1694(88)90015-7

Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.

Shin, Y., & Park, J-S. (2024). Generalized logistic model for r-largest order statistics with hydrological application. Stochastic Environmental Research and Risk Assessment. doi:10.1007/s00477-023-02642-7

See Also

optim

Examples

x <- rglor(n = 50, r = 3, loc = 10, scale = 2, shape = 0.1)
fit <- rglo.fit(x$rmat, num_inits = 5)

Profile Likelihood for Return Levels under the rGLO Model

Description

Computes and plots the profile log-likelihood for a return level under a stationary r-largest generalized logistic distribution (rGLO) model fitted by rglo.fit.

Usage

rglo.prof(z, m, xlow, xup, conf = 0.95, nint = 100)

Arguments

z

An object returned by rglo.fit. The fitted model must represent a stationary model.

m

A return period greater than 1. The profile likelihood is computed for the corresponding return level exceeded with probability 1/m.

xlow, xup

Lower and upper bounds of the return level grid over which the profile likelihood is evaluated.

conf

A numeric vector of confidence levels for profile likelihood confidence intervals.

nint

The number of grid points used to evaluate the profile likelihood.

Details

The function evaluates the profile log-likelihood over a grid of return level values and plots the resulting curve. Horizontal and vertical lines are added to indicate profile likelihood confidence intervals for the confidence levels specified in conf.

Value

A data frame containing the return period, estimated return level, confidence level, lower confidence limit, upper confidence limit, and interval width. A profile likelihood plot is also produced.

References

Ahmad, M. I., Sinclair, C. D., and Werritty, A. (1988). Log-logistic flood frequency analysis. Journal of Hydrology. doi:10.1016/0022-1694(88)90015-7

Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.

Shin, Y., & Park, J-S. (2024). Generalized logistic model for r-largest order statistics with hydrological application. Stochastic Environmental Research and Risk Assessment. doi:10.1007/s00477-023-02642-7

See Also

rglo.fit, rglo.rl

Examples


x <- rglor(n = 50, r = 2, loc = 10, scale = 2, shape = 0.1)
fit <- rglo.fit(x$rmat)
rglo.prof(fit, m = 100, xlow = 12, xup = 25)


Return Levels for the Generalized Logistic Distribution

Description

Computes return levels and their standard errors for a stationary generalized logistic model fitted by rglo.fit.

Usage

rglo.rl(z, year = c(20, 50, 100, 200), show = TRUE)

Arguments

z

An object returned by rglo.fit. The fitted model should represent a stationary model.

year

A numeric vector of return periods for which return levels are to be computed.

show

Logical. If TRUE, the estimated return levels and their standard errors are printed.

Details

For a return period T, the return level is defined as the quantile exceeded with probability 1/T. Under the generalized logistic distribution, the return level is

x_T = \mu + \frac{\sigma}{\xi} \left[1 - \left(\frac{1 - 1/T}{1/T}\right)^{-\xi}\right],

which is equivalently written in the implementation as

x_T = \mu + \frac{\sigma}{\xi} - \frac{\sigma}{\xi} \left(\frac{1/T}{1 - 1/T}\right)^{\xi}.

Standard errors are obtained using the delta method.

Value

The input object z with two additional components:

rl

A numeric vector of estimated return levels.

rlse

A numeric vector of standard errors of the estimated return levels.

References

Ahmad, M. I., Sinclair, C. D., and Werritty, A. (1988). Log-logistic flood frequency analysis. Journal of Hydrology. doi:10.1016/0022-1694(88)90015-7

Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.

Shin, Y., & Park, J-S. (2024). Generalized logistic model for r-largest order statistics with hydrological application. Stochastic Environmental Research and Risk Assessment. doi:10.1007/s00477-023-02642-7

See Also

rglo.fit, rglo.prof

Examples

x <- rglor(n = 50, r = 3, loc = 10, scale = 2, shape = 0.1)
fit <- rglo.fit(x$rmat)
out <- rglo.rl(fit, year = c(20, 50, 100, 200))

Summary of Fitted rGLO Models over Different Values of r

Description

Summarizes fitted generalized logistic distribution models for r-largest order statistics over r = 1, \dots, R. For each value of r, the function fits the model using rglo.fit and computes return levels using rglo.rl.

Usage

rglo.summary(
  data,
  r = NULL,
  ydat = NULL,
  mul = NULL,
  sigl = NULL,
  shl = NULL,
  mulink = identity,
  siglink = identity,
  shlink = identity,
  num_inits = 100,
  muinit = NULL,
  siginit = NULL,
  shinit = NULL,
  show = FALSE,
  method = "Nelder-Mead",
  maxit = 10000,
  ...
)

Arguments

data

A numeric vector, matrix, or data frame containing the r-largest order statistics. Each row should contain decreasing order statistics for one block or time period.

r

Optional integer giving the maximum number of order statistics to summarize. If NULL, all available columns are used.

ydat

A matrix or data frame of covariates for generalized linear modelling of the parameters, or NULL for stationary fitting.

mul, sigl, shl

Integer vectors indicating which columns of ydat are used for the location, scale, and shape parameters, respectively.

mulink, siglink, shlink

Inverse link functions for the location, scale, and shape parameters, respectively.

num_inits

Number of initial parameter sets used in optimization.

muinit, siginit, shinit

Optional initial values for the location, scale, and shape parameters.

show

Logical. If TRUE, print details from model fitting.

method

Optimization method passed to optim.

maxit

Maximum number of iterations for optimization.

...

Additional arguments passed to rglo.fit.

Value

A data frame containing:

References

Ahmad, M. I., Sinclair, C. D., and Werritty, A. (1988). Log-logistic flood frequency analysis. Journal of Hydrology. doi:10.1016/0022-1694(88)90015-7

Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.

Shin, Y., & Park, J-S. (2024). Generalized logistic model for r-largest order statistics with hydrological application. Stochastic Environmental Research and Risk Assessment. doi:10.1007/s00477-023-02642-7

Examples

x <- rglor(n = 50, r = 3, loc = 10, scale = 2, shape = 0.1)
rglo.summary(x$rmat, num_inits = 5)


Entropy Difference Test for rGLO Models

Description

Performs the entropy difference (ED) test for selecting the number of order statistics in the r-largest generalized logistic distribution (rGLO) model.

Usage

rgloEd(data, par = NULL)

Arguments

data

A numeric matrix or data frame containing the r-largest order statistics. Each row represents one block or observation, and columns must be ordered from largest to smallest.

par

An optional numeric vector of length 3 giving the location, scale, and shape parameters. If NULL, the parameters are estimated using rglo.fit.

Details

The test compares the entropy of models fitted with r and r-1 order statistics and evaluates whether the additional order statistic provides significant information.

This function applies the entropy difference test to the r-largest generalized logistic model. If par is not supplied, the model parameters are first estimated using rglo.fit.

Value

A list containing:

References

Bader, B., Yan, J., & Zhang, X. (2017). Automated selection of r for the r-largest order statistics approach. Statistics and Computing. doi:10.1007/s11222-016-9697-3

Shin, Y., & Park, J-S. (2024). Generalized logistic model for r-largest order statistics with hydrological application. Stochastic Environmental Research and Risk Assessment. doi:10.1007/s00477-023-02642-7

See Also

rglo.fit, rgloLh

Examples


x <- rglor(n=50, r=3, loc = 10, scale = 2, shape = 0.1)
rgloEd(x$rmat)


Sequential Entropy Difference Test for rGLO Models

Description

Performs the sequential entropy difference (ED) test for selecting the number of order statistics in the r-largest generalized logistic distribution (rGLO) model.

Usage

rgloEdtest(data, par = NULL)

Arguments

data

A numeric matrix or data frame containing the r-largest order statistics. Each row represents one observation (or block), and columns must be ordered from largest to smallest.

par

An optional numeric vector of length 3 giving the location, scale, and shape parameters. If NULL, parameters are estimated separately at each value of r using rgloEd.

Details

The procedure computes ED tests sequentially for r = 2, \dots, R and applies the ForwardStop and StrongStop stopping rules to control the false discovery rate.

The function sequentially applies the entropy difference test (rgloEd) for increasing values of r. The resulting p-values are adjusted using the ForwardStop and StrongStop procedures to help determine an appropriate value of r.

Value

A data frame containing:

References

Ahmad, M. I., Sinclair, C. D., and Werritty, A. (1988). Log-logistic flood frequency analysis. Journal of Hydrology. doi:10.1016/0022-1694(88)90015-7

Bader, B., Yan, J., & Zhang, X. (2017). Automated selection of r for the r-largest order statistics approach. Statistics and Computing. doi:10.1007/s11222-016-9697-3

Shin, Y., & Park, J-S. (2024). Generalized logistic model for r-largest order statistics with hydrological application. Stochastic Environmental Research and Risk Assessment. doi:10.1007/s00477-023-02642-7

See Also

rgloEd, rglo.fit

Examples


x <- rglor(n=50, r=3, loc = 10, scale = 2, shape = 0.1)
rgloEdtest(x$rmat)

data(bangkok)
rgloEdtest(bangkok)


Log-Likelihood Contributions for the rGLO Model

Description

Computes the observation-wise log-likelihood contributions for the r-largest generalized logistic distribution (rGLO) model.

Usage

rgloLh(data, par)

Arguments

data

A numeric vector, matrix, or data frame of observations. If a vector is supplied, it is treated as a one-column matrix. If a matrix or data frame is supplied, each row is treated as one observation and columns represent decreasing order statistics.

par

A numeric vector of length 3 giving the location, scale, and shape parameters, respectively.

Details

This function is mainly intended for internal likelihood evaluation. Invalid parameter combinations return Inf, which is often more robust than stopping with an error when used inside iterative procedures.

Value

A numeric vector of log-likelihood contributions, one for each row of data. If the parameter combination is invalid, the function returns Inf.

References

Ahmad, M. I., Sinclair, C. D., and Werritty, A. (1988). Log-logistic flood frequency analysis. Journal of Hydrology. doi:10.1016/0022-1694(88)90015-7

Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.

Shin, Y., & Park, J-S. (2024). Generalized logistic model for r-largest order statistics with hydrological application. Stochastic Environmental Research and Risk Assessment. doi:10.1007/s00477-023-02642-7

Examples

x <- rglor(n=50, r=3, loc = 10, scale = 2, shape = 0.1)
fit <- rglo.fit(x$rmat, num_inits = 5)
rgloLh(data=fit$data,par=fit$mle)

Random Generation from the Generalized Logistic Distribution for r-Largest Order Statistics

Description

Generates random samples from the generalized logistic distribution for r-largest order statistics.

Usage

rglor(n, r, loc = 0, scale = 1, shape = 0.1)

Arguments

n

A positive integer specifying the number of observations.

r

A positive integer specifying the number of order statistics for each observation.

loc

A numeric value specifying the location parameter.

scale

A positive numeric value specifying the scale parameter.

shape

A numeric value specifying the shape parameter.

Details

The function first generates independent uniform random variables and then constructs decreasing variables through recursive transformations. These are transformed using the generalized logistic quantile function qglo.

Value

A list with components:

umat

An n x r matrix of independent uniform random numbers.

wmat

An n x r matrix of transformed uniform variables used to construct decreasing order statistics.

rmat

An n x r matrix of simulated r-largest order statistics from the generalized logistic distribution.

References

Bader, B., Yan, J., & Zhang, X. (2017). Automated selection of r for the r-largest order statistics approach. Statistics and Computing. doi:10.1007/s11222-016-9697-3

Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.

Shin, Y., & Park, J-S. (2024). Generalized logistic model for r-largest order statistics with hydrological application. Stochastic Environmental Research and Risk Assessment. doi:10.1007/s00477-023-02642-7

Examples

x <- rglor(10, 3, loc = 0, scale = 1, shape = 0.1)
x$rmat

Fit the Four-Parameter Kappa Distribution to r-Largest Order Statistics

Description

Fits the four-parameter kappa distribution to r-largest order statistics using maximum likelihood estimation. Stationary and non-stationary models are supported through generalized linear modelling of the location, scale, and two shape parameters.

Usage

rk4d.fit(
  xdat,
  r = NULL,
  penk = NULL,
  penh = NULL,
  ydat = NULL,
  mul = NULL,
  sigl = NULL,
  shl = NULL,
  hl = NULL,
  mulink = identity,
  siglink = identity,
  shlink = identity,
  hlink = identity,
  num_inits = 100,
  muinit = NULL,
  siginit = NULL,
  shinit = NULL,
  hinit = NULL,
  show = TRUE,
  method = "Nelder-Mead",
  maxit = 10000,
  ...
)

Arguments

xdat

A numeric vector, matrix, or data frame of observations. Each row should contain decreasing order statistics for a given year or block. The first column therefore contains block maxima. Only the first r columns are used in the fitted model. If r is NULL, all available columns are used.

r

The number of largest order statistics to use in the fitted model. If NULL, all columns of xdat are used.

penk

Optional penalty for the first shape parameter. Supported values include "CD" and "MS".

penh

Optional penalty for the second shape parameter. Supported values include "MS" and "MSa".

ydat

A matrix or data frame of covariates for non-stationary modelling of the parameters, or NULL for a stationary model. The number of rows must match the number of rows of xdat.

mul, sigl, shl, hl

Integer vectors indicating which columns of ydat are used as covariates for the location, scale, first shape, and second shape parameters, respectively.

mulink, siglink, shlink, hlink

Inverse link functions for the location, scale, first shape, and second shape parameters, respectively.

num_inits

The number of initial parameter sets used in the optimization.

muinit, siginit, shinit, hinit

Numeric vectors giving initial values for the location, scale, first shape, and second shape parameters. If NULL, default initial values based on L-moments are used.

show

Logical. If TRUE, details of the fitted model are printed.

method

Optimization method passed to optim for stationary fits.

maxit

Maximum number of iterations for optim.

...

Additional control arguments passed to the optimizer.

Value

A list with components including:

trans

Logical; TRUE if a non-stationary model is fitted.

model

A list containing mul, sigl, shl, and hl.

link

A character vector describing the inverse link functions.

conv

The convergence code returned by the optimizer.

nllh

The negative log-likelihood evaluated at the fitted parameters.

data

The data used in the fit.

mle

The maximum likelihood estimates.

cov

The estimated covariance matrix when available.

se

The estimated standard errors when available.

vals

A matrix containing fitted values of the location, scale, first shape, and second shape parameters at each observation.

r

The number of order statistics used in the fitted model.

References

Hosking, J. R. M. (1994). The four-parameter kappa distribution. IBM Journal of Research and Development, 38(3), 251–258.

Martins, E. S., & Stedinger, J. R. (2000). Generalized maximum-likelihood generalized extreme-value quantile estimators for hydrologic data. Water Resources Research, 36(3), 737–744. doi:10.1029/1999WR900330

Coles, S., & Dixon, M. (1999). Likelihood-based inference for extreme value models. Extremes, 2(1), 5–23. doi:10.1023/A:1009905222644

Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.

Shin, Y., & Park, J.-S. (2023). Modeling climate extremes using the four-parameter kappa distribution for r-largest order statistics. Weather and Climate Extremes. doi:10.1016/j.wace.2022.100533

See Also

optim

Examples

x <- rk4dr(n = 50, r = 2, loc = 10, scale = 2, shape1 = 0.1, shape2 = 0.1)
fit <- rk4d.fit(x$rmat, num_inits = 5)

Profile Likelihood for Return Levels under the rK4D Model

Description

Computes and plots the profile log-likelihood for a return level under a stationary r-largest four-parameter kappa distribution (rK4D) model fitted by rk4d.fit.

Usage

rk4d.prof(z, m, xlow, xup, conf = 0.95, nint = 100)

Arguments

z

An object returned by rk4d.fit. The fitted model must represent a stationary model.

m

A return period greater than 1. The profile likelihood is computed for the corresponding return level exceeded with probability 1/m.

xlow, xup

Lower and upper bounds of the return level grid over which the profile likelihood is evaluated.

conf

A numeric vector of confidence levels for profile likelihood confidence intervals.

nint

The number of grid points used to evaluate the profile likelihood.

Details

The function evaluates the profile log-likelihood over a grid of return level values and plots the resulting curve. Horizontal and vertical lines are added to indicate profile likelihood confidence intervals for the confidence levels specified in conf.

Value

A data frame containing the return period, estimated return level, confidence level, lower confidence limit, upper confidence limit, and interval width. A profile likelihood plot is also produced.

References

Hosking, J. R. M. (1994). The four-parameter kappa distribution. IBM Journal of Research and Development, 38(3), 251–258.

Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.

Shin, Y., & Park, J.-S. (2023). Modeling climate extremes using the four-parameter kappa distribution for r-largest order statistics. Weather and Climate Extremes. doi:10.1016/j.wace.2022.100533

See Also

rk4d.fit, rk4d.rl

Examples


x <- rk4dr(n = 50, r = 2, loc = 10, scale = 2, shape1 = 0.1, shape2 = 0.1)
fit <- rk4d.fit(x$rmat, num_inits = 100)
rk4d.prof(fit, m = 100, xlow = 12, xup = 25)


Return Levels for the Four-Parameter Kappa Distribution

Description

Computes return levels and their standard errors for a stationary four-parameter kappa model fitted by rk4d.fit.

Usage

rk4d.rl(z, year = c(20, 50, 100, 200), show = TRUE)

Arguments

z

An object returned by rk4d.fit. The fitted model should represent a stationary model.

year

A numeric vector of return periods for which return levels are to be computed.

show

Logical. If TRUE, the estimated return levels and their standard errors are printed.

Details

For a return period T, the return level is defined as the quantile exceeded with probability 1/T. Under the four-parameter kappa distribution, the return level is

x_T = \mu + \frac{\sigma}{\xi} - \frac{\sigma}{\xi} \left(\frac{1-(1-1/T)^h}{h}\right)^\xi,

and standard errors are obtained using the delta method.

Value

The input object z with two additional components:

References

Hosking, J. R. M. (1994). The four-parameter kappa distribution. IBM Journal of Research and Development, 38(3), 251–258.

Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.

Bader, B., Yan, J., & Zhang, X. (2017). Automated selection of r for the r-largest order statistics approach. Statistics and Computing. doi:10.1007/s11222-016-9697-3

Shin, Y., & Park, J.-S. (2023). Modeling climate extremes using the four-parameter kappa distribution for r-largest order statistics. Weather and Climate Extremes. doi:10.1016/j.wace.2022.100533

See Also

rk4d.fit, rk4d.prof

Examples

x <- rk4dr(n = 50, r = 2, loc = 10, scale = 2, shape1 = 0.1, shape2 = 0.1)
fit <- rk4d.fit(x$rmat, num_inits = 5)
out <- rk4d.rl(fit, year = c(20, 50, 100, 200))

Summary of Fitted rK4D Models over Different Values of r

Description

Summarizes fitted four-parameter kappa distribution models for r-largest order statistics over r = 1, \dots, R. For each value of r, the function fits the model using rk4d.fit and computes return levels using rk4d.rl.

Usage

rk4d.summary(
  data,
  r = NULL,
  penk = NULL,
  penh = NULL,
  ydat = NULL,
  mul = NULL,
  sigl = NULL,
  shl = NULL,
  hl = NULL,
  mulink = identity,
  siglink = identity,
  shlink = identity,
  hlink = identity,
  num_inits = 100,
  muinit = NULL,
  siginit = NULL,
  shinit = NULL,
  hinit = NULL,
  show = FALSE,
  method = "Nelder-Mead",
  maxit = 10000,
  ...
)

Arguments

data

A numeric vector, matrix, or data frame containing the r-largest order statistics. Each row should contain decreasing order statistics for one block or time period.

r

Optional integer giving the maximum number of order statistics to summarize. If NULL, all available columns are used.

penk

Penalty function for the xi parameter in maximum penalized likelihood estimation.

penh

Penalty function for the h parameter in maximum penalized likelihood estimation.

ydat

A matrix or data frame of covariates for generalized linear modelling of the parameters, or NULL for stationary fitting.

mul, sigl, shl, hl

Integer vectors indicating which columns of ydat are used for the location, scale, first shape, and second shape parameters, respectively.

mulink, siglink, shlink, hlink

Inverse link functions for the location, scale, first shape, and second shape parameters, respectively.

num_inits

Number of initial parameter sets used in optimization.

muinit, siginit, shinit, hinit

Optional initial values for the location, scale, first shape, and second shape parameters.

show

Logical. If TRUE, print details from model fitting.

method

Optimization method passed to optim.

maxit

Maximum number of iterations for optimization.

...

Additional arguments passed to rk4d.fit.

Value

A data frame containing:

References

Hosking, J. R. M. (1994). The four-parameter kappa distribution. IBM Journal of Research and Development, 38(3), 251–258.

Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.

Bader, B., Yan, J., & Zhang, X. (2017). Automated selection of r for the r-largest order statistics approach. Statistics and Computing. doi:10.1007/s11222-016-9697-3

Shin, Y., & Park, J.-S. (2023). Modeling climate extremes using the four-parameter kappa distribution for r-largest order statistics. Weather and Climate Extremes. doi:10.1016/j.wace.2022.100533

Examples

x <- rk4dr(n = 50, r = 3, loc = 10, scale = 2, shape1 = 0.1, shape2 = 0.1)
rk4d.summary(x$rmat, num_inits = 5)
# penalty function
rk4d.summary(x$rmat, penk = "CD", penh = "MS", num_inits = 5)


Entropy Difference Test for rK4D Models

Description

Performs the entropy difference (ED) test for selecting the number of order statistics in the r-largest four-parameter kappa distribution (rK4D) model.

Usage

rk4dEd(data)

Arguments

data

A numeric matrix or data frame containing the r-largest order statistics. Each row represents one block or observation, and columns must be ordered from largest to smallest.

Details

The test compares the entropy of models fitted with r and r-1 order statistics and evaluates whether the additional order statistic provides significant information.

This function fits the rK4D model using rk4d.fit and then computes the entropy difference test statistic by comparing the fitted likelihood contributions from models with r and r-1 order statistics.

Value

A list containing:

References

Bader, B., Yan, J., & Zhang, X. (2017). Automated selection of r for the r-largest order statistics approach. Statistics and Computing. doi:10.1007/s11222-016-9697-3

Shin, Y., Park, J.-S., and coauthors (2023). Modeling climate extremes using the four-parameter kappa distribution for r-largest order statistics. Weather and Climate Extremes. doi:10.1016/j.wace.2022.100533

See Also

rk4d.fit, rk4dLh

Examples


x <- rk4dr(n=50, r=2, loc = 10, scale = 2, shape1 = 0.1, shape2 = 0.1)
rk4dEd(x$rmat)


Sequential Entropy Difference Test for rK4D Models

Description

Performs the sequential entropy difference (ED) test for selecting the number of order statistics in the r-largest four-parameter kappa distribution (rK4D) model.

Usage

rk4dEdtest(data)

Arguments

data

A numeric matrix or data frame containing the r-largest order statistics. Each row represents one observation (or block), and columns must be ordered from largest to smallest.

Details

The procedure computes ED tests sequentially for r = 2, \dots, R and applies the ForwardStop and StrongStop stopping rules to control the false discovery rate.

The function sequentially applies the entropy difference test (rk4dEd) for increasing values of r. The resulting p-values are adjusted using the ForwardStop and StrongStop procedures to help determine an appropriate value of r.

Value

A data frame containing:

References

Hosking, J. R. M. (1994). The four-parameter kappa distribution. IBM Journal of Research and Development, 38(3), 251–258.

Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.

Bader, B., Yan, J., & Zhang, X. (2017). Automated selection of r for the r-largest order statistics approach. Statistics and Computing. doi:10.1007/s11222-016-9697-3

Shin, Y., & Park, J.-S. (2023). Modeling climate extremes using the four-parameter kappa distribution for r-largest order statistics. Weather and Climate Extremes. doi:10.1016/j.wace.2022.100533

See Also

rk4dEd, rk4d.fit

Examples


x <- rk4dr(n=50, r=2, loc = 10, scale = 2, shape1 = 0.1, shape2 = 0.1)
rk4dEdtest(x$rmat)

data(bangkok)
rk4dEdtest(bangkok)


Log-Likelihood Contributions for the rK4D Model

Description

Computes the observation-wise log-likelihood contributions for the r-largest four-parameter kappa distribution (rK4D) model.

Usage

rk4dLh(data, par)

Arguments

data

A numeric vector, matrix, or data frame of observations. If a vector is supplied, it is treated as a one-column matrix. If a matrix or data frame is supplied, each row is treated as one observation and columns represent decreasing order statistics.

par

A numeric vector of length 4 giving the location, scale, first shape, and second shape parameters.

Value

A numeric vector of log-likelihood contributions for each row of data. If invalid parameter combinations occur, the function returns a large penalty value.

References

Hosking, J. R. M. (1994). The four-parameter kappa distribution. IBM Journal of Research and Development, 38(3), 251–258.

Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.

Shin, Y., & Park, J.-S. (2023). Modeling climate extremes using the four-parameter kappa distribution for r-largest order statistics. Weather and Climate Extremes. doi:10.1016/j.wace.2022.100533

Examples

x <- rk4dr(n=50, r=3, loc = 10, scale = 2, shape1 = 0.1, shape2 = 0.1)
fit <- rk4d.fit(x$rmat, num_inits = 5)
rk4dLh(data=fit$data,par=fit$mle)


Random Generation from the Four-Parameter Kappa Distribution for r-Largest Order Statistics

Description

Generates random samples from the four-parameter kappa distribution for r-largest order statistics.

Usage

rk4dr(n, r, loc = 0, scale = 1, shape1 = 0.1, shape2 = 0.1)

Arguments

n

A positive integer specifying the number of observations.

r

A positive integer specifying the number of order statistics for each observation.

loc

A numeric value specifying the location parameter.

scale

A positive numeric value specifying the scale parameter.

shape1

A numeric value specifying the first shape parameter.

shape2

A numeric value specifying the second shape parameter.

Details

The function first generates independent uniform random variables and then constructs decreasing transformed variables recursively using the second shape parameter. These are transformed by the four-parameter kappa quantile function qk4d.

For valid generation with r > 1, the second shape parameter should satisfy shape2 < 1/(r-1).

Value

A list with components:

References

Hosking, J. R. M. (1994). The four-parameter kappa distribution. IBM Journal of Research and Development, 38(3), 251–258.

Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.

Bader, B., Yan, J., & Zhang, X. (2017). Automated selection of r for the r-largest order statistics approach. Statistics and Computing. doi:10.1007/s11222-016-9697-3

Shin, Y., & Park, J.-S. (2023). Modeling climate extremes using the four-parameter kappa distribution for r-largest order statistics. Weather and Climate Extremes. doi:10.1016/j.wace.2022.100533

Examples

x <- rk4dr(n=50, r=3, loc = 0, scale = 1, shape1 = 0.1, shape2 = 0.1)
x$rmat


Fit the Logistic Distribution to r-Largest Order Statistics

Description

Fits the logistic distribution to r-largest order statistics using maximum likelihood estimation. Stationary and non-stationary models are supported through generalized linear modelling of the location and scale parameters.

Usage

rld.fit(
  xdat,
  r = NULL,
  ydat = NULL,
  mul = NULL,
  sigl = NULL,
  mulink = identity,
  siglink = identity,
  num_inits = 100,
  muinit = NULL,
  siginit = NULL,
  show = TRUE,
  method = "Nelder-Mead",
  maxit = 10000,
  ...
)

Arguments

xdat

A numeric vector, matrix, or data frame of observations. Each row should contain decreasing order statistics for a given year or block. The first column therefore contains block maxima. Only the first r columns are used in the fitted model. If r is NULL, all available columns are used.

r

The number of largest order statistics to use in the fitted model. If NULL, all columns of xdat are used.

ydat

A matrix or data frame of covariates for non-stationary modelling of the parameters, or NULL for a stationary model. The number of rows must match the number of rows of xdat.

mul, sigl

Integer vectors indicating which columns of ydat are used as covariates for the location and scale parameters, respectively.

mulink, siglink

Inverse link functions for the location and scale parameters, respectively.

num_inits

The number of initial parameter sets used in the optimization.

muinit, siginit

Numeric vectors giving initial values for the location and scale parameters. If NULL, default initial values based on L-moments are used.

show

Logical. If TRUE, details of the fitted model are printed.

method

Optimization method passed to optim for stationary fits.

maxit

Maximum number of iterations for optim.

...

Additional control arguments passed to the optimizer.

Value

A list with components including:

References

Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.

Shin, Y., & Park, J-S. (2024). Generalized logistic model for r-largest order statistics with hydrological application. Stochastic Environmental Research and Risk Assessment. doi:10.1007/s00477-023-02642-7

See Also

optim

Examples

x <- rldr(n = 50, r = 3, loc = 10, scale = 2)
fit <- rld.fit(x$rmat, num_inits = 5)

Profile Likelihood for Return Levels under the rLD Model

Description

Computes and plots the profile log-likelihood for a return level under a stationary r-largest logistic distribution (rLD) model fitted by rld.fit.

Usage

rld.prof(z, m, xlow, xup, conf = 0.95, nint = 100)

Arguments

z

An object returned by rld.fit. The fitted model must represent a stationary model.

m

A return period greater than 1. The profile likelihood is computed for the corresponding return level exceeded with probability 1/m.

xlow, xup

Lower and upper bounds of the return level grid over which the profile likelihood is evaluated.

conf

A numeric vector of confidence levels for profile likelihood confidence intervals.

nint

The number of grid points used to evaluate the profile likelihood.

Details

The function evaluates the profile log-likelihood over a grid of return level values and plots the resulting curve. Horizontal and vertical lines are added to indicate profile likelihood confidence intervals for the confidence levels specified in conf.

Value

A data frame containing the return period, estimated return level, confidence level, lower confidence limit, upper confidence limit, and interval width. A profile likelihood plot is also produced.

References

Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.

Shin, Y., & Park, J-S. (2024). Generalized logistic model for r-largest order statistics with hydrological application. Stochastic Environmental Research and Risk Assessment. doi:10.1007/s00477-023-02642-7

See Also

rld.fit, rld.rl

Examples

x <- rldr(n = 50, r = 2, loc = 10, scale = 2)
fit <- rld.fit(x$rmat)
rld.prof(fit, m = 100, xlow = 12, xup = 25)


Return Levels for the Logistic Distribution

Description

Computes return levels and their standard errors for a stationary logistic model fitted by rld.fit.

Usage

rld.rl(z, year = c(20, 50, 100, 200), show = TRUE)

Arguments

z

An object returned by rld.fit. The fitted model should represent a stationary model.

year

A numeric vector of return periods for which return levels are to be computed.

show

Logical. If TRUE, the estimated return levels and their standard errors are printed.

Details

For a return period T, the return level is defined as the quantile exceeded with probability 1/T. Under the logistic distribution, the return level is

x_T = \mu + \sigma \log\left(\frac{1}{\exp(-\log(1-1/T)) - 1}\right),

and standard errors are obtained using the delta method.

Value

The input object z with two additional components:

References

Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.

Shin, Y., & Park, J-S. (2024). Generalized logistic model for r-largest order statistics with hydrological application. Stochastic Environmental Research and Risk Assessment. doi:10.1007/s00477-023-02642-7

See Also

rld.fit, rld.prof

Examples

x <- rldr(n = 50, r = 2, loc = 10, scale = 2)
fit <- rld.fit(x$rmat, num_inits = 5)
out <- rld.rl(fit,year= c(20, 50, 100, 200))

Summary of Fitted rLD Models over Different Values of r

Description

Summarizes fitted logistic distribution models for r-largest order statistics over r = 1, \dots, R. For each value of r, the function fits the model using rld.fit and computes return levels using rld.rl.

Usage

rld.summary(
  data,
  r = NULL,
  ydat = NULL,
  mul = NULL,
  sigl = NULL,
  mulink = identity,
  siglink = identity,
  num_inits = 100,
  muinit = NULL,
  siginit = NULL,
  show = FALSE,
  method = "Nelder-Mead",
  maxit = 10000,
  ...
)

Arguments

data

A numeric vector, matrix, or data frame containing the r-largest order statistics. Each row should contain decreasing order statistics for one block or time period.

r

Optional integer giving the maximum number of order statistics to summarize. If NULL, all available columns are used.

ydat

A matrix or data frame of covariates for generalized linear modelling of the parameters, or NULL for stationary fitting.

mul, sigl

Integer vectors indicating which columns of ydat are used for the location and scale parameters, respectively.

mulink, siglink

Inverse link functions for the location and scale parameters, respectively.

num_inits

Number of initial parameter sets used in optimization.

muinit, siginit

Optional initial values for the location and scale parameters.

show

Logical. If TRUE, print details from model fitting.

method

Optimization method passed to optim.

maxit

Maximum number of iterations for optimization.

...

Additional arguments passed to rld.fit.

Value

A data frame containing:

References

Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.

Shin, Y., & Park, J-S. (2024). Generalized logistic model for r-largest order statistics with hydrological application. Stochastic Environmental Research and Risk Assessment. doi:10.1007/s00477-023-02642-7

Examples

x <- rldr(n = 50, r = 2, loc = 10, scale = 2)
rld.summary(x$rmat, num_inits = 5)

Random Generation from the Logistic Distribution for r-Largest Order Statistics

Description

Generates random samples from the logistic distribution for r-largest order statistics.

Usage

rldr(n, r, loc = 0, scale = 1)

Arguments

n

A positive integer specifying the number of observations.

r

A positive integer specifying the number of order statistics for each observation.

loc

A numeric value specifying the location parameter.

scale

A positive numeric value specifying the scale parameter.

Details

The function first generates independent uniform random variables and then constructs decreasing transformed variables recursively. These are transformed by the logistic quantile function qld.

Value

A list with components:

References

Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.

Bader, B., Yan, J., & Zhang, X. (2017). Automated selection of r for the r-largest order statistics approach. Statistics and Computing. doi:10.1007/s11222-016-9697-3

Shin, Y., & Park, J-S. (2024). Generalized logistic model for r-largest order statistics with hydrological application. Stochastic Environmental Research and Risk Assessment. doi:10.1007/s00477-023-02642-7

Examples

x <- rldr(n=50, r=3, loc = 0, scale = 1)
x$rmat