1 Introduction

This vignette will explore the different methods for calculating the required sample size and number of trials to achieve a specific power in a meta-analysis. All methods are implemented in the RTSA package. These sample size calculations are intended for meta-analyses and not for single studies.

The methods for calculating sample size in a meta-analysis, which we call required information size (or required number of participants) differ depending on presence of heterogeneity. We will present required information size methods for meta-analyses with and without heterogeneity.

Some of the methods presented in this vignette are implemented in the original TSA stand-alone software [1], whereas others are only available in the RTSA.

library(RTSA)

2 Estimation of heterogeneity

We will be considering heterogeneity in this vignette. When heterogeneity is expected or present, we need to adjust our meta-analysis to accommodate this extra source of variation. When there is heterogeneity present in a meta-analysis, we estimate \(\tau^2\) and can use either \(D^2\) (diversity) or \(I^2\) (inconsistency) to try to quantify the size of heterogeneity relative to the total variation in the meta-analysis. All the heterogeneity metrics are however quite difficult to estimate in most meta-analyses and the estimates are often uncertain, especially for meta-analyses with few and/or small studies. Heterogeneity has an impact on both the interpretation of the meta-analysis results and on the power calculation as described in [2]. It is strongly recommended that this uncertainty must be reported and taken into consideration when both interpreting the results of the meta-analysis and when conducting the sample size (required information size) and trial size calculations. The RTSA package includes confidence intervals and standard errors on the three measures of heterogeneity - \(\tau^2\), \(I^2\), and \(Q^2\). These intervals are calculated via the confint.rma.uni() function from the metafor-package.

We will see in this vignette how estimate a sample and trial size for a meta-analysis including how to incorporate the estimation uncertainty of the heterogeneity. The original sample size calculation in the original stand-alone TSA software [1] did not incorporate the uncertainty.

We start with considering scenarios without heterogeneity when the sample and trial size calculations are performed prior initiation of the trials. A meta-analysis planned/design prior the initiation of the trials is called a prospective meta-analysis.

3 Prospective meta-analysis

Sample size calculations for prospective meta-analyses are done prior the conduct of the meta-analysis. If a meta-analysis already exists, how many more participants and trials that are needed for achieving a specific level of power can be calculated. This is investigated in the Retrospective meta-analysis Section 4.

3.1 No heterogeneity

For meta-analyses without heterogeneity, we will be fitting fixed-effect meta-analysis models, whereas random-effects meta-analysis models will be fitted for meta-analyses with heterogeneity.

For a fixed-effect meta-analysis the original TSA software [1] uses the sample size formula for a single trial with a normally distributed outcome. The required information size (RIS) is presented as the total number of participants needed to achieve a specific power. The total number of participants counts both the number of participants in the control and the intervention group. The formula is defined as:

\[\begin{align} RIS = 4 \cdot (z_{1-\alpha/side} + z_\beta)^2 \cdot \frac{\nu}{\theta^2}. \tag{3.1} \end{align}\]

For binary data \(\nu = p_0\cdot(1-p_0)\) with \(p_0 = (p_I + p_C)/2\) and \(\theta = p_C - p_I\) where \(p_I\) is the proportion of events in the intervention group and \(p_C\) is the proportion of events in the control group. For continuous data \(\theta\) is an a priori estimate of the difference in means between the two treatment groups and \(\nu\) is the assumed variance.

What we need to define to make the sample size calculation differs depending on whether our outcome is risk ratios (RR), odds ratios (OR), risk differences (RD), or mean differences (MD). For mean differences, we need prior specification of:

  • The minimum clinically relevant difference \(\theta\).
  • The expected standard deviation \(\nu\).
  • test type (one- or two-sided), type-I error and type-II error levels.

For RR, OR, or RD, we need specification of:

  • The minimum clinically relevant difference \(\theta\), expressed either as a RD, RR, or OR. For RR, it can also be expressed as a relative risk reduction (RRR).
  • The probability of event in the control group \(p_C\).
  • test type (one- or two-sided), type-I-error and type-II error levels.

Suppose we assume an effect of intervention compared with control for a dichotomous outcome resulting in a risk ratio of \(RR = 0.9\) with a common probability of event being \(p_C = 0.1\). To calculate the number of required participants, we use the RTSA function ris(). We set alpha to 0.05, beta to 0.1 and use a two-sided test, hence side = 2.

ris(outcome = "RR", mc = 0.9, pC = 0.1, alpha = 0.05, beta = 0.1, side = 2)
#> This is a prospective meta-analysis sample size calculation.
#> The sample size calculation assumes a 2-sided test, equal group sizes,
#> a type-I-error of 0.05 and a type-II-error of 0.1. 
#> The minimum clinical relevant value is set to: 0.9 for outcome metric RR.
#> Additional parameters for sample size are:
#> Probability of event in the control group: 0.1.
#> 
#> Fixed-effect required information size:
#> 36136 participants in total. 
#> 
#> For more information about the sample size calculation see vignette:
#> 'Calculating required sample size and required number of trials'.

3.2 Heterogeneity

The original TSA software formulas do not give an estimate of the required number of trials to achieve a specific power. However, recent papers have shown a need for a minimum number of trials to achieve a specific power when heterogeneity is present [3]. In this section, we will present a method for calculating the required number of trials.

We start with presenting the methods implemented in the original TSA software before presenting the newly added methods that calculate the required number of trials. For the required sample size calculation, TSA uses the following formulas depending on the choice of using either diversity \(D^2\) or inconsistency \(I^2\).

\[\begin{align} RIS_{D^2} = \frac{1}{1-D^2}\cdot 4 \cdot (z_{1-\alpha/2} + z_\beta)^2 \cdot \frac{\nu}{\theta^2}. \end{align}\]

\[\begin{align} RIS_{I^2} = \frac{1}{1-I^2}\cdot 4 \cdot (z_{1-\alpha/2} + z_\beta)^2 \cdot \frac{\nu}{\theta^2}. \end{align}\]

where \(D^2\) is the diversity, \(I^2\) is the inconsistency. Diversity and Inconsistency are calculated as:

\[\begin{align*} D^2 = \frac{\tau^2}{\tau^2 + \sigma^2_D}, \quad I^2 = \frac{\tau^2}{\tau^2 + \sigma^2_M}. \end{align*}\]

For more information about the two formulas, see [4].

An example of calculating RIS using RIS based on Diversity (\(D^2\)) or Inconsistency (\(I^2\)) is given below.

ris(outcome = "RR", mc = 0.9, pC = 0.2, fixed = FALSE, I2 = 0.2, D2 = 0.3,
    side = 2, alpha = 0.05, beta = 0.2)
#> This is a prospective meta-analysis sample size calculation.
#> The sample size calculation assumes a 2-sided test, equal group sizes,
#> a type-I-error of 0.05 and a type-II-error of 0.2. 
#> The minimum clinical relevant value is set to: 0.9 for outcome metric RR.
#> Additional parameters for sample size are:
#> Probability of event in the control group: 0.2.
#> 
#> Fixed-effect required information size:
#> 12080 participants in total. 
#> 
#> Random-effects required information size:
#> Adjusted by diversity (D^2): 17258 participants in total.
#> Adjusted by inconsistency (I^2): 15100 participants in total.
#> 
#> For more information about the sample size calculation see vignette:
#> 'Calculating required sample size and required number of trials'.

Using a simulation study, we investigate if the method will provide a required information size (RIS) to achieve sufficient power. Consider a scenario where \(RR = 0.9\), \(p_C = 0.1\), and \(\tau^2=0.05\). Each \(RIS\) formula is depending on \(\tau\) and \(D^2\) or \(I^2\), so we need a guess for an estimate of \(\tau\) and an estimate of \(\sigma_D\) or \(\sigma_M\). For each simulation we make an initial meta-analysis of 10 studies where each study has 500 participants. From the meta-analyses we estimate \(\tau\), \(\sigma_D\), and \(\sigma_M\). From these estimates, we can calculate \(RIS_{D^2}\) and \(RIS_{I^2}\) providing us with the needed number of participants. An additional trial is then added to achieve the RIS. Redoing this 10,000 times, we wish to see how many times the null hypothesis is rejected to investigate if we on average achieve the right power.

To investigate the effect of more trials, we also increase the number of added trials to the meta-analysis from 1 to 10.

Table 3.1: Power per model as a function of number of extra trials and RIS based on Diversity
Number of extra trials Fixed-effect Random-effects DL Random-effects HKSJ
1 0.889 0.295 0.217
2 0.851 0.334 0.236
3 0.834 0.394 0.277
4 0.821 0.388 0.307
5 0.829 0.425 0.331
6 0.818 0.397 0.324
7 0.838 0.487 0.412
8 0.797 0.455 0.377
9 0.841 0.484 0.415
10 0.814 0.476 0.420

We see that we do not reach 80% power for the two random-effects models.

We will now give the formula for calculating the minimum number of required trials. Let \(\tilde{\theta}\) be the intervention effect, which will be the \(\log(RR)\) in our case, \(\alpha\) and \(\beta\) are respectively the type-1 and type 2 error rates, and \(z_{x}\) is the quantile from the normal distribution at \(x\). Then we will need to fulfill the following equation, to ensure that we have the right error rates [3].

\[\frac{\tilde{\theta}}{\sqrt{\text{Var}(\tilde{\theta})}} = z_{1-\alpha/2}+z_{1-\beta}, \quad \text{where} \quad \text{Var}(\tilde{\theta}) = \left( \sum_k \frac{1}{2\cdot \sigma_k^2/ n_k + \tau^2} \right)^{-1}\]

Then, we will have the defined power, \(1-\beta\) when the following in-equality holds. Notice that in the simulation studies we know the true values of \(\tau^2\) and \(\theta\). Hence \(K\) will be the variable which will vary.

\[\begin{align} \tau^2 < \frac{\theta \cdot K}{\left(z_{1-\alpha/2}+z_{1-\beta}\right)^2} \end{align}\]

To simplify the formula, we assume all trials have the same variation of the estimated treatment effect and they are all of the same size, we can then calculate the number of participants to:

\[\begin{align} RIS_{New} = \frac{2\cdot \sigma^2}{\frac{\tilde{\theta}\cdot K}{\left(z_{1-\alpha/2}+z_{1-\beta}\right)^2} - \tau^2}. \end{align}\]

We compare the methods originally implemented in the TSA software with the new methods for calculating both the required number of participants and the required number of trials.

We set \(RR = 0.9\), \(p_C = 0.1\) and \(\tau^2 = 0.05\). With these values we get the following minimum number of required trials:

ris(outcome = "RR", mc = 0.9, fixed = FALSE, tau2 = 0.05, pC = 0.1,
    side = 2, alpha = 0.05, beta = 0.2)
#> This is a prospective meta-analysis sample size calculation.
#> The sample size calculation assumes a 2-sided test, equal group sizes,
#> a type-I-error of 0.05 and a type-II-error of 0.2. 
#> The minimum clinical relevant value is set to: 0.9 for outcome metric RR.
#> Additional parameters for sample size are:
#> Probability of event in the control group: 0.1.
#> 
#> Fixed-effect required information size:
#> 26993 participants in total. 
#> 
#> Random-effects required information size:
#> Adjusted by tau^2: 1502892 participants in total split over (at minimum) 36 trial(s).
#> 
#> For more information about the sample size calculation see vignette:
#> 'Calculating required sample size and required number of trials'.

The intended level of power is reached for each of the combinations of the number of trials and required participants per trial, as seen in Table 3.2. The results are based on 10,000 simulated meta-analyses. The calculated power is shown for a fixed-effect model and two random-effects models where one is using the DerSimonian-Laird estimator (DL) for heterogeneity and the other is adjusted with the Hartung-Knapp-Sidik-Jonkman (HKSJ) adjustment.

Table 3.2: Power per model as a function of number of trials and number of participants per trial
Number of trials Participants per trial Fixed-effect Random-effects DL Random-effects HKSJ
36 39381 0.9962 0.8063 0.7826
37 15476 0.9902 0.8025 0.7777
38 9630 0.9889 0.8000 0.7741
39 6990 0.9860 0.8026 0.7791

4 Retrospective meta-analysis

When we are updating a meta-analysis or want to make a meta-analysis of existing trials, we might be interested in how much more information is needed to achieve a certain level of power. Using information from already conducted trials in the sample and trial size calculations are problematic in terms of introducing bias. The sample size calculation becomes dependent on the specific selection of existing trials that might be published for showing promising results. Hence when conducting a sample size calculation based on earlier trials, there is a risk of the trials not being representative of the real world.

We will now show how to calculate whether the retrospective meta-analysis fulfills the power requirements.

In a fixed-effect meta-analysis, we can calculate the extra number of required participants using equation (3.1). By subtracting the number of acquired participants, we have an estimate for how many more participants we need to have a well-powered meta-analysis. As it is believed in the fixed-effect meta-analysis that the effect of interest is identical across trials, there is no requirement for the power to be achieved to make multiple additional studies. One suffices if it has the right sample size.

For a random-effects meta-analysis we use an updated version of the formulas used for prospective meta-analyses. Again we use the formulas from [3]. Using the following inequality we can estimate the remaining number of trials:

\[\begin{align} \tau^2 < \frac{K}{\left(z_{1-\alpha/2}+z_{1-\beta}\right)^2/\hat{\theta}^2-1/\sigma^2_{R}} \end{align}\]

Under simplifying assumptions, such as assuming all trials have the same variation of the estimated treatment effect and they are all of the same size, we can then calculate the number of participants to:

\[\begin{align} RIS_{New} = \frac{2\cdot \sigma^2}{\frac{ K}{\left(z_{1-\alpha/2}+z_{1-\beta}\right)^2/\tilde{\theta}^2-1/\sigma^2_{R}} - \tau^2}. \end{align}\]

In both formulas, the \(\sigma^2_{R}\) is the current estimate of the variance of the pooled effect in the random-effects meta-analysis. Using the metaanalysis function from the R package, the additional number of participants and trials can be extracted ...$ris

ma <- metaanalysis(data = perioOxy, outcome = "RR", mc = 0.8, beta = 0.1,
                   pC = 0.15)
ma$ris
#> This is a retrospective meta-analysis sample size calculation. 
#> The sample size calculation assumes a 2-sided test, equal group sizes,
#> a type-I-error of 0.05 and a type-II-error of 0.1 
#> 
#> Fixed-effect required information size:
#> The number of required participants for a fixed-effect meta-analysis is reached.
#> 
#> Random-effects required information size:
#> Adjusted by tau^2: 184896 participants are additionally required in total split over (at minimum) 9 trial(s).
#> Adjusted by diversity (D^2): 10441 participants in total are additionally required. 
#> Adjusted by inconsistency (I^2): 3344 participants in total are additionally required.
#> 
#> For more information about the sample size calculation see vignette: 
#> 'Calculating required sample size and required number of trials'.

As seen in the example we do not need more participants for a fixed-effect meta-analysis with 80% power and a RR set to 0.8. We find that we additionally need 184896 participants over 9 trials of equal size to have sufficient power given the estimate of \(\tau^2\).

If we take into consideration the uncertainty of the \(\tau^2\) estimate, we can compute the additional number of participants and trials given the lower and upper limit of the \(\tau^2\) estimate. For the lower limit of \(\tau^2\), we have that we need at minimum one more trials with 5428 participants in the trial:

ma$ris$NR_tau$NR_tau_ll
#> $minTrial
#> [1] 1
#> 
#> $nPax
#>                 [,1] [,2] [,3] [,4]
#> Trials             1    2    3    4
#> Pax per trial   5428 2135 1329  965
#> Total nr of pax 5428 4270 3987 3860
#> 
#> $tau2
#> [1] 0.002601661

If we increase the number of additional trials to four, we need 965 per trial.

In the worst case scenario we need way more. Here the minimum number of required trials are 47 additionally. This shows that we have a lot of uncertainty about the \(\tau^2\) estimate.

ma$ris$NR_tau$NR_tau_ul
#> $minTrial
#> [1] 47
#> 
#> $nPax
#>                     [,1]   [,2]  [,3]  [,4]
#> Trials                47     48    49    50
#> Pax per trial     372783   3485  1751  1169
#> Total nr of pax 17520801 167280 85799 58450
#> 
#> $tau2
#> [1] 0.3473498

Note that if the meta-analysis is sequential, the sample and trial size requirements for sufficient power are larger but often close to the requirements of a non-sequential meta-analysis.

5 References

  1. Thorlund K, Engstrøm J, Wetterslev J, Brok J, Imberger G, Gluud C (2011). User manual for trial sequential analysis (TSA). Copenhagen Trial Unit, Centre for Clinical Intervention Research, Copenhagen, Denmark. URL https://ctu.dk/wp-content/uploads/2021/03/2017-10-10-TSA-Manual-ENG_ER.pdf.

  2. Ioannidis JP, Patsopoulos NA, Evangelou E. Uncertainty in heterogeneity estimates in meta-analyses. BMJ. 2007 Nov 3;335(7626):914-6. doi: 10.1136/bmj.39343.408449.80. PMID: 17974687; PMCID: PMC2048840.

  3. Kulinskaya E, Wood J. Trial sequential methods for meta-analysis. Res Synth Methods. 2014 Sep;5(3):212-20. doi: 10.1002/jrsm.1104. Epub 2013 Nov 28. PMID: 26052847.

  4. Wetterslev J, Thorlund K, Brok J, Gluud C. Estimating required information size by quantifying diversity in random-effects model meta-analyses. BMC Med Res Methodol. 2009 Dec 30;9:86. doi: 10.1186/1471-2288-9-86. PMID: 20042080; PMCID: PMC2809074.