---
title: "Getting started with respondeR"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Getting started with respondeR}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(collapse = TRUE, comment = "#>")
```

```{r setup}
library(respondeR)
```

## The idea

Trials of continuous outcomes usually report a **mean change** and **standard
deviation** in each arm. Those are hard to communicate: a standardized mean
difference of 0.3 means little to a patient. A **responder analysis** translates
the continuous result into something concrete: the proportion of patients who
improve by at least a **minimal important difference (MID)**. It contrasts the
arms on familiar scales: risk difference, risk ratio, odds ratio, number needed
to treat.

respondeR does this from summary statistics alone, using the cut-point approach
of Anzures-Cabrera, Sarpatwari & Higgins (2011). It never needs individual
patient data.

## Data format

One row per study. The experimental arm columns end in `_e`, the control arm in
`_c`:

```{r}
sample_responder_data
```

| Column | Meaning |
|--------|---------|
| `study` | Study label |
| `change_e`, `change_c` | Mean change per arm |
| `sd_e`, `sd_c` | SD of change per arm |
| `n_e`, `n_c` | Sample size per arm |

## A first analysis

Suppose a change above `1` is clinically meaningful. With the default settings,
`responder_analysis()` returns one row per pooling method:

```{r}
res <- responder_analysis(sample_responder_data, mid = 1)
res[, c("method", "p_e", "p_c", "rd", "rd_lb", "rd_ub", "rr", "or", "nnt")]
```

* `p_e`, `p_c` are the experimental and control responder proportions (on the
  `[0, 1]` scale).
* `rd`, `rr`, `or`, `nnt` are the between-arm contrasts, each with a confidence
  interval (`*_lb`, `*_ub`).
* The `individual` method pools per-study risk differences, so it reports a
  pooled `rd` but leaves `p_e`/`p_c` as `NA`.

For a quick, readable summary use `format_responder_results()`:

```{r}
format_responder_results(res)
```

## Which way is "better"?

By default a *higher* change is a response. For outcomes where *lower* is better
(pain, symptom scores), set `direction = "lower"`:

```{r}
responder_analysis(sample_responder_data, mid = 1, direction = "lower",
                   method = "individual")[, c("method", "rd", "rd_lb", "rd_ub")]
```

## Baseline risk: matched or median control

By default each summary method pools the control arm the same way as the
experimental arm. To instead hold the baseline risk at the median control arm for
every summary method, as in the simulation study behind this package
(Sofi-Mahmudi, 2024), set `control = "median"`. This returns point estimates,
because the median control arm has no variance model:

```{r}
responder_analysis(sample_responder_data, mid = 1, control = "median")[,
  c("method", "p_e", "p_c", "rd")]
```

## Per-study results and a forest plot

`responder_rd_individual()` returns the per-study risk differences that feed a
forest plot:

```{r}
responder_rd_individual(sample_responder_data, mid = 1)
```

```{r, fig.width = 6, fig.height = 3.2, fig.alt = "Forest plot of per-study responder risk differences"}
ps <- responder_rd_individual(sample_responder_data, mid = 1)
pooled <- responder_analysis(sample_responder_data, mid = 1, method = "individual")

y <- rev(seq_len(nrow(ps) + 1))
est <- c(ps$rd, pooled$rd) * 100
lo  <- c(ps$ci_lb, pooled$rd_lb) * 100
hi  <- c(ps$ci_ub, pooled$rd_ub) * 100
labels <- c(as.character(ps$study), "Pooled")

op <- par(mar = c(4, 6, 1, 1))
plot(NA, xlim = range(c(lo, hi, 0)), ylim = c(0.5, length(y) + 0.5),
     yaxt = "n", xlab = "Risk difference (%)", ylab = "", bty = "n")
abline(v = 0, lty = 2, col = "grey60")
segments(lo, y, hi, y, lwd = 2)
points(est, y, pch = c(rep(15, nrow(ps)), 18), cex = c(rep(1.4, nrow(ps)), 2))
axis(2, at = y, labels = labels, las = 1, tick = FALSE)
par(op)
```

## Random effects and heterogeneity

When studies disagree, use random-effects pooling. respondeR reports Cochran's
Q, I-squared, tau-squared and a prediction interval:

```{r}
responder_analysis(sample_responder_data, mid = 1, method = "individual",
                   pooling = "random")[, c("method", "rd", "rd_lb", "rd_ub",
                                           "tau2", "i2", "pi_lb", "pi_ub")]
```

## A threshold-free alternative

Choosing a MID is sometimes contentious. The common-language effect size sidesteps
it entirely: the probability that a randomly chosen treated patient responds
better than a randomly chosen control.

```{r}
cles <- responder_cles(sample_responder_data)
sprintf("CLES = %.1f%% (%.1f%% to %.1f%%)",
        100 * cles$cles, 100 * cles$cles_lb, 100 * cles$cles_ub)
```

## A real example: VAS pain after exercise therapy

The package bundles a real dataset, `vas_pain`: the 20 randomized trials of
exercise for spinal health pooled for the visual analogue scale (VAS) pain
outcome by Li, Bao, Wang and Zhao (2025). The change scores are post minus
baseline VAS on a 0 to 10 cm scale, so a more negative value is a larger pain
reduction; we analyze with `direction = "lower"` and a negative MID equal to the
responder threshold. Using a 1.5 cm reduction as the minimal important
difference:

```{r}
res <- responder_analysis(vas_pain, mid = -1.5, direction = "lower",
                          pooling = "random", ci_method = "hksj")
format_responder_results(res)
```

Pooling the per-study estimates (the `individual` method, the most defensible),
about 17 more exercise patients per 100 reach a 1.5 cm pain reduction than
controls. The pool-then-dichotomize summaries give larger and more dispersed
values here (weighted about 21, unweighted about 32, median about 47 per 100):
that spread is a sign of heterogeneity across the 20 trials, and the individual
method, which respects each trial's own scale, is the one to trust. The
threshold-free common-language effect size avoids picking a cut-point:

```{r}
cles <- responder_cles(vas_pain, direction = "lower")
sprintf("A treated patient has less pain than a control %.0f%% of the time (%.0f%% to %.0f%%)",
        100 * cles$cles, 100 * cles$cles_lb, 100 * cles$cles_ub)
```

(Data from Li et al. (2025), Frontiers in Sports and Active Living,
\doi{10.3389/fspor.2025.1614906}, Figure 3, reproduced under CC BY 4.0.)

## The Shiny application

Everything above is available in a point-and-click app:

```{r, eval = FALSE}
launch_responder_analysis()
```

The same tool runs in the browser, with no installation, at
<https://choxos.github.io/respondeR/app/>.

## Where next

See `vignette("methodology")` for the full statistical detail: each method's
estimator and variance, the relative measures, the SMD bridge, the
logit/MID-uncertainty/distribution options, assumptions and a method-choice
guide.