Suppose we observe \(n\) independent counts \(y_1, \dots, y_n\) drawn from a Poisson distribution with unknown rate \(\lambda > 0\):
\[ Y_i \mid \lambda \;\sim\; \mathrm{Poisson}(\lambda), \qquad i = 1,\dots,n. \]
The joint likelihood is
\[ p(y_1,\dots,y_n \mid \lambda) \;\propto\; e^{-n\lambda}\,\lambda^{\sum_i y_i}. \]
Everything relevant for updating beliefs about \(\lambda\) is captured by two sufficient statistics: \(n\) (number of observations) and \(\sum_i y_i\) (total observed count).
With exposure. In many applications the counts are observed over different exposure periods \(e_i\) (time, area, number of opportunities). The model becomes \(Y_i \mid \lambda \sim \mathrm{Poisson}(e_i \lambda)\), and the sufficient statistics become \(\sum_i e_i\) and \(\sum_i y_i\). An exposure-weighted example appears in Appendix A ((Albert 2009)).
The rate \(\lambda > 0\) calls for a positive prior. The Gamma distribution in shape–rate form is the natural conjugate choice:
\[ p(\lambda) \;=\; \frac{\beta^\alpha}{\Gamma(\alpha)}\,\lambda^{\alpha-1}\,e^{-\beta\lambda}, \qquad \lambda > 0, \]
with shape \(\alpha > 0\) and rate \(\beta > 0\). Its mean is \(\alpha/\beta\) and its variance is \(\alpha/\beta^2\).
Interpreting the parameters. Think of \(\alpha - 1\) as a “prior total count” and \(\beta\) as a “prior number of observations”. The prior mean \(\alpha/\beta\) is your pre-data estimate of the rate, and larger \(\beta\) encodes a stronger commitment to that estimate.
Note: many textbooks use shape–scale form (\(\theta = 1/\beta\)); glmbayes uses the shape–rate convention throughout.
Multiply the likelihood kernel \(e^{-n\lambda}\lambda^{\sum y_i}\) by the prior kernel \(\lambda^{\alpha-1}e^{-\beta\lambda}\):
\[ p(\lambda \mid y) \;\propto\; \lambda^{(\alpha + \sum y_i) - 1}\,e^{-(\beta + n)\lambda}. \]
This is a Gamma kernel, so the conjugate posterior is:
\[ \boxed{\lambda \mid y \;\sim\; \mathrm{Gamma}\!\left(\alpha + \textstyle\sum_i y_i,\;\; \beta + n\right).} \]
Update rule: add the total count to the shape, add the sample size (or total exposure) to the rate.
The posterior mean is
\[ \mathbb{E}[\lambda \mid y] \;=\; \frac{\alpha + \sum y_i}{\beta + n} \;=\; \frac{\beta}{\beta + n}\cdot\frac{\alpha}{\beta} \;+\; \frac{n}{\beta + n}\cdot\bar{y}, \]
again a weighted average of the prior mean and the data mean.
bayesrules illustrationThe bayesrules package (Johnson et al. 2022) provides
plot_gamma_poisson() and
summarize_gamma_poisson() for visualizing the Gamma–Poisson
update. The same scenario appears in the Bayes Rules! conjugate vignette
and in Chapter 12 of the book.
Scenario. You observe daily bicycle rental counts \(y_1, \dots, y_9\) from the first nine days of a year. The data are \(y = (1, 1, 1, 0, 0, 0, 0, 0, 0)\), giving \(\sum y = 3\) across \(n = 9\) days. Your prior belief is a \(\Gamma(3, 4)\) distribution on the daily rate \(\lambda\) (prior mean = 0.75 rentals/day).
br_y_df <- data.frame(y = c(1, 1, 1, rep(0, 6))) ## n = 9, sum(y) = 3
br_n <- nrow(br_y_df)
br_shape <- 3
br_rate <- 4
## Analytic posterior
post_shape_br <- br_shape + sum(br_y_df$y) ## 3 + 3 = 6
post_rate_br <- br_rate + br_n ## 4 + 9 = 13library(bayesrules)
plot_gamma_poisson(
shape = br_shape,
rate = br_rate,
sum_y = sum(br_y_df$y),
n = br_n,
prior = TRUE,
likelihood = TRUE,
posterior = TRUE
)summarize_gamma_poisson(shape = br_shape, rate = br_rate,
sum_y = sum(br_y_df$y), n = br_n)
#> model shape rate mean mode var sd
#> 1 prior 3 4 0.7500000 0.5000000 0.18750000 0.4330127
#> 2 posterior 6 13 0.4615385 0.3846154 0.03550296 0.1884223Reading the output. The posterior is \(\Gamma(6, 13)\) with mean \(6/13 \approx 0.46\). The data (mean \(\bar y = 0.33\)) have pulled the posterior mean down from the prior mean (0.75), and the posterior is noticeably tighter.
gp_analytic <- data.frame(
Example = "Bayes Rules! daily counts",
n = br_n,
`sum(y)` = sum(br_y_df$y),
Posterior = sprintf("Gamma(%d, %d)", post_shape_br, post_rate_br),
Mean = post_shape_br / post_rate_br,
SD = sqrt(post_shape_br) / post_rate_br,
check.names = FALSE
)
knitr::kable(gp_analytic, digits = 4,
caption = "Conjugate Gamma--Poisson posterior (Bayes Rules! scenario)")| Example | n | sum(y) | Posterior | Mean | SD |
|---|---|---|---|---|---|
| Bayes Rules! daily counts | 9 | 3 | Gamma(6, 13) | 0.4615 | 0.1884 |
glmb()The scalar Gamma–Poisson model maps to
glmb() with
family = poisson(link = "identity") and
dGamma(shape, rate, beta, Inv_Dispersion = FALSE). The
single coefficient is the rate \(\lambda\).
gp_beta <- matrix(br_shape / br_rate, 1L, 1L, dimnames = list(NULL, "(Intercept)"))
gp_pf <- dGamma(shape = br_shape, rate = br_rate,
beta = gp_beta, Inv_Dispersion = FALSE)
set.seed(2026)
fit_gp <- glmb(
n = 20000,
y ~ 1,
data = br_y_df,
weights = rep(1L, br_n),
family = poisson(link = "identity"),
pfamily = gp_pf
)
print(fit_gp)
#>
#> Call: glmb(formula = y ~ 1, family = poisson(link = "identity"), pfamily = gp_pf,
#> n = 20000, data = br_y_df, weights = rep(1L, br_n))
#>
#> Posterior Mean Coefficients:
#> (Intercept)
#> 0.4641
#>
#> Effective Number of Parameters: 0.5025621
#> Expected Residual Deviance: 7.462233
#> DIC: 13.96479gp_compare <- data.frame(
Example = "Bayes Rules! daily counts",
Posterior = gp_analytic$Posterior,
`Analytic Mean` = gp_analytic$Mean,
`Analytic SD` = gp_analytic$SD,
`glmb Post.Mean` = fit_gp$coef.means["(Intercept)"],
`glmb Post.Sd` = sd(fit_gp$coefficients[, "(Intercept)", drop = TRUE]),
check.names = FALSE
)
knitr::kable(gp_compare, digits = 4,
caption = "Analytic vs. glmb() posterior mean and SD")| Example | Posterior | Analytic Mean | Analytic SD | glmb Post.Mean | glmb Post.Sd | |
|---|---|---|---|---|---|---|
| (Intercept) | Bayes Rules! daily counts | Gamma(6, 13) | 0.4615 | 0.1884 | 0.4641 | 0.1875 |
The glmb Post.Mean and
glmb Post.Sd columns should match the
analytic Mean and SD to Monte Carlo
error because dGamma(Inv_Dispersion = FALSE) implements the
exact conjugate update.
The scalar Gamma–Poisson model is the conjugate prototype for
Poisson glmb() with
family = poisson(link = "identity"). In that
setting the single coefficient is the rate \(\lambda\), the prior is
dGamma(shape, rate, beta, Inv_Dispersion = FALSE), and the
posterior update is the closed form above.
For general Poisson regression with covariates (log
link), conjugacy is lost and glmb() uses envelope-based
accept–reject sampling with a dNormal() prior on the
regression coefficients (Chapter 10). The Bayes Rules!
equality_index Poisson regression example is worked in
Chapter 10, Appendix A.
The hearttransplants dataset (Albert 2020; Christiansen and Morris 1995)
records, for each of 94 U.S. hospitals, the exposure
e (expected deaths under the national baseline rate) and
observed deaths y within 30 days of
surgery. The model is \(y_i \mid \lambda_i
\sim \mathrm{Poisson}(e_i \lambda_i)\) where \(\lambda_i\) is the hospital’s standardized
mortality ratio.
library(LearnBayes)
data("hearttransplants")
cat(sprintf("%d hospitals | sum(y) = %d | sum(e) = %.0f\n",
nrow(hearttransplants), sum(hearttransplants$y), sum(hearttransplants$e)))
#> 94 hospitals | sum(y) = 277 | sum(e) = 294681Albert’s prior. Albert (Ch. 3.2) uses a \(\Gamma(16, 15174)\) prior on \(\lambda\), derived from historical national data (prior mean \(\approx 0.00105\)). He focuses on two specific hospitals:
| Hospital | Exposure \(e\) | Deaths \(y\) | Analytic posterior | Posterior mean | Posterior SD |
|---|---|---|---|---|---|
| A | 66 | 1 | \(\Gamma(17,\; 15240)\) | \(17/15240 \approx 0.001115\) | \(\sqrt{17}/15240 \approx 0.000271\) |
| B | 1767 | 4 | \(\Gamma(20,\; 16941)\) | \(20/16941 \approx 0.001181\) | \(\sqrt{20}/16941 \approx 0.000264\) |
ht_albert <- data.frame(
Hospital = c("A", "B"),
e = c(ex_A, ex_B),
y = c(yobs_A, yobs_B),
Posterior = c("Gamma(17, 15240)", "Gamma(20, 16941)"),
Mean = c(
(alpha0_ht + yobs_A) / (beta0_ht + ex_A),
(alpha0_ht + yobs_B) / (beta0_ht + ex_B)
),
SD = c(
sqrt(alpha0_ht + yobs_A) / (beta0_ht + ex_A),
sqrt(alpha0_ht + yobs_B) / (beta0_ht + ex_B)
),
check.names = FALSE
)
knitr::kable(ht_albert, digits = 6,
caption = "Albert Ch. 3.2: analytic posterior mean and SD (shape--rate Gamma)")| Hospital | e | y | Posterior | Mean | SD |
|---|---|---|---|---|---|
| A | 66 | 1 | Gamma(17, 15240) | 0.001115 | 0.000271 |
| B | 1767 | 4 | Gamma(20, 16941) | 0.001181 | 0.000264 |
glmb()The exposure enters as weights. The conjugate sampler
updates \(\Gamma(\alpha_0 + y,\; \beta_0 +
e)\) exactly.
ht_beta <- matrix(alpha0_ht / beta0_ht, 1L, 1L, dimnames = list(NULL, "(Intercept)"))
ht_pf <- dGamma(shape = alpha0_ht, rate = beta0_ht,
beta = ht_beta, Inv_Dispersion = FALSE)
set.seed(2026)
fit_A <- glmb(n = 20000, y ~ 1, data = data.frame(y = yobs_A),
weights = ex_A, family = poisson(link = "identity"), pfamily = ht_pf)
print(fit_A)
#>
#> Call: glmb(formula = y ~ 1, family = poisson(link = "identity"), pfamily = ht_pf,
#> n = 20000, data = data.frame(y = yobs_A), weights = ex_A)
#>
#> Posterior Mean Coefficients:
#> (Intercept)
#> 0.001121
#>
#> Effective Number of Parameters: 3.864282
#> Expected Residual Deviance: 768.8068
#> DIC: 904.6711set.seed(2026)
fit_B <- glmb(n = 20000, y ~ 1, data = data.frame(y = yobs_B),
weights = ex_B, family = poisson(link = "identity"), pfamily = ht_pf)
print(fit_B)
#>
#> Call: glmb(formula = y ~ 1, family = poisson(link = "identity"), pfamily = ht_pf,
#> n = 20000, data = data.frame(y = yobs_B), weights = ex_B)
#>
#> Posterior Mean Coefficients:
#> (Intercept)
#> 0.001185
#>
#> Effective Number of Parameters: 351.3413
#> Expected Residual Deviance: 101062.1
#> DIC: 107184ht_compare <- data.frame(
Hospital = c("A", "B"),
e = c(ex_A, ex_B),
y = c(yobs_A, yobs_B),
Posterior = c("Gamma(17, 15240)", "Gamma(20, 16941)"),
`Albert Mean` = ht_albert$Mean,
`Albert SD` = ht_albert$SD,
`glmb Post.Mean` = c(fit_A$coef.means["(Intercept)"],
fit_B$coef.means["(Intercept)"]),
`glmb Post.Sd` = c(sd(fit_A$coefficients[, "(Intercept)", drop = TRUE]),
sd(fit_B$coefficients[, "(Intercept)", drop = TRUE])),
check.names = FALSE
)
knitr::kable(ht_compare, digits = 6,
caption = "Albert analytic vs. glmb() posterior mean and SD")| Hospital | e | y | Posterior | Albert Mean | Albert SD | glmb Post.Mean | glmb Post.Sd |
|---|---|---|---|---|---|---|---|
| A | 66 | 1 | Gamma(17, 15240) | 0.001115 | 0.000271 | 0.001121 | 0.000270 |
| B | 1767 | 4 | Gamma(20, 16941) | 0.001181 | 0.000264 | 0.001185 | 0.000263 |
Each fit is for one hospital:
(Intercept) is the posterior draw for that hospital’s rate
\(\lambda\). The
glmb Post.Mean and
glmb Post.Sd columns should match Albert’s
Mean and SD to Monte Carlo error.
equality_index).rglmb).