Asymmetric Poisson Pseudo-Maximum Likelihood (APPML)

library(capybara)

Overview

Standard Poisson Pseudo-Maximum Likelihood (PPML) estimates the conditional mean of the outcome variable. fepoisson_asymmetric() extends this by fitting conditional expectiles — a generalization that lets you examine different parts of the conditional distribution with the same efficiency gains of high-dimensional fixed effects.

The implementation follows Bergstrand et al. (2025): instead of minimizing a symmetric loss, the estimator minimizes an asymmetric weighted loss controlled by a single parameter expectile (\(\tau \in (0, 1)\)):

The expectile is set through fit_control() and defaults to 0.5, so if you don’t specify it, fepoisson_asymmetric() will behave like standard PPML.

Estimating lower and upper expectiles

Shifting expectile away from 0.5 lets you trace how the entire conditional distribution of mpg varies with wt, after absorbing the cyl fixed effects.

ross2004_subset <- ross2004[ross2004$year %in% seq(1989, 1999, 5), ]
ross2004_subset$trade <- exp(ross2004_subset$ltrade)
ross2004_subset$exp_year <- paste0(ross2004_subset$ctry1, ross2004_subset$year)
ross2004_subset$imp_year <- paste0(ross2004_subset$ctry2, ross2004_subset$year)

# 10th expectile — sensitive to small values of mpg
fit10 <- fepoisson_asymmetric(
  trade ~ ldist + border + comlang + colony | exp_year + imp_year,
  data    = ross2004_subset,
  control = fit_control(expectile = 0.1)
)

summary(fit10)
Formula: trade ~ ldist + border + comlang + colony | exp_year + imp_year

Family: Poisson

Estimates:

|         | Estimate | Std. Error | z value     | Pr(>|z|)  |
|---------|----------|------------|-------------|-----------|
| ldist   |  -1.0916 |     0.0000 | -39358.0438 | 0.0000 ** |
| border  |   0.3419 |     0.0001 |   5729.9222 | 0.0000 ** |
| comlang |   0.3352 |     0.0001 |   5700.9311 | 0.0000 ** |
| colony  |   0.3942 |     0.0001 |   5416.8894 | 0.0000 ** |

Significance codes: ** p < 0.01; * p < 0.05; + p < 0.10

Fixed effects:
  exp_year: 457
  imp_year: 457

Number of observations: Full 21450; Missing 0; Perfect classification 0 

Number of Fisher Scoring iterations: 17
# 90th expectile — sensitive to large values of mpg
fit90 <- fepoisson_asymmetric(
  trade ~ ldist + border + comlang + colony | exp_year + imp_year,
  data    = ross2004_subset,
  control = fit_control(expectile = 0.9)
)

summary(fit90)
Formula: trade ~ ldist + border + comlang + colony | exp_year + imp_year

Family: Poisson

Estimates:

|         | Estimate | Std. Error | z value     | Pr(>|z|)  |
|---------|----------|------------|-------------|-----------|
| ldist   |  -0.9096 |     0.0000 | -45409.0443 | 0.0000 ** |
| border  |   0.3160 |     0.0000 |   7382.8698 | 0.0000 ** |
| comlang |   0.1897 |     0.0000 |   4789.5546 | 0.0000 ** |
| colony  |   0.4657 |     0.0001 |   7910.0068 | 0.0000 ** |

Significance codes: ** p < 0.01; * p < 0.05; + p < 0.10

Fixed effects:
  exp_year: 457
  imp_year: 457

Number of observations: Full 21450; Missing 0; Perfect classification 0 

Number of Fisher Scoring iterations: 18

Comparing coefficients across expectiles

A natural use-case is comparing how the effect of a regressor changes across the distribution. The table below collects the coefficient on wt at the three expectile levels:

summary_table(
  fit10, fit90,
  model_names = c("10th expectile", "90th expectile")
)
|     Variable     |   10th expectile   |   90th expectile   |
|------------------|--------------------|--------------------|
| ldist            |           -1.092** |           -0.910** |
|                  |            (0.000) |            (0.000) |
| border           |            0.342** |            0.316** |
|                  |            (0.000) |            (0.000) |
| comlang          |            0.335** |            0.190** |
|                  |            (0.000) |            (0.000) |
| colony           |            0.394** |            0.466** |
|                  |            (0.000) |            (0.000) |
|                  |                    |                    |
| Fixed effects    |                    |                    |
| exp_year         |                Yes |                Yes |
| imp_year         |                Yes |                Yes |
|                  |                    |                    |
| N                |             21,450 |             21,450 |

Standard errors in parenthesis
Significance levels: ** p < 0.01; * p < 0.05; + p < 0.10

The log-distance coefficient shrinks in magnitude from fit10 to fit90, which indicates that trade much less with each other at the top of the distribution compared to the bottom of the distribution.

Convergence diagnostics

The outer APPML loop updates observation weights until the coefficient vector stops changing. You can inspect convergence through the returned list elements:

cat("Outer loop converged:", fit10$conv_outer, "\n")
cat("Outer iterations:    ", fit10$iter_outer, "\n")
cat("Expectile used:      ", fit10$expectile, "\n")
Outer loop converged: TRUE 
Outer iterations:     4 
Expectile used:       0.1

References

Bergstrand, Jeffrey H., Matthew W. Clance, and JMC Santos Silva. “The tails of gravity: Using expectiles to quantify the trade-margins effects of economic integration agreements.” Journal of International Economics (2025): 104145.