There is not much reason to write another package for making simple
ordination plots with ggplot
. Gavin Simpson wrote his
ggvegan
package to provide autoplot
methods
for several classes of vegan
ordination objects with
ggplot
. Phyloseq
has been available for
several years now, and includes wrappers for making ordinations and
plotting them. The plot functions usually work with ordination objects
made with vegan
functions, too. The ggord
package makes bi- and tri-plots with ordinations made with
vegan
and other packages. And it is really not difficult to
write your own code - examples and scripts abound on the web. But I am
not aware of a package including functions for implementing several
vegan
style plots I frequently use. Fitting environmental
variables as vectors to an ordination with envfit
, for
example. Or delineating treatment groups with hulls, spiders, and
ellipses with vegan
’s family of ordi-
functions. Hence my foray into the field with this package,
ggordiplots
.
vegan
includes a family of functions for delineating
treatment groups in an ordination plot: ordihull
for
enclosing all points for a treatment within a polygon,
ordispider
for linking all points for a treatment to their
group centroid with line segments, and ordiellipse
for
scaling ellipses about group centroids in a variety of ways. I have
collected all of these features into a single function,
gg_ordiplot
; each feature may be turned on or off with a
logical as an argument. Usage (copied from the documentation available
by entering ?gg_ordiplot
) is:
gg_ordiplot(ord, groups, scaling = 1, choices = c(1, 2), kind = c("sd", "se", "ehull"),
conf = NULL, show.groups = "all", ellipse = TRUE, label = FALSE, hull = FALSE,
spiders = FALSE, plot = TRUE)
And the simple example given in the documentation is:
suppressPackageStartupMessages(library(vegan))
suppressPackageStartupMessages(library(ggplot2))
suppressPackageStartupMessages(library(ggordiplots))
data("dune")
data("dune.env")
dune.hel <- decostand(dune, method = "hellinger")
ord <- rda(dune.hel)
gg_ordiplot(ord, groups = dune.env$Management, pt.size = 3)
By default, ellipses indicating one standard deviation about the centroid are plotted for each group (i.e. kind = “sd”). This can be changed to give the standard errors about the group centroids:
If kind
equals “se”, or “sd”, confidence intervals may
be shown by setting conf
to a numeric value. For example,
show 95% confidence intervals for the standard errors:
Axes plotted may be chosen with choices of axes and ellipses (or
other features) limited to a subset of the treatment groups with
show.groups
. For example, plot ordination axes 2 and 3, and
draw ellipses for only BF and HF:
gg_ordiplot(ord, groups = dune.env$Management, choices = c(2, 3), show.groups = c("BF",
"HF"), kind = "se", pt.size = 3, conf = 0.95)
As a final example, indicate treatment groups with hulls, labels and spiders instead of ellipses.
Three functions in this package have to do with displaying
relationships between site ordinations (based on species presences or
abundances, i.e. community composition) and environmental variables.
vegan
’s envfit
function makes linear fits
between the variables and the ordination axes. These can then be
displayed as vectors originating at the center of the plot. Projections
of sites on these vectors should order the sites with respect to the
variable. But the fit may not be linear. For individual variables,
linearity can be checked graphically by making contour (or surface)
plots, and by making bubble plots. In the first case, contours of the
value of the variable are fit to the ordination plot. In the second, the
site symbols are sized in proportion to the value of the variable.
Usage for this function is:
gg_envfit(ord, env, groups = NA, scaling = 1, choices = c(1, 2), perm = 999, alpha = 0.05,
angle = 20, len = 0.5, unit = "cm", arrow.col = "red", pt.size = 3, plot = TRUE)
And a simple example copied from the documentation is:
data(varespec)
data(varechem)
vare.dist <- vegdist(varespec, method = "bray")
set.seed(123)
vare.mds <- monoMDS(vare.dist)
set.seed(123)
gg_envfit(ord = vare.mds, env = varechem, perm = 9999, pt.size = 2, alpha = 0.2)
As with gg_ordiplot
, the function silently returns a
list of the data frames (df_ord and df_arrows) used for the plot and the
plot itself. They can be captured, examined and used to make modified
plots as explained in the other vignette (“Modifying ggordiplots Plots”)
included in this package.
The print method for envfit
gives a table including the
r2 and p.value
(Pr>r) for each variable.
##
## ***VECTORS
##
## MDS1 MDS2 r2 Pr(>r)
## N -0.08414 0.99645 0.2245 0.0627 .
## P -0.79777 -0.60296 0.1891 0.1130
## K -0.91838 -0.39571 0.1549 0.1676
## Ca -0.87356 -0.48671 0.2836 0.0287 *
## Mg -0.95775 -0.28760 0.2134 0.0810 .
## S -0.46058 -0.88762 0.0502 0.5873
## Al 1.00000 0.00047 0.4819 0.0012 **
## Fe 0.99078 0.13548 0.4073 0.0032 **
## Mn -1.00000 0.00256 0.4182 0.0026 **
## Zn -0.99765 -0.06848 0.1390 0.2079
## Mo 0.77138 0.63638 0.0482 0.5980
## Baresoil -0.99753 0.07023 0.1641 0.1562
## Humdepth -0.91408 -0.40553 0.4759 0.0019 **
## pH 0.99417 -0.10786 0.2063 0.0925 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Permutation: free
## Number of permutations: 9999
I used a fairly large value of alpha
in the
gg_envfit
plot above so that vectors for variables with
weak correlations would still be included in the plot. And of course
they all plotted as vectors, i.e. straight lines, because that is the
way the model works. Let’s compare variables with strong and weak
correlations using contour and bubble plots. For
gg_ordisurf
you may have to try different values of
binwidth
to get a plot that you like. The default value is
calculated as the difference between the minimum and maximum values of
the variable divided by 15. Larger values give fewer contours and vice
versa.
gg_ordisurf(ord = vare.mds, env.var = varechem$Al, binwidth = 20, pt.size = 1, var.label = "Aluminum")
## Warning: The dot-dot notation (`..level..`) was deprecated in ggplot2 3.4.0.
## ℹ Please use `after_stat(level)` instead.
## ℹ The deprecated feature was likely used in the ggordiplots package.
## Please report the issue at <https://github.com/jfq3/ggordiplots/issues>.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
The contours are close to parallel and both plots show that the Al concentration increases from left to right, in the same direction as the vector in the previous plot.
Here we can compare directions because we are using the same ordination data for all of the plots. That is, we are not re-running the ordination each time. If we were, we would have to be more careful in making comparisons with directions. The signs along the axes have no real meaning, and axes can flip unexpectedly. This is especially true of NMDS which is essentially a randomization process. This is why I set the randomization seed, because otherwise the axes could flip when I compiled the document and would not match the text I had written. But axes can flip with other methods, too, if different algorithms are used or samples are added or removed.
Now make the same types of plots for a variable with a weak correlation, % of ground that is bare soil.
This relationship is obviously not linear. The value peaks near the center of the first axis (i.e.is monotonic) and there is no clear relationship with the second axis.
I tried making plots for molybdenum (Mo), but the correlation was so bad that the function to fit contours failed. From the bubble plot we can see why.
gg_ordisurf
and gg_ordibubble
also silently
return lists of the data frames used and the plots made. You may
examine, use, and modify these with the methods you have learned
above.
ordicluster
is another member of vegan
’s
family of ordi-
functions. It overlays a cluster diagram on
an ordination plot. This is something I rarely use, but I have included
the function gg_ordicluster
to make the same plots with
ggplot2
. It accepts the same arguments as
ordicluster
(except display
cannot be changed
from “sites”), plus I have added functionality for showing treatment
groups by mapping them to symbols (shapes
in
ggplot2
). Usage is :
ord
is an ordination object, and cl
is the
result from hclust
based on the same distance matrix as the
ordination. Kindt & Coe (BiodiverisityR) recommend single linkage
clustering be used to evaluate how well ordination reflects clustering.
The simple example from the documentation is:
data("dune")
dune.bray <- vegdist(dune, method = "bray")
ord <- cmdscale(dune.bray, k = nrow(dune) - 1, eig = TRUE, add = TRUE)
## Warning in cmdscale(dune.bray, k = nrow(dune) - 1, eig = TRUE, add = TRUE):
## only 18 of the first 19 eigenvalues are > 0
And to demonstrate results with additional arguments:
See the other vignette in this package, “Modifying ggordiplots Plots,” for information on how all of the above plots may be customized.