Splits and Networx

Klaus Schliep

September 17, 2024

This tutorial gives a basic introduction for constructing phylogenetic networks and adding parameters to trees or networx objects using phangorn (K. P. Schliep 2011) in R. Splits graphs or phylogenetic networks are a useful way to display conflicting data or to summarize different trees. Here, we present two popular networks, consensus networks (Holland et al. 2004) and Neighbor-Net (Bryant and Moulton 2004).
Trees or networks are often missing either edge weights or edge support values. We show how to improve a tree/networx object by adding support values or estimating edge weights using non-negative Least-Squares (nnls).

We first load the phangorn package and a few data sets we use in this vignette.

library(phangorn)
data(Laurasiatherian)
data(yeast)

consensusNet

A consensusNet (Holland et al. 2004) is a generalization of a consensus tree. Instead of only representing splits (taxon bipartitions) occurring in at least 50% of the trees in a bootstrap or MCMC sample one can use a lower threshold and explore competing splits. Note that, in its basic implementation used here, the consensusNet edge lengths are proportional to the frequency of the corresponding splits in the provided list of trees.

The input for consensusNet is a list of trees i.e. an object of class multiPhylo.

set.seed(1)
bs <- bootstrap.phyDat(yeast, FUN = function(x)nj(dist.hamming(x)), 
    bs=100)
tree <- nj(dist.hamming(yeast))
par("mar" = rep(1, 4))
tree <- plotBS(tree, bs, "phylogram")

cnet <- consensusNet(bs, .3)
plot(cnet, show.edge.label=TRUE)

In many cases, consensusNet will return more than two incompatible (competing) splits. This cannot be plotted as a planar (2-dimensional) graph. Such as situation requires a n-dimensional graph, where the maximum number of dimensions equals the maximum number of incompatible splits. For example, if we have three alternative incompatible splits: (A,B)|(C,D) vs. (A,C)|(B,D) vs. (A,D)|(B,C), we need a 3-dimensional graph to show all three alternatives. A nice way to get still a good impression of the network is to plot it in 3D.

plot(cnet, "3D")
# rotate 3d plot
play3d(spin3d(axis=c(0,1,0), rpm=6), duration=10)
# create animated gif file 
movie3d(spin3d(axis=c(0,1,0), rpm=6), duration=10)

which will result in a spinning graph similar to this

rotatingNetworx
rotatingNetworx

neighborNet

The function neighborNet implements the popular method of Bryant and Moulton (2004). The Neighbor-Net algorithm is essentially a 2D-version of the Neighbor joining algorithm. The Neighbour-net is computed in two steps: the first computes a circular ordering of the taxa in the data set; the second step involves the estimation of edge weights using non-negative Least-Squares (nnls).

dm <- dist.hamming(yeast)
nnet <- neighborNet(dm)
par("mar" = rep(1, 4))
plot(nnet)

The advantage of Neighbor-Net is that it returns a circular split system which can be always displayed in a planar (2D) graph. The rendering of the networx is done using the the fantastic igraph package (Csardi and Nepusz 2006).

Adding support values

We can use the generic function addConfidences to add (branch) support values from a tree, i.e. an object of class phylo to a networx, splits or phylo object. The Neighbor-Net object we computed above provides no support values. We can add the support values from the tree we computed to the splits (edges) shared by both objects.

nnet <- addConfidences(nnet, tree)
par("mar" = rep(1, 4))
plot(nnet, show.edge.label=TRUE)

Analogously, we can also add support values to a tree:

tree2 <- rNNI(tree, 2)
tree2 <- addConfidences(tree2, tree)
# several support values are missing
par("mar" = rep(1, 4))
plot(tree2, show.node.label=TRUE)

Estimating edge weights (nnls)

Consensus networks, on the other hand, provide primarily information about support values corresponding to a split, but no information about the actual difference between the taxon bipartitions defined by that split. For example, one may be interested how the alternative support values correspond with the actual genetic distance between the involved taxa. Given a distance matrix, we can estimate edge weights using non-negative Least-Squares, and plot them onto the consensusNet splits graph.

cnet <- nnls.networx(cnet, dm)
par("mar" = rep(1, 4))
plot(cnet, show.edge.label=TRUE)

Import and export networks, advanced functions for networx objects

The functions read.nexus.networx and write.nexus.networx can read and write nexus files for or from SplitsTree (Huson and Bryant 2006). Check-out the new vignette IntertwiningTreesAndNetworks (K. Schliep et al. 2017) for additional functions, examples, and advanced application.

Session Information

sessionInfo()
## R version 4.4.1 (2024-06-14)
## Platform: x86_64-pc-linux-gnu
## Running under: Debian GNU/Linux 12 (bookworm)
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.11.0 
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.11.0
## 
## locale:
##  [1] LC_CTYPE=de_AT.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=de_AT.UTF-8        LC_COLLATE=C              
##  [5] LC_MONETARY=de_AT.UTF-8    LC_MESSAGES=de_AT.UTF-8   
##  [7] LC_PAPER=de_AT.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=de_AT.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: Europe/Vienna
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] phangorn_2.12.1 ape_5.8-0.1    
## 
## loaded via a namespace (and not attached):
##  [1] Matrix_1.7-0      gtable_0.3.5      jsonlite_1.8.8    dplyr_1.1.4      
##  [5] compiler_4.4.1    highr_0.11        tidyselect_1.2.1  Rcpp_1.0.13      
##  [9] parallel_4.4.1    jquerylib_0.1.4   scales_1.3.0      yaml_2.3.10      
## [13] fastmap_1.2.0     lattice_0.22-6    ggplot2_3.5.1     R6_2.5.1         
## [17] labeling_0.4.3    generics_0.1.3    igraph_2.0.3      knitr_1.48       
## [21] tibble_3.2.1      munsell_0.5.1     bslib_0.8.0       pillar_1.9.0     
## [25] rlang_1.1.4       utf8_1.2.4        fastmatch_1.1-4   cachem_1.1.0     
## [29] xfun_0.47         quadprog_1.5-8    sass_0.4.9        cli_3.6.3        
## [33] withr_3.0.1       magrittr_2.0.3    digest_0.6.37     grid_4.4.1       
## [37] rstudioapi_0.16.0 lifecycle_1.0.4   nlme_3.1-165      vctrs_0.6.5      
## [41] evaluate_0.24.0   glue_1.7.0        farver_2.1.2      codetools_0.2-20 
## [45] ggseqlogo_0.2     fansi_1.0.6       colorspace_2.1-1  rmarkdown_2.28   
## [49] tools_4.4.1       pkgconfig_2.0.3   htmltools_0.5.8.1

References

Bryant, David, and Vincent Moulton. 2004. Neighbor-Net: An Agglomerative Method for the Construction of Phylogenetic Networks.” Molecular Biology and Evolution 21 (2): 255–65. https://doi.org/10.1093/molbev/msh018.
Csardi, Gabor, and Tamas Nepusz. 2006. “The Igraph Software Package for Complex Network Research.” InterJournal Complex Systems: 1695. https://igraph.org/.
Holland, Barbara R., Katharina T. Huber, Vincent Moulton, and Peter J. Lockhart. 2004. “Using Consensus Networks to Visualize Contradictory Evidence for Species Phylogeny.” Molecular Biology and Evolution 21 (7): 1459–61. https://doi.org/10.1093/molbev/msh145.
Huson, D. H., and D. Bryant. 2006. “Application of Phylogenetic Networks in Evolutionary Studies.” Molecular Biology and Evolution 23 (2): 254–67.
Schliep, Klaus Peter. 2011. “Phangorn: Phylogenetic Analysis in R.” Bioinformatics 27 (4): 592–93. https://doi.org/10.1093/bioinformatics/btq706.
Schliep, Klaus, Alastair J. Potts, David A. Morrison, and Guido W. Grimm. 2017. “Intertwining Phylogenetic Trees and Networks.” Methods in Ecology and Evolution 8 (10): 1212–20.