| Type: | Package |
| Title: | MST-kNN Clustering Algorithm |
| Version: | 1.0.0 |
| Description: | Implements the MST-kNN clustering algorithm proposed by Inostroza-Ponta (2008) https://trove.nla.gov.au/work/28729389. The algorithm determines the number of clusters automatically by recursively intersecting the Minimum Spanning Tree (MST) and the k-Nearest Neighbor (kNN) proximity graphs constructed from a pairwise distance matrix. The value of k is selected via a connectivity criterion (the smallest k such that the kNN graph is connected, bounded by floor(log(n))). The package requires only a distance matrix as input and returns cluster assignments, an 'igraph' network, and partition metadata. |
| License: | GPL-2 |
| URL: | https://github.com/jorgeklz/package-mstknnclust, https://jorgeklz.github.io/package-mstknnclust/ |
| BugReports: | https://github.com/jorgeklz/package-mstknnclust/issues |
| Depends: | R (≥ 3.5.0) |
| Imports: | igraph |
| Suggests: | knitr, rmarkdown, testthat (≥ 3.0.0) |
| VignetteBuilder: | knitr |
| Encoding: | UTF-8 |
| LazyData: | true |
| Config/testthat/edition: | 3 |
| Config/roxygen2/version: | 8.0.0 |
| NeedsCompilation: | no |
| Packaged: | 2026-05-13 01:35:28 UTC; jorge |
| Author: | Jorge Parraga-Alava
|
| Maintainer: | Jorge Parraga-Alava <jorge.parraga@utm.edu.ec> |
| Repository: | CRAN |
| Date/Publication: | 2026-05-13 07:10:02 UTC |
Indo-European languages dataset
Description
It contains the distances between 84 Indo-European languages based on the mean percent difference in cognacy, using the 200 Swadesh words.
Usage
data(dslanguages)
Format
An data frame with 84 rows and 84 columns containing a distance matrix.
Details
Once the data set is loaded, it can be accessed as an object of class dataframe called dslanguages.
References
Dyen, I., Kruskal, J., and Black, P. (1992). An indoeuropean classification: A lexicostatistical experiment. Transactions of the American Philosophical Society. 82, (5).
Budding Yeast dataset
Description
It contains the expression levels of 2467 genes on 79 samples corresponding to 8 different experiments of the budding yeast: alpha factor (18 samples), cdc15 (15 samples), cold shock (4 samples), diauxic shift (7 samples), DTT shock (4 samples), elutriation (14 samples), heat shock (6 samples) and sporulation (11 samples).
Usage
data(dsyeastexpression)
Format
An data frame with 2467 rows and 79 columns.
Details
Once the data set is loaded, it can be accessed as an object of class dataframe called dsyeastexpression.
Source
https://www.pnas.org/content/suppl/1998/12/08/95.25.14863.DC1/3917data.xls
References
M. B. Eisen, P. T. Spellman, P. O. Brown, and D. Botstein. (1998). Cluster analysis and display of genome-wideexpression patterns.Proceedings of the National Academy of Sciences, 95(25):14863–14868
Generates clustering results
Description
Generates clustering results
Usage
generate.results(g_clusters, distance.matrix)
Arguments
g_clusters |
igraph object with all clusters as connected components. |
distance.matrix |
The original distance matrix. |
Value
A list with cnumber, cluster, partition, csize, network.
Performs the MST-kNN clustering algorithm
Description
Performs the MST-kNN clustering algorithm which generates a clustering solution with automatic number-of-clusters determination by recursively intersecting the Minimum Spanning Tree (MST) and the k-Nearest Neighbor (kNN) graphs.
Usage
mst.knn(distance.matrix, suggested.k)
Arguments
distance.matrix |
A numeric matrix or data.frame with equal numbers of rows and columns representing pairwise distances between objects. |
suggested.k |
Optional. A numeric value representing the suggested number of nearest neighbours. |
Value
A list with elements cnumber, cluster,
partition, csize, network.
Author(s)
Mario Inostroza-Ponta, Jorge Parraga-Alava, Pablo Moscato
Examples
set.seed(1987)
n <- 100; m <- 15
x <- matrix(runif(n * m, min = -5, max = 10), nrow = n, ncol = m)
d <- base::as.matrix(stats::dist(x, method = "euclidean"))
library("mstknnclust")
results <- mst.knn(d)
library("igraph")
plot(results$network,
vertex.size = 8,
vertex.color = igraph::components(results$network)$membership,
layout = igraph::layout_with_fr(results$network, niter = 10000),
main = paste("MST-kNN | clusters =", results$cnumber))
Generates the solution when only singletons are yielded
Description
Generates the solution when only singletons are yielded
Usage
only.single.graphs(total_nodos, nodos_singletons)
Arguments
total_nodos |
Total number of nodes in data matrix. |
nodos_singletons |
Nodes list with cluster singletons. |
Value
An object of class "igraph" as a network representing the clustering solution.