Clustering and visualization of time-series whole-brain activity data of C. elegans using WormTensor

Kentaro Yamamoto

Laboratory for Bioinformatics Research, RIKEN Center for Biosystems Dynamics Research

Koki Tsuyuzaki

Laboratory for Bioinformatics Research, RIKEN Center for Biosystems Dynamics Research

Itoshi Nikaido

Laboratory for Bioinformatics Research, RIKEN Center for Biosystems Dynamics Research

Load libraries

Install WormTensor package from CRAN or GitHub in advance and then type the code below in the R console window.

library(WormTensor)

worm_download

worm_download is a function to retrieve data from figshare for a total of 28 animals (24 normal and 4 noisy). If there is no argument, mSBD distance matrices (including 24 normal animals) will be downloaded.

object <- worm_download()

as_worm_tensor

as_worm_tensor is a function to generate a WormTensor object from distance matrices. A WormTensor object S4 class is used by worm_membership, worm_clustering, worm_evaluate, and worm_visualize.

object <- as_worm_tensor(object$Ds)

worm_membership

worm_membership is a function to generate a membership tensor from a WormTensor object with distance matrices. Set the assumed number of clusters to k(>=2).

object <- worm_membership(object, k=6)

worm_clustering

worm_clustering is a function to generate a clustering result from a WormTensor object with a membership tensor.

object <- worm_clustering(object)

worm_evaluate

worm_evaluate is a function to generate an evaluation result from a WormTensor object with a worm_clustering result.

object <- worm_evaluate(object)

worm_visualizeworm_visualizeis a function to visualizeworm_clusteringandworm_evaluate` results.

object <- worm_visualize(object)

Figure1a : Silhouette plots

Figure1b : Dimensional reduction Plots colored by cluster

Figure1c : Dimensional reduction Plots colored by no. of identified cells

Figure1d : ARI with a merge result and each animal(with MCMI)

Pipe Operation

The above functions can also be run by connecting them with R’s native pipe.

worm_download()$Ds |>
    as_worm_tensor() |>
        worm_membership(k=6) |>
            worm_clustering() |>
                worm_evaluate() |>
                    worm_visualize() -> object

Pipe Operation (with Labels)

If you have a label for the cells, you can use it for external evaluation.

# Sample Labels
worm_download()$Ds |>
    as_worm_tensor() |>
        worm_membership(k=6) |>
            worm_clustering() -> object
labels <- list(
    label1 = sample(3, length(object@clustering), replace=TRUE),
    label2 = sample(4, length(object@clustering), replace=TRUE),
    label3 = sample(5, length(object@clustering), replace=TRUE))
# WormTensor (with Labels)
worm_download()$Ds |>
    as_worm_tensor() |>
        worm_membership(k=6) |>
            worm_clustering() |>
                worm_evaluate(labels) |>
                    worm_visualize() -> object_labels

Figure2a : Silhouette plots

Figure2b : Dimensional reduction Plots colored by cluster

Figure2c : Dimensional reduction Plots colored by no. of identified cells

Figure2d : ARI with a merge result and each animal(with MCMI)

Figure2e : Dimensional reduction Plots colored by label

Figure2f : Consistency of labels and cluster members

worm_distance

worm_distance helps you analyze your time-series data matrices with WormTensor. worm_distance is a function to convert time-series data matrices into distance matrices. The distance matrices can be used for analysis by WormTensor.

# Toy data (data of 3 animals)
n_cell_x <- 13
n_cell_y <- 24
n_cell_z <- 29
n_cells <- 30
n_time_frames <- 100

# animal_x : 13 cells, 100 time frames
animal_x <- matrix(runif(n_cell_x*n_time_frames),
    nrow=n_cell_x, ncol=n_time_frames)
rownames(animal_x) <- sample(seq(n_cells), n_cell_x)
colnames(animal_x) <- seq(n_time_frames)

# animal_y : 24 cells, 100 time frames
animal_y <- matrix(runif(n_cell_y*n_time_frames),
    nrow=n_cell_y, ncol=n_time_frames)
rownames(animal_y) <- sample(seq(n_cells), n_cell_y)
colnames(animal_y) <- seq(n_time_frames)

# animal_z : 29 cells, 100 time frames
animal_z <- matrix(runif(n_cell_z*n_time_frames),
    nrow=n_cell_z, ncol=n_time_frames)
rownames(animal_z) <- sample(seq(n_cells), n_cell_z)
colnames(animal_z) <- seq(n_time_frames)

# Input list for worm_distnce
X <- list(animal_x=animal_x,
    animal_y=animal_y,
    animal_z=animal_z)

# Pipe Operation
# tsne.perplexity must be adjusted for data size
worm_distance(X, "mSBD") |>
    as_worm_tensor() |>
        worm_membership(k=6) |>
            worm_clustering() |>
                worm_evaluate() |>
                    worm_visualize(tsne.perplexity=5) -> object

Session Information

#> R Under development (unstable) (2022-07-07 r82559)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Ubuntu 22.04 LTS
#> 
#> Matrix products: default
#> BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=C              
#>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] WormTensor_0.1.1
#> 
#> loaded via a namespace (and not attached):
#>  [1] ade4_1.7-22        tidyselect_1.2.1   viridisLite_0.4.2  farver_2.1.2      
#>  [5] dplyr_1.1.4        fastmap_1.2.0      promises_1.3.0     shinyjs_2.1.0     
#>  [9] digest_0.6.36      mime_0.12          lifecycle_1.0.4    factoextra_1.0.7  
#> [13] cluster_2.1.3      magrittr_2.0.3     compiler_4.3.0     rlang_1.1.4       
#> [17] sass_0.4.9         tools_4.3.0        utf8_1.2.4         yaml_2.3.9        
#> [21] knitr_1.48         ggsignif_0.6.4     labeling_0.4.3     plyr_1.8.9        
#> [25] abind_1.4-5        Rtsne_0.17         withr_3.0.0        purrr_1.0.2       
#> [29] grid_4.3.0         stats4_4.3.0       fansi_1.0.6        ggpubr_0.6.0      
#> [33] xtable_1.8-4       e1071_1.7-14       colorspace_2.1-0   aricode_1.0.3     
#> [37] ggplot2_3.5.1      scales_1.3.0       iterators_1.0.14   MASS_7.3-57.1     
#> [41] cli_3.6.3          rmarkdown_2.27     generics_0.1.3     RcppParallel_5.1.8
#> [45] rstudioapi_0.13    RSpectra_0.16-2    reshape2_1.4.4     usedist_0.4.0     
#> [49] cachem_1.1.0       proxy_0.4-27       stringr_1.5.1      splines_4.3.0     
#> [53] modeltools_0.2-23  parallel_4.3.0     clValid_0.7        vctrs_0.6.5       
#> [57] Matrix_1.4-1       jsonlite_1.8.8     carData_3.0-5      car_3.1-2         
#> [61] ggrepel_0.9.5      rstatix_0.7.2      clue_0.3-65        foreach_1.5.2     
#> [65] jquerylib_0.1.4    tidyr_1.3.1        glue_1.7.0         dtw_1.23-1        
#> [69] codetools_0.2-18   cowplot_1.1.3      flexclust_1.4-2    uwot_0.2.2        
#> [73] stringi_1.8.4      gtable_0.3.5       rTensor_1.4.8      later_1.3.2       
#> [77] munsell_0.5.1      tibble_3.2.1       pillar_1.9.0       htmltools_0.5.8.1 
#> [81] clusterSim_0.51-4  R6_2.5.1           evaluate_0.24.0    shiny_1.8.1.1     
#> [85] lattice_0.20-45    backports_1.5.0    dtwclust_5.5.10    broom_1.0.6       
#> [89] httpuv_1.6.15      bslib_0.7.0        class_7.3-20.1     Rcpp_1.0.13       
#> [93] nlme_3.1-158       mgcv_1.8-40        xfun_0.46          pkgconfig_2.0.3