Taxonomy

The similarity matrix represents a graph with vertices and edges.

Each vertex belongs to 3 nested sets

We calculate metrics hierarchically:

We can aggregate each of these metrics to produce more metrics:

Consider a compound perturbation experiment done in replicates in a multi-well plate. Each compound belongs to one (or more) MOAs.

Further,

The metrics implemented in matric are defined below.

Level 1-0

Raw metrics

Metric Description
sim_mean_i mean similarity of a vertex to its replicate vertices

Related: sim_median_i which uses median instead of mean.

Scaled metrics

Metric Description
sim_scaled_mean_non_rep_i scale sim_mean_i using sim_mean_stat_non_rep_i and sim_sd_stat_non_rep_i

where

Related:

Rank-based and retrieval-based metrics

Consider a list of vertices comprising

Metric Description
sim_ranked_relrank_mean_non_rep_i the mean percentile of the vertex’s replicates in this list
sim_retrieval_average_precision_non_rep_i the average precision reported on the list, with the replicates being the positive class
sim_retrieval_r_precision_non_rep_i similarly, the R-precision reported on the list

Related:

Level 1 aggregations Level 1-0 metrics

Note: These are Level 1 summaries of scaling parameters; they are not used for scaling, themselves:

Level 2-1

Raw metrics

Metric Description
sim_mean_g mean similarity of vertices in a replicate set to its group replicate vertices

Related: sim_median_g which uses median instead of mean.

Scaled metrics

Metric Description
sim_scaled_mean_non_rep_g scale sim_mean_g using sim_mean_stat_non_rep_g and sim_sd_stat_non_rep_g

where

Related:

Rank-based and retrieval-based metrics

Consider a list of vertices comprising

We define metrics similar to the corresponding Level 1-0 metrics:

Level 2 aggregations of Level 2-1 metrics

These are not implemented.

Addendum

This a related discussion on metrics, from here.

We have a weighted graph where the vertices are perturbations with multiple labels (e.g. pathways in the case of genetic perturbations), and edges are the similarity between the vertices (e.g. the cosine similarity between image-based profiles of two CRISPR knockouts).

There are three levels of ranked lists of edges, each of which can produce global metrics (based on classification metrics like average precision or other so-called class probability metrics). These global metrics can be used to compare representations.

In all 3 cases, we pose it as a binary classification problem on the edges:

The three levels of ranked lists of edges, along with the metrics they induce, are below

(Not all the metrics are useful, and some may be very similar to others. I have highlighted the ones I think are useful.)

  1. Global: Single list, comprising all edges
  1. We can directly compute a single global metric from this list
  1. Label-specific: One list per label, comprising all edges that have at least one vertex with the label
  1. We can compute a label-specific metric, from each list, with an additional constraint on Class 1 edges: both vertices should share the label being evaluated.
  2. We can then (weighted) average the label-specific metrics to get a single global metric.
  3. We can also directly compute a global metric directly across all the label-specific lists.
  1. Sample-specific: One list per sample, comprising all edges that have at least one vertex as that sample
  1. We can compute a sample-specific metric, from each list.
  2. We can then average the sample-specific metrics to get a label-specific metric, but filtered like in 1a although it may not be quite as straightforward; 2.d might be better.
  3. We can further (weighted) average the label-specific metrics to get a single global metric.
  4. We can also directly compute a label-specific metric directly across the sample-specific lists, but filtered like in 1a.
  5. We can also directly average the sample-specific metrics to get a single global metric.
  6. We can also directly compute a single global metric directly across all the sample-specific lists.
  7. We can also (weighted) average the label-specific metric in 2d to get a single global metric.

Notes:

Categorization based on https://scikit-learn.org/stable/modules/model_evaluation.html#multiclass-and-multilabel-classification (I did not double-check; there could be errors)

Index Averaging Metric type
0.a micro global
1.a micro label-specific
1.b macro global
1.c micro global
2.b macro label-specific
2.c macro of macro-label-specific global
2.d micro label-specific
2.e macro global
2.f micro global
2.g macro of micro-label-specific global