Introduction

BioGSP provides Spectral Graph Wavelet Transform (SGWT) analysis for spatial biological data. This package enables multi-scale analysis of spatial patterns using graph signal processing techniques.

Key Capabilities

  • Object-oriented SGWT workflow with clear separation of concerns
  • Multi-scale spatial analysis using spectral graph wavelets
  • Flexible data input with custom column naming
  • Energy-normalized similarity analysis with comprehensive metrics
  • Multiple wavelet kernels (Mexican Hat, Meyer, Heat)
  • Comprehensive visualization tools
  • Detailed inverse transform results (low-pass, band-pass approximations)

New Workflow Overview

The new workflow consists of five main steps:

  1. initSGWT(): Initialize SGWT object with data and parameters
  2. checkKband(): Analyze k-band limited property of signals (optional)
  3. runSpecGraph(): Build spectral graph structure
  4. runSGWT(): Perform forward and inverse SGWT transforms
  5. runSGCC(): Calculate energy-normalized weighted similarity

Setup

# Load required packages
library(BioGSP)
library(ggplot2)
library(patchwork)
library(viridis)

# Set random seed for reproducibility
set.seed(123)

New SGWT Workflow Demonstration

Generate Example Pattern

# Create a spatial pattern with concentric circles
demo_pattern <- simulate_multiscale(
  grid_size = 20,        # Moderate size for fast computation
  n_centers = 1,         # Single center pattern
  Ra_seq = 5,            # Inner circle radius
  n_steps = 1,           # Single step (one pattern)
  outer_start = 10,      # Outer ring radius
  seed = 123
)[[1]]
## Generating 1 patterns...
##   |                                                                              |                                                                      |   0%  |                                                                              |======================================================================| 100%
## 
## Multicenter multiscale simulation completed!
# Display pattern structure
cat("Generated pattern dimensions:", nrow(demo_pattern), "x", ncol(demo_pattern), "\n")
## Generated pattern dimensions: 400 x 4
cat("Column names:", paste(colnames(demo_pattern), collapse = ", "), "\n")
## Column names: X, Y, signal_1, signal_2
cat("Signals available:", sum(demo_pattern$signal_1), "inner pixels,", sum(demo_pattern$signal_2), "outer pixels\n")
## Signals available: 81 inner pixels, 234 outer pixels

Visualize Input Pattern

# Single combined visualization (one plot for this section)
demo_pattern$pattern_type <- ifelse(
  demo_pattern$signal_1 == 1,
  "Inner Circle",
  ifelse(demo_pattern$signal_2 == 1, "Outer Ring", "Background")
)

p_input <- ggplot(demo_pattern, aes(X, Y, fill = pattern_type)) +
  geom_tile() +
  scale_fill_manual(
    values = c("Background" = "white", "Inner Circle" = "#e31a1c", "Outer Ring" = "#1f78b4"),
    name = "Pattern"
  ) +
  coord_fixed() +
  theme_void() +
  ggtitle("Input Pattern (Signal 1 + Signal 2)")

print(p_input)

Step 1: Initialize SGWT Object

# Initialize SGWT object with custom column names
SG <- initSGWT(
  data.in = demo_pattern,
  x_col = "X",           # Custom X coordinate column name
  y_col = "Y",           # Custom Y coordinate column name
  signals = c("signal_1", "signal_2"),  # Analyze both signals
  J = 2,                 # Number of wavelet scales
  scaling_factor = 1,    # Scale progression factor
  kernel_type = "mexican_hat"
)

# Display initialized object
print(SG)
## SGWT Object
## ===========
## Data:
##   Dimensions: 400 x 5 
##   Coordinates: X , Y 
##   Signals: signal_1, signal_2 
## 
## Parameters:
##   k (neighbors): mexican_hat 
##   J (scales): 2 
##   Kernel type: mexican_hat 
##   Laplacian type: 
## 
## Status:
##   Graph computed: FALSE 
##   Forward computed: FALSE 
##   Inverse computed: FALSE

Step 2: Build Spectral Graph

# Build spectral graph structure
SG <- runSpecGraph(SG, k = 12, laplacian_type = "normalized", length_eigenvalue = 200, verbose = TRUE)
## Building graph from spatial coordinates...
## Computing Laplacian and eigendecomposition...
## Auto-generated scales: 1.0188, 1.0188 
## Graph construction completed.
# Check updated object
cat("Graph construction completed!\n")
## Graph construction completed!
cat("Adjacency matrix dimensions:", dim(SG$Graph$adjacency_matrix), "\n")
## Adjacency matrix dimensions: 400 400
cat("Number of eigenvalues computed:", length(SG$Graph$eigenvalues), "\n")
## Number of eigenvalues computed: 200

Visualize Fourier Modes (Eigenvectors)

# Visualize Fourier modes (eigenvectors) - 5 low and 5 high frequency modes
fourier_modes <- plot_FM(SG, mode_type = "both", n_modes = 2, ncol = 2)

cat("Fourier Modes Visualization:\n")
## Fourier Modes Visualization:
cat("- Low-frequency modes (2-6): Smooth spatial patterns, excluding DC component\n")
## - Low-frequency modes (2-6): Smooth spatial patterns, excluding DC component
cat("- High-frequency modes (96-100): Fine-detailed spatial patterns\n")
## - High-frequency modes (96-100): Fine-detailed spatial patterns
cat("- Each mode shows its eigenvalue (λ) indicating frequency content\n")
## - Each mode shows its eigenvalue (λ) indicating frequency content

Step 3: Run SGWT Analysis

# Perform SGWT forward and inverse transforms
SG <- runSGWT(SG, verbose = TRUE)
## Performing SGWT analysis for 2 signals...
## Using batch processing for efficiency...
## SGWT analysis completed.
# Display final object with all results
print(SG)
## SGWT Object
## ===========
## Data:
##   Dimensions: 400 x 5 
##   Coordinates: X , Y 
##   Signals: signal_1, signal_2 
## 
## Parameters:
##   k (neighbors): mexican_hat 
##   J (scales): 2 
##   Kernel type: mexican_hat 
##   Laplacian type: 
##   Scales: 1.0188, 1.0188 
## 
## Status:
##   Graph computed: TRUE 
##   Forward computed: TRUE 
##   Inverse computed: TRUE 
## 
## Reconstruction Errors:
##    signal_1 : 0.129674 
##    signal_2 : 0.234629

Visualize SGWT Decomposition

# Plot SGWT decomposition results for signal_1
decomp_plots <- plot_sgwt_decomposition(
  SG = SG,
  signal_name = "signal_1",
  plot_scales = 1,       # One plot for this section
  ncol = 1
)

print(decomp_plots)
## TableGrob (4 x 1) "arrange": 4 grobs
##               z     cells    name           grob
## original      1 (1-1,1-1) arrange gtable[layout]
## scaling       2 (2-2,1-1) arrange gtable[layout]
## wavelet_1     3 (3-3,1-1) arrange gtable[layout]
## reconstructed 4 (4-4,1-1) arrange gtable[layout]

Energy Analysis

# Analyze energy distribution across scales for signal_1
energy_analysis <- sgwt_energy_analysis(SG, "signal_1")
print(energy_analysis)
##       scale    energy energy_ratio scale_value   signal
## 1  low_pass 57.784192   0.95227414    1.018772 signal_1
## 2 wavelet_1  1.448008   0.02386293    1.018772 signal_1
## 3 wavelet_2  1.448008   0.02386293    1.018772 signal_1
# Create energy distribution plot
energy_plot <- ggplot(energy_analysis, aes(x = scale, y = energy_ratio)) +
  geom_bar(stat = "identity", fill = "steelblue", alpha = 0.7) +
  geom_text(aes(label = paste0(round(energy_ratio * 100, 1), "%")), 
            vjust = -0.5, size = 3) +
  labs(title = "Energy Distribution Across SGWT Scales (Signal 1)",
       x = "Scale", y = "Energy Ratio") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

print(energy_plot)

Advanced Features Demonstration

Different Wavelet Kernels

# Compare different kernel types
kernels <- c("mexican_hat", "meyer", "heat")
kernel_results <- list()

for (kn in kernels) {
  # Initialize with different kernel
  SG_temp <- initSGWT(
    data.in = demo_pattern,
    x_col = "X", y_col = "Y",
    signals = "signal_1",
    J = 3,                    # Reduced for faster computation
    kernel_type = kn
  )
  
  # Run full workflow
  SG_temp <- runSpecGraph(SG_temp, k = 12, laplacian_type = "normalized", length_eigenvalue = 100,verbose = FALSE)
  SG_temp <- runSGWT(SG_temp, verbose = FALSE)
  
  kernel_results[[kn]] <- SG_temp
}

# Compare reconstruction errors
comparison_df <- data.frame(
  Kernel = kernels,
  RMSE = sapply(kernel_results, function(x) x$Inverse$signal_1$reconstruction_error),
  stringsAsFactors = FALSE
)

print("Kernel Performance Comparison:")
## [1] "Kernel Performance Comparison:"
print(comparison_df)
##                  Kernel      RMSE
## mexican_hat mexican_hat 0.2113782
## meyer             meyer 0.2019181
## heat               heat 0.1928501
# Plot comparison
ggplot(comparison_df, aes(x = Kernel, y = RMSE, fill = Kernel)) +
  geom_bar(stat = "identity", alpha = 0.7) +
  geom_text(aes(label = round(RMSE, 6)), vjust = -0.5) +
  scale_fill_viridis_d() +
  labs(title = "SGWT Reconstruction Error by Kernel Type",
       x = "Kernel Type", y = "RMSE") +
  theme_minimal()

Step 4: Pattern Similarity Analysis

# Calculate similarity between signal_1 and signal_2 in same object
similarity_within <- runSGCC("signal_1", "signal_2", SG = SG, return_parts = TRUE)

cat("Pattern Similarity Analysis (within same graph):\n")
## Pattern Similarity Analysis (within same graph):
cat(sprintf("Overall similarity: %.4f\n", similarity_within$S))
## Overall similarity: -0.5930
cat(sprintf("Low-frequency similarity: %.4f\n", similarity_within$c_low))
## Low-frequency similarity: -0.5996
cat(sprintf("Non-low-frequency similarity: %.4f\n", similarity_within$c_nonlow))
## Non-low-frequency similarity: -0.5040
cat(sprintf("Energy weights - Low: %.3f, Non-low: %.3f\n", 
           similarity_within$w_low, similarity_within$w_NL))
## Energy weights - Low: 0.932, Non-low: 0.068
# Generate a second pattern for cross-comparison
pattern_2 <- simulate_multiscale(
  grid_size = 40,
  n_centers = 1,
  Ra_seq = 8,            # Different inner radius
  n_steps = 1,           # Single step
  outer_start = 15,      # Different outer radius
  seed = 456
)[[1]]
## Generating 1 patterns...
##   |                                                                              |                                                                      |   0%  |                                                                              |======================================================================| 100%
## 
## Multicenter multiscale simulation completed!
# Create second SGWT object
SG2 <- initSGWT(pattern_2, x_col = "X", y_col = "Y", signals = "signal_1", 
                J = 4)
SG2 <- runSpecGraph(SG2, k = 12, laplacian_type = "normalized",length_eigenvalue = 100, verbose = FALSE)
SG2 <- runSGWT(SG2, verbose = FALSE)

# Note: Cross-object similarity comparison removed due to different eigenvalue counts
# between SG (200 eigenvalues) and SG2 (100 eigenvalues) causing dimension mismatch

# Visualize both patterns for comparison (single faceted plot for this section)
pattern_compare <- rbind(
  data.frame(
    X = demo_pattern$X,
    Y = demo_pattern$Y,
    signal_1 = demo_pattern$signal_1,
    Pattern = "Pattern A (R_in=5, R_out=10)"
  ),
  data.frame(
    X = pattern_2$X,
    Y = pattern_2$Y,
    signal_1 = pattern_2$signal_1,
    Pattern = "Pattern B (R_in=8, R_out=15)"
  )
)

pattern_compare$Signal <- ifelse(pattern_compare$signal_1 == 1, "Signal", "Background")

ggplot(pattern_compare, aes(X, Y, fill = Signal)) +
  geom_tile() +
  scale_fill_manual(values = c("Background" = "white", "Signal" = "#377eb8")) +
  coord_fixed() +
  facet_wrap(~ Pattern) +
  theme_void() +
  theme(legend.position = "none") +
  ggtitle("Pattern Comparison (signal_1)")

Low-frequency Only Analysis

# Compare using only low-frequency components
similarity_low <- runSGCC("signal_1", "signal_2", SG = SG, low_only = TRUE, return_parts = TRUE)

cat("Low-frequency Only Similarity Analysis:\n")
## Low-frequency Only Similarity Analysis:
cat(sprintf("Low-frequency similarity: %.4f\n", similarity_low$c_low))
## Low-frequency similarity: -0.5996
cat(sprintf("Overall score (same as low-freq): %.4f\n", similarity_low$S))
## Overall score (same as low-freq): -0.5996
cat("Note: Non-low-frequency components are ignored (c_nonlow = NA)\n")
## Note: Non-low-frequency components are ignored (c_nonlow = NA)
# Compare with full analysis
cat("\nComparison:\n")
## 
## Comparison:
cat(sprintf("Full analysis similarity: %.4f\n", similarity_within$S))
## Full analysis similarity: -0.5930
cat(sprintf("Low-only similarity: %.4f\n", similarity_low$S))
## Low-only similarity: -0.5996
cat(sprintf("Difference: %.4f\n", abs(similarity_within$S - similarity_low$S)))
## Difference: 0.0065

Multiple Signal Analysis

# Analyze energy distribution for both signals
energy_signal1 <- sgwt_energy_analysis(SG, "signal_1")
energy_signal2 <- sgwt_energy_analysis(SG, "signal_2")

# Combine for comparison
energy_comparison <- rbind(energy_signal1, energy_signal2)

# Plot comparison
ggplot(energy_comparison, aes(x = scale, y = energy_ratio, fill = signal)) +
  geom_bar(stat = "identity", position = "dodge", alpha = 0.7) +
  geom_text(aes(label = paste0(round(energy_ratio * 100, 1), "%")), 
            position = position_dodge(width = 0.9), vjust = -0.5, size = 3) +
  scale_fill_viridis_d() +
  labs(title = "Energy Distribution Comparison: Signal 1 vs Signal 2",
       x = "Scale", y = "Energy Ratio", fill = "Signal") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

Similarity Space Visualization with Multiple Patterns

# Generate 9 paired patterns using simulate_multiscale
cat("Generating 9 paired patterns for similarity analysis...\n")
## Generating 9 paired patterns for similarity analysis...
patterns_9 <- simulate_multiscale(
  grid_size = 40,
  n_centers = 1,
  Ra_seq = c(3, 6, 9),     # Inner circle radii
  n_steps = 3,             # 3 shrinkage steps
  outer_start = 20,        # Fixed starting outer radius
  seed = 123
)
## Generating 9 patterns...
##   |                                                                              |                                                                      |   0%  |                                                                              |========                                                              |  11%  |                                                                              |================                                                      |  22%  |                                                                              |=======================                                               |  33%  |                                                                              |===============================                                       |  44%  |                                                                              |=======================================                               |  56%  |                                                                              |===============================================                       |  67%  |                                                                              |======================================================                |  78%  |                                                                              |==============================================================        |  89%  |                                                                              |======================================================================| 100%
## 
## Multicenter multiscale simulation completed!
cat("Generated", length(patterns_9), "patterns\n")
## Generated 9 patterns
cat("Pattern names:", names(patterns_9), "\n")
## Pattern names: Ra_3_Step_1 Ra_3_Step_2 Ra_3_Step_3 Ra_6_Step_1 Ra_6_Step_2 Ra_6_Step_3 Ra_9_Step_1 Ra_9_Step_2 Ra_9_Step_3
# Create SGWT objects for all 9 patterns and compute similarities
similarity_results <- list()
sgwt_objects <- list()

cat("\nProcessing patterns and computing SGWT analysis...\n")
## 
## Processing patterns and computing SGWT analysis...
# Process each pattern
for (i in seq_along(patterns_9)) {
  pattern_name <- names(patterns_9)[i]
  pattern_data <- patterns_9[[i]]
  
  cat("Processing", pattern_name, "...\n")
  
  # Create SGWT object
  SG_temp <- initSGWT(
    data.in = pattern_data,
    x_col = "X", 
    y_col = "Y", 
    signals = c("signal_1", "signal_2"),
    J = 4,
    kernel_type = "heat"
  )
  
  # Build graph and run SGWT
  SG_temp <- runSpecGraph(SG_temp, k = 12, laplacian_type = "normalized", 
                         length_eigenvalue = 50, verbose = FALSE)
  SG_temp <- runSGWT(SG_temp, verbose = FALSE)
  
  sgwt_objects[[pattern_name]] <- SG_temp
  
  # Compute within-pattern similarity (signal_1 vs signal_2)
  sim_within <- runSGCC("signal_1", "signal_2", SG = SG_temp, return_parts = TRUE)
  similarity_results[[pattern_name]] <- sim_within
}
## Processing Ra_3_Step_1 ...
## Processing Ra_3_Step_2 ...
## Processing Ra_3_Step_3 ...
## Processing Ra_6_Step_1 ...
## Processing Ra_6_Step_2 ...
## Processing Ra_6_Step_3 ...
## Processing Ra_9_Step_1 ...
## Processing Ra_9_Step_2 ...
## Processing Ra_9_Step_3 ...
# Print similarity results summary
cat("\nSimilarity Results Summary (signal_1 vs signal_2 within each pattern):\n")
## 
## Similarity Results Summary (signal_1 vs signal_2 within each pattern):
similarity_df <- data.frame(
  Pattern = names(similarity_results),
  S = sapply(similarity_results, function(x) x$S),
  c_low = sapply(similarity_results, function(x) x$c_low),
  c_nonlow = sapply(similarity_results, function(x) x$c_nonlow),
  w_low = sapply(similarity_results, function(x) x$w_low),
  w_NL = sapply(similarity_results, function(x) x$w_NL),
  stringsAsFactors = FALSE
)

# Extract Ra and Step values for better labeling
similarity_df$Ra <- as.numeric(gsub(".*Ra_([0-9.]+)_Step.*", "\\1", similarity_df$Pattern))
similarity_df$Step <- as.numeric(gsub(".*Step_([0-9]+)$", "\\1", similarity_df$Pattern))
similarity_df$Label <- paste0("Ra=", similarity_df$Ra, ",Step=", similarity_df$Step)

# Create similarity space visualization
similarity_plot <- visualize_similarity_xy(
  similarity_results,
  point_size = 4,
  point_color = "steelblue",
  add_diagonal = TRUE,
  add_axes_lines = TRUE,
  title = "SGWT Similarity Space: 9 Paired Patterns (signal_1 vs signal_2)",
  show_labels = TRUE
)

print(similarity_plot)

# Analysis of patterns
cat("\nPattern Analysis:\n")
## 
## Pattern Analysis:
cat("Ra (Inner Radius) Effect:\n")
## Ra (Inner Radius) Effect:
for (ra in unique(similarity_df$Ra)) {
  subset_data <- similarity_df[similarity_df$Ra == ra, ]
  cat(sprintf("  Ra=%g: Mean c_low=%.3f, Mean c_nonlow=%.3f, Mean S=%.3f\n", 
              ra, mean(subset_data$c_low), mean(subset_data$c_nonlow), mean(subset_data$S)))
}
##   Ra=3: Mean c_low=0.459, Mean c_nonlow=0.216, Mean S=0.382
##   Ra=6: Mean c_low=0.205, Mean c_nonlow=-0.153, Mean S=0.097
##   Ra=9: Mean c_low=-0.018, Mean c_nonlow=-0.300, Mean S=-0.102
cat("\nStep (Shrinkage Step) Effect:\n")
## 
## Step (Shrinkage Step) Effect:
for (step in unique(similarity_df$Step)) {
  subset_data <- similarity_df[similarity_df$Step == step, ]
  cat(sprintf("  Step=%g: Mean c_low=%.3f, Mean c_nonlow=%.3f, Mean S=%.3f\n", 
              step, mean(subset_data$c_low), mean(subset_data$c_nonlow), mean(subset_data$S)))
}
##   Step=1: Mean c_low=-0.266, Mean c_nonlow=-0.489, Mean S=-0.334
##   Step=2: Mean c_low=0.185, Mean c_nonlow=-0.256, Mean S=0.050
##   Step=3: Mean c_low=0.726, Mean c_nonlow=0.508, Mean S=0.661
cat("\nSimilarity Space Interpretation:\n")
## 
## Similarity Space Interpretation:
cat("- Each point represents similarity between signal_1 (inner circle) and signal_2 (outer ring)\n")
## - Each point represents similarity between signal_1 (inner circle) and signal_2 (outer ring)
cat("- Color indicates inner radius (Ra): different colors for different inner radii\n")
## - Color indicates inner radius (Ra): different colors for different inner radii
cat("- Shape indicates shrinkage step: circle/triangle/square for different steps\n")
## - Shape indicates shrinkage step: circle/triangle/square for different steps
cat("- Position shows frequency-domain similarity characteristics\n")
## - Position shows frequency-domain similarity characteristics

Advanced Workflow Features

Custom Parameters

# Demonstrate advanced parameter customization
SG_advanced <- initSGWT(
  data.in = demo_pattern,
  x_col = "X", y_col = "Y",
  signals = "signal_1",
  J = 6,                     # More scales for finer analysis
  scaling_factor = 1.5,      # Closer scales
  kernel_type = "heat"       # Heat kernel
)

SG_advanced <- runSpecGraph(SG_advanced, k = 15, laplacian_type = "randomwalk",length_eigenvalue = 30, verbose = FALSE)
SG_advanced <- runSGWT(SG_advanced, verbose = FALSE)

cat("Advanced Parameters Results:\n")
## Advanced Parameters Results:
cat("Number of scales:", length(SG_advanced$Parameters$scales), "\n")
## Number of scales: 6
cat("Scales:", paste(round(SG_advanced$Parameters$scales, 4), collapse = ", "), "\n")
## Scales: 0.5399, 0.36, 0.24, 0.16, 0.1067, 0.0711
cat("Reconstruction error:", round(SG_advanced$Inverse$signal_1$reconstruction_error, 6), "\n")
## Reconstruction error: 0.538056

Summary

Key Takeaways

  1. New Object-Oriented Workflow: Clear separation of initialization, graph building, analysis, and similarity computation
  2. Flexible Input: Supports custom column naming for coordinates and signals
  3. Multiple Kernels: Choose from Mexican Hat, Meyer, or Heat kernel families
  4. Comprehensive Similarity: Energy-normalized weighted similarity with detailed diagnostics
  5. Advanced Analysis: Low-frequency only analysis, cross-object comparison, multiple signal support

New Workflow Benefits

# Complete workflow in 5 clear steps:
SG <- initSGWT(data, signals = c("signal1", "signal2"))           # 1. Initialize
kband <- checkKband(SG, signals = c("signal1", "signal2"))       # 2. Check k-band (optional)
SG <- runSpecGraph(SG, k = 12, laplacian_type = "normalized")    # 3. Build graph  
SG <- runSGWT(SG)                                                 # 4. Run SGWT
similarity <- runSGCC("signal1", "signal2", SG = SG)             # 5. Compare patterns

Parameter Guidelines

  • Small datasets (<200 points): k=8-12, J=3-4
  • Medium datasets (200-1000 points): k=12-20, J=4-6
  • Large datasets (>1000 points): k=15-25, J=5-8

Next Steps

  • Explore your own spatial datasets using the demonstrated workflow
  • Experiment with different kernel types and parameters
  • Use comprehensive similarity analysis for comparative studies
  • Refer to function documentation for detailed parameter specifications

BioGSP Package - Biological Graph Signal Processing for Spatial Data Analysis