Overview of R analysis for evaluating the in vitro tumorigenicity of CSC candidate using the 3D multi-spheroid model by determining object area using 3D-SiSP analytical method.

The in vitro tumorigenic assay is a crucial step in verifying CSC properties. In our study, we focus on the usage of the CSC biosensor SORE6, however, other CSC markers can also be utilized with our analysis platform. SORE6 serves as the CSC biosensor in this example, with a green fluorescence protein (GFP) acting as the biosensor protein in this system. Cells containing the SORE6 biosensor were sorted into GFP-positive (POS) and GFP-negative (NEG) populations using fluorescence-activated cell sorting, expected to represent CSCs and non-CSCs, respectively. Then, the expected CSC and non-CSC populations were separately cultured in the 3D multi-spheroid model at various initial seeding densities. According to the 3D-SiSP analytical method, the tumorigenicity of CSC and non-CSC were compared by object area.

According to image analysis, all cells in each well were nuclear stained, enabling the measurement of the area of all objects within the well (Figure 1). To evaluate the in vitro tumorigenicity of cancer stem cell (CSC) candidates, we developed a high-throughput analysis to determine objects area using 3D Surface Integrative Spheroid Profiling (3D-SiSP) analytical method.

In our analysis pipeline, the area of all objects was included without cut-off concern. Subsequently, the objects in each initial cell seeding condition were used for object area calculation and compared between the GFP-positive and GFP-negative groups. Statistical tests were performed by comparing the CSC and non-CSC groups in each cell seeding density. Additionally, box plots were generated for data visualization.

Figure 1: Image analysis of each 3D object
Figure 1: Image analysis of each 3D object. (Created by BioRender.com / Mahidol University)

1. Analysis overview

Here, the analysis composed of 4 main sections below.

  1. Section 1: Define information of the dataset.
  2. Section 2: Create analysis pipeline.
    • Part 2.1: Create a function (Object.Area) for mean calculation of area of all objects (ObjectArea).
    • Part 2.2: Perform the analysis for all conditions using the created function (Object.Area).
    • Part 2.3: Create summary analysis
  3. Section 3: Statistic test
    • Part 3.1: Normality test
    • Part 3.2: Statistical test
  4. Section 4: Data visualization.

2. Inputs

The plate map was arranged as shown in the picture below. Biosensor groups (POS and NEG) and initial cell seeding density were arranged accordingly. According to the image analysis, the area of all objects were measured and stored in a WellName.csv file, well by well.

In the WellName.csv files contains 4 columns, which are described as below.

  1. Object_Area_um2 = Area of each object (μm²)
  2. CellSeedingDensity = Initial cell seeding density
  3. BiosensorGroup = CSC biosensor groups (POS or NEG)
  4. Replicate = Technical replicate of the experiment

Here, we also provide a set of example inputs, where each WellName.csv file contains information about the objects in the respective well.

Figure 2: Plate map of in vitro tumorigenic assay
Figure 2: Plate map of in vitro tumorigenic assay. (Created by BioRender.com / Mahidol University)

Analysis for evaluating the in vitro tumorigenicity of CSC candidate using the 3D multi-spheroid model by determining object area using 3D-SiSP analytical method.

Import R library

library(dplyr)
library(ggplot2)
library(ggpubr)
library(rstatix)

Section 1: Define information of the dataset.

To define information of the dataset.

1.1 Define working directory: Output_Directory
1.2 Define input directory: Input_Directory
1.3 Define the cell line name: CellLine_Name

# Dataset example
  ## Output directory
    Output_Directory  <- 'path/to/directory'
    setwd(Output_Directory)
  
  ## Input directory
    Input_Directory   <- 'path/to/directory'
  
  CellLine_Name <- "CellLineX"

Section 2: Create analysis pipeline.

Part 2.1: Create a function (Object.Area) for mean calculation of area of all objects (ObjectArea).

Object.Area <- function(Well_FileName, Input_Directory, CellLine_Name)
{
  # Import each `.csv` file.
    Well_FilePath <- file.path(Input_Directory, Well_FileName)
    Well_File     <- read.csv(Well_FilePath)

  # Extract information (Initial cell seeding density, Biosensor group) from the file.
    CellSeedingDensity <- Well_File$CellSeedingDensity[1]
    BiosensorGroup <- Well_File$BiosensorGroup[1]
    
  # Extract area of all objects from the file.
    EachWell_ObjectArea <- Well_File$Object_Area_um2
    EachWell_ObjectArea <- as.vector(EachWell_ObjectArea) 
    
  # Calculate mean of area of all objects in the well
    Mean.Object_Area <- mean(EachWell_ObjectArea)
  
  # Create a data frame to store object area information.
     Result <- data.frame(FileName           = Well_FileName, 
                          CellSeedingDensity = CellSeedingDensity, 
                          BiosensorGroup      = BiosensorGroup, 
                          CellLine           = CellLine_Name, 
                          ObjectArea    = Mean.Object_Area
                          )
  
  return(Result)
}

Part 2.2: Perform the analysis for all conditions using the created function (Object.Area)

  # Create a blank data frame to store `EachWell_Object_Area` information.
    AllCondition_Object_Area <- data.frame()
  
  # Load the input data.
    FileName_list <- list.files(Input_Directory)
    FileName_list <- as.list(FileName_list)
  
  
    for (Well in FileName_list) {
      # Calculate the mean of object area in each well by `Object.Area` function.
        EachWell_Object_Area     <- Object.Area(Well, Input_Directory, CellLine_Name)
        
      # Store `EachWell_Object_Area` into the created data frame (`AllCondition_Object_Area`).
        AllCondition_Object_Area <- rbind(AllCondition_Object_Area, 
                                              EachWell_Object_Area
                                             )
    }

  # Reorder the data according to `BiosensorGroup` and `CellSeedingDensity`.
      AllCondition_Object_Area <- with(AllCondition_Object_Area, 
                                          AllCondition_Object_Area[order(BiosensorGroup, CellSeedingDensity), ]
                                        )
  
  # Write the `AllCondition_Object_Area` data into a `.csv` file. 
    write.csv(AllCondition_Object_Area, 
                paste0(CellLine_Name, "_Raw_ObjectArea.csv"), row.names = FALSE)
    
    print(paste0("The .csv file was written as '", CellLine_Name, "_Raw_ObjectArea.csv'"))
## [1] "The .csv file was written as 'CellLineX_Raw_ObjectArea.csv'"

Observe the AllCondition_Object_Area data.

head(AllCondition_Object_Area)
##    FileName CellSeedingDensity BiosensorGroup  CellLine ObjectArea
## 2   B11.csv                 63            NEG CellLineX   850.4693
## 12  C11.csv                 63            NEG CellLineX   946.3654
## 22  D11.csv                 63            NEG CellLineX   610.4338
## 32  E11.csv                 63            NEG CellLineX   556.7345
## 42  F11.csv                 63            NEG CellLineX   803.2494
## 52  G11.csv                 63            NEG CellLineX   331.7023

Part 2.3: Create summary analysis.

For the summary analysis of the spheroid count (AllCondition_Object_Area), the statistics comprising mean and standard deviation were calculated and written as a .csv file.

Summary_ObjectArea <- AllCondition_Object_Area%>%
                            group_by(CellLine, 
                                     BiosensorGroup, 
                                     CellSeedingDensity) %>%
                            summarise_at(vars(ObjectArea), 
                                         list(Mean = mean, 
                                              SD   = sd))
  
# Write the `Summary_Result` into a `.csv` file.
  write.csv(Summary_ObjectArea, 
              paste0(CellLine_Name, "_Summary_ObjectArea.csv"), row.names = FALSE)
  
  print(paste0("Summary of the object area analysis was written as '", CellLine_Name, "_Summary_ObjectArea.csv'"))
## [1] "Summary of the object area analysis was written as 'CellLineX_Summary_ObjectArea.csv'"

Observe the Summary_ObjectArea data.

head(Summary_ObjectArea)
## # A tibble: 6 × 5
## # Groups:   CellLine, BiosensorGroup [2]
##   CellLine  BiosensorGroup CellSeedingDensity  Mean    SD
##   <chr>     <chr>                       <int> <dbl> <dbl>
## 1 CellLineX NEG                            63  683. 226. 
## 2 CellLineX NEG                           125  715. 164. 
## 3 CellLineX NEG                           250  801.  94.4
## 4 CellLineX NEG                           500 1431. 234. 
## 5 CellLineX NEG                          1000 2251. 175. 
## 6 CellLineX POS                            63  742. 239.

Section 3: Statistical test

3.1 Normality test

To determine whether the data follows a normal distribution, the normality test should be performed.

  shapiro.test(AllCondition_Object_Area$ObjectArea)
## 
##  Shapiro-Wilk normality test
## 
## data:  AllCondition_Object_Area$ObjectArea
## W = 0.88436, p-value = 3.71e-05

For the example dataset, the result showed that the data do not follow a normal distribution.

3.2 Statistical test

According to the non-normal distribution of the data, the Wilcoxon rank-sum test was used for statistical testing by comparing the spheroid count between the biosensor groups in the same cell seeding density.

If your data follows a normal distribution, a parametric test like the “T-test” should be used instead.
Here, we also provide the choice of statistical tests as shown below.

  # T-test
    Stat_ObjectArea_Ttest <- compare_means(ObjectArea ~ BiosensorGroup, 
                                          data     = AllCondition_Object_Area, 
                                          group.by = "CellSeedingDensity", 
                                          method   = "t.test"
                                        )

  # Wilcoxon rank-sum test
    Stat_ObjectArea_Wilcox <- compare_means(ObjectArea ~ BiosensorGroup, 
                                          data     = AllCondition_Object_Area, 
                                          group.by = "CellSeedingDensity", 
                                          method   = "wilcox.test"
                                        )
 
  # Write the `Stat_ObjectArea` into a `.csv` file. 
    write.csv(Stat_ObjectArea_Wilcox, 
                paste0(CellLine_Name, "_Stat_ObjectArea.csv"), row.names = FALSE)
    
    print(paste0("Statistical test of the object area was written as '", CellLine_Name, "_Stat_ObjectArea.csv'"))
## [1] "Statistical test of the object area was written as 'CellLineX_Stat_ObjectArea.csv'"

Observe the Stat_ObjectArea_Wilcox data.

 Stat_ObjectArea_Wilcox 
## # A tibble: 5 × 9
##   CellSeedingDensity .y.    group1 group2       p p.adj p.format p.signif method
##                <int> <chr>  <chr>  <chr>    <dbl> <dbl> <chr>    <chr>    <chr> 
## 1                 63 Objec… NEG    POS    0.589   0.59  0.5887   ns       Wilco…
## 2                125 Objec… NEG    POS    0.00866 0.035 0.0087   **       Wilco…
## 3                250 Objec… NEG    POS    0.00866 0.035 0.0087   **       Wilco…
## 4                500 Objec… NEG    POS    0.00216 0.011 0.0022   **       Wilco…
## 5               1000 Objec… NEG    POS    0.00866 0.035 0.0087   **       Wilco…

Section 4: Data visualization

Data visualization is presented as a faceted box plot in an object named Tumorigenicity_BoxPlot and saved as a .png file.

# Define `CellSeedingDensity` and `BiosensorGroup` as factors.
  AllCondition_Object_Area$CellSeedingDensity <- factor(AllCondition_Object_Area$CellSeedingDensity, 
                                                              levels = c("1000", "500", "250", "125", "63")
                                                           )

  AllCondition_Object_Area$BiosensorGroup <- factor(AllCondition_Object_Area$BiosensorGroup, 
                                                        levels = c("POS", "NEG")
                                                      )

# Create a box plot  which faceted by `CellSeedingDensity`.
  Tumorigenicity_BoxPlot <- ggboxplot(AllCondition_Object_Area, 
                                      x = "BiosensorGroup", 
                                      y = "ObjectArea",
                                      color = "BiosensorGroup", 
                                      fill = "BiosensorGroup", 
                                      facet.by = "CellSeedingDensity", 
                                      short.panel.labs = TRUE, 
                                      add = "jitter") +
    
                                scale_color_manual(values = c("#154360", 
                                                              "#9A7D0A")) + 
    
                                scale_fill_manual(values = c("#0073C2FF", 
                                                             "#EFC000FF")) +
    
                                labs(x = "GFP expression", 
                                     y = "Number of spheroids",
                                     title = "In vitro tumorigenicity") +
    
                                theme(plot.title = element_text(hjust = 0.5)) +
                                guides(fill = guide_legend(title = NULL)) +
    
                                facet_wrap(~CellSeedingDensity) +
                                facet_grid(rows = NULL, 
                                           cols = vars(CellSeedingDensity)) +
    
                                stat_compare_means(label = "p.format", 
                                                   label.x.npc = 0.8, label.y.npc =0.9) +
    
                                theme(legend.position = "none") 
     
  # Save the box plot as a `.png` file.
    ggsave(paste0(CellLine_Name, "_Plot_ObjectArea.png"), 
             width = 10 , height = 3.5, 
             dpi   = 300,  units   = "in")

    print(paste0("The .png file was saved as '", CellLine_Name, "'plot_ObjectArea.png'"))
## [1] "The .png file was saved as 'CellLineX'plot_ObjectArea.png'"


Figure 4: In vitro tumorigenic assay of cell line X. The spheroids were determined by area cut-off 8,000 μm².