What are GO, MSigDB, KEGG, ORA, and GSEA?

What are those terms in transcriptomics analysis?

Let’s start with the full name of each

  1. Gene Ontology (GO)
  2. Molecular Signatures Database (MSigDB)
  3. Kyoto Encyclopedia of Genes and Genomes (KEGG)
  4. Over-Representation Analysis (ORA)
  5. Gene Set Enrichment Analysis (GSEA)
Picture generated by Gemini, Prompt by Krittiyabhorn Kongtanawanich.

Grouping according to the usage

1. Source of gene set -> GO, msigDB, and KEGG

1.1 Gene Ontology (GO)

1.2 Kyoto Encyclopedia of Genes and Genomes (KEGG)

1.3 Molecular Signatures Database (MSigDB)

MSigDB is a collection of annotated gene sets that can be used for gene set enrichment analyses. It includes gene sets derived from various sources, including GO terms, KEGG pathways, and other curated datasets. Researchers use MSigDB to access predefined gene sets for enrichment analyses.

1.4 Other sources of gene set

Other sources of gene set:


2. Application in Analyses -> ORA and GSEA

2.1 Over-Representation Analysis (ORA)

How the genes in ORA were defined. Credit pic

2.2 Gene Set Enrichment Analysis (GSEA)

Overview of GSEA. Credit pic

Summary of the connections

In summary, GO provides functional annotations, KEGG offers pathway information, msigDB aggregates these resources into usable gene sets, while ORA and GSEA are methodologies that leverage these databases to interpret gene expression data in a biological context.

The different between ORA and GSEA. Credit pic