Skip to contents
  • GSEASummaryPlot is used to plot a summary of the results of a GSEA analysis.

  • GSEAPlot is used to plot the results of a GSEA analysis.

  • PrepareFGSEA is used to prepare the result of GSEA from the fgsea package for plotting.

Usage

PrepareFGSEAResult(data, gene_ranks, gene_sets)

GSEASummaryPlot(
  data,
  gene_ranks = "@gene_ranks",
  gene_sets = "@gene_sets",
  top_term = 10,
  metric = "p.adjust",
  cutoff = 0.05,
  character_width = 50,
  line_plot_size = 0.25,
  metric_name = paste0("-log10(", metric, ")"),
  nonsig_name = "Non-significant",
  linewidth = 0.2,
  line_by = c("prerank", "running_score"),
  title = NULL,
  subtitle = NULL,
  xlab = NULL,
  ylab = NULL,
  alpha = 0.6,
  aspect.ratio = 1,
  legend.position = "right",
  legend.direction = "vertical",
  theme = "theme_this",
  theme_args = list(),
  palette = "Spectral",
  palcolor = NULL,
  seed = 8525,
  ...
)

GSEAPlot(
  data,
  gene_ranks = "@gene_ranks",
  gene_sets = "@gene_sets",
  sample_coregenes = FALSE,
  line_width = 1.5,
  line_alpha = 1,
  line_color = "#6BB82D",
  n_coregenes = 10,
  genes_label = NULL,
  label_fg = "black",
  label_bg = "white",
  label_bg_r = 0.1,
  label_size = 4,
  title = NULL,
  subtitle = NULL,
  xlab = NULL,
  ylab = NULL,
  combine = TRUE,
  nrow = NULL,
  ncol = NULL,
  byrow = TRUE,
  seed = 8525,
  ...
)

Arguments

data

A data frame of GSEA results For example, from DOSE::gseDO(). Required columns are ID, Description, NES, p.adjust, pvalue. The ID column is used to match the gene sets.

gene_ranks

A numeric vector of gene ranks with genes as names The gene ranks are used to plot the gene sets. If gene_ranks is a character vector starting with @, the gene ranks will be taken from the attribute of data.

gene_sets

A list of gene sets, typically from a record of a GMT file The names of the list should match the ID column of data. If gene_sets is a character vector starting with @, the gene sets will be taken from the attribute of data. The GSEA plots will be plotted for each gene set. So, the number of plots will be the number of gene sets. If you only want to plot a subset of gene sets, you can subset the gene_sets before passing it to this function.

top_term

An integer to select the top terms

metric

The metric to use for the significance of the terms

cutoff

The cutoff for the significance of the terms The terms will not be filtered with this cutoff; they are only filtered by the top_term ranked by the metric. The cutoff here is used to show the significance of the terms on the plot. For the terms that are not significant, the color will be grey.

character_width

The width of the characters in the y-axis

line_plot_size

The size of the line plots

metric_name

The name of the metric to show in the color bar

nonsig_name

The name of the legend for the nonsignificant terms

linewidth

The width of the lines in the line plots

line_by

The method to calculate the line plots.

  • prerank: Use the gene ranks as heights to plot the line plots.

  • running_score: Use the running score to plot the line plots.

title

A character string specifying the title of the plot. A function can be used to generate the title based on the default title. This is useful when split_by is used and the title needs to be dynamic.

subtitle

A character string specifying the subtitle of the plot.

xlab

A character string specifying the x-axis label.

ylab

A character string specifying the y-axis label.

alpha

A numeric value specifying the transparency of the plot.

aspect.ratio

A numeric value specifying the aspect ratio of the plot.

legend.position

A character string specifying the position of the legend. if waiver(), for single groups, the legend will be "none", otherwise "right".

legend.direction

A character string specifying the direction of the legend.

theme

A character string or a theme class (i.e. ggplot2::theme_classic) specifying the theme to use. Default is "theme_this".

theme_args

A list of arguments to pass to the theme function.

palette

A character string specifying the palette to use. A named list or vector can be used to specify the palettes for different split_by values.

palcolor

A character string specifying the color to use in the palette. A named list can be used to specify the colors for different split_by values. If some values are missing, the values from the palette will be used (palcolor will be NULL for those values).

seed

The random seed to use. Default is 8525.

...

Additional arguments.

sample_coregenes

A logical value to sample the core genes from the core_enrichment; if FALSE, the first n_coregenes will be used

line_width

The width of the line in the running score plot

line_alpha

The alpha of the line in the running score plot

line_color

The color of the line in the running score plot

n_coregenes

The number of core genes to label

genes_label

The genes to label. If set, n_coregenes will be ignored

label_fg

The color of the label text

label_bg

The background color of the label

label_bg_r

The radius of the background color of the label

label_size

The size of the label text

combine

Whether to combine the plots into one when facet is FALSE. Default is TRUE.

nrow

A numeric value specifying the number of rows in the facet.

ncol

A numeric value specifying the number of columns in the facet.

byrow

A logical value indicating whether to fill the plots by row.

Value

A data frame with the desired columns for plotting and the gene ranks and gene sets as attributes

Examples

# \donttest{
data(gsea_example)
GSEASummaryPlot(gsea_example)

GSEASummaryPlot(gsea_example, line_by = "running_score")

GSEASummaryPlot(gsea_example, cutoff = 0.01)

# }
# \donttest{
GSEAPlot(gsea_example, gene_sets = attr(gsea_example, "gene_sets")[1])

GSEAPlot(gsea_example, gene_sets = attr(gsea_example, "gene_sets")[1:4])

# }