Skip to contents

Generate a grid of scatter correlation plots for all pairs of variables.

Usage

CorPairsPlot(
  data,
  columns = NULL,
  group_by = NULL,
  group_by_sep = "_",
  group_name = NULL,
  split_by = NULL,
  split_by_sep = "_",
  diag_type = NULL,
  diag_args = list(),
  layout = c(".\\", "\\.", "/.", "./"),
  cor_method = c("pearson", "spearman", "kendall"),
  cor_palette = "RdBu",
  cor_palcolor = NULL,
  cor_size = 3,
  cor_format = "corr: {round(corr, 2)}",
  cor_fg = "black",
  cor_bg = "white",
  cor_bg_r = 0.1,
  theme = "theme_this",
  theme_args = list(),
  palette = ifelse(is.null(group_by), "Spectral", "Paired"),
  palcolor = NULL,
  title = NULL,
  subtitle = NULL,
  facet_by = NULL,
  legend.position = "right",
  legend.direction = "vertical",
  seed = 8525,
  combine = TRUE,
  nrow = NULL,
  ncol = NULL,
  byrow = TRUE,
  axes = NULL,
  axis_titles = axes,
  guides = NULL,
  design = NULL,
  ...
)

Arguments

data

A data frame.

columns

The column names of the data to be plotted. If NULL, all columns, except group_by, will be used.

group_by

Columns to group the data for plotting For those plotting functions that do not support multiple groups, They will be concatenated into one column, using group_by_sep as the separator

group_by_sep

The separator for multiple group_by columns. See group_by

group_name

The name of the group in the legend.

split_by

The column(s) to split data by and plot separately.

split_by_sep

The separator for multiple split_by columns. See split_by

diag_type

The type of the diagonal plots. Available types: "density", "violin", "histogram", "box", "none".

diag_args

A list of additional arguments to be passed to the diagonal plots.

layout

The layout of the plots. Available layouts: ".\", "\.", "/.", "./".

  • '\' or '/' means the diagonal plots are on the top-left to bottom-right diagonal.

  • '.' means where the scatter plots are.

cor_method

The method to calculate the correlation. Available methods: "pearson", "spearman", "kendall". The correlation will be shown in the other triangle of the scatter plots.

cor_palette

The color palette for the correlation tile plots.

cor_palcolor

Custom colors used to create a color palette for the correlation tile plots.

cor_size

The size of the correlation text.

cor_format

The format of the correlation text. Default is "corr: %.2f". It will be formatted using sprintf(cor_format, corr).

cor_fg

The color of the correlation text.

cor_bg

The background color of the correlation text.

cor_bg_r

The radius of the background of the correlation text.

theme

A character string or a theme class (i.e. ggplot2::theme_classic) specifying the theme to use. Default is "theme_this".

theme_args

A list of arguments to pass to the theme function.

palette

A character string specifying the palette to use. A named list or vector can be used to specify the palettes for different split_by values.

palcolor

A character string specifying the color to use in the palette. A named list can be used to specify the colors for different split_by values. If some values are missing, the values from the palette will be used (palcolor will be NULL for those values).

title

A character string specifying the title of the plot. A function can be used to generate the title based on the default title. This is useful when split_by is used and the title needs to be dynamic.

subtitle

A character string specifying the subtitle of the plot.

facet_by

A character string specifying the column name of the data frame to facet the plot. Otherwise, the data will be split by split_by and generate multiple plots and combine them into one using patchwork::wrap_plots

legend.position

A character string specifying the position of the legend. if waiver(), for single groups, the legend will be "none", otherwise "right".

legend.direction

A character string specifying the direction of the legend.

seed

The random seed to use. Default is 8525.

combine

Whether to combine the plots into one when facet is FALSE. Default is TRUE.

nrow

A numeric value specifying the number of rows in the facet.

ncol

A numeric value specifying the number of columns in the facet.

byrow

A logical value indicating whether to fill the plots by row.

axes

A string specifying how axes should be treated. Passed to patchwork::wrap_plots(). Only relevant when split_by is used and combine is TRUE. Options are:

  • 'keep' will retain all axes in individual plots.

  • 'collect' will remove duplicated axes when placed in the same run of rows or columns of the layout.

  • 'collect_x' and 'collect_y' will remove duplicated x-axes in the columns or duplicated y-axes in the rows respectively.

axis_titles

A string specifying how axis titltes should be treated. Passed to patchwork::wrap_plots(). Only relevant when split_by is used and combine is TRUE. Options are:

  • 'keep' will retain all axis titles in individual plots.

  • 'collect' will remove duplicated titles in one direction and merge titles in the opposite direction.

  • 'collect_x' and 'collect_y' control this for x-axis titles and y-axis titles respectively.

guides

A string specifying how guides should be treated in the layout. Passed to patchwork::wrap_plots(). Only relevant when split_by is used and combine is TRUE. Options are:

  • 'collect' will collect guides below to the given nesting level, removing duplicates.

  • 'keep' will stop collection at this level and let guides be placed alongside their plot.

  • 'auto' will allow guides to be collected if a upper level tries, but place them alongside the plot if not.

design

Specification of the location of areas in the layout, passed to patchwork::wrap_plots(). Only relevant when split_by is used and combine is TRUE. When specified, nrow, ncol, and byrow are ignored. See patchwork::wrap_plots() for more details.

...

Additional arguments.

Value

A patch_work::wrap_plots object or a list of them if combine is FALSE.

Examples

# \donttest{
set.seed(8525)
data <- data.frame(x = rnorm(100))
data$y <- rnorm(100, 10, sd = 0.5)
data$z <- -data$x + data$y + rnorm(100, 20, 1)
data$g <- sample(1:4, 100, replace = TRUE)

CorPairsPlot(data, diag_type = "histogram", diag_args = list(bins = 30, palette = "Paired"),
 layout = "/.")


CorPairsPlot(data, group_by = "g", diag_type = "none", layout = "./",
 theme_args = list(axis.title = element_textbox(
     color = "black", box.color = "grey20", size = 16, halign = 0.5, fill = "grey90",
     linetype = 1, width = grid::unit(1, "npc"), padding = ggplot2::margin(5, 5, 5, 5))))


CorPairsPlot(data, group_by = "g", diag_type = "violin", layout = "\\.",
  cor_format = "{x}\n{y}\ncorr: {round(corr, 2)}")


CorPairsPlot(data, split_by = "g", diag_type = "none", layout = ".\\",
 legend.position = "bottom", legend.direction = "horizontal", group_name = "group")


CorPairsPlot(data, split_by = "g",
 palcolor = list("1" = "red", "2" = "blue", "3" = "green", "4" = "yellow"))

# }