Skip to contents

Word cloud plot to illustrate the count/frequency of words.

Usage

WordCloudPlot(
  data,
  word_by = NULL,
  sentence_by = NULL,
  count_by = NULL,
  score_by = NULL,
  count_name = NULL,
  score_name = NULL,
  split_by = NULL,
  split_by_sep = "_",
  words_excluded = plotthis::words_excluded,
  score_agg = mean,
  minchar = 2,
  word_size = c(2, 8),
  top_words = 100,
  facet_by = NULL,
  facet_scales = "fixed",
  facet_ncol = NULL,
  facet_nrow = NULL,
  facet_byrow = TRUE,
  theme = "theme_this",
  theme_args = list(),
  palette = "Spectral",
  palcolor = NULL,
  alpha = 1,
  palreverse = FALSE,
  aspect.ratio = 1,
  legend.position = "right",
  legend.direction = "vertical",
  title = NULL,
  subtitle = NULL,
  seed = 8525,
  combine = TRUE,
  nrow = NULL,
  ncol = NULL,
  byrow = TRUE,
  ...
)

Arguments

data

A data frame.

word_by

A character string of the column name to use as the word. A character column is expected.

sentence_by

A character string of the column name to split the sentence. A character column is expected. Either word_by or sentence_by should be specified.

count_by

A character string of the column name for the count of the word/sentence. A numeric column is expected. If NULL, the count of the word/sentence will be used.

score_by

A character string of the column name for the score of the word/sentence. A numeric column is expected, used for the color of the word cloud. If NULL, the score will be set to 1.

count_name

A character string to name the legend of count.

score_name

A character string to name the legend of score.

split_by

The column(s) to split data by and plot separately.

split_by_sep

The separator for multiple split_by columns. See split_by

words_excluded

A character vector of words to exclude from the word cloud.

score_agg

A function to aggregate the scores. Default is mean.

minchar

A numeric value specifying the minimum number of characters for the word.

word_size

A numeric vector specifying the range of the word size.

top_words

A numeric value specifying the number of top words to show.

facet_by

A character string specifying the column name of the data frame to facet the plot. Otherwise, the data will be split by split_by and generate multiple plots and combine them into one using patchwork::wrap_plots

facet_scales

Whether to scale the axes of facets. Default is "fixed" Other options are "free", "free_x", "free_y". See ggplot2::facet_wrap

facet_ncol

A numeric value specifying the number of columns in the facet. When facet_by is a single column and facet_wrap is used.

facet_nrow

A numeric value specifying the number of rows in the facet. When facet_by is a single column and facet_wrap is used.

facet_byrow

A logical value indicating whether to fill the plots by row. Default is TRUE.

theme

A character string or a theme class (i.e. ggplot2::theme_classic) specifying the theme to use. Default is "theme_this".

theme_args

A list of arguments to pass to the theme function.

palette

A character string specifying the palette to use. A named list or vector can be used to specify the palettes for different split_by values.

palcolor

A character string specifying the color to use in the palette. A named list can be used to specify the colors for different split_by values. If some values are missing, the values from the palette will be used (palcolor will be NULL for those values).

alpha

A numeric value specifying the transparency of the plot.

palreverse

A logical value to reverse the palette colors.

aspect.ratio

A numeric value specifying the aspect ratio of the plot.

legend.position

A character string specifying the position of the legend. if waiver(), for single groups, the legend will be "none", otherwise "right".

legend.direction

A character string specifying the direction of the legend.

title

A character string specifying the title of the plot. A function can be used to generate the title based on the default title. This is useful when split_by is used and the title needs to be dynamic.

subtitle

A character string specifying the subtitle of the plot.

seed

The random seed to use. Default is 8525.

combine

Whether to combine the plots into one when facet is FALSE. Default is TRUE.

nrow

A numeric value specifying the number of rows in the facet.

ncol

A numeric value specifying the number of columns in the facet.

byrow

A logical value indicating whether to fill the plots by row.

...

Additional arguments.

Value

A ggplot object or wrap_plots object or a list of ggplot objects

Examples

data <- data.frame(
   word = c("apple", "banana", "cherry", "date", "elderberry"),
   count = c(10, 20, 30, 40, 50),
   score = c(1, 2, 3, 4, 5)
)
WordCloudPlot(data, word_by = "word", count_by = "count", score_by = "score")