Converts raw input (long data frame, wide data frame, named list, or an
existing UpsetPlotData object) into the internal format expected by
UpsetPlotAtomic(). This is the data-preparation workhorse
that handles all input format parsing.
Usage
prepare_upset_data(
data,
in_form = "auto",
group_by = NULL,
group_by_sep = "_",
id_by = NULL,
specific = TRUE
)Arguments
- data
A data frame, a named list of element vectors, or an
UpsetPlotDataobject.- in_form
A character string specifying the input format. One of
"auto"(default; detect fromdatastructure),"long","wide","list", or"upset".- group_by
A character vector of column name(s) defining the sets. In long format, this is the column with set labels. In wide format, these are the set-membership columns. When
NULLin wide format, all columns exceptid_byare used. Ignored for list and upset inputs.- group_by_sep
A character string to concatenate multiple
group_bycolumns whenin_form = "long". Default:"_".- id_by
A character string specifying the column name for instance identifiers. Required for long format; optional for wide format (a synthetic
.idcolumn is created if omitted).- specific
A logical value. When
TRUE(default), only specific intersections are returned (elements belonging exclusively to the shown set combination). WhenFALSE, all overlapping items are included. See https://github.com/gaospecial/ggVennDiagram/issues/64.
Value
An UpsetPlotData object — a data frame with an
Intersection list-column of set labels and one row per element.
A "group_order" attribute preserves the original set order.
Input formats
Long format — one row per (set, element) pair:
group_by id_by
A a1
A a2
B a1
B a3
Requires both group_by and id_by.
Wide format — each row is an element, each set has a logical or 0/1 membership column:
A B
TRUE TRUE
TRUE FALSE
FALSE TRUE
The set columns are identified by group_by (or all columns except
id_by when group_by = NULL).
List format — a named list of element vectors:
UpsetPlotData — a pre-processed object (returned by this function)
is returned unchanged (with a warning if group_by is also provided).
