Skip to content

datar.apis.forcats

module

datar.apis.forcats

Functions
  • fct_anon(_f, prefix) (Any) Anonymise factor levels</>
  • fct_c(*fs) (Any) Concatenate factors, combining levels</>
  • fct_collapse(_f, other_level, **kwargs) (Any) Collapse factor levels into manually defined groups</>
  • fct_count(_f, sort, prop) (Any) Count entries in a factor</>
  • fct_cross(*fs, sep, keep_empty) (Any) Combine levels from two or more factors to create a new factor</>
  • fct_drop(_f, only) (Any) Drop unused levels</>
  • fct_expand(_f, *additional_levels) (Any) Add additional levels to a factor</>
  • fct_explicit_na(_f, na_level) (Any) Make missing values explicit</>
  • fct_infreq(_f, ordered) (Any) Reorder factor levels by frequency</>
  • fct_inorder(_f, ordered) (Any) Reorder factor levels by first appearance</>
  • fct_inseq(_f, ordered) (Any) Reorder factor levels by sequence</>
  • fct_lump(_f, n, prop, w, other_level, ties_method) (Any) Lump together factor levels into "other"</>
  • fct_lump_lowfreq(_f, other_level) (Any) lumps together the least frequent levels, ensuringthat "other" is still the smallest level. </>
  • fct_lump_min(_f, min_, w, other_level) (Any) lumps levels that appear fewer than min_ times.</>
  • fct_lump_n(_f, n, w, other_level) (Any) Lumps all levels except for the n most frequent.</>
  • fct_lump_prop(_f, prop, w, other_level) (Any) Lumps levels that appear in fewer prop * n times.</>
  • fct_match(_f, lvls) (Any) Test for presence of levels in a factor</>
  • fct_other(_f, keep, drop, other_level) (Any) Replace levels with "other"</>
  • fct_recode(_f, *args, **kwargs) (Any) Change factor levels by hand</>
  • fct_relabel(_f, _fun, *args, **kwargs) (Any) Automatically relabel factor levels, collapse as necessary</>
  • fct_relevel(_f, *lvls, after) (Any) Reorder factor levels by hand</>
  • fct_reorder(_f, _x, *args, _fun, _desc, **kwargs) (Any) Reorder factor levels by a function (default: median)</>
  • fct_reorder2(_f, _x, *args, _fun, _desc, **kwargs) (Any) Reorder factor levels by a function (default: last2)</>
  • fct_rev(_f) (Any) Reverse the order of the levels of a factor</>
  • fct_shift(_f, n) (Any) Shift the levels of a factor</>
  • fct_shuffle(_f) (Any) Shuffle the levels of a factor</>
  • fct_unify(fs, levels) (Any) Unify the levels in a list of factors</>
  • fct_unique(_f) (Any) Unique values of a factor</>
  • first2(_x, _y) (Any) Find the first element of _y ordered by _x</>
  • last2(_x, _y) (Any) Find the last element of _y ordered by _x</>
  • lvls_expand(_f, new_levels) (Any) Expands the set of levels; the new levels mustinclude the old levels. </>
  • lvls_reorder(_f, idx, ordered) (Any) Leaves values of a factor as they are, but changes the order bygiven indices </>
  • lvls_revalue(_f, new_levels) (Any) changes the values of existing levels; there mustbe one new level for each old level </>
  • lvls_union(fs) (Any) Find all levels in a list of factors</>
function

datar.apis.forcats.fct_relevel(_f, *lvls, after=None)

Reorder factor levels by hand

Parameters
  • _f A factor (categoriccal), or a string vector
  • *lvls Either a function (then len(lvls) should equal to 1) orthe new levels. A function will be called with the current levels as input, and the return value (which must be a character vector) will be used to relevel the factor. Any levels not mentioned will be left in their existing order, by default after the explicitly mentioned levels.
  • after (int, optional) Where should the new values be placed?
Returns (Any)

The factor with levels replaced

function

datar.apis.forcats.fct_inorder(_f, ordered=None)

Reorder factor levels by first appearance

Parameters
  • _f A factor
  • ordered (bool, optional) A logical which determines the "ordered" status of theoutput factor.
Returns (Any)

The factor with levels reordered

function

datar.apis.forcats.fct_infreq(_f, ordered=None)

Reorder factor levels by frequency

Parameters
  • _f A factor
  • ordered (bool, optional) A logical which determines the "ordered" status of theoutput factor.
Returns (Any)

The factor with levels reordered

function

datar.apis.forcats.fct_inseq(_f, ordered=None)

Reorder factor levels by sequence

Parameters
  • _f A factor
  • ordered (bool, optional) A logical which determines the "ordered" status of theoutput factor.
Returns (Any)

The factor with levels reordered

function

datar.apis.forcats.fct_reorder(_f, _x, *args, _fun=None, _desc=False, **kwargs)

Reorder factor levels by a function (default: median)

Parameters
  • _f A factor
  • _x The data to be used to reorder the factor
  • *args Extra arguments to be passed to _fun
  • _fun (optional) A function to be used to reorder the factor
  • _desc (bool, optional) If True, the factor will be reordered in descending order
  • **kwargs Extra keyword arguments to be passed to _fun
Returns (Any)

The factor with levels reordered

function

datar.apis.forcats.fct_reorder2(_f, _x, *args, _fun=None, _desc=False, **kwargs)

Reorder factor levels by a function (default: last2)

Parameters
  • _f A factor
  • _x The data to be used to reorder the factor
  • *args Extra arguments to be passed to _fun
  • _fun (optional) A function to be used to reorder the factor
  • _desc (bool, optional) If True, the factor will be reordered in descending order
  • **kwargs Extra keyword arguments to be passed to _fun
Returns (Any)

The factor with levels reordered

function

datar.apis.forcats.fct_shuffle(_f)

Shuffle the levels of a factor

Parameters
  • _f A factor
Returns (Any)

The factor with levels shuffled

function

datar.apis.forcats.fct_rev(_f)

Reverse the order of the levels of a factor

Parameters
  • _f A factor
Returns (Any)

The factor with levels reversed

function

datar.apis.forcats.fct_shift(_f, n=1)

Shift the levels of a factor

Parameters
  • _f A factor
  • n (int, optional) The number of levels to shift
Returns (Any)

The factor with levels shifted

function

datar.apis.forcats.first2(_x, _y)

Find the first element of _y ordered by _x

Parameters
  • _x The vector used to order _y
  • _y The vector to get the first element of
Returns (Any)

First element of _y ordered by _x

function

datar.apis.forcats.last2(_x, _y)

Find the last element of _y ordered by _x

Parameters
  • _x The vector used to order _y
  • _y The vector to get the last element of
Returns (Any)

Last element of _y ordered by _x

function

datar.apis.forcats.fct_anon(_f, prefix='')

Anonymise factor levels

Parameters
  • prefix (str, optional) A character prefix to insert in front of the random labels.
  • f A factor.
Returns (Any)

The factor with levels anonymised

function

datar.apis.forcats.fct_recode(_f, *args, **kwargs)

Change factor levels by hand

Parameters
  • _f A factor
  • *args and
  • **kwargs A sequence of named character vectors where the namegives the new level, and the value gives the old level. Levels not otherwise mentioned will be left as is. Levels can be removed by naming them NULL. As NULL/None cannot be a name of keyword arguments, replacement has to be specified as a dict (i.e. fct_recode(x, {NULL: "apple"})) If you want to replace multiple values with the same old value, use a set/list/numpy.ndarray (i.e. fct_recode(x, fruit=["apple", "banana"])). This is a safe way, since set/list/numpy.ndarray is not hashable to be a level of a factor. Do NOT use a tuple, as it's hashable!
    Note that the order of the name-value is in the reverse way as dplyr.recode() and dplyr.recode_factor()
Returns (Any)

The factor recoded with given recodings

function

datar.apis.forcats.fct_collapse(_f, other_level=None, **kwargs)

Collapse factor levels into manually defined groups

Parameters
  • _f A factor
  • other_level (optional) Replace all levels not named in kwargs.If not, don't collapse them.
  • **kwargs The levels to collapse.Like name=[old_level, old_level1, ...]. The old levels will be replaced with name
Returns (Any)

The factor with levels collapsed.

function

datar.apis.forcats.fct_lump(_f, n=None, prop=None, w=None, other_level='Other', ties_method='min')

Lump together factor levels into "other"

Parameters
  • n (optional) Positive n preserves the most common n values.Negative n preserves the least common -n values. It there are ties, you will get at least abs(n) values.
  • prop (optional) Positive prop lumps values which do not appear at leastprop of the time. Negative prop lumps values that do not appear at most -prop of the time.
  • w (optional) An optional numeric vector giving weights for frequency ofeach value (not level) in f.
  • other_level (optional) Value of level used for "other" values. Alwaysplaced at end of levels.
  • f A factor
  • ties_method A character string specifying how ties are treated. One of: average, first, dense, max, and min.
Returns (Any)

The factor with levels lumped.

function

datar.apis.forcats.fct_lump_min(_f, min_, w=None, other_level='Other')

lumps levels that appear fewer than min_ times.

Parameters
  • _f A factor
  • min_ Preserve levels that appear at least min_ number of times.
  • w (optional) An optional numeric vector giving weights for frequency ofeach value (not level) in f.
  • other_level (optional) Value of level used for "other" values. Alwaysplaced at end of levels.
Returns (Any)

The factor with levels lumped.

function

datar.apis.forcats.fct_lump_prop(_f, prop, w=None, other_level='Other')

Lumps levels that appear in fewer prop * n times.

Parameters
  • _f A factor
  • prop Positive prop lumps values which do not appear at leastprop of the time. Negative prop lumps values that do not appear at most -prop of the time.
  • w (optional) An optional numeric vector giving weights for frequency ofeach value (not level) in f.
  • other_level (optional) Value of level used for "other" values. Alwaysplaced at end of levels.
Returns (Any)

The factor with levels lumped.

function

datar.apis.forcats.fct_lump_n(_f, n, w=None, other_level='Other')

Lumps all levels except for the n most frequent.

Parameters
  • n Positive n preserves the most common n values.Negative n preserves the least common -n values. It there are ties, you will get at least abs(n) values.
  • w (optional) An optional numeric vector giving weights for frequency ofeach value (not level) in f.
  • other_level (optional) Value of level used for "other" values. Alwaysplaced at end of levels.
  • f A factor
  • ties_method A character string specifying how ties are treated. One of: average, first, dense, max, and min.
Returns (Any)

The factor with levels lumped.

function

datar.apis.forcats.fct_lump_lowfreq(_f, other_level='Other')

lumps together the least frequent levels, ensuringthat "other" is still the smallest level.

Parameters
  • other_level (optional) Value of level used for "other" values. Alwaysplaced at end of levels.
  • f A factor
Returns (Any)

The factor with levels lumped.

function

datar.apis.forcats.fct_other(_f, keep=None, drop=None, other_level='Other')

Replace levels with "other"

Parameters
  • _f A factor
  • keep (optional) and
  • drop (optional) Pick one of keep and drop:
    • - keep will preserve listed levels, replacing all others with
        other_level.
    • - drop will replace listed levels with other_level, keeping all
        as is.
  • other_level (optional) Value of level used for "other" values. Alwaysplaced at end of levels.
Returns (Any)

The factor with levels replaced.

function

datar.apis.forcats.fct_relabel(_f, _fun, *args, **kwargs)

Automatically relabel factor levels, collapse as necessary

Parameters
  • _f A factor
  • _fun A function to be applied to each level. Must accept the oldlevels and return a character vector of the same length as its input.
  • *args and
  • **kwargs Addtional arguments to _fun
Returns (Any)

The factor with levels relabeled

function

datar.apis.forcats.fct_expand(_f, *additional_levels)

Add additional levels to a factor

Parameters
  • _f A factor
  • *additional_levels Additional levels to add to the factor.Levels that already exist will be silently ignored.
Returns (Any)

The factor with levels expanded

function

datar.apis.forcats.fct_explicit_na(_f, na_level='(Missing)')

Make missing values explicit

This gives missing values an explicit factor level, ensuring that they appear in summaries and on plots.

Parameters
  • _f A factor
  • na_level (optional) Level to use for missing values.This is what NAs will be changed to.
Returns (Any)

The factor with explict na_levels

function

datar.apis.forcats.fct_drop(_f, only=None)

Drop unused levels

Parameters
  • _f A factor
  • only (optional) A character vector restricting the set of levels to be dropped.If supplied, only levels that have no entries and appear in this vector will be removed.
Returns (Any)

The factor with unused levels dropped

function

datar.apis.forcats.fct_unify(fs, levels=None)

Unify the levels in a list of factors

Parameters
  • fs A list of factors
  • levels (optional) Set of levels to apply to every factor. Default to unionof all factor levels
Returns (Any)

A list of factors with the levels expanded

function

datar.apis.forcats.fct_c(*fs)

Concatenate factors, combining levels

This is a useful ways of patching together factors from multiple sources that really should have the same levels but don't.

Parameters
  • *fs factors to concatenate
Returns (Any)

The concatenated factor

function

datar.apis.forcats.fct_cross(*fs, sep=':', keep_empty=False)

Combine levels from two or more factors to create a new factor

Computes a factor whose levels are all the combinations of the levels of the input factors.

Parameters
  • *fs factors to cross
  • sep (str, optional) A string to separate levels
  • keep_empty (bool, optional) If True, keep combinations with no observations as levels
Returns (Any)

The new factor

function

datar.apis.forcats.fct_count(_f, sort=False, prop=False)

Count entries in a factor

Parameters
  • _f A factor
  • sort (bool, optional) If True, sort the result so that the most common values float tothe top
  • prop (optional) If True, compute the fraction of marginal table.
Returns (Any)

A data frame with columns f, n and p, if prop is True

function

datar.apis.forcats.fct_match(_f, lvls)

Test for presence of levels in a factor

Do any of lvls occur in _f?

Parameters
  • _f A factor
  • lvls A vector specifying levels to look for.
Returns (Any)

A logical factor

function

datar.apis.forcats.fct_unique(_f)

Unique values of a factor

Parameters
  • _f A factor
Returns (Any)

The factor with the unique values in _f

function

datar.apis.forcats.lvls_reorder(_f, idx, ordered=None)

Leaves values of a factor as they are, but changes the order bygiven indices

Parameters
  • idx A integer index, with one integer for each existing level.
  • ordered (bool, optional) A logical which determines the "ordered" status of theoutput factor. None preserves the existing status of the factor.
  • f A factor (or character vector).
  • new_levels A character vector of new levels.
Returns (Any)

The factor with levels reordered

function

datar.apis.forcats.lvls_revalue(_f, new_levels)

changes the values of existing levels; there mustbe one new level for each old level

Parameters
  • _f A factor
  • new_levels A character vector of new levels.
Returns (Any)

The factor with the new levels

function

datar.apis.forcats.lvls_expand(_f, new_levels)

Expands the set of levels; the new levels mustinclude the old levels.

Parameters
  • _f A factor
  • new_levels The new levels. Must include the old ones
Returns (Any)

The factor with the new levels

function

datar.apis.forcats.lvls_union(fs)

Find all levels in a list of factors

Parameters
  • fs A list of factors
Returns (Any)

A list of all levels