base-funs
%run nb_helpers.py
from datar.all import *
debug_kwargs = {'prefix': '\n', 'sep': f'\n{"-" * 20}\n'}
nb_header(
cut, diff, identity, expand_grid, outer,
make_names, make_unique, rank,
)
★ cut¶
Cut a numeric vector into bins¶
Args:¶
x
: A numeric vector
breaks
: The breaks
labels
: The labels
include_lowest
: Whether to include the lowest value
right
: Whether to include the rightmost value
dig_lab
: The number of digits for labels
ordered_result
: Whether to return an ordered factor
Returns:¶
The factor vector
★ diff¶
Difference of a numeric vector¶
Args:¶
x
: A numeric vector
lag
: The lag to use. Could be negative.
It always calculates x[lag:] - x[:-lag]
even when lag
is
negative
differences
: The order of the difference
Returns:¶
An array of x[lag:] – x[:-lag]
.
If differences > 1
, the rule applies differences
times on x
★ identity¶
★ expand_grid¶
★ outer¶
Outer product of two vectors¶
Args:¶
x
: A numeric vector
y
: A numeric vector
fun
: The function to handle how the result of the elements from
the first and second vectors should be computed.
The function has to be vectorized at the second argument, and
return the same shape as y.
Returns:¶
The outer product
★ make_names¶
★ make_unique¶
★ rank¶
debug(
cut(seq(1,10), 3),
diff([1, 2, 3]),
identity(1.23),
expand_grid([1,2], [3,4]),
outer([1,2], [3,4]),
make_names([1, 2, 3]),
make_unique([1, 1, 1]),
rank([3, 4, 1, -1]),
**debug_kwargs
)
[2022-12-02 13:26:45][datar][WARNING] New names: [2022-12-02 13:26:45][datar][WARNING] * '_1' -> '__0' [2022-12-02 13:26:45][datar][WARNING] * '_2' -> '__1' [2022-12-02 13:26:45][datar][WARNING] * '_3' -> '__2' [2022-12-02 13:26:45][datar][WARNING] New names: [2022-12-02 13:26:45][datar][WARNING] * '_1' -> '__0' [2022-12-02 13:26:45][datar][WARNING] * '_1' -> '__1' [2022-12-02 13:26:45][datar][WARNING] * '_1' -> '__2'
cut(seq(1,10), 3) -------------------- [(0.99, 4.0], (0.99, 4.0], (0.99, 4.0], (0.99, 4.0], (4.0, 7.0], (4.0, 7.0], (4.0, 7.0], (7.0, 10.0], (7.0, 10.0], (7.0, 10.0]] Categories (3, interval[float64, right]): [(0.99, 4.0] < (4.0, 7.0] < (7.0, 10.0]] diff([1, 2, 3]) -------------------- array([1, 1]) identity(1.23) -------------------- 1.23 expand_grid([1,2], [3,4]) -------------------- _VAR_0 _VAR_1 <int64> <int64> 0 1 3 1 1 4 2 2 3 3 2 4 outer([1,2], [3,4]) -------------------- 0 1 <int64> <int64> 0 3 4 1 6 8 make_names([1, 2, 3]) -------------------- ['__0', '__1', '__2'] make_unique([1, 1, 1]) -------------------- ['__0', '__1', '__2'] rank([3, 4, 1, -1]) -------------------- array([3., 4., 2., 1.])