Backends
Backends
The datar package is a collection of APIs that are ported from a bunch of R packages. The APIs are implemented in a backend-agnostic way, so that they can be used with different backends. Currently, datar supports the following backends:
numpy: Mostly the implementations of functions fromdatar.base.pandas: Implementations usingpandasas backend.
Installation of a backend
pip install -U datar[<pandas>]
Using desired backends
You can install multiple backends, but can use a subset of them.
from datar import options
options(backends=['pandas'])
# Import the API functions then
Writing a backend
A backend is supposed to implement as a Simplug plugin. There are a hooks to be implemented.
Hooks
setup(): calleed before any API is imported. You can do some setup here.get_versions(): return a dict of versions of the dependencies of the backend. The keys are the names of the packages, and the values are the versions.load_dataset(name: str, metadata: Mapping): load a dataset, which can be loaded usingfrom datar.data import <dataset>.base_api(): load the implementation ofdatar.apis.base.dplyr_api(): load the implementation ofdatar.apis.dplyr.tibble_api(): load the implementation ofdatar.apis.tibble.forcats_api(): load the implementation ofdatar.apis.forcats.tidyr_api(): load the implementation ofdatar.apis.tidyr.other_api(): load other backend-specific APIs.c_getitem(item): load the implementation ofdatar.base.c.__getitem__(c[...]).operate(op: str, x: Any, y: Any = None): load the implementation of the operators.
Seleting a backend at runtime
You can use __backend to select a backend at runtime.
from datar.tibble import tibble
tibble(..., __backend="pandas")
Selecting a backend for operators
If you have multiple backends installed, you can select a backend for operators.
from datar.core.operator import DatarOperator
DatarOperator.backend = "pandas"
# Or use the context manager
with DatarOperator.with_backend("pandas"):
data >> mutate(z=f.x + f.y)
Selecting a backend for c[]
from datar.base import c
c.backend = "pandas"
# Or use the context manager
with c.with_backend("pandas"):
data >> mutate(z=c[1:3])
Selecting a backend for numpy ufuncs
from datar.apis.other import array_ufunc
array_ufunc.backend = "pandas"
# Or use the context manager
with array_ufunc.with_backend("pandas"):
data >> mutate(z=np.sin(f.x))