rows
%run nb_helpers.py
from datar.all import *
nb_header(
rows_insert,
rows_update,
rows_patch,
rows_upsert,
rows_delete,
book='rows'
)
★ rows_insert¶
Insert rows from y into x¶
See original API
https://dplyr.tidyverse.org/reference/rows.html
Args:¶
x
: A data frame
y
: A data frame
by
: An unnamed character vector giving the key columns.
The key columns must exist in both x and y.
Keys typically uniquely identify each row, but this is only
enforced for the key values of y
By default, we use the first column in y, since the first column is
a reasonable place to put an identifier variable.
conflict
: How to handle conflicts
- "error": Throw an error
- "ignore": Ignore conflicts
copy
: If x and y are not from the same data source, and copy is TRUE,
then y will be copied into the same src as x.
This allows you to join tables across srcs, but it is a potentially
expensive operation so you must opt into it.
in_place
: Should x be modified in place?
This may not be supported, depending on the backend implementation.
Returns:¶
A data frame with all existing rows and potentially new rows
★ rows_update¶
Update rows in x with values from y¶
See original API
https://dplyr.tidyverse.org/reference/rows.html
Args:¶
x
: A data frame
y
: A data frame
by
: An unnamed character vector giving the key columns.
The key columns must exist in both x and y.
Keys typically uniquely identify each row, but this is only
enforced for the key values of y
By default, we use the first column in y, since the first column is
a reasonable place to put an identifier variable.
unmatched
: how should keys in y that are unmatched by the keys
in x be handled?
One of -
"error", the default, will error if there are any keys in y that
are unmatched by the keys in x.
"ignore" will ignore rows in y with keys that are unmatched
by the keys in x.
copy
: If x and y are not from the same data source, and copy is TRUE,
then y will be copied into the same src as x.
This allows you to join tables across srcs, but it is a potentially
expensive operation so you must opt into it.
in_place
: Should x be modified in place?
This may not be supported, depending on the backend implementation.
Returns:¶
A data frame with all existing rows and potentially new rows
★ rows_patch¶
Patch rows in x with values from y¶
See original API
https://dplyr.tidyverse.org/reference/rows.html
Args:¶
x
: A data frame
y
: A data frame
by
: An unnamed character vector giving the key columns.
The key columns must exist in both x and y.
Keys typically uniquely identify each row, but this is only
enforced for the key values of y
By default, we use the first column in y, since the first column is
a reasonable place to put an identifier variable.
unmatched
: how should keys in y that are unmatched by the keys
in x be handled?
One of -
"error", the default, will error if there are any keys in y that
are unmatched by the keys in x.
"ignore" will ignore rows in y with keys that are unmatched
by the keys in x.
copy
: If x and y are not from the same data source, and copy is TRUE,
then y will be copied into the same src as x.
This allows you to join tables across srcs, but it is a potentially
expensive operation so you must opt into it.
in_place
: Should x be modified in place?
This may not be supported, depending on the backend implementation.
Returns:¶
A data frame with NA values overwritten and the number of rows preserved
★ rows_upsert¶
Upsert rows in x with values from y¶
See original API
https://dplyr.tidyverse.org/reference/rows.html
Args:¶
x
: A data frame
y
: A data frame
by
: An unnamed character vector giving the key columns.
The key columns must exist in both x and y.
Keys typically uniquely identify each row, but this is only
enforced for the key values of y
By default, we use the first column in y, since the first column is
a reasonable place to put an identifier variable.
copy
: If x and y are not from the same data source, and copy is TRUE,
then y will be copied into the same src as x.
This allows you to join tables across srcs, but it is a potentially
expensive operation so you must opt into it.
in_place
: Should x be modified in place?
This may not be supported, depending on the backend implementation.
Returns:¶
A data frame with inserted or updated depending on whether or not
the key value in y already exists in x. Key values in y must be unique.
★ rows_delete¶
Delete rows in x that match keys in y¶
See original API
https://dplyr.tidyverse.org/reference/rows.html
Args:¶
x
: A data frame
y
: A data frame
by
: An unnamed character vector giving the key columns.
The key columns must exist in both x and y.
Keys typically uniquely identify each row, but this is only
enforced for the key values of y
By default, we use the first column in y, since the first column is
a reasonable place to put an identifier variable.
unmatched
: how should keys in y that are unmatched by the keys
in x be handled?
One of -
"error", the default, will error if there are any keys in y that
are unmatched by the keys in x.
"ignore" will ignore rows in y with keys that are unmatched
by the keys in x.
copy
: If x and y are not from the same data source, and copy is TRUE,
then y will be copied into the same src as x.
This allows you to join tables across srcs, but it is a potentially
expensive operation so you must opt into it.
in_place
: Should x be modified in place?
This may not be supported, depending on the backend implementation.
Returns:¶
A data frame with rows deleted
data = tibble(a = seq(1, 3), b = c(letters[[0, 1]], NA), c = [.5, 1.5, 2.5])
data
a | b | c | |
---|---|---|---|
<int64> | <object> | <float64> | |
0 | 1 | a | 0.5 |
1 | 2 | b | 1.5 |
2 | 3 | NaN | 2.5 |
rows_insert(data, tibble(a = 4, b = "z"))
[2022-12-02 14:22:14][datar][ INFO] Matching, by='a'
a | b | c | |
---|---|---|---|
<int64> | <object> | <float64> | |
0 | 1 | a | 0.5 |
1 | 2 | b | 1.5 |
2 | 3 | NaN | 2.5 |
3 | 4 | z | NaN |
with try_catch():
rows_insert(data, tibble(a = 3, b = "z"))
[2022-12-02 14:22:14][datar][ INFO] Matching, by='a'
[ValueError] Attempting to insert duplicate rows.
rows_update(data, tibble(a = [2,3], b = "z"))
[2022-12-02 14:22:15][datar][ INFO] Matching, by='a'
a | b | c | |
---|---|---|---|
<int64> | <object> | <float64> | |
0 | 1 | a | 0.5 |
1 | 2 | z | 1.5 |
2 | 3 | z | 2.5 |
rows_update(data, tibble(b = "z", a = [2,3]), by = "a")
a | b | c | |
---|---|---|---|
<int64> | <object> | <float64> | |
0 | 1 | a | 0.5 |
1 | 2 | z | 1.5 |
2 | 3 | z | 2.5 |
rows_patch(data, tibble(a = [2,3], b = "z"))
[2022-12-02 14:22:17][datar][ INFO] Matching, by='a'
a | b | c | |
---|---|---|---|
<int64> | <object> | <float64> | |
0 | 1 | a | 0.5 |
1 | 2 | b | 1.5 |
2 | 3 | z | 2.5 |
rows_upsert(data, tibble(a = seq(2, 4), b = "z"))
[2022-12-02 14:22:18][datar][ INFO] Matching, by='a'
a | b | c | |
---|---|---|---|
<int64> | <object> | <float64> | |
0 | 1 | a | 0.5 |
1 | 2 | z | 1.5 |
2 | 3 | z | 2.5 |
3 | 4 | z | NaN |
rows_delete(data, tibble(a = [2, 3]))
[2022-12-02 14:22:18][datar][ INFO] Matching, by='a'
a | b | c | |
---|---|---|---|
<int64> | <object> | <float64> | |
0 | 1 | a | 0.5 |
rows_delete(data, tibble(a = [2, 3], b = "b"))
[2022-12-02 14:22:19][datar][ INFO] Matching, by='a' [2022-12-02 14:22:19][datar][ INFO] Ignoring extra columns: ['b']
a | b | c | |
---|---|---|---|
<int64> | <object> | <float64> | |
0 | 1 | a | 0.5 |
with try_catch():
rows_delete(data, tibble(a = [2,3], b = "b"), by = c("a", "b"))
[ValueError] Attempting to delete missing rows.