DataModule
DatModule from plkit is a subclass of LightningDataModule from pytorch-lightning. So you can follow everything that is documented by pytorch-lightning
Other than that, we have two additional methods defined: data_reader to read the data into a collection and data_splits to split the data collection for training, validation and testing.
The data_reader method is required to be defined if you want to use the features of plkit's DataModule (auto-splitting data for example). Once you have it defined, you don't need to care about data_prepare and setup methods that LightningDataModule requires.
data_reader
You can read data into a list of samples by data_reader or yield the samples to save some memory.
Note
If data_reader is yielding (it is a generateor), plkit.data.IterDataset will be used, and samples can not be suffled.
Tip
You can yield multiple features, as well as the labels at the same time as a tuple. For example:
class MyData(plkit.DataModule):
    ...
    def data_reader(
        yield (sample_name, feature_a, feature_b, label)
Then in training_step, validation_step or test_step, you can easily decouple them like this:
class MyModule(plkit.Module):
    ...
    def training_step(self, batch, _):
        # Each variable has this batch of features
        sample_name, feature_a, feature_b, label = batch
This is also the case when a list of samples returned from data_reader.
data_splits
This method is supposed to split the data collection returned (yielded) from data_reader. If you have data_tvt (see data_tvt in configuration) specified in configuration, the collection will be specified automatically based on data_tvt. Otherwise, you can return a dictionary like this from data_splits to split the data by yourself:
from plkit.data import DataModule, Dataset, IterDataset
class MyData(DataModule):
    ...
    def data_splits(self, data, stage):
        return {
            'train': Dataset(...),
            'val': Dataset(...), # or a list of datasets,
            'test': Dataset(...), # or a list of datasets
        }
Note
setup is only calling at stage fit. If you want to do it at test stage, you will need to override setup method.