class documentation

CachedDataSet is a dataset wrapper which caches in memory the data saved, so that the user avoids io operations with slow storage media.

You can also specify a CachedDataSet in catalog.yml:

>>> test_ds:
>>>    type: CachedDataSet
>>>    versioned: true
>>>    dataset:
>>>       type: pandas.CSVDataSet
>>>       filepath: example.csv

Please note that if your dataset is versioned, this should be indicated in the wrapper class as shown above.

Method __getstate__ Undocumented
Method __init__ Creates a new instance of CachedDataSet pointing to the provided Python object.
Static Method _from_config Undocumented
Method _describe Undocumented
Method _exists Undocumented
Method _load Undocumented
Method _release Undocumented
Method _save Undocumented
Constant _SINGLE_PROCESS Undocumented
Instance Variable _cache Undocumented
Instance Variable _dataset Undocumented

Inherited from AbstractDataSet:

Class Method from_config Create a data set instance using the configuration provided.
Method __str__ Undocumented
Method exists Checks whether a data set's output already exists by calling the provided _exists() method.
Method load Loads data by delegation to the provided load method.
Method release Release any cached data.
Method save Saves data by delegation to the provided save method.
Method _copy Undocumented
Property _logger Undocumented
def __getstate__(self): (source)

Undocumented

def __init__(self, dataset: Union[AbstractDataSet, Dict], version: Version = None, copy_mode: str = None): (source)

Creates a new instance of CachedDataSet pointing to the provided Python object.

Parameters
dataset:Union[AbstractDataSet, Dict]A Kedro DataSet object or a dictionary to cache.
version:VersionIf specified, should be an instance of kedro.io.core.Version. If its load attribute is None, the latest version will be loaded. If its save attribute is None, save version will be autogenerated.
copy_mode:strThe copy mode used to copy the data. Possible values are: "deepcopy", "copy" and "assign". If not provided, it is inferred based on the data type.
Raises
ValueErrorIf the provided dataset is not a valid dict/YAML representation of a dataset or an actual dataset.
@staticmethod
def _from_config(config, version): (source)

Undocumented

def _describe(self) -> Dict[str, Any]: (source)

Undocumented

def _exists(self) -> bool: (source)

Undocumented

def _load(self): (source)

Undocumented

def _release(self): (source)

Undocumented

def _save(self, data: Any): (source)

Undocumented

_SINGLE_PROCESS: bool = (source)

Undocumented

Value
True

Undocumented

_dataset = (source)

Undocumented