class documentation
class CachedDataSet(AbstractDataSet): (source)
CachedDataSet is a dataset wrapper which caches in memory the data saved, so that the user avoids io operations with slow storage media.
You can also specify a CachedDataSet in catalog.yml:
>>> test_ds: >>> type: CachedDataSet >>> versioned: true >>> dataset: >>> type: pandas.CSVDataSet >>> filepath: example.csv
Please note that if your dataset is versioned, this should be indicated in the wrapper class as shown above.
Method | __getstate__ |
Undocumented |
Method | __init__ |
Creates a new instance of CachedDataSet pointing to the provided Python object. |
Static Method | _from |
Undocumented |
Method | _describe |
Undocumented |
Method | _exists |
Undocumented |
Method | _load |
Undocumented |
Method | _release |
Undocumented |
Method | _save |
Undocumented |
Constant | _SINGLE |
Undocumented |
Instance Variable | _cache |
Undocumented |
Instance Variable | _dataset |
Undocumented |
Inherited from AbstractDataSet
:
Class Method | from |
Create a data set instance using the configuration provided. |
Method | __str__ |
Undocumented |
Method | exists |
Checks whether a data set's output already exists by calling the provided _exists() method. |
Method | load |
Loads data by delegation to the provided load method. |
Method | release |
Release any cached data. |
Method | save |
Saves data by delegation to the provided save method. |
Method | _copy |
Undocumented |
Property | _logger |
Undocumented |
def __init__(self, dataset:
Union[ AbstractDataSet, Dict]
, version: Version
= None, copy_mode: str
= None):
(source)
¶
Creates a new instance of CachedDataSet pointing to the provided Python object.
Parameters | |
dataset:Union[ | A Kedro DataSet object or a dictionary to cache. |
version:Version | If specified, should be an instance of kedro.io.core.Version. If its load attribute is None, the latest version will be loaded. If its save attribute is None, save version will be autogenerated. |
copystr | The copy mode used to copy the data. Possible values are: "deepcopy", "copy" and "assign". If not provided, it is inferred based on the data type. |
Raises | |
ValueError | If the provided dataset is not a valid dict/YAML representation of a dataset or an actual dataset. |