package documentation
kedro.io provides functionality to read and write to a number of data sets. At core of the library is AbstractDataSet which allows implementation of various ``AbstractDataSet``s.
Module | cached |
This module contains CachedDataSet, a dataset wrapper which caches in memory the data saved, so that the user avoids io operations with slow storage media |
Module | core |
This module provides a set of classes which underpin the data loading and saving functionality provided by kedro.io. |
Module | data |
DataCatalog stores instances of AbstractDataSet implementations to provide load and save capabilities from anywhere in the program. To use a DataCatalog, you need to instantiate it with a dictionary of data sets... |
Module | lambda |
LambdaDataSet is an implementation of AbstractDataSet which allows for providing custom load, save, and exists methods without extending AbstractDataSet. |
Module | memory |
MemoryDataSet is a data set implementation which handles in-memory data. |
Module | partitioned |
PartitionedDataSet loads and saves partitioned file-like data using the underlying dataset definition. It also uses fsspec for filesystem level operations. |
From __init__.py
:
Class |
|
AbstractDataSet is the base class for all data set implementations. All data set implementations should extend this abstract class and implement the methods marked as abstract. If a specific dataset implementation cannot be used in conjunction with the ... |
Class |
|
AbstractVersionedDataSet is the base class for all versioned data set implementations. All data sets that implement versioning should extend this abstract class and implement the methods marked as abstract. |
Class |
|
CachedDataSet is a dataset wrapper which caches in memory the data saved, so that the user avoids io operations with slow storage media. |
Class |
|
DataCatalog stores instances of AbstractDataSet implementations to provide load and save capabilities from anywhere in the program. To use a DataCatalog, you need to instantiate it with a dictionary of data sets... |
Class |
|
IncrementalDataSet inherits from PartitionedDataSet, which loads and saves partitioned file-like data using the underlying dataset definition. For filesystem level operations it uses fsspec : https://github.com/intake/filesystem_spec... |
Class |
|
LambdaDataSet loads and saves data to a data set. It relies on delegating to specific implementation such as csv, sql, etc. |
Class |
|
MemoryDataSet loads and saves data from/to an in-memory Python object. |
Class |
|
PartitionedDataSet loads and saves partitioned file-like data using the underlying dataset definition. For filesystem level operations it uses fsspec : https://github.com/intake/filesystem_spec. |
Class |
|
This namedtuple is used to provide load and save versions for versioned data sets. If Version.load is None, then the latest available version is loaded. If Version.save is None, then save version is formatted as YYYY-MM-DDThh... |
Exception |
|
DataSetAlreadyExistsError raised by DataCatalog class in case of trying to add a data set which already exists in the DataCatalog. |
Exception |
|
DataSetError raised by AbstractDataSet implementations in case of failure of input/output methods. |
Exception |
|
DataSetNotFoundError raised by DataCatalog class in case of trying to use a non-existing data set. |