kedro.io

Modules Classes Names

package documentation

(source)

kedro.io provides functionality to read and write to a number of data sets. At core of the library is AbstractDataSet which allows implementation of various ``AbstractDataSet``s.

Module	`cached_dataset`	This module contains `CachedDataSet`, a dataset wrapper which caches in memory the data saved, so that the user avoids io operations with slow storage media
Module	`core`	This module provides a set of classes which underpin the data loading and saving functionality provided by `kedro.io`.
Module	`data_catalog`	`DataCatalog` stores instances of `AbstractDataSet` implementations to provide `load` and `save` capabilities from anywhere in the program. To use a `DataCatalog`, you need to instantiate it with a dictionary of data sets...
Module	`lambda_dataset`	`LambdaDataSet` is an implementation of `AbstractDataSet` which allows for providing custom load, save, and exists methods without extending `AbstractDataSet`.
Module	`memory_dataset`	`MemoryDataSet` is a data set implementation which handles in-memory data.
Module	`partitioned_dataset`	`PartitionedDataSet` loads and saves partitioned file-like data using the underlying dataset definition. It also uses `fsspec` for filesystem level operations.

From __init__.py:

Class	`AbstractDataSet`	`AbstractDataSet` is the base class for all data set implementations. All data set implementations should extend this abstract class and implement the methods marked as abstract. If a specific dataset implementation cannot be used in conjunction with the ...
Class	`AbstractVersionedDataSet`	`AbstractVersionedDataSet` is the base class for all versioned data set implementations. All data sets that implement versioning should extend this abstract class and implement the methods marked as abstract.
Class	`CachedDataSet`	`CachedDataSet` is a dataset wrapper which caches in memory the data saved, so that the user avoids io operations with slow storage media.
Class	`DataCatalog`	`DataCatalog` stores instances of `AbstractDataSet` implementations to provide `load` and `save` capabilities from anywhere in the program. To use a `DataCatalog`, you need to instantiate it with a dictionary of data sets...
Class	`IncrementalDataSet`	`IncrementalDataSet` inherits from `PartitionedDataSet`, which loads and saves partitioned file-like data using the underlying dataset definition. For filesystem level operations it uses `fsspec`: https://github.com/intake/filesystem_spec...
Class	`LambdaDataSet`	`LambdaDataSet` loads and saves data to a data set. It relies on delegating to specific implementation such as csv, sql, etc.
Class	`MemoryDataSet`	`MemoryDataSet` loads and saves data from/to an in-memory Python object.
Class	`PartitionedDataSet`	`PartitionedDataSet` loads and saves partitioned file-like data using the underlying dataset definition. For filesystem level operations it uses `fsspec`: https://github.com/intake/filesystem_spec.
Class	`Version`	This namedtuple is used to provide load and save versions for versioned data sets. If `Version.load` is None, then the latest available version is loaded. If `Version.save` is None, then save version is formatted as YYYY-MM-DDThh...
Exception	`DataSetAlreadyExistsError`	`DataSetAlreadyExistsError` raised by `DataCatalog` class in case of trying to add a data set which already exists in the `DataCatalog`.
Exception	`DataSetError`	`DataSetError` raised by `AbstractDataSet` implementations in case of failure of input/output methods.
Exception	`DataSetNotFoundError`	`DataSetNotFoundError` raised by `DataCatalog` class in case of trying to use a non-existing data set.