Module Index

kedro - Kedro is a framework that makes it easy to build robust and scalable data pipelines by providing uniform project templates, data abstraction, configuration and pipeline assembly.
- config - kedro.config provides functionality for loading Kedro configuration from different file formats.
  - abstract_config - This module provides kedro.abstract_config with the baseline class model for a ConfigLoader implementation.
  - common - This module contains methods and facade interfaces for various ConfigLoader implementations.
  - config - This module provides kedro.config with the functionality to load one or more configuration files from specified paths.
  - omegaconf_config - This module provides kedro.config with the functionality to load one or more configuration files of yaml or json type from specified paths through OmegaConf.
  - templated_config - This module provides kedro.config with the functionality to load one or more configuration files from specified paths, and format template strings with the values from the passed dictionary.
- extras - kedro.extras provides functionality such as datasets and extensions.
  - datasets - kedro.extras.datasets is where you can find all of Kedro's data connectors. These data connectors are implementations of the AbstractDataSet.
    - api - APIDataSet loads the data from HTTP(S) APIs and returns them into either as string or json Dict. It uses the python requests library: https://requests.readthedocs.io/en/latest/
      - api_dataset - APIDataSet loads the data from HTTP(S) APIs. It uses the python requests library: https://requests.readthedocs.io/en/latest/
    - biosequence - AbstractDataSet implementation to read/write from/to a sequence file.
      - biosequence_dataset - BioSequenceDataSet loads and saves data to/from bio-sequence objects to file.
    - dask - Provides I/O modules using dask dataframe.
      - parquet_dataset - ParquetDataSet is a data set used to load and save data to parquet files using Dask dataframe
    - email - AbstractDataSet implementations for managing email messages.
      - message_dataset - EmailMessageDataSet loads/saves an email message from/to a file using an underlying filesystem (e.g.: local, S3, GCS). It uses the email package in the standard library to manage email messages.
    - geopandas - GeoJSONLocalDataset is an AbstractVersionedDataSet to save and load GeoJSON files.
      - geojson_dataset - GeoJSONDataSet loads and saves data to a local geojson file. The underlying functionality is supported by geopandas, so it supports all allowed geopandas (pandas) options for loading and saving geosjon files.
    - holoviews - AbstractDataSet implementation to save Holoviews objects as image files.
      - holoviews_writer - HoloviewsWriter saves Holoviews objects as image file(s) to an underlying filesystem (e.g. local, S3, GCS).
    - json - AbstractDataSet implementation to load/save data from/to a JSON file.
      - json_dataset - JSONDataSet loads/saves data from/to a JSON file using an underlying filesystem (e.g.: local, S3, GCS). It uses native json to handle the JSON file.
    - matplotlib - AbstractDataSet implementation to save matplotlib objects as image files.
      - matplotlib_writer - MatplotlibWriter saves one or more Matplotlib objects as image files to an underlying filesystem (e.g. local, S3, GCS).
    - networkx - AbstractDataSet implementation to save and load NetworkX graphs in JSON , GraphML and GML formats using NetworkX.
      - gml_dataset - NetworkX GMLDataSet loads and saves graphs to a graph modelling language (GML) file using an underlying filesystem (e.g.: local, S3, GCS). NetworkX is used to create GML data.
      - graphml_dataset - NetworkX GraphMLDataSet loads and saves graphs to a GraphML file using an underlying filesystem (e.g.: local, S3, GCS). NetworkX is used to create GraphML data.
      - json_dataset - JSONDataSet loads and saves graphs to a JSON file using an underlying filesystem (e.g.: local, S3, GCS). NetworkX is used to create JSON data.
    - pandas - AbstractDataSet implementations that produce pandas DataFrames.
      - csv_dataset - CSVDataSet loads/saves data from/to a CSV file using an underlying filesystem (e.g.: local, S3, GCS). It uses pandas to handle the CSV file.
      - excel_dataset - ExcelDataSet loads/saves data from/to a Excel file using an underlying filesystem (e.g.: local, S3, GCS). It uses pandas to handle the Excel file.
      - feather_dataset - FeatherDataSet is a data set used to load and save data to feather files using an underlying filesystem (e.g.: local, S3, GCS). The underlying functionality is supported by pandas, so it supports all operations the pandas supports.
      - gbq_dataset - GBQTableDataSet loads and saves data from/to Google BigQuery. It uses pandas-gbq to read and write from/to BigQuery table.
      - generic_dataset - GenericDataSet loads/saves data from/to a data file using an underlying filesystem (e.g.: local, S3, GCS). It uses pandas to handle the type of read/write target.
      - hdf_dataset - HDFDataSet loads/saves data from/to a hdf file using an underlying filesystem (e.g.: local, S3, GCS). It uses pandas.HDFStore to handle the hdf file.
      - json_dataset - JSONDataSet loads/saves data from/to a JSON file using an underlying filesystem (e.g.: local, S3, GCS). It uses pandas to handle the JSON file.
      - parquet_dataset - ParquetDataSet loads/saves data from/to a Parquet file using an underlying filesystem (e.g.: local, S3, GCS). It uses pandas to handle the Parquet file.
      - sql_dataset - SQLDataSet to load and save data to a SQL backend.
      - xml_dataset - XMLDataSet loads/saves data from/to a XML file using an underlying filesystem (e.g.: local, S3, GCS). It uses pandas to handle the XML file.
    - pickle - AbstractDataSet implementation to load/save data from/to a Pickle file.
      - pickle_dataset - PickleDataSet loads/saves data from/to a Pickle file using an underlying filesystem (e.g.: local, S3, GCS). The underlying functionality is supported by the specified backend library passed in (defaults to the ...
    - pillow - AbstractDataSet implementation to load/save image data.
      - image_dataset - ImageDataSet loads/saves image data as numpy from an underlying filesystem (e.g.: local, S3, GCS). It uses Pillow to handle image file.
    - plotly - AbstractDataSet implementations to load/save a plotly figure from/to a JSON file.
      - json_dataset - JSONDataSet loads/saves a plotly figure from/to a JSON file using an underlying filesystem (e.g.: local, S3, GCS).
      - plotly_dataset - PlotlyDataSet generates a plot from a pandas DataFrame and saves it to a JSON file using an underlying filesystem (e.g.: local, S3, GCS). It loads the JSON into a plotly figure.
    - redis - AbstractDataSet implementation to load/save data from/to a redis db.
      - redis_dataset - PickleDataSet loads/saves data from/to a Redis database. The underlying functionality is supported by the redis library, so it supports all allowed options for instantiating the redis app from_url and setting a value.
    - spark - Provides I/O modules for Apache Spark.
      - deltatable_dataset - AbstractDataSet implementation to access DeltaTables using delta-spark
      - spark_dataset - AbstractVersionedDataSet implementation to access Spark dataframes using pyspark
      - spark_hive_dataset - AbstractDataSet implementation to access Spark dataframes using pyspark on Apache Hive.
      - spark_jdbc_dataset - SparkJDBCDataSet to load and save a PySpark DataFrame via JDBC.
    - svmlight - AbstractDataSet implementation to load/save data from/to a svmlight/ libsvm sparse data file.
      - svmlight_dataset - SVMLightDataSet loads/saves data from/to a svmlight/libsvm file using an underlying filesystem (e.g.: local, S3, GCS). It uses sklearn functions dump_svmlight_file to save and load_svmlight_file to load a file.
    - tensorflow - Provides I/O for TensorFlow Models.
      - tensorflow_model_dataset - TensorflowModelDataset is a data set implementation which can save and load TensorFlow models.
    - text - AbstractDataSet implementation to load/save data from/to a text file.
      - text_dataset - TextDataSet loads/saves data from/to a text file using an underlying filesystem (e.g.: local, S3, GCS).
    - tracking - Dataset implementations to save data for Kedro Experiment Tracking
      - json_dataset - JSONDataSet saves data to a JSON file using an underlying filesystem (e.g.: local, S3, GCS). It uses native json to handle the JSON file. The JSONDataSet is part of Kedro Experiment Tracking. The dataset is versioned by default.
      - metrics_dataset - MetricsDataSet saves data to a JSON file using an underlying filesystem (e.g.: local, S3, GCS). It uses native json to handle the JSON file. The MetricsDataSet is part of Kedro Experiment Tracking. The dataset is versioned by default and only takes metrics of numeric values.
    - video - Dataset implementation to load/save data from/to a video file.
      - video_dataset - VideoDataSet loads/saves video data from an underlying filesystem (e.g.: local, S3, GCS). It uses OpenCV VideoCapture to read and decode videos and OpenCV VideoWriter to encode and write video.
    - yaml - AbstractDataSet implementation to load/save data from/to a YAML file.
      - yaml_dataset - YAMLDataSet loads/saves data from/to a YAML file using an underlying filesystem (e.g.: local, S3, GCS). It uses PyYAML to handle the YAML file.
  - extensions - This module contains an IPython extension.
    - ipython - This file and directory exists purely for backwards compatibility of the following: %load_ext kedro.extras.extensions.ipython from kedro.extras.extensions.ipython import reload_kedro
  - logging - This module contains a logging handler class which produces coloured logs.
    - color_logger - A logging handler class which produces coloured logs.
- framework - kedro.framework provides Kedro's framework components
  - cli - kedro.framework.cli implements commands available from Kedro's CLI.
    - catalog - A collection of CLI commands for working with Kedro catalog.
    - cli - kedro is a CLI for managing Kedro projects.
    - hooks - kedro.framework.cli.hooks provides primitives to use hooks to extend KedroCLI's behaviour
      - manager - This module defines a dedicated hook manager for hooks that extends Kedro CLI behaviour.
      - markers - This module provides markers to declare Kedro CLI's hook specs and implementations. For more information, please see [Pluggy's documentation](https://pluggy.readthedocs.io/en/stable/#marking-hooks).
      - specs - A module containing specifications for all callable hooks in the Kedro CLI's execution timeline. For more information about these specifications, please visit [Pluggy's documentation](https://pluggy.readthedocs.io/en/stable/#specs...
    - jupyter - A collection of helper functions to integrate with Jupyter/IPython and CLI commands for working with Kedro catalog.
    - micropkg - A collection of CLI commands for working with Kedro micro-packages.
    - pipeline - A collection of CLI commands for working with Kedro pipelines.
    - project - A collection of CLI commands for working with Kedro project.
    - registry - A collection of CLI commands for working with registered Kedro pipelines.
    - starters - kedro is a CLI for managing Kedro projects.
    - utils - Utilities for use with click.
  - context - kedro.framework.context provides functionality for loading Kedro project context.
    - context - This module provides context for Kedro project.
  - hooks - kedro.framework.hooks provides primitives to use hooks to extend KedroContext's behaviour
    - manager - This module provides an utility function to retrieve the global hook_manager singleton in a Kedro's execution process.
    - markers - This module provides markers to declare Kedro's hook specs and implementations. For more information, please see [Pluggy's documentation](https://pluggy.readthedocs.io/en/stable/#marking-hooks).
    - specs - A module containing specifications for all callable hooks in the Kedro's execution timeline. For more information about these specifications, please visit [Pluggy's documentation](https://pluggy.readthedocs.io/en/stable/#specs...
  - project - kedro.framework.project module provides utitlity to configure a Kedro project and access its settings.
  - session - kedro.framework.session provides access to KedroSession responsible for project lifecycle.
    - session - This module implements Kedro session responsible for project lifecycle.
    - shelvestore - This module implements a dict-like store object used to persist Kedro sessions. This module is separated from store.py to ensure it's only imported when exported explicitly.
    - store - This module implements a dict-like store object used to persist Kedro sessions.
  - startup - This module provides metadata for a Kedro project.
- io - kedro.io provides functionality to read and write to a number of data sets. At core of the library is AbstractDataSet which allows implementation of various ``AbstractDataSet``s.
  - cached_dataset - This module contains CachedDataSet, a dataset wrapper which caches in memory the data saved, so that the user avoids io operations with slow storage media
  - core - This module provides a set of classes which underpin the data loading and saving functionality provided by kedro.io.
  - data_catalog - DataCatalog stores instances of AbstractDataSet implementations to provide load and save capabilities from anywhere in the program. To use a DataCatalog, you need to instantiate it with a dictionary of data sets...
  - lambda_dataset - LambdaDataSet is an implementation of AbstractDataSet which allows for providing custom load, save, and exists methods without extending AbstractDataSet.
  - memory_dataset - MemoryDataSet is a data set implementation which handles in-memory data.
  - partitioned_dataset - PartitionedDataSet loads and saves partitioned file-like data using the underlying dataset definition. It also uses fsspec for filesystem level operations.
- ipython - This script creates an IPython extension to load Kedro-related variables in local scope.
- pipeline - kedro.pipeline provides functionality to define and execute data-driven pipelines.
  - modular_pipeline - Helper to integrate modular pipelines into a master pipeline.
- runner - kedro.runner provides runners that are able to execute Pipeline instances.
  - parallel_runner - ParallelRunner is an AbstractRunner implementation. It can be used to run the Pipeline in parallel groups formed by toposort.
  - runner - AbstractRunner is the base class for all Pipeline runner implementations.
  - sequential_runner - SequentialRunner is an AbstractRunner implementation. It can be used to run the Pipeline in a sequential manner using a topological sort of provided nodes.
  - thread_runner - ThreadRunner is an AbstractRunner implementation. It can be used to run the Pipeline in parallel groups formed by toposort using threads.
- utils - This module provides a set of helper functions being used across different components of kedro package.
- __main__ - Entry point when invoked with python -m kedro.