kedro
- Kedro is a framework that makes it easy to build robust and scalable data pipelines by providing uniform project templates, data abstraction, configuration and pipeline assembly.config
- kedro.config provides functionality for loading Kedro configuration from different file formats.abstract_config
- This module provides kedro.abstract_config with the baseline class model for a ConfigLoader
implementation.common
- This module contains methods and facade interfaces for various ConfigLoader implementations.config
- This module provides kedro.config with the functionality to load one or more configuration files from specified paths.omegaconf_config
- This module provides kedro.config with the functionality to load one or more configuration files of yaml or json type from specified paths through OmegaConf.templated_config
- This module provides kedro.config with the functionality to load one or more configuration files from specified paths, and format template strings with the values from the passed dictionary.extras
- kedro.extras provides functionality such as datasets and extensions.datasets
- kedro.extras.datasets is where you can find all of Kedro's data connectors. These data connectors are implementations of the AbstractDataSet.api
- APIDataSet loads the data from HTTP(S) APIs and returns them into either as string or json Dict. It uses the python requests library: https://requests.readthedocs.io/en/latest/api_dataset
- APIDataSet loads the data from HTTP(S) APIs. It uses the python requests library: https://requests.readthedocs.io/en/latest/biosequence
- AbstractDataSet implementation to read/write from/to a sequence file.biosequence_dataset
- BioSequenceDataSet loads and saves data to/from bio-sequence objects to file.dask
- Provides I/O modules using dask dataframe.parquet_dataset
- ParquetDataSet is a data set used to load and save data to parquet files using Dask dataframeemail
- AbstractDataSet implementations for managing email messages.message_dataset
- EmailMessageDataSet loads/saves an email message from/to a file using an underlying filesystem (e.g.: local, S3, GCS). It uses the email package in the standard library to manage email messages.geopandas
- GeoJSONLocalDataset is an AbstractVersionedDataSet to save and load GeoJSON files.geojson_dataset
- GeoJSONDataSet loads and saves data to a local geojson file. The underlying functionality is supported by geopandas, so it supports all allowed geopandas (pandas) options for loading and saving geosjon files.holoviews
- AbstractDataSet implementation to save Holoviews objects as image files.holoviews_writer
- HoloviewsWriter saves Holoviews objects as image file(s) to an underlying filesystem (e.g. local, S3, GCS).json
- AbstractDataSet implementation to load/save data from/to a JSON file.json_dataset
- JSONDataSet loads/saves data from/to a JSON file using an underlying filesystem (e.g.: local, S3, GCS). It uses native json to handle the JSON file.matplotlib
- AbstractDataSet implementation to save matplotlib objects as image files.matplotlib_writer
- MatplotlibWriter saves one or more Matplotlib objects as image files to an underlying filesystem (e.g. local, S3, GCS).networkx
- AbstractDataSet implementation to save and load NetworkX graphs in JSON , GraphML and GML formats using NetworkX.gml_dataset
- NetworkX GMLDataSet loads and saves graphs to a graph modelling language (GML) file using an underlying filesystem (e.g.: local, S3, GCS). NetworkX is used to create GML data.graphml_dataset
- NetworkX GraphMLDataSet loads and saves graphs to a GraphML file using an underlying filesystem (e.g.: local, S3, GCS). NetworkX is used to create GraphML data.json_dataset
- JSONDataSet loads and saves graphs to a JSON file using an underlying filesystem (e.g.: local, S3, GCS). NetworkX is used to create JSON data.pandas
- AbstractDataSet implementations that produce pandas DataFrames.csv_dataset
- CSVDataSet loads/saves data from/to a CSV file using an underlying filesystem (e.g.: local, S3, GCS). It uses pandas to handle the CSV file.excel_dataset
- ExcelDataSet loads/saves data from/to a Excel file using an underlying filesystem (e.g.: local, S3, GCS). It uses pandas to handle the Excel file.feather_dataset
- FeatherDataSet is a data set used to load and save data to feather files using an underlying filesystem (e.g.: local, S3, GCS). The underlying functionality is supported by pandas, so it supports all operations the pandas supports.gbq_dataset
- GBQTableDataSet loads and saves data from/to Google BigQuery. It uses pandas-gbq to read and write from/to BigQuery table.generic_dataset
- GenericDataSet loads/saves data from/to a data file using an underlying filesystem (e.g.: local, S3, GCS). It uses pandas to handle the type of read/write target.hdf_dataset
- HDFDataSet loads/saves data from/to a hdf file using an underlying filesystem (e.g.: local, S3, GCS). It uses pandas.HDFStore to handle the hdf file.json_dataset
- JSONDataSet loads/saves data from/to a JSON file using an underlying filesystem (e.g.: local, S3, GCS). It uses pandas to handle the JSON file.parquet_dataset
- ParquetDataSet loads/saves data from/to a Parquet file using an underlying filesystem (e.g.: local, S3, GCS). It uses pandas to handle the Parquet file.sql_dataset
- SQLDataSet to load and save data to a SQL backend.xml_dataset
- XMLDataSet loads/saves data from/to a XML file using an underlying filesystem (e.g.: local, S3, GCS). It uses pandas to handle the XML file.pickle
- AbstractDataSet implementation to load/save data from/to a Pickle file.pickle_dataset
- PickleDataSet loads/saves data from/to a Pickle file using an underlying filesystem (e.g.: local, S3, GCS). The underlying functionality is supported by the specified backend library passed in (defaults to the ...pillow
- AbstractDataSet implementation to load/save image data.image_dataset
- ImageDataSet loads/saves image data as numpy
from an underlying filesystem (e.g.: local, S3, GCS). It uses Pillow to handle image file.plotly
- AbstractDataSet implementations to load/save a plotly figure from/to a JSON file.json_dataset
- JSONDataSet loads/saves a plotly figure from/to a JSON file using an underlying filesystem (e.g.: local, S3, GCS).plotly_dataset
- PlotlyDataSet generates a plot from a pandas DataFrame and saves it to a JSON file using an underlying filesystem (e.g.: local, S3, GCS). It loads the JSON into a plotly figure.redis
- AbstractDataSet implementation to load/save data from/to a redis db.redis_dataset
- PickleDataSet loads/saves data from/to a Redis database. The underlying functionality is supported by the redis library, so it supports all allowed options for instantiating the redis app from_url and setting a value.spark
- Provides I/O modules for Apache Spark.deltatable_dataset
- AbstractDataSet implementation to access DeltaTables using delta-sparkspark_dataset
- AbstractVersionedDataSet implementation to access Spark dataframes using pysparkspark_hive_dataset
- AbstractDataSet implementation to access Spark dataframes using pyspark on Apache Hive.spark_jdbc_dataset
- SparkJDBCDataSet to load and save a PySpark DataFrame via JDBC.svmlight
- AbstractDataSet implementation to load/save data from/to a svmlight/ libsvm sparse data file.svmlight_dataset
- SVMLightDataSet loads/saves data from/to a svmlight/libsvm file using an underlying filesystem (e.g.: local, S3, GCS). It uses sklearn functions dump_svmlight_file to save and load_svmlight_file to load a file.tensorflow
- Provides I/O for TensorFlow Models.tensorflow_model_dataset
- TensorflowModelDataset is a data set implementation which can save and load TensorFlow models.text
- AbstractDataSet implementation to load/save data from/to a text file.text_dataset
- TextDataSet loads/saves data from/to a text file using an underlying filesystem (e.g.: local, S3, GCS).tracking
- Dataset implementations to save data for Kedro Experiment Trackingjson_dataset
- JSONDataSet saves data to a JSON file using an underlying filesystem (e.g.: local, S3, GCS). It uses native json to handle the JSON file. The JSONDataSet is part of Kedro Experiment Tracking. The dataset is versioned by default.metrics_dataset
- MetricsDataSet saves data to a JSON file using an underlying filesystem (e.g.: local, S3, GCS). It uses native json to handle the JSON file. The MetricsDataSet is part of Kedro Experiment Tracking. The dataset is versioned by default and only takes metrics of numeric values.video
- Dataset implementation to load/save data from/to a video file.video_dataset
- VideoDataSet loads/saves video data from an underlying filesystem (e.g.: local, S3, GCS). It uses OpenCV VideoCapture to read and decode videos and OpenCV VideoWriter to encode and write video.yaml
- AbstractDataSet implementation to load/save data from/to a YAML file.yaml_dataset
- YAMLDataSet loads/saves data from/to a YAML file using an underlying filesystem (e.g.: local, S3, GCS). It uses PyYAML to handle the YAML file.extensions
- This module contains an IPython extension.ipython
- This file and directory exists purely for backwards compatibility of the following: %load_ext kedro.extras.extensions.ipython from kedro.extras.extensions.ipython import reload_kedrologging
- This module contains a logging handler class which produces coloured logs.color_logger
- A logging handler class which produces coloured logs.framework
- kedro.framework provides Kedro's framework componentscli
- kedro.framework.cli implements commands available from Kedro's CLI.catalog
- A collection of CLI commands for working with Kedro catalog.cli
- kedro is a CLI for managing Kedro projects.hooks
- kedro.framework.cli.hooks provides primitives to use hooks to extend KedroCLI's behaviourmanager
- This module defines a dedicated hook manager for hooks that extends Kedro CLI behaviour.markers
- This module provides markers to declare Kedro CLI's hook specs and implementations. For more information, please see [Pluggy's documentation](https://pluggy.readthedocs.io/en/stable/#marking-hooks).specs
- A module containing specifications for all callable hooks in the Kedro CLI's execution timeline. For more information about these specifications, please visit [Pluggy's documentation](https://pluggy.readthedocs.io/en/stable/#specs...jupyter
- A collection of helper functions to integrate with Jupyter/IPython and CLI commands for working with Kedro catalog.micropkg
- A collection of CLI commands for working with Kedro micro-packages.pipeline
- A collection of CLI commands for working with Kedro pipelines.project
- A collection of CLI commands for working with Kedro project.registry
- A collection of CLI commands for working with registered Kedro pipelines.starters
- kedro is a CLI for managing Kedro projects.utils
- Utilities for use with click.context
- kedro.framework.context provides functionality for loading Kedro project context.context
- This module provides context for Kedro project.hooks
- kedro.framework.hooks provides primitives to use hooks to extend KedroContext's behaviourmanager
- This module provides an utility function to retrieve the global hook_manager singleton in a Kedro's execution process.markers
- This module provides markers to declare Kedro's hook specs and implementations. For more information, please see [Pluggy's documentation](https://pluggy.readthedocs.io/en/stable/#marking-hooks).specs
- A module containing specifications for all callable hooks in the Kedro's execution timeline. For more information about these specifications, please visit [Pluggy's documentation](https://pluggy.readthedocs.io/en/stable/#specs...project
- kedro.framework.project module provides utitlity to configure a Kedro project and access its settings.session
- kedro.framework.session provides access to KedroSession responsible for project lifecycle.session
- This module implements Kedro session responsible for project lifecycle.shelvestore
- This module implements a dict-like store object used to persist Kedro sessions. This module is separated from store.py to ensure it's only imported when exported explicitly.store
- This module implements a dict-like store object used to persist Kedro sessions.startup
- This module provides metadata for a Kedro project.io
- kedro.io provides functionality to read and write to a number of data sets. At core of the library is AbstractDataSet which allows implementation of various ``AbstractDataSet``s.cached_dataset
- This module contains CachedDataSet, a dataset wrapper which caches in memory the data saved, so that the user avoids io operations with slow storage mediacore
- This module provides a set of classes which underpin the data loading and saving functionality provided by kedro.io.data_catalog
- DataCatalog stores instances of AbstractDataSet implementations to provide load and save capabilities from anywhere in the program. To use a DataCatalog, you need to instantiate it with a dictionary of data sets...lambda_dataset
- LambdaDataSet is an implementation of AbstractDataSet which allows for providing custom load, save, and exists methods without extending AbstractDataSet.memory_dataset
- MemoryDataSet is a data set implementation which handles in-memory data.partitioned_dataset
- PartitionedDataSet loads and saves partitioned file-like data using the underlying dataset definition. It also uses fsspec
for filesystem level operations.ipython
- This script creates an IPython extension to load Kedro-related variables in local scope.pipeline
- kedro.pipeline provides functionality to define and execute data-driven pipelines.modular_pipeline
- Helper to integrate modular pipelines into a master pipeline.runner
- kedro.runner provides runners that are able to execute Pipeline instances.parallel_runner
- ParallelRunner is an AbstractRunner implementation. It can be used to run the Pipeline in parallel groups formed by toposort.runner
- AbstractRunner is the base class for all Pipeline runner implementations.sequential_runner
- SequentialRunner is an AbstractRunner implementation. It can be used to run the Pipeline in a sequential manner using a topological sort of provided nodes.thread_runner
- ThreadRunner is an AbstractRunner implementation. It can be used to run the Pipeline in parallel groups formed by toposort using threads.utils
- This module provides a set of helper functions being used across different components of kedro package.__main__
- Entry point when invoked with python -m kedro.