abc.ABC
kedro.io.AbstractDataSet
- AbstractDataSet is the base class for all data set implementations. All data set implementations should extend this abstract class and implement the methods marked as abstract. If a specific dataset implementation cannot be used in conjunction with the ...kedro.extras.datasets.api.APIDataSet
- APIDataSet loads the data from HTTP(S) APIs. It uses the python requests library: https://requests.readthedocs.io/en/latest/kedro.extras.datasets.biosequence.BioSequenceDataSet
- BioSequenceDataSet loads and saves data to a sequence file.kedro.extras.datasets.dask.ParquetDataSet
- ParquetDataSet loads and saves data to parquet file(s). It uses Dask remote data services to handle the corresponding load and save operations: https://docs.dask.org/en/latest/how-to/connect-to-remote-data.html...kedro.extras.datasets.pandas.GBQQueryDataSet
- GBQQueryDataSet loads data from a provided SQL query from Google BigQuery. It uses pandas.read_gbq which itself uses pandas-gbq internally to read from BigQuery table. Therefore it supports all allowed pandas options on ...kedro.extras.datasets.pandas.GBQTableDataSet
- GBQTableDataSet loads and saves data from/to Google BigQuery. It uses pandas-gbq to read and write from/to BigQuery table.kedro.extras.datasets.pandas.sql_dataset.SQLQueryDataSet
- SQLQueryDataSet loads data from a provided SQL query. It uses pandas.DataFrame internally, so it supports all allowed pandas options on read_sql_query. Since Pandas uses SQLAlchemy behind the scenes, when instantiating ...kedro.extras.datasets.pandas.sql_dataset.SQLTableDataSet
- SQLTableDataSet loads data from a SQL table and saves a pandas dataframe to a table. It uses pandas.DataFrame internally, so it supports all allowed pandas options on read_sql_table and to_sql methods. ...kedro.extras.datasets.redis.PickleDataSet
- PickleDataSet loads/saves data from/to a Redis database. The underlying functionality is supported by the redis library, so it supports all allowed options for instantiating the redis app from_url and setting a value.kedro.extras.datasets.spark.DeltaTableDataSet
- DeltaTableDataSet loads data into DeltaTable objects.kedro.extras.datasets.spark.spark_jdbc_dataset.SparkJDBCDataSet
- SparkJDBCDataSet loads data from a database table accessible via JDBC URL url and connection properties and saves the content of a PySpark DataFrame to an external database table via JDBC. It uses pyspark.sql.DataFrameReader...kedro.extras.datasets.spark.SparkHiveDataSet
- SparkHiveDataSet loads and saves Spark dataframes stored on Hive. This data set also handles some incompatible file types such as using partitioned parquet on hive which will not normally allow upserts to existing data without a complete replacement of the existing file/partition.kedro.extras.datasets.video.VideoDataSet
- VideoDataSet loads / save video data from a given filepath as sequence of PIL.Image.Image using OpenCV.kedro.io.AbstractVersionedDataSet
- AbstractVersionedDataSet is the base class for all versioned data set implementations. All data sets that implement versioning should extend this abstract class and implement the methods marked as abstract.kedro.extras.datasets.email.EmailMessageDataSet
- EmailMessageDataSet loads/saves an email message from/to a file using an underlying filesystem (e.g.: local, S3, GCS). It uses the email package in the standard library to manage email messages.kedro.extras.datasets.geopandas.GeoJSONDataSet
- GeoJSONDataSet loads/saves data to a GeoJSON file using an underlying filesystem (eg: local, S3, GCS). The underlying functionality is supported by geopandas, so it supports all allowed geopandas (pandas) options for loading and saving GeoJSON files.kedro.extras.datasets.holoviews.HoloviewsWriter
- HoloviewsWriter saves Holoviews objects to image file(s) in an underlying filesystem (e.g. local, S3, GCS).kedro.extras.datasets.json.JSONDataSet
- JSONDataSet loads/saves data from/to a JSON file using an underlying filesystem (e.g.: local, S3, GCS). It uses native json to handle the JSON file.kedro.extras.datasets.tracking.JSONDataSet
- JSONDataSet saves data to a JSON file using an underlying filesystem (e.g.: local, S3, GCS). It uses native json to handle the JSON file. The JSONDataSet is part of Kedro Experiment Tracking. The dataset is write-only and it is versioned by default.kedro.extras.datasets.tracking.MetricsDataSet
- MetricsDataSet saves data to a JSON file using an underlying filesystem (e.g.: local, S3, GCS). It uses native json to handle the JSON file. The MetricsDataSet is part of Kedro Experiment Tracking. The dataset is write-only, it is versioned by default and only takes metrics of numeric values.kedro.extras.datasets.matplotlib.MatplotlibWriter
- MatplotlibWriter saves one or more Matplotlib objects as image files to an underlying filesystem (e.g. local, S3, GCS).kedro.extras.datasets.networkx.GMLDataSet
- GMLDataSet loads and saves graphs to a GML file using an underlying filesystem (e.g.: local, S3, GCS). NetworkX is used to create GML data. See https://networkx.org/documentation/stable/tutorial.html for details.kedro.extras.datasets.networkx.GraphMLDataSet
- GraphMLDataSet loads and saves graphs to a GraphML file using an underlying filesystem (e.g.: local, S3, GCS). NetworkX is used to create GraphML data. See https://networkx.org/documentation/stable/tutorial.html...kedro.extras.datasets.networkx.JSONDataSet
- NetworkX JSONDataSet loads and saves graphs to a JSON file using an underlying filesystem (e.g.: local, S3, GCS). NetworkX is used to create JSON data. See https://networkx.org/documentation/stable/tutorial.html...kedro.extras.datasets.pandas.CSVDataSet
- CSVDataSet loads/saves data from/to a CSV file using an underlying filesystem (e.g.: local, S3, GCS). It uses pandas to handle the CSV file.kedro.extras.datasets.pandas.ExcelDataSet
- ExcelDataSet loads/saves data from/to a Excel file using an underlying filesystem (e.g.: local, S3, GCS). It uses pandas to handle the Excel file.kedro.extras.datasets.pandas.FeatherDataSet
- FeatherDataSet loads and saves data to a feather file using an underlying filesystem (e.g.: local, S3, GCS). The underlying functionality is supported by pandas, so it supports all allowed pandas options for loading and saving csv files.kedro.extras.datasets.pandas.GenericDataSet
- pandas.GenericDataSet
loads/saves data from/to a data file using an underlying filesystem (e.g.: local, S3, GCS). It uses pandas to dynamically select the appropriate type of read/write target on a best effort basis.kedro.extras.datasets.pandas.HDFDataSet
- HDFDataSet loads/saves data from/to a hdf file using an underlying filesystem (e.g. local, S3, GCS). It uses pandas.HDFStore to handle the hdf file.kedro.extras.datasets.pandas.JSONDataSet
- JSONDataSet loads/saves data from/to a JSON file using an underlying filesystem (e.g.: local, S3, GCS). It uses pandas to handle the json file.kedro.extras.datasets.pandas.ParquetDataSet
- ParquetDataSet loads/saves data from/to a Parquet file using an underlying filesystem (e.g.: local, S3, GCS). It uses pandas to handle the Parquet file.kedro.extras.datasets.pandas.XMLDataSet
- XMLDataSet loads/saves data from/to a XML file using an underlying filesystem (e.g.: local, S3, GCS). It uses pandas to handle the XML file.kedro.extras.datasets.pickle.PickleDataSet
- PickleDataSet loads/saves data from/to a Pickle file using an underlying filesystem (e.g.: local, S3, GCS). The underlying functionality is supported by the specified backend library passed in (defaults to the ...kedro.extras.datasets.pillow.ImageDataSet
- ImageDataSet loads/saves image data as numpy
from an underlying filesystem (e.g.: local, S3, GCS). It uses Pillow to handle image file.kedro.extras.datasets.plotly.JSONDataSet
- JSONDataSet loads/saves a plotly figure from/to a JSON file using an underlying filesystem (e.g.: local, S3, GCS).kedro.extras.datasets.plotly.PlotlyDataSet
- PlotlyDataSet generates a plot from a pandas DataFrame and saves it to a JSON file using an underlying filesystem (e.g.: local, S3, GCS). It loads the JSON into a plotly figure.kedro.extras.datasets.spark.SparkDataSet
- SparkDataSet loads and saves Spark dataframes.kedro.extras.datasets.svmlight.SVMLightDataSet
- SVMLightDataSet loads/saves data from/to a svmlight/libsvm file using an underlying filesystem (e.g.: local, S3, GCS). It uses sklearn functions dump_svmlight_file to save and load_svmlight_file to load a file.kedro.extras.datasets.tensorflow.TensorFlowModelDataset
- TensorflowModelDataset loads and saves TensorFlow models. The underlying functionality is supported by, and passes input arguments through to, TensorFlow 2.X load_model and save_model methods.kedro.extras.datasets.text.TextDataSet
- TextDataSet loads/saves data from/to a text file using an underlying filesystem (e.g.: local, S3, GCS)kedro.extras.datasets.yaml.YAMLDataSet
- YAMLDataSet loads/saves data from/to a YAML file using an underlying filesystem (e.g.: local, S3, GCS). It uses PyYAML to handle the YAML file.kedro.io.CachedDataSet
- CachedDataSet is a dataset wrapper which caches in memory the data saved, so that the user avoids io operations with slow storage media.kedro.io.LambdaDataSet
- LambdaDataSet loads and saves data to a data set. It relies on delegating to specific implementation such as csv, sql, etc.kedro.io.MemoryDataSet
- MemoryDataSet loads and saves data from/to an in-memory Python object.kedro.io.PartitionedDataSet
- PartitionedDataSet loads and saves partitioned file-like data using the underlying dataset definition. For filesystem level operations it uses fsspec
: https://github.com/intake/filesystem_spec.kedro.io.IncrementalDataSet
- IncrementalDataSet inherits from PartitionedDataSet, which loads and saves partitioned file-like data using the underlying dataset definition. For filesystem level operations it uses fsspec
: https://github.com/intake/filesystem_spec...kedro.io.AbstractVersionedDataSet
- AbstractVersionedDataSet is the base class for all versioned data set implementations. All data sets that implement versioning should extend this abstract class and implement the methods marked as abstract.kedro.extras.datasets.email.EmailMessageDataSet
- EmailMessageDataSet loads/saves an email message from/to a file using an underlying filesystem (e.g.: local, S3, GCS). It uses the email package in the standard library to manage email messages.kedro.extras.datasets.geopandas.GeoJSONDataSet
- GeoJSONDataSet loads/saves data to a GeoJSON file using an underlying filesystem (eg: local, S3, GCS). The underlying functionality is supported by geopandas, so it supports all allowed geopandas (pandas) options for loading and saving GeoJSON files.kedro.extras.datasets.holoviews.HoloviewsWriter
- HoloviewsWriter saves Holoviews objects to image file(s) in an underlying filesystem (e.g. local, S3, GCS).kedro.extras.datasets.json.JSONDataSet
- JSONDataSet loads/saves data from/to a JSON file using an underlying filesystem (e.g.: local, S3, GCS). It uses native json to handle the JSON file.kedro.extras.datasets.tracking.JSONDataSet
- JSONDataSet saves data to a JSON file using an underlying filesystem (e.g.: local, S3, GCS). It uses native json to handle the JSON file. The JSONDataSet is part of Kedro Experiment Tracking. The dataset is write-only and it is versioned by default.kedro.extras.datasets.tracking.MetricsDataSet
- MetricsDataSet saves data to a JSON file using an underlying filesystem (e.g.: local, S3, GCS). It uses native json to handle the JSON file. The MetricsDataSet is part of Kedro Experiment Tracking. The dataset is write-only, it is versioned by default and only takes metrics of numeric values.kedro.extras.datasets.matplotlib.MatplotlibWriter
- MatplotlibWriter saves one or more Matplotlib objects as image files to an underlying filesystem (e.g. local, S3, GCS).kedro.extras.datasets.networkx.GMLDataSet
- GMLDataSet loads and saves graphs to a GML file using an underlying filesystem (e.g.: local, S3, GCS). NetworkX is used to create GML data. See https://networkx.org/documentation/stable/tutorial.html for details.kedro.extras.datasets.networkx.GraphMLDataSet
- GraphMLDataSet loads and saves graphs to a GraphML file using an underlying filesystem (e.g.: local, S3, GCS). NetworkX is used to create GraphML data. See https://networkx.org/documentation/stable/tutorial.html...kedro.extras.datasets.networkx.JSONDataSet
- NetworkX JSONDataSet loads and saves graphs to a JSON file using an underlying filesystem (e.g.: local, S3, GCS). NetworkX is used to create JSON data. See https://networkx.org/documentation/stable/tutorial.html...kedro.extras.datasets.pandas.CSVDataSet
- CSVDataSet loads/saves data from/to a CSV file using an underlying filesystem (e.g.: local, S3, GCS). It uses pandas to handle the CSV file.kedro.extras.datasets.pandas.ExcelDataSet
- ExcelDataSet loads/saves data from/to a Excel file using an underlying filesystem (e.g.: local, S3, GCS). It uses pandas to handle the Excel file.kedro.extras.datasets.pandas.FeatherDataSet
- FeatherDataSet loads and saves data to a feather file using an underlying filesystem (e.g.: local, S3, GCS). The underlying functionality is supported by pandas, so it supports all allowed pandas options for loading and saving csv files.kedro.extras.datasets.pandas.GenericDataSet
- pandas.GenericDataSet
loads/saves data from/to a data file using an underlying filesystem (e.g.: local, S3, GCS). It uses pandas to dynamically select the appropriate type of read/write target on a best effort basis.kedro.extras.datasets.pandas.HDFDataSet
- HDFDataSet loads/saves data from/to a hdf file using an underlying filesystem (e.g. local, S3, GCS). It uses pandas.HDFStore to handle the hdf file.kedro.extras.datasets.pandas.JSONDataSet
- JSONDataSet loads/saves data from/to a JSON file using an underlying filesystem (e.g.: local, S3, GCS). It uses pandas to handle the json file.kedro.extras.datasets.pandas.ParquetDataSet
- ParquetDataSet loads/saves data from/to a Parquet file using an underlying filesystem (e.g.: local, S3, GCS). It uses pandas to handle the Parquet file.kedro.extras.datasets.pandas.XMLDataSet
- XMLDataSet loads/saves data from/to a XML file using an underlying filesystem (e.g.: local, S3, GCS). It uses pandas to handle the XML file.kedro.extras.datasets.pickle.PickleDataSet
- PickleDataSet loads/saves data from/to a Pickle file using an underlying filesystem (e.g.: local, S3, GCS). The underlying functionality is supported by the specified backend library passed in (defaults to the ...kedro.extras.datasets.pillow.ImageDataSet
- ImageDataSet loads/saves image data as numpy
from an underlying filesystem (e.g.: local, S3, GCS). It uses Pillow to handle image file.kedro.extras.datasets.plotly.JSONDataSet
- JSONDataSet loads/saves a plotly figure from/to a JSON file using an underlying filesystem (e.g.: local, S3, GCS).kedro.extras.datasets.plotly.PlotlyDataSet
- PlotlyDataSet generates a plot from a pandas DataFrame and saves it to a JSON file using an underlying filesystem (e.g.: local, S3, GCS). It loads the JSON into a plotly figure.kedro.extras.datasets.spark.SparkDataSet
- SparkDataSet loads and saves Spark dataframes.kedro.extras.datasets.svmlight.SVMLightDataSet
- SVMLightDataSet loads/saves data from/to a svmlight/libsvm file using an underlying filesystem (e.g.: local, S3, GCS). It uses sklearn functions dump_svmlight_file to save and load_svmlight_file to load a file.kedro.extras.datasets.tensorflow.TensorFlowModelDataset
- TensorflowModelDataset loads and saves TensorFlow models. The underlying functionality is supported by, and passes input arguments through to, TensorFlow 2.X load_model and save_model methods.kedro.extras.datasets.text.TextDataSet
- TextDataSet loads/saves data from/to a text file using an underlying filesystem (e.g.: local, S3, GCS)kedro.extras.datasets.yaml.YAMLDataSet
- YAMLDataSet loads/saves data from/to a YAML file using an underlying filesystem (e.g.: local, S3, GCS). It uses PyYAML to handle the YAML file.kedro.runner.AbstractRunner
- AbstractRunner is the base class for all Pipeline runner implementations.kedro.runner.ParallelRunner
- ParallelRunner is an AbstractRunner implementation. It can be used to run the Pipeline in parallel groups formed by toposort. Please note that this runner
implementation validates dataset using the _validate_catalog...kedro.runner.SequentialRunner
- SequentialRunner is an AbstractRunner implementation. It can be used to run the Pipeline in a sequential manner using a topological sort of provided nodes.kedro.runner.ThreadRunner
- ThreadRunner is an AbstractRunner implementation. It can be used to run the Pipeline in parallel groups formed by toposort using threads.click.CommandCollection
kedro.framework.cli.utils.CommandCollection
- Modified from the Click one to still run the source groups function.kedro.framework.cli.cli.KedroCLI
- A CommandCollection class to encapsulate the KedroCLI command loading.click.exceptions.ClickException
kedro.framework.cli.utils.KedroCliError
- Exceptions generated from the Kedro CLI.collections.abc.MutableMapping
kedro.framework.project._ProjectPipelines
- A read-only lazy dictionary-like object to hold the project pipelines. On configure it will store the pipelines module. On first data access, e.g. through __getitem__, it will load the registered pipelines and merge them with pipelines defined from hooks.collections.abc.Sequence
kedro.extras.datasets.video.video_dataset.AbstractVideo
- Base class for the underlying video datakedro.extras.datasets.video.video_dataset.FileVideo
- A video object read from a filekedro.extras.datasets.video.video_dataset.GeneratorVideo
- A video object with frames yielded by a generatorkedro.extras.datasets.video.video_dataset.SequenceVideo
- A video object read from an indexable sequence of framescollections.UserDict
kedro.config.AbstractConfigLoader
- for all ConfigLoader
implementations.kedro.config.ConfigLoader
- Recursively scan directories (config paths) contained in conf_source for configuration files with a yaml, yml, json, ini, pickle, xml or properties extension, load them, and return them in the form of a config dictionary.kedro.config.OmegaConfigLoader
- Recursively scan directories (config paths) contained in conf_source for configuration files with a yaml, yml or json extension, load and merge them through OmegaConf (https://omegaconf.readthedocs.io/...kedro.config.TemplatedConfigLoader
- Extension of the ConfigLoader class that allows for template values, wrapped in brackets like: ${...}, to be automatically formatted based on the configs.kedro.framework.project._ProjectLogging
- No class docstring; 0/1 instance variable, 2/2 methods documentedkedro.framework.session.store.BaseSessionStore
- BaseSessionStore is the base class for all session stores. BaseSessionStore is an ephemeral store implementation that doesn't persist the session data.kedro.framework.session.shelvestore.ShelveStore
- Stores the session data on disk using shelve
package. This is an example of how to persist data on disk.dynaconf.LazySettings
kedro.framework.project._ProjectSettings
- Define all settings available for users to configure in Kedro, along with their validation rules and default values. Use Dynaconf's LazySettings as base.dynaconf.validator.Validator
kedro.framework.project._HasSharedParentClassValidator
- A validator to check that the parent of the default class is an ancestor of the settings value.kedro.framework.project._IsSubclassValidator
- A validator to check if the supplied setting value is a subclass of the default classException
kedro.config.BadConfigException
- Raised when a configuration file cannot be loaded, for instance due to wrong syntax or poor formatting.kedro.config.MissingConfigException
- Raised when no configuration files can be found within a config pathkedro.framework.context.KedroContextError
- Error occurred when loading project and running context pipeline.kedro.framework.session.session.KedroSessionError
- KedroSessionError raised by KedroSession in the case that multiple runs are attempted in one session.kedro.io.DataSetError
- DataSetError raised by AbstractDataSet implementations in case of failure of input/output methods.kedro.io.core.VersionNotFoundError
- VersionNotFoundError raised by AbstractVersionedDataSet implementations in case of no load versions available for the data set.kedro.io.DataSetAlreadyExistsError
- DataSetAlreadyExistsError raised by DataCatalog class in case of trying to add a data set which already exists in the DataCatalog.kedro.io.DataSetNotFoundError
- DataSetNotFoundError raised by DataCatalog class in case of trying to use a non-existing data set.kedro.pipeline.modular_pipeline.ModularPipelineError
- Raised when a modular pipeline is not adapted and integrated appropriately using the helper.kedro.pipeline.pipeline.CircularDependencyError
- Raised when it is not possible to provide a topological execution order for nodes, due to a circular dependency existing in the node definition.kedro.pipeline.pipeline.ConfirmNotUniqueError
- Raised when two or more nodes that are part of the same pipeline attempt to confirm the same dataset.kedro.pipeline.pipeline.OutputNotUniqueError
- Raised when two or more nodes that are part of the same pipeline produce outputs with the same name.hdfs.InsecureClient
kedro.extras.datasets.spark.spark_dataset.KedroHdfsInsecureClient
- Subclasses hdfs.InsecureClient and implements hdfs_exists and hdfs_glob methods required by SparkDataSetkedro.extras.datasets.video.video_dataset.SlicedVideo
- A representation of slices of other video typeskedro.framework.cli.hooks.specs.CLICommandSpecs
- Namespace that defines all specifications for Kedro CLI's lifecycle hooks.kedro.framework.cli.starters._Prompt
- Represent a single CLI prompt for kedro new
kedro.framework.cli.starters.KedroStarterSpec
- Specification of custom kedro starter templatekedro.framework.context.KedroContext
- KedroContext is the base class which holds the configuration and Kedro's main functionality.kedro.framework.hooks.manager._NullPluginManager
- This class creates an empty hook_manager that will ignore all calls to hooks, allowing the runner to function if no hook_manager has been instantiated.kedro.framework.hooks.specs.DataCatalogSpecs
- Namespace that defines all specifications for a data catalog's lifecycle hooks.kedro.framework.hooks.specs.DatasetSpecs
- Namespace that defines all specifications for a dataset's lifecycle hooks.kedro.framework.hooks.specs.KedroContextSpecs
- Namespace that defines all specifications for a Kedro context's lifecycle hooks.kedro.framework.hooks.specs.NodeSpecs
- Namespace that defines all specifications for a node's lifecycle hooks.kedro.framework.hooks.specs.PipelineSpecs
- Namespace that defines all specifications for a pipeline's lifecycle hooks.kedro.framework.session.KedroSession
- KedroSession is the object that is responsible for managing the lifecycle of a Kedro run. Use KedroSession.create()
as a context manager to construct a new KedroSession with session data provided (see the example below).kedro.io.data_catalog._FrozenDatasets
- Helper class to access underlying loaded datasetskedro.io.DataCatalog
- DataCatalog stores instances of AbstractDataSet implementations to provide load and save capabilities from anywhere in the program. To use a DataCatalog, you need to instantiate it with a dictionary of data sets...kedro.pipeline.node.Node
- Node is an auxiliary class facilitating the operations required to run user-provided functions as part of Kedro pipelines.kedro.pipeline.pipeline.Pipeline
- A Pipeline defined as a collection of Node objects. This class treats nodes as part of a graph representation and provides inputs, outputs and execution order.kedro.runner.parallel_runner._SharedMemoryDataSet
- _SharedMemoryDataSet is a wrapper class for a shared MemoryDataSet in SyncManager. It is not inherited from AbstractDataSet class.logging.StreamHandler
kedro.extras.logging.ColorHandler
- A color log handler.multiprocessing.managers.SyncManager
kedro.runner.parallel_runner.ParallelRunnerManager
- ParallelRunnerManager is used to create shared MemoryDataSet objects as default data sets in a pipeline.namedtuple('Version', ['load', 'save'])
kedro.io.Version
- This namedtuple is used to provide load and save versions for versioned data sets. If Version.load is None, then the latest available version is loaded. If Version.save is None, then save version is formatted as YYYY-MM-DDThh...pluggy.PluginManager
kedro.framework.cli.hooks.CLIHooksManager
- Hooks manager to manage CLI hookstyping.Generic
kedro.io.AbstractDataSet
- AbstractDataSet is the base class for all data set implementations. All data set implementations should extend this abstract class and implement the methods marked as abstract. If a specific dataset implementation cannot be used in conjunction with the ...kedro.extras.datasets.api.APIDataSet
- APIDataSet loads the data from HTTP(S) APIs. It uses the python requests library: https://requests.readthedocs.io/en/latest/kedro.extras.datasets.biosequence.BioSequenceDataSet
- BioSequenceDataSet loads and saves data to a sequence file.kedro.extras.datasets.dask.ParquetDataSet
- ParquetDataSet loads and saves data to parquet file(s). It uses Dask remote data services to handle the corresponding load and save operations: https://docs.dask.org/en/latest/how-to/connect-to-remote-data.html...kedro.extras.datasets.pandas.GBQQueryDataSet
- GBQQueryDataSet loads data from a provided SQL query from Google BigQuery. It uses pandas.read_gbq which itself uses pandas-gbq internally to read from BigQuery table. Therefore it supports all allowed pandas options on ...kedro.extras.datasets.pandas.GBQTableDataSet
- GBQTableDataSet loads and saves data from/to Google BigQuery. It uses pandas-gbq to read and write from/to BigQuery table.kedro.extras.datasets.pandas.sql_dataset.SQLQueryDataSet
- SQLQueryDataSet loads data from a provided SQL query. It uses pandas.DataFrame internally, so it supports all allowed pandas options on read_sql_query. Since Pandas uses SQLAlchemy behind the scenes, when instantiating ...kedro.extras.datasets.pandas.sql_dataset.SQLTableDataSet
- SQLTableDataSet loads data from a SQL table and saves a pandas dataframe to a table. It uses pandas.DataFrame internally, so it supports all allowed pandas options on read_sql_table and to_sql methods. ...kedro.extras.datasets.redis.PickleDataSet
- PickleDataSet loads/saves data from/to a Redis database. The underlying functionality is supported by the redis library, so it supports all allowed options for instantiating the redis app from_url and setting a value.kedro.extras.datasets.spark.DeltaTableDataSet
- DeltaTableDataSet loads data into DeltaTable objects.kedro.extras.datasets.spark.spark_jdbc_dataset.SparkJDBCDataSet
- SparkJDBCDataSet loads data from a database table accessible via JDBC URL url and connection properties and saves the content of a PySpark DataFrame to an external database table via JDBC. It uses pyspark.sql.DataFrameReader...kedro.extras.datasets.spark.SparkHiveDataSet
- SparkHiveDataSet loads and saves Spark dataframes stored on Hive. This data set also handles some incompatible file types such as using partitioned parquet on hive which will not normally allow upserts to existing data without a complete replacement of the existing file/partition.kedro.extras.datasets.video.VideoDataSet
- VideoDataSet loads / save video data from a given filepath as sequence of PIL.Image.Image using OpenCV.kedro.io.AbstractVersionedDataSet
- AbstractVersionedDataSet is the base class for all versioned data set implementations. All data sets that implement versioning should extend this abstract class and implement the methods marked as abstract.kedro.extras.datasets.email.EmailMessageDataSet
- EmailMessageDataSet loads/saves an email message from/to a file using an underlying filesystem (e.g.: local, S3, GCS). It uses the email package in the standard library to manage email messages.kedro.extras.datasets.geopandas.GeoJSONDataSet
- GeoJSONDataSet loads/saves data to a GeoJSON file using an underlying filesystem (eg: local, S3, GCS). The underlying functionality is supported by geopandas, so it supports all allowed geopandas (pandas) options for loading and saving GeoJSON files.kedro.extras.datasets.holoviews.HoloviewsWriter
- HoloviewsWriter saves Holoviews objects to image file(s) in an underlying filesystem (e.g. local, S3, GCS).kedro.extras.datasets.json.JSONDataSet
- JSONDataSet loads/saves data from/to a JSON file using an underlying filesystem (e.g.: local, S3, GCS). It uses native json to handle the JSON file.kedro.extras.datasets.tracking.JSONDataSet
- JSONDataSet saves data to a JSON file using an underlying filesystem (e.g.: local, S3, GCS). It uses native json to handle the JSON file. The JSONDataSet is part of Kedro Experiment Tracking. The dataset is write-only and it is versioned by default.kedro.extras.datasets.tracking.MetricsDataSet
- MetricsDataSet saves data to a JSON file using an underlying filesystem (e.g.: local, S3, GCS). It uses native json to handle the JSON file. The MetricsDataSet is part of Kedro Experiment Tracking. The dataset is write-only, it is versioned by default and only takes metrics of numeric values.kedro.extras.datasets.matplotlib.MatplotlibWriter
- MatplotlibWriter saves one or more Matplotlib objects as image files to an underlying filesystem (e.g. local, S3, GCS).kedro.extras.datasets.networkx.GMLDataSet
- GMLDataSet loads and saves graphs to a GML file using an underlying filesystem (e.g.: local, S3, GCS). NetworkX is used to create GML data. See https://networkx.org/documentation/stable/tutorial.html for details.kedro.extras.datasets.networkx.GraphMLDataSet
- GraphMLDataSet loads and saves graphs to a GraphML file using an underlying filesystem (e.g.: local, S3, GCS). NetworkX is used to create GraphML data. See https://networkx.org/documentation/stable/tutorial.html...kedro.extras.datasets.networkx.JSONDataSet
- NetworkX JSONDataSet loads and saves graphs to a JSON file using an underlying filesystem (e.g.: local, S3, GCS). NetworkX is used to create JSON data. See https://networkx.org/documentation/stable/tutorial.html...kedro.extras.datasets.pandas.CSVDataSet
- CSVDataSet loads/saves data from/to a CSV file using an underlying filesystem (e.g.: local, S3, GCS). It uses pandas to handle the CSV file.kedro.extras.datasets.pandas.ExcelDataSet
- ExcelDataSet loads/saves data from/to a Excel file using an underlying filesystem (e.g.: local, S3, GCS). It uses pandas to handle the Excel file.kedro.extras.datasets.pandas.FeatherDataSet
- FeatherDataSet loads and saves data to a feather file using an underlying filesystem (e.g.: local, S3, GCS). The underlying functionality is supported by pandas, so it supports all allowed pandas options for loading and saving csv files.kedro.extras.datasets.pandas.GenericDataSet
- pandas.GenericDataSet
loads/saves data from/to a data file using an underlying filesystem (e.g.: local, S3, GCS). It uses pandas to dynamically select the appropriate type of read/write target on a best effort basis.kedro.extras.datasets.pandas.HDFDataSet
- HDFDataSet loads/saves data from/to a hdf file using an underlying filesystem (e.g. local, S3, GCS). It uses pandas.HDFStore to handle the hdf file.kedro.extras.datasets.pandas.JSONDataSet
- JSONDataSet loads/saves data from/to a JSON file using an underlying filesystem (e.g.: local, S3, GCS). It uses pandas to handle the json file.kedro.extras.datasets.pandas.ParquetDataSet
- ParquetDataSet loads/saves data from/to a Parquet file using an underlying filesystem (e.g.: local, S3, GCS). It uses pandas to handle the Parquet file.kedro.extras.datasets.pandas.XMLDataSet
- XMLDataSet loads/saves data from/to a XML file using an underlying filesystem (e.g.: local, S3, GCS). It uses pandas to handle the XML file.kedro.extras.datasets.pickle.PickleDataSet
- PickleDataSet loads/saves data from/to a Pickle file using an underlying filesystem (e.g.: local, S3, GCS). The underlying functionality is supported by the specified backend library passed in (defaults to the ...kedro.extras.datasets.pillow.ImageDataSet
- ImageDataSet loads/saves image data as numpy
from an underlying filesystem (e.g.: local, S3, GCS). It uses Pillow to handle image file.kedro.extras.datasets.plotly.JSONDataSet
- JSONDataSet loads/saves a plotly figure from/to a JSON file using an underlying filesystem (e.g.: local, S3, GCS).kedro.extras.datasets.plotly.PlotlyDataSet
- PlotlyDataSet generates a plot from a pandas DataFrame and saves it to a JSON file using an underlying filesystem (e.g.: local, S3, GCS). It loads the JSON into a plotly figure.kedro.extras.datasets.spark.SparkDataSet
- SparkDataSet loads and saves Spark dataframes.kedro.extras.datasets.svmlight.SVMLightDataSet
- SVMLightDataSet loads/saves data from/to a svmlight/libsvm file using an underlying filesystem (e.g.: local, S3, GCS). It uses sklearn functions dump_svmlight_file to save and load_svmlight_file to load a file.kedro.extras.datasets.tensorflow.TensorFlowModelDataset
- TensorflowModelDataset loads and saves TensorFlow models. The underlying functionality is supported by, and passes input arguments through to, TensorFlow 2.X load_model and save_model methods.kedro.extras.datasets.text.TextDataSet
- TextDataSet loads/saves data from/to a text file using an underlying filesystem (e.g.: local, S3, GCS)kedro.extras.datasets.yaml.YAMLDataSet
- YAMLDataSet loads/saves data from/to a YAML file using an underlying filesystem (e.g.: local, S3, GCS). It uses PyYAML to handle the YAML file.kedro.io.CachedDataSet
- CachedDataSet is a dataset wrapper which caches in memory the data saved, so that the user avoids io operations with slow storage media.kedro.io.LambdaDataSet
- LambdaDataSet loads and saves data to a data set. It relies on delegating to specific implementation such as csv, sql, etc.kedro.io.MemoryDataSet
- MemoryDataSet loads and saves data from/to an in-memory Python object.kedro.io.PartitionedDataSet
- PartitionedDataSet loads and saves partitioned file-like data using the underlying dataset definition. For filesystem level operations it uses fsspec
: https://github.com/intake/filesystem_spec.kedro.io.IncrementalDataSet
- IncrementalDataSet inherits from PartitionedDataSet, which loads and saves partitioned file-like data using the underlying dataset definition. For filesystem level operations it uses fsspec
: https://github.com/intake/filesystem_spec...typing.NamedTuple
kedro.framework.cli.pipeline.PipelineArtifacts
- An ordered collection of source_path, tests_path, config_pathskedro.framework.startup.ProjectMetadata
- Structure holding project metadata derived from pyproject.toml