kedro.runner.AbstractRunner

class documentation

class AbstractRunner(ABC): (source)

Known subclasses: kedro.runner.ParallelRunner, kedro.runner.SequentialRunner, kedro.runner.ThreadRunner

AbstractRunner is the base class for all Pipeline runner implementations.

Method	`__init__`	Instantiates the runner classs.
Method	`create_default_data_set`	Factory method for creating the default dataset for the runner.
Method	`run`	Run the `Pipeline` using the datasets provided by `catalog` and save results back to the same objects.
Method	`run_only_missing`	Run only the missing outputs from the `Pipeline` using the datasets provided by `catalog`, and save results back to the same objects.
Method	`_run`	The abstract interface for running pipelines, assuming that the inputs have already been checked and normalized by run().
Method	`_suggest_resume_scenario`	Suggest a command to the user to resume a run after it fails. The run should be started from the point closest to the failure for which persisted input exists.
Instance Variable	`_is_async`	Undocumented
Property	`_logger`	Undocumented

def __init__(self, is_async: bool = False): (source) ¶

overridden in kedro.runner.ParallelRunner, kedro.runner.SequentialRunner, kedro.runner.ThreadRunner

Instantiates the runner classs.

Parameters
is_async:`bool`	If True, the node inputs and outputs are loaded and saved asynchronously with threads. Defaults to False.

@abstractmethod
def create_default_data_set(self, ds_name: str) -> AbstractDataSet: (source) ¶

overridden in kedro.runner.ParallelRunner, kedro.runner.SequentialRunner, kedro.runner.ThreadRunner

Factory method for creating the default dataset for the runner.

Parameters
ds_name:`str`	Name of the missing dataset.
Returns
`AbstractDataSet`	An instance of an implementation of `AbstractDataSet` to be used for all unregistered datasets.

def run(self, pipeline: Pipeline, catalog: DataCatalog, hook_manager: PluginManager = None, session_id: str = None) -> Dict[str, Any]: (source) ¶

Run the Pipeline using the datasets provided by catalog and save results back to the same objects.

Parameters
pipeline:`Pipeline`	The `Pipeline` to run.
catalog:`DataCatalog`	The `DataCatalog` from which to fetch data.
hook_manager:`PluginManager`	The `PluginManager` to activate hooks.
session_id:`str`	The id of the session.
Returns
`Dict[str, Any]`	Any node outputs that cannot be processed by the `DataCatalog`. These are returned in a dictionary, where the keys are defined by the node outputs.
Raises
`ValueError`	Raised when `Pipeline` inputs cannot be satisfied.

def run_only_missing(self, pipeline: Pipeline, catalog: DataCatalog, hook_manager: PluginManager) -> Dict[str, Any]: (source) ¶

Run only the missing outputs from the Pipeline using the datasets provided by catalog, and save results back to the same objects.

Parameters
pipeline:`Pipeline`	The `Pipeline` to run.
catalog:`DataCatalog`	The `DataCatalog` from which to fetch data.
hook_manager:`PluginManager`	The `PluginManager` to activate hooks.
Returns
`Dict[str, Any]`	Any node outputs that cannot be processed by the `DataCatalog`. These are returned in a dictionary, where the keys are defined by the node outputs.
Raises
`ValueError`	Raised when `Pipeline` inputs cannot be satisfied.

@abstractmethod
def _run(self, pipeline: Pipeline, catalog: DataCatalog, hook_manager: PluginManager, session_id: str = None): (source) ¶

overridden in kedro.runner.ParallelRunner, kedro.runner.SequentialRunner, kedro.runner.ThreadRunner

The abstract interface for running pipelines, assuming that the inputs have already been checked and normalized by run().

Parameters
pipeline:`Pipeline`	The `Pipeline` to run.
catalog:`DataCatalog`	The `DataCatalog` from which to fetch data.
hook_manager:`PluginManager`	The `PluginManager` to activate hooks.
session_id:`str`	The id of the session.

def _suggest_resume_scenario(self, pipeline: Pipeline, done_nodes: Iterable[Node], catalog: DataCatalog): (source) ¶

Suggest a command to the user to resume a run after it fails. The run should be started from the point closest to the failure for which persisted input exists.

Parameters
pipeline:`Pipeline`	the `Pipeline` of the run.
done_nodes:`Iterable[Node]`	the ``Node``s that executed successfully.
catalog:`DataCatalog`	the `DataCatalog` of the run.

_is_async = (source) ¶

Undocumented

@property
_logger = (source) ¶

Undocumented