kedro.runner.ThreadRunner

class documentation

class ThreadRunner(AbstractRunner): (source)

ThreadRunner is an AbstractRunner implementation. It can be used to run the Pipeline in parallel groups formed by toposort using threads.

Method	`__init__`	Instantiates the runner.
Method	`create_default_data_set`	Factory method for creating the default dataset for the runner.
Method	`_get_required_workers_count`	Calculate the max number of processes required for the pipeline
Method	`_run`	The abstract interface for running pipelines.
Instance Variable	`_max_workers`	Undocumented

Inherited from AbstractRunner:

Method	`run`	Run the `Pipeline` using the datasets provided by `catalog` and save results back to the same objects.
Method	`run_only_missing`	Run only the missing outputs from the `Pipeline` using the datasets provided by `catalog`, and save results back to the same objects.
Method	`_suggest_resume_scenario`	Suggest a command to the user to resume a run after it fails. The run should be started from the point closest to the failure for which persisted input exists.
Instance Variable	`_is_async`	Undocumented
Property	`_logger`	Undocumented

def __init__(self, max_workers: int = None, is_async: bool = False): (source) ¶

overrides kedro.runner.AbstractRunner.__init__

Instantiates the runner.

Parameters
max_workers:`int`	Number of worker processes to spawn. If not set, calculated automatically based on the pipeline configuration and CPU core count.
is_async:`bool`	If True, set to False, because `ThreadRunner` doesn't support loading and saving the node inputs and outputs asynchronously with threads. Defaults to False.
Raises
`ValueError`	bad parameters passed

def create_default_data_set(self, ds_name: str) -> MemoryDataSet: (source) ¶

overrides kedro.runner.AbstractRunner.create_default_data_set

Factory method for creating the default dataset for the runner.

Parameters
ds_name:`str`	Name of the missing dataset.
Returns
`MemoryDataSet`	An instance of `MemoryDataSet` to be used for all unregistered datasets.

def _get_required_workers_count(self, pipeline: Pipeline): (source) ¶

Calculate the max number of processes required for the pipeline

def _run(self, pipeline: Pipeline, catalog: DataCatalog, hook_manager: PluginManager, session_id: str = None): (source) ¶

overrides kedro.runner.AbstractRunner._run

The abstract interface for running pipelines.

Parameters
pipeline:`Pipeline`	The `Pipeline` to run.
catalog:`DataCatalog`	The `DataCatalog` from which to fetch data.
hook_manager:`PluginManager`	The `PluginManager` to activate hooks.
session_id:`str`	The id of the session.
Raises
`Exception`	in case of any downstream node failure.

Undocumented