kedro.runner.ParallelRunner

class documentation

class ParallelRunner(AbstractRunner): (source)

ParallelRunner is an AbstractRunner implementation. It can be used to run the Pipeline in parallel groups formed by toposort. Please note that this runner implementation validates dataset using the _validate_catalog method, which checks if any of the datasets are single process only using the _SINGLE_PROCESS dataset attribute.

Method	`__del__`	Undocumented
Method	`__init__`	Instantiates the runner by creating a Manager.
Method	`create_default_data_set`	Factory method for creating the default dataset for the runner.
Class Method	`_validate_catalog`	Ensure that all data sets are serialisable and that we do not have any non proxied memory data sets being used as outputs as their content will not be synchronized across threads.
Class Method	`_validate_nodes`	Ensure all tasks are serialisable.
Method	`_get_required_workers_count`	Calculate the max number of processes required for the pipeline, limit to the number of CPU cores.
Method	`_run`	The abstract interface for running pipelines.
Instance Variable	`_manager`	Undocumented
Instance Variable	`_max_workers`	Undocumented

Inherited from AbstractRunner:

Method	`run`	Run the `Pipeline` using the datasets provided by `catalog` and save results back to the same objects.
Method	`run_only_missing`	Run only the missing outputs from the `Pipeline` using the datasets provided by `catalog`, and save results back to the same objects.
Method	`_suggest_resume_scenario`	Suggest a command to the user to resume a run after it fails. The run should be started from the point closest to the failure for which persisted input exists.
Instance Variable	`_is_async`	Undocumented
Property	`_logger`	Undocumented

def __del__(self): (source) ¶

Undocumented

def __init__(self, max_workers: int = None, is_async: bool = False): (source) ¶

overrides kedro.runner.AbstractRunner.__init__

Instantiates the runner by creating a Manager.

Parameters
max_workers:`int`	Number of worker processes to spawn. If not set, calculated automatically based on the pipeline configuration and CPU core count. On windows machines, the max_workers value cannot be larger than 61 and will be set to min(61, max_workers).
is_async:`bool`	If True, the node inputs and outputs are loaded and saved asynchronously with threads. Defaults to False.
Raises
`ValueError`	bad parameters passed

def create_default_data_set(self, ds_name: str) -> _SharedMemoryDataSet: (source) ¶

overrides kedro.runner.AbstractRunner.create_default_data_set

Factory method for creating the default dataset for the runner.

Parameters
ds_name:`str`	Name of the missing dataset.
Returns
`_SharedMemoryDataSet`	An instance of `_SharedMemoryDataSet` to be used for all unregistered datasets.

@classmethod
def _validate_catalog(cls, catalog: DataCatalog, pipeline: Pipeline): (source) ¶

Ensure that all data sets are serialisable and that we do not have any non proxied memory data sets being used as outputs as their content will not be synchronized across threads.

@classmethod
def _validate_nodes(cls, nodes: Iterable[Node]): (source) ¶

Ensure all tasks are serialisable.

def _get_required_workers_count(self, pipeline: Pipeline): (source) ¶

Calculate the max number of processes required for the pipeline, limit to the number of CPU cores.

def _run(self, pipeline: Pipeline, catalog: DataCatalog, hook_manager: PluginManager, session_id: str = None): (source) ¶

overrides kedro.runner.AbstractRunner._run

The abstract interface for running pipelines.

Parameters
pipeline:`Pipeline`	The `Pipeline` to run.
catalog:`DataCatalog`	The `DataCatalog` from which to fetch data.
hook_manager:`PluginManager`	The `PluginManager` to activate hooks.
session_id:`str`	The id of the session.
Raises
`AttributeError`	When the provided pipeline is not suitable for parallel execution.
`RuntimeError`	If the runner is unable to schedule the execution of all pipeline nodes.
`Exception`	In case of any downstream node failure.

_manager = (source) ¶

Undocumented

_max_workers = (source) ¶

Undocumented