class documentation

class ExecutionEngine: (source)

View In Hierarchy

Undocumented

Method __init__ Undocumented
Method close Gracefully close the execution engine. If it has already been started, stop it. In all cases, close the spider and the downloader.
Method close_spider Close (cancel) spider and clear all its outstanding requests
Method crawl Inject the request into the spider <-> downloader pipeline
Method download Return a Deferred which fires with a Response as result, only downloader middlewares are applied
Method has_capacity Undocumented
Method open_spider Undocumented
Method pause Undocumented
Method schedule Undocumented
Method spider_is_idle Undocumented
Method start Undocumented
Method stop Gracefully stop the execution engine
Method unpause Undocumented
Instance Variable crawler Undocumented
Instance Variable downloader Undocumented
Instance Variable logformatter Undocumented
Instance Variable paused Undocumented
Instance Variable running Undocumented
Instance Variable scheduler_cls Undocumented
Instance Variable scraper Undocumented
Instance Variable settings Undocumented
Instance Variable signals Undocumented
Instance Variable slot Undocumented
Instance Variable spider Undocumented
Instance Variable start_time Undocumented
Property open_spiders Undocumented
Method _download Undocumented
Method _downloaded Undocumented
Method _get_scheduler_class Undocumented
Method _handle_downloader_output Undocumented
Method _needs_backout Undocumented
Method _next_request Undocumented
Method _next_request_from_scheduler Undocumented
Method _schedule_request Undocumented
Method _spider_idle Called when a spider gets idle, i.e. when there are no remaining requests to download or schedule. It can be called multiple times. If a handler for the spider_idle signal raises a DontCloseSpider exception, the spider is not closed until the next loop and this function is guaranteed to be called (at least) once again...
Instance Variable _closewait Undocumented
Instance Variable _spider_closed_callback Undocumented
def __init__(self, crawler, spider_closed_callback: Callable): (source)

Undocumented

def close(self) -> Deferred: (source)

Gracefully close the execution engine. If it has already been started, stop it. In all cases, close the spider and the downloader.

def close_spider(self, spider: Spider, reason: str = 'cancelled') -> Deferred: (source)

Close (cancel) spider and clear all its outstanding requests

def crawl(self, request: Request, spider: Optional[Spider] = None): (source)

Inject the request into the spider <-> downloader pipeline

def download(self, request: Request, spider: Optional[Spider] = None) -> Deferred: (source)

Return a Deferred which fires with a Response as result, only downloader middlewares are applied

def has_capacity(self) -> bool: (source)

Undocumented

@inlineCallbacks
def open_spider(self, spider: Spider, start_requests: Iterable = (), close_if_idle: bool = True): (source)

Undocumented

def pause(self): (source)

Undocumented

def schedule(self, request: Request, spider: Spider): (source)

Undocumented

def spider_is_idle(self, spider: Optional[Spider] = None) -> bool: (source)

Undocumented

Undocumented

def stop(self) -> Deferred: (source)

Gracefully stop the execution engine

def unpause(self): (source)

Undocumented

Undocumented

downloader = (source)

Undocumented

logformatter = (source)

Undocumented

Undocumented

Undocumented

scheduler_cls = (source)

Undocumented

Undocumented

settings = (source)

Undocumented

Undocumented

Undocumented

Undocumented

start_time = (source)

Undocumented

@property
open_spiders: list = (source)

Undocumented

def _download(self, request: Request, spider: Optional[Spider]) -> Deferred: (source)

Undocumented

def _downloaded(self, result: Union[Response, Request], request: Request, spider: Spider) -> Union[Deferred, Response]: (source)

Undocumented

def _get_scheduler_class(self, settings: BaseSettings) -> type: (source)

Undocumented

def _handle_downloader_output(self, result: Union[Request, Response, Failure], request: Request) -> Optional[Deferred]: (source)

Undocumented

def _needs_backout(self) -> bool: (source)

Undocumented

def _next_request(self): (source)

Undocumented

def _next_request_from_scheduler(self) -> Optional[Deferred]: (source)

Undocumented

def _schedule_request(self, request: Request, spider: Spider): (source)

Undocumented

def _spider_idle(self): (source)

Called when a spider gets idle, i.e. when there are no remaining requests to download or schedule. It can be called multiple times. If a handler for the spider_idle signal raises a DontCloseSpider exception, the spider is not closed until the next loop and this function is guaranteed to be called (at least) once again. A handler can raise CloseSpider to provide a custom closing reason.

_closewait = (source)

Undocumented

_spider_closed_callback = (source)

Undocumented