class documentation
class Scraper: (source)
Undocumented
Method | __init__ |
Undocumented |
Method | call |
Undocumented |
Method | close |
Close a spider being scraped and release its resources |
Method | enqueue |
Undocumented |
Method | handle |
Undocumented |
Method | handle |
Undocumented |
Method | is |
Return True if there isn't any more spiders to process |
Method | open |
Open the given spider for scraping and allocate resources for it |
Instance Variable | concurrent |
Undocumented |
Instance Variable | crawler |
Undocumented |
Instance Variable | itemproc |
Undocumented |
Instance Variable | logformatter |
Undocumented |
Instance Variable | signals |
Undocumented |
Instance Variable | slot |
Undocumented |
Instance Variable | spidermw |
Undocumented |
Method | _check |
Undocumented |
Method | _itemproc |
ItemProcessor finished for the given ``item`` and returned ``output`` |
Method | _log |
Log and silence errors that come from the engine (typically download errors that got propagated thru here). |
Method | _process |
Process each Request/Item (given in the output parameter) returned from the given spider |
Method | _scrape |
Handle the downloaded response or failure through the spider callback/errback |
Method | _scrape2 |
Handle the different cases of request's result been a Response or a Failure |
Method | _scrape |
Undocumented |
def call_spider(self, result:
Union[ Response, Failure]
, request: Request
, spider: Spider
) -> Deferred
:
(source)
¶
Undocumented
def enqueue_scrape(self, result:
Union[ Response, Failure]
, request: Request
, spider: Spider
) -> Deferred
:
(source)
¶
Undocumented
def handle_spider_error(self, _failure:
Failure
, request: Request
, response: Response
, spider: Spider
):
(source)
¶
Undocumented
def handle_spider_output(self, result:
Union[ Iterable, AsyncIterable]
, request: Request
, response: Response
, spider: Spider
) -> Deferred
:
(source)
¶
Undocumented
def _itemproc_finished(self, output:
Any
, item: Any
, response: Response
, spider: Spider
):
(source)
¶
ItemProcessor finished for the given ``item`` and returned ``output``
def _log_download_errors(self, spider_failure:
Failure
, download_failure: Failure
, request: Request
, spider: Spider
) -> Union[ Failure, None]
:
(source)
¶
Log and silence errors that come from the engine (typically download errors that got propagated thru here). spider_failure: the value passed into the errback of self.call_spider() download_failure: the value passed into _scrape2() from ExecutionEngine._handle_downloader_output() as "result"
def _process_spidermw_output(self, output:
Any
, request: Request
, response: Response
, spider: Spider
) -> Optional[ Deferred]
:
(source)
¶
Process each Request/Item (given in the output parameter) returned from the given spider
def _scrape(self, result:
Union[ Response, Failure]
, request: Request
, spider: Spider
) -> Deferred
:
(source)
¶
Handle the downloaded response or failure through the spider callback/errback