scrapy.crawler.CrawlerRunner

class documentation

class CrawlerRunner: (source)

Known subclasses: scrapy.crawler.CrawlerProcess

This is a convenient helper class that keeps track of, manages and runs crawlers inside an already setup :mod:`~twisted.internet.reactor`. The CrawlerRunner object must be instantiated with a :class:`~scrapy.settings.Settings` object. This class shouldn't be needed (since Scrapy is responsible of using it accordingly) unless writing scripts that manually handle the crawling process. See :ref:`run-from-script` for an example.

Method	`__init__`	Undocumented
Method	`crawl`	Run a crawler with the provided arguments.
Method	`create_crawler`	Return a :class:`~scrapy.crawler.Crawler` object.
Method	`join`	join()
Method	`stop`	Stops simultaneously all the crawling jobs taking place.
Class Variable	`crawlers`	Undocumented
Instance Variable	`bootstrap_failed`	Undocumented
Instance Variable	`settings`	Undocumented
Instance Variable	`spider_loader`	Undocumented
Property	`spiders`	Undocumented
Static Method	`_get_spider_loader`	Get SpiderLoader instance from settings
Method	`_crawl`	Undocumented
Method	`_create_crawler`	Undocumented
Instance Variable	`_active`	Undocumented
Instance Variable	`_crawlers`	Undocumented

def __init__(self, settings=None): (source) ¶

overridden in scrapy.crawler.CrawlerProcess

Undocumented

def crawl(self, crawler_or_spidercls, *args, **kwargs): (source) ¶

Run a crawler with the provided arguments. It will call the given Crawler's :meth:`~Crawler.crawl` method, while keeping track of it so it can be stopped later. If ``crawler_or_spidercls`` isn't a :class:`~scrapy.crawler.Crawler` instance, this method will try to create one using this parameter as the spider class given to it. Returns a deferred that is fired when the crawling is finished. :param crawler_or_spidercls: already created crawler, or a spider class or spider's name inside the project to create it :type crawler_or_spidercls: :class:`~scrapy.crawler.Crawler` instance, :class:`~scrapy.spiders.Spider` subclass or string :param args: arguments to initialize the spider :param kwargs: keyword arguments to initialize the spider

def create_crawler(self, crawler_or_spidercls): (source) ¶

Return a :class:`~scrapy.crawler.Crawler` object. * If ``crawler_or_spidercls`` is a Crawler, it is returned as-is. * If ``crawler_or_spidercls`` is a Spider subclass, a new Crawler is constructed for it. * If ``crawler_or_spidercls`` is a string, this function finds a spider with this name in a Scrapy project (using spider loader), then creates a Crawler instance for it.

@defer.inlineCallbacks
def join(self): (source) ¶

join() Returns a deferred that is fired when all managed :attr:`crawlers` have completed their executions.

def stop(self): (source) ¶

Stops simultaneously all the crawling jobs taking place. Returns a deferred that is fired when they all have ended.

crawlers = (source) ¶

Undocumented

bootstrap_failed: bool = (source) ¶

Undocumented

settings = (source) ¶

Undocumented

spider_loader = (source) ¶

Undocumented