class documentation

class CrawlerRunner: (source)

Known subclasses: scrapy.crawler.CrawlerProcess

View In Hierarchy

This is a convenient helper class that keeps track of, manages and runs crawlers inside an already setup :mod:`~twisted.internet.reactor`. The CrawlerRunner object must be instantiated with a :class:`~scrapy.settings.Settings` object. This class shouldn't be needed (since Scrapy is responsible of using it accordingly) unless writing scripts that manually handle the crawling process. See :ref:`run-from-script` for an example.

Method __init__ Undocumented
Method crawl Run a crawler with the provided arguments.
Method create_crawler Return a :class:`~scrapy.crawler.Crawler` object.
Method join join()
Method stop Stops simultaneously all the crawling jobs taking place.
Class Variable crawlers Undocumented
Instance Variable bootstrap_failed Undocumented
Instance Variable settings Undocumented
Instance Variable spider_loader Undocumented
Property spiders Undocumented
Static Method _get_spider_loader Get SpiderLoader instance from settings
Method _crawl Undocumented
Method _create_crawler Undocumented
Instance Variable _active Undocumented
Instance Variable _crawlers Undocumented
def __init__(self, settings=None): (source)

Undocumented

def crawl(self, crawler_or_spidercls, *args, **kwargs): (source)

Run a crawler with the provided arguments. It will call the given Crawler's :meth:`~Crawler.crawl` method, while keeping track of it so it can be stopped later. If ``crawler_or_spidercls`` isn't a :class:`~scrapy.crawler.Crawler` instance, this method will try to create one using this parameter as the spider class given to it. Returns a deferred that is fired when the crawling is finished. :param crawler_or_spidercls: already created crawler, or a spider class or spider's name inside the project to create it :type crawler_or_spidercls: :class:`~scrapy.crawler.Crawler` instance, :class:`~scrapy.spiders.Spider` subclass or string :param args: arguments to initialize the spider :param kwargs: keyword arguments to initialize the spider

def create_crawler(self, crawler_or_spidercls): (source)

Return a :class:`~scrapy.crawler.Crawler` object. * If ``crawler_or_spidercls`` is a Crawler, it is returned as-is. * If ``crawler_or_spidercls`` is a Spider subclass, a new Crawler is constructed for it. * If ``crawler_or_spidercls`` is a string, this function finds a spider with this name in a Scrapy project (using spider loader), then creates a Crawler instance for it.

join() Returns a deferred that is fired when all managed :attr:`crawlers` have completed their executions.

def stop(self): (source)

Stops simultaneously all the crawling jobs taking place. Returns a deferred that is fired when they all have ended.

crawlers = (source)

Undocumented

bootstrap_failed: bool = (source)

Undocumented

settings = (source)

Undocumented

spider_loader = (source)

Undocumented

Undocumented

@staticmethod
def _get_spider_loader(settings): (source)

Get SpiderLoader instance from settings

def _crawl(self, crawler, *args, **kwargs): (source)

Undocumented

def _create_crawler(self, spidercls): (source)

Undocumented

Undocumented

_crawlers = (source)

Undocumented