This is a convenient helper class that keeps track of, manages and runs crawlers inside an already setup :mod:`~twisted.internet.reactor`. The CrawlerRunner object must be instantiated with a :class:`~scrapy.settings.Settings` object. This class shouldn't be needed (since Scrapy is responsible of using it accordingly) unless writing scripts that manually handle the crawling process. See :ref:`run-from-script` for an example.
Method | __init__ |
Undocumented |
Method | crawl |
Run a crawler with the provided arguments. |
Method | create |
Return a :class:`~scrapy.crawler.Crawler` object. |
Method | join |
join() |
Method | stop |
Stops simultaneously all the crawling jobs taking place. |
Class Variable | crawlers |
Undocumented |
Instance Variable | bootstrap |
Undocumented |
Instance Variable | settings |
Undocumented |
Instance Variable | spider |
Undocumented |
Property | spiders |
Undocumented |
Static Method | _get |
Get SpiderLoader instance from settings |
Method | _crawl |
Undocumented |
Method | _create |
Undocumented |
Instance Variable | _active |
Undocumented |
Instance Variable | _crawlers |
Undocumented |
Run a crawler with the provided arguments. It will call the given Crawler's :meth:`~Crawler.crawl` method, while keeping track of it so it can be stopped later. If ``crawler_or_spidercls`` isn't a :class:`~scrapy.crawler.Crawler` instance, this method will try to create one using this parameter as the spider class given to it. Returns a deferred that is fired when the crawling is finished. :param crawler_or_spidercls: already created crawler, or a spider class or spider's name inside the project to create it :type crawler_or_spidercls: :class:`~scrapy.crawler.Crawler` instance, :class:`~scrapy.spiders.Spider` subclass or string :param args: arguments to initialize the spider :param kwargs: keyword arguments to initialize the spider
Return a :class:`~scrapy.crawler.Crawler` object. * If ``crawler_or_spidercls`` is a Crawler, it is returned as-is. * If ``crawler_or_spidercls`` is a Spider subclass, a new Crawler is constructed for it. * If ``crawler_or_spidercls`` is a string, this function finds a spider with this name in a Scrapy project (using spider loader), then creates a Crawler instance for it.
join() Returns a deferred that is fired when all managed :attr:`crawlers` have completed their executions.
Stops simultaneously all the crawling jobs taking place. Returns a deferred that is fired when they all have ended.