scrapy.core.scheduler.BaseScheduler

class documentation

class BaseScheduler: (source)

Known subclasses: scrapy.core.scheduler.Scheduler

The scheduler component is responsible for storing requests received from the engine, and feeding them back upon request (also to the engine). The original sources of said requests are: * Spider: ``start_requests`` method, requests created for URLs in the ``start_urls`` attribute, request callbacks * Spider middleware: ``process_spider_output`` and ``process_spider_exception`` methods * Downloader middleware: ``process_request``, ``process_response`` and ``process_exception`` methods The order in which the scheduler returns its stored requests (via the ``next_request`` method) plays a great part in determining the order in which those requests are downloaded. The methods defined in this class constitute the minimal interface that the Scrapy engine will interact with.

Class Method	`from_crawler`	Factory method which receives the current :class:`~scrapy.crawler.Crawler` object as argument.
Method	`close`	Called when the spider is closed by the engine. It receives the reason why the crawl finished as argument and it's useful to execute cleaning code.
Method	`enqueue_request`	Process a request received by the engine.
Method	`has_pending_requests`	``True`` if the scheduler has enqueued requests, ``False`` otherwise
Method	`next_request`	Return the next :class:`~scrapy.http.Request` to be processed, or ``None`` to indicate that there are no requests to be considered ready at the moment.
Method	`open`	Called when the spider is opened by the engine. It receives the spider instance as argument and it's useful to execute initialization code.

@classmethod
def from_crawler(cls, crawler: Crawler): (source) ¶