Stats Collection

However, the Stats Collector is always available, so you can always import it in your module and use its API (to increment or set new stat keys), regardless of whether the stats collection is enabled or not. If it’s disabled, the API will still work but it won’t collect anything. This is aimed at simplifying the stats collector usage: you should spend no more than one line of code for collecting stats in your spider, Scrapy extension, or whatever code you’re using the Stats Collector from.

Another feature of the Stats Collector is that it’s very efficient (when enabled) and extremely efficient (almost unnoticeable) when disabled.

The Stats Collector keeps a stats table per open spider which is automatically opened when the spider is opened, and closed when the spider is closed.

Access the stats collector through the stats attribute. Here is an example of an extension that access stats:

Set stat value:

Set stat value only if greater than previous:

Set stat value only if lower than previous:

Get stat value:

Get all stats:

Available Stats Collectors

Besides the basic there are other Stats Collectors available in Scrapy which extend the basic Stats Collector. You can select which Stats Collector to use through the STATS_CLASS setting. The default Stats Collector used is the MemoryStatsCollector.

A simple stats collector that keeps the stats of the last scraping run (for each spider) in memory, after they’re closed. The stats can be accessed through the attribute, which is a dict keyed by spider domain name.

This is the default Stats Collector used in Scrapy.

DummyStatsCollector

class scrapy.statscollectors.DummyStatsCollector