Spiders Contracts

This allows you to test each callback of your spider by hardcoding a sample url and check various constraints for how the callback processes the response. Each contract is prefixed with an and included in the docstring. See the following example:

This callback is tested using three built-in contracts:

class scrapy.contracts.default.UrlContract[source]

This contract (@url) sets the sample URL used when checking other contract conditions for this spider. This contract is mandatory. All callbacks lacking this contract are ignored when running the checks:

@url url

class scrapy.contracts.default.CallbackKeywordArgumentsContract

This contract (@cb_kwargs) sets the cb_kwargs attribute for the sample request. It must be a valid JSON dictionary.

class scrapy.contracts.default.ReturnsContract[source]

This contract (@returns) sets lower and upper bounds for the items and requests returned by the spider. The upper bound is optional:

@returns item(s)|request(s) [min [max]]

class scrapy.contracts.default.ScrapesContract

Use the check command to run the contract checks.

If you find you need more power than the built-in Scrapy contracts you can create and load your own contracts in the project by using the setting:

SPIDER_CONTRACTS = {
    'myproject.contracts.ItemValidate': 10,

Each contract must inherit from Contract and can override three methods:

class scrapy.contracts.Contract(method, \args*)

Parameters
- method (collections.abc.Callable) – callback function to which the contract is associated
- args () – list of arguments passed into the docstring (whitespace separated)
adjust_request_args(args)[source]

This receives a dict as an argument containing default arguments for request object. Request is used by default, but this can be changed with the request_cls attribute. If multiple contracts in chain have this attribute defined, the last one is used.
pre_process(response)

This allows hooking in various checks on the response received from the sample request, before it’s being passed to the callback.
post_process(output)

This allows processing the output of the callback. Iterators are converted listified before being passed to this hook.

Raise from pre_process or if expectations are not met:

class scrapy.exceptions.ContractFail[source]

Error raised in case of a failing contract

Here is a demo contract which checks the presence of a custom header in the response received:

Detecting check runs

When scrapy check is running, the SCRAPY_CHECK environment variable is set to the true string. You can use os.environ to perform any change to your spiders or your settings when is used:

import os
class ExampleSpider(scrapy.Spider):
    name = 'example'
    def __init__(self):
        if os.environ.get('SCRAPY_CHECK'):