- Use request.meta than response.meta in the middleware
- Consider Crawlera response if contains X-Crawlera-Version header
- Build the documentation in Travis CI and fail on documentation issues
- Update matrix of tests
- Added more stats to better understanding the internal states.
- Log warning when using https:// protocol.
- Add default http:// protocol in case of none provided, and log warning about it.
- Fix duplicated request when the response is not from crawlera, this was causing an infinite loop of retries when dont_filter=True.
- Enable crawlera on demand by setting
- Remove username and password from settings since it’s removed from crawlera.
- Include affected spider in logs.
- Handle situations when crawlera is restarted and reply with 407’s for a few minutes by retrying the requests with a exponential backoff system.
- Correctly check for bans in crawlera (Jobs will not get banned on non ban 503’s).
- Exponential backoff when crawlera doesn’t have proxies available.
dont_proxy=Falseheader disabling crawlera when it is enabled.
- Remove X-Crawlera-* headers when Crawlera is disabled.
- Introduction of DEFAULT_CRAWLERA_HEADERS settings.
- Use CONNECT method to contact Crawlera proxy.
- Trigger PYPI deployments after changes made to TOXENV in v1.2.3
- Multiple documentation fixes
- Test scrapy-crawlera on combinations of software used by scrapinghub stacks
- Fix Crawlera error stats key in Python 3.
- Add support for Python 3.6.
- Fix release date in README.
- Recommend middleware order to be
610to run before
- Change default download timeout to 190s or 3 minutes 10 seconds (instead of 1800s or 30 minutes).
- Test and advertize Python 3 compatiblity.
- Clear Scrapy DNS cache for proxy URL in case of connection errors.
- Distribute plugin as universal wheel.