开发者

scrapy is using HTTP 1.0 by default

开发者 https://www.devze.com 2023-04-03 22:57 出处：网络

It looks like Scrapy is using HTTP 1.0 by default. Is there a setting to make it us开发者_运维问答e HTTP 1.1 to send request?

相关专题：python scrapy

It looks like Scrapy is using HTTP 1.0 by default. Is there a setting to make it us开发者_运维问答e HTTP 1.1 to send request?

Thanks.

From http://dev.scrapy.org/wiki/ScrapyRecipes:

How to spoof requests to be HTTP 1.1 compliant You can do this by overriding the Scrapy HTTP Client Factory, with the following (undocumented) setting:

DOWNLOADER_HTTPCLIENTFACTORY = 'myproject.downloader.HTTPClientFactory'

Here's a possible implementation of myproject.downloader module:

from scrapy.core.downloader.webclient import ScrapyHTTPClientFactory, ScrapyHTTPPageGetter

class PageGetter(ScrapyHTTPPageGetter):

    def sendCommand(self, command, path):
        self.transport.write('%s %s HTTP/1.1\r\n' % (command, path))

class HTTPClientFactory(ScrapyHTTPClientFactory):

    protocol = PageGetter