Scrapy的301、302重定向问题原因及解决办法
Scrapy的301、302重定向问题原因及解决办法
根据 HTTP标准 ,返回值为200-300之间的值为成功的response。
Scrapy运行爬虫过程中,目标网站返回301或302,而没有获取到想要的网页内容,表示请求失败。eg:
2019-02-13 17:18:32 [scrapy.extensions.telnet] DEBUG: Telnet console listening on 127.0.0.1:6023
2019-02-13 17:18:33 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (301) to ://www.baidu.com/search/?lm=0&rn=10&pn=0&fr=search&ie=gbk&word=%D3%E2%C6%DA%C1%CB%D3%D0%BA%DC%C3%B4%CE%A3%BA%A6> from ://zhidao.baidu.com/search?lm=0&rn=10&pn=0&fr=search&ie=gbk&word=%D3%E2%C6%DA%C1%CB%D3%D0%BA%DC%C3%B4%CE%A3%BA%A6>
2019-02-13 17:18:36 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to ://www.baidu.com/forbiddenip/forbidden.html> from ://www.baidu.com/search/?lm=0&rn=10&pn=0&fr=search&ie=gbk&word=%D3%E2%C6%DA%C1%CB%D3%D0%BA%DC%C3%B4%CE%A3%BA%A6>
2019-02-13 17:18:40 [scrapy.core.engine] DEBUG: Crawled (200) ://www.baidu.com/forbiddenip/forbidden.html> (referer: None)
2019-02-13 17:18:41 [scrapy.core.engine] INFO: Closing spider (finished)
2019-02-13 17:18:41 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 2295