采纳答案成功!
向帮助你的同学说点啥吧!感谢那些助人为乐的人
老师,我爬拉勾网的时候,加了UA和cookie,但是还是出现了302的问题。(用浏览器依然可以访问,应该是没封我的IP) 请问老师,这是为啥呀?
补充console信息:
2019-01-30 09:32:27 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.lagou.com/> (referer: None) 2019-01-30 09:32:28 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F5455295.html&t=1548811960&_ti=1> from <GET https://www.lagou.com/jobs/5455295.html> 2019-01-30 09:32:29 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F5455295.html&t=1548811960&_ti=1> (referer: https://www.lagou.com/) 2019-01-30 09:32:29 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F2960414.html&t=1548811961&_ti=1> from <GET https://www.lagou.com/jobs/2960414.html> 2019-01-30 09:33:48 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F5455295.html&t=1548811960&_ti=1> {'crawl_time': datetime.datetime(2019, 1, 30, 9, 32, 29, 341141), 'url': 'https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F5455295.html&t=1548811960&_ti=1', 'url_object_id': 'aef6e325d316968e50cd5cc87009e0f4'} 2019-01-30 09:33:50 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.lagou.com/jobs/2856806.html> (failed 1 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>] 2019-01-30 09:33:52 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F2960511.html&t=1548811961&_ti=1> from <GET https://www.lagou.com/jobs/2960511.html> 2019-01-30 09:33:54 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.lagou.com/jobs/5357786.html> (failed 1 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>] 2019-01-30 09:33:56 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.lagou.com/jobs/5474115.html> (failed 1 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionDone: Connection was closed cleanly.>] 2019-01-30 09:33:57 [scrapy.extensions.logstats] INFO: Crawled 2 pages (at 2 pages/min), scraped 1 items (at 1 items/min) 2019-01-30 09:33:59 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.lagou.com/jobs/5528540.html> (failed 1 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionDone: Connection was closed cleanly.>] 2019-01-30 09:34:02 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.lagou.com/jobs/5197365.html> (failed 1 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionDone: Connection was closed cleanly.>] 2019-01-30 09:34:04 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.lagou.com/jobs/2844860.html> (failed 1 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionDone: Connection was closed cleanly.>] 2019-01-30 09:34:06 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.lagou.com/jobs/5497110.html> (failed 1 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionDone: Connection was closed cleanly.>] 2019-01-30 09:34:10 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F3165552.html&t=1548812060&_ti=1> from <GET https://www.lagou.com/jobs/3165552.html> 2019-01-30 09:34:13 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F5481459.html&t=1548812060&_ti=1> from <GET https://www.lagou.com/jobs/5481459.html> 2019-01-30 09:34:15 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F5317219.html&t=1548812060&_ti=1> from <GET https://www.lagou.com/jobs/5317219.html> 2019-01-30 09:34:17 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F3189780.html&t=1548812060&_ti=1> from <GET https://www.lagou.com/jobs/3189780.html> 2019-01-30 09:34:20 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F3111672.html&t=1548812069&_ti=1> from <GET https://www.lagou.com/jobs/3111672.html> 2019-01-30 09:34:22 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F2960511.html&t=1548811961&_ti=1> (referer: https://www.lagou.com/) 2019-01-30 09:34:24 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F2960414.html&t=1548811961&_ti=1> (referer: https://www.lagou.com/) 2019-01-30 09:34:26 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F2832873.html&t=1548812069&_ti=1> from <GET https://www.lagou.com/jobs/2832873.html> 2019-01-30 09:34:27 [scrapy.extensions.logstats] INFO: Crawled 4 pages (at 2 pages/min), scraped 1 items (at 0 items/min) 2019-01-30 09:34:29 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F3071732.html&t=1548812078&_ti=1> from <GET https://www.lagou.com/jobs/3071732.html> 2019-01-30 09:34:32 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F5474115.html&t=1548812078&_ti=1> from <GET https://www.lagou.com/jobs/5474115.html> 2019-01-30 09:34:34 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F5357786.html&t=1548812078&_ti=1> from <GET https://www.lagou.com/jobs/5357786.html> 2019-01-30 09:34:36 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F2769395.html&t=1548812078&_ti=1> from <GET https://www.lagou.com/jobs/2769395.html>
是的, 这里被重定向了, 你可以试试重新登录 看看是否还是会重现这个问题呢
好的 老师
登录后可查看更多问答,登录/注册
带你彻底掌握Scrapy,用Django+Elasticsearch搭建搜索引擎
4.8k 30
2.5k 18
1.1k 18
1.4k 15
2.8k 15