请稍等 ...
×

采纳答案成功!

向帮助你的同学说点啥吧!感谢那些助人为乐的人

着急,着急,着急,爬取博客园时出现了302重定向

拿来做毕设的,马上要答辩了,突然给我整这么一出,开启博客园网站的爬虫后,可以正常显示页面,但是,爬取时被302重定向了,是不是大家爬取太多了,网站做了针对性的反爬。

开启爬虫后是可以正常访问博客园的
图片描述
下面是运行时的信息,请老师给点办法,着实着急,谢谢老师

2021-05-19 17:25:36 [scrapy.core.engine] INFO: Spider opened
2021-05-19 17:25:36 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2021-05-19 17:25:36 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
<fake_useragent.fake.FakeUserAgent object at 0x000001CEE72B4D48> random
2021-05-19 17:25:37 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://news.cnblogs.com> (referer: None)
2021-05-19 17:25:37 [scrapy.dupefilters] DEBUG: Filtered duplicate request: <GET https://news.cnblogs.com/n/page/2/> - no more duplicates will be shown (see DUPEFILTER_DEBUG to show all duplicates)
<fake_useragent.fake.FakeUserAgent object at 0x000001CEE72B4D48> random
<fake_useragent.fake.FakeUserAgent object at 0x000001CEE72B4D48> random
<fake_useragent.fake.FakeUserAgent object at 0x000001CEE72B4D48> random
<fake_useragent.fake.FakeUserAgent object at 0x000001CEE72B4D48> random
<fake_useragent.fake.FakeUserAgent object at 0x000001CEE72B4D48> random
<fake_useragent.fake.FakeUserAgent object at 0x000001CEE72B4D48> random
<fake_useragent.fake.FakeUserAgent object at 0x000001CEE72B4D48> random
<fake_useragent.fake.FakeUserAgent object at 0x000001CEE72B4D48> random
<fake_useragent.fake.FakeUserAgent object at 0x000001CEE72B4D48> random
<fake_useragent.fake.FakeUserAgent object at 0x000001CEE72B4D48> random
<fake_useragent.fake.FakeUserAgent object at 0x000001CEE72B4D48> random
<fake_useragent.fake.FakeUserAgent object at 0x000001CEE72B4D48> random
<fake_useragent.fake.FakeUserAgent object at 0x000001CEE72B4D48> random
<fake_useragent.fake.FakeUserAgent object at 0x000001CEE72B4D48> random
<fake_useragent.fake.FakeUserAgent object at 0x000001CEE72B4D48> random
<fake_useragent.fake.FakeUserAgent object at 0x000001CEE72B4D48> random
2021-05-19 17:25:41 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://account.cnblogs.com:443/signin?ReturnUrl=https%3A%2F%2Fnews.cnblogs.com%2Fn%2F694199%2F> from <GET https://news.cnblogs.com/n/694199/>
<fake_useragent.fake.FakeUserAgent object at 0x000001CEE72B4D48> random
<fake_useragent.fake.FakeUserAgent object at 0x000001CEE72B4D48> random
2021-05-19 17:25:43 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://account.cnblogs.com:443/signin?ReturnUrl=https%3A%2F%2Fnews.cnblogs.com%2Fn%2F694200%2F> from <GET https://news.cnblogs.com/n/694200/>
<fake_useragent.fake.FakeUserAgent object at 0x000001CEE72B4D48> random
2021-05-19 17:25:46 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://account.cnblogs.com:443/signin?ReturnUrl=https%3A%2F%2Fnews.cnblogs.com%2Fn%2F694201%2F> from <GET https://news.cnblogs.com/n/694201/>
2021-05-19 17:25:50 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://account.cnblogs.com:443/signin?ReturnUrl=https%3A%2F%2Fnews.cnblogs.com%2Fn%2F694202%2F> from <GET https://news.cnblogs.com/n/694202/>
<fake_useragent.fake.FakeUserAgent object at 0x000001CEE72B4D48> random
2021-05-19 17:25:54 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://account.cnblogs.com:443/signin?ReturnUrl=https%3A%2F%2Fnews.cnblogs.com%2Fn%2F694203%2F> from <GET https://news.cnblogs.com/n/694203/>
<fake_useragent.fake.FakeUserAgent object at 0x000001CEE72B4D48> random

Process finished with exit code -1

很着急,希望老师及时回复下,给点解决办法,相信很多拿来做毕设的同学,也碰到了这样的问题,拜托老师

正在回答 回答被采纳积分+3

2回答

bobby 2021-05-20 22:32:10

可以先看看5-2使用selenium模拟登录后直接抓取

0 回复 有任何疑惑可以回复我~
我没你笨 2021-05-20 10:21:18

后面有一部分数据是需要登录才能请求成功的,所以302了,你定位到第一个302的数据,点开试试

0 回复 有任何疑惑可以回复我~
问题已解决,确定采纳
还有疑问,暂不采纳
微信客服

购课补贴
联系客服咨询优惠详情

帮助反馈 APP下载

慕课网APP
您的移动学习伙伴

公众号

扫描二维码
关注慕课网微信公众号