请稍等 ...
×

采纳答案成功!

向帮助你的同学说点啥吧!感谢那些助人为乐的人

爬取没几分钟就报错了,发现知乎检测到了异常

http://img1.sycdn.imooc.com//szimg/5c34a26c0001222610010675.jpg
这个时候该怎么办,要在Selenium登录的时候做if判断,做识别验证码然后填入吗,还是说有什么其他方法?

正在回答

4回答

bobby 2019-01-10 18:33:09

可以在检测到这里的时候 自动加入代码去用云打码识别这个验证码 并自动输入后继续进行

0 回复 有任何疑惑可以回复我~
  • 提问者 HugoL #1
    通过Selenium获取到的验证码img的URL是利用base64编码的,在浏览器上根本无法请求这个URL,也就获取不了图片来通过云打码识别了,这该怎么办?
    回复 有任何疑惑可以回复我~ 2019-01-14 20:36:36
  • 提问者 HugoL #2
    requests请求返回"requests.exceptions.InvalidSchema: No connection adapters were found for'img_url'   "
    回复 有任何疑惑可以回复我~ 2019-01-14 20:51:13
提问者 HugoL 2019-01-16 22:20:21

这个是集成到scrapy里只返回空值的https://img1.sycdn.imooc.com//szimg/5c3f3da30001504c10010705.jpg
这个是单独运行时可以正常识别
https://img1.sycdn.imooc.com//szimg/5c3f3da30001a90c10010894.jpg

0 回复 有任何疑惑可以回复我~
bobby 2019-01-15 16:51:58

https://img1.sycdn.imooc.com//szimg/5c3d9eb20001163111520088.jpg 这个是我刚才分析的url请求,这里的src属性中保存了完整的base4编码, 你可以在这里获取到这个属性 然后截取出其中的base64数据,记住数据是从这里的逗号处开始的, 然后你拷贝出其中的逗号之后的所有字符串,可以这样直接保存成jpg文件:

import base64
image = "R0lGODdhlgA8AIcAAP7+/gICAujo6NfX1/Pz8xcXF8jIyCcnJ0dHR5eXl1dXV3d3d2ZmZoeHhzQ0NLe3t6enp7e3t6qqqjw8PMHBwwAAAAAlgA8AEAI/wABCBxIsKDBgwgTKlzIsKHDhxAjSmxQoEGCABgDIHgAgQCAjyBBCkgwAECEAgFSqkzJgACDAAkKAJhJs6ZNAAQKBNjJQACAn0CDCh1KtKjRo0iFDlDgoECAAAoEAJhKdSqEAA0AaN0KQACAr2ABPCiQIECAAw0GKADAti2BCADiyp37AIDdu3gPBNi7VwCAv4AFLEBwQEGAw4gDKBgAoHFjAQEcBJgcoIADAJgza9YsIAEDAgBCix5NujTpBQcCqC4wAIDr17BdDwBAm3aCAgFy60ZAAIBv3wQYOCgQIACCBgYAKFdOYAGA59CfN0CAIEAABAIAaN+uvYGDAgcGAP8YT748AAMADARYv74BgPfw3x8IQB/BAAD48wsAwL+/f4AABA4kWJDgAAYQAiwsIADAQ4gQBQyA0KBAgAAHIChAAMCjRwEBAjg4MEAAAJQABjwA0NLlS5cRAswsoEAAAJw5cT4gAMDnz58CBABoYKDBggBJAxxAcCBAgAMJBhAAUNUqAAIGAGzl2tXrV7BhB0BwECDAAQcCAKxdC+HAAQkAADgIUNeBggAA9O4FMIDCgAEABA8mXNgwBAYBFAeIAMDxY8iRDQCgXLlBgAAHECxIAMDzZ88GHAQIcICBAQCpVa9m3dr1a9YEGiRwECBAgQcAABhQAIEAAODBhQ8HHiH/woMBBgYAAPBgAAEA0aVPpy5dAIIABgBs597dAADw4cWPFy8gwQMCANSvZ9/e/Xv48eXPp1/f/n38+fXv57//AcAAARgYQPBAAICEChcybOjwIcSIEidSrBgggYICBR5EECAgQQAEARAoCFBAAQEABACwBBABAMyYMAkwGPCgQICcCgDw7OlTAIQKBwoEAGD0KFKjAxwgGCDAwYQAAQoQMADgKgABAgBw7er1KwAJBgwQgIAgAIMGDRgwaFDggIMAARYAqGv3Lt68egEIaKAgQAAFAwAQLmz4MAEAihczSBDgQIACARJAAGD5MgADDAIEKABhwAAAokUnAGD69GkH/wcCBGiAAADs2LAfBKht2wEAAAQUBFjQgAAAAAwCEHdgYAKA5MqXM2dOAAD06NKnU69uYEGA7AESAOju3fsAAQAACEhwgIEBBQHWQwDg/j38+PEHCABg/759AQoKBOg/ACAAgQMFPgiAQAAAhQsZCgDw8CEBAwocBCgQAUBGjRkTAADQQIAACQsGEIggAEBKlStZtnSZskGBADMDALB5E6eAAAUC9OxJAQEAoQAEJHBwIEBSBAkiAHDqdMAAAFOpVgVwIECAAg0MAPD6FWxYrwQEADA7QIEDAxAKPFgAAG5cAAImBFgwQAAAvXsBJCAAAHBgwYMJFxYswEEAxQEKHP8YAAAy5AIJEgQIcKABAAAPAiwAIGCAAACjSZc2fRo1AAMBWDtYQABAbNmzadeWPSBAbgkAePf2/Ru4AAMAiBc3fhx5cuXLmTd3/vz4AAICAFS3bkAAAO3buXf3/l27AADjyZc3fx59evXr2bd3/x5+fPnz6ddnr4BBgQABHDRoAFBBgIEBHEQAgDChwoUMGzp8CDGixIkUAQgYACAjgAAQCgRgUEBBgQATDAAQIACAypUqISgIEKBAgJkBCAwYEEGBgwMBGCAIEGABgKFEixYdAOCBgwBMmRYIMACA1KlUq1qtOkDAggQDEAgw4CDAgQcAypo9izZtWgMTCrgNADf/boAGAgAsCIC3AYC9fAEIAAA4MOABCgIweOCgAQEEABo7fgwZAAEAlCtbBvAggOYDCSAAEDBgwYEDBAAIWLAggOoAEQC4fv2aQAQFBhQsSBBhQIICAQ4YYBBBgIAGCw4EUGAAgPLlzJs7Xy4ggQAHAQJAAIA9O3YBEgB4/w6gAYDx5CMEMPAgQYACDwZEAAA//oMDAeozIAAgv34BAPr7BwhAYIIABQMcMABA4UIBDgIsgIAgwIIFASwyACAAwEYADAJ8DHAAwQAAJU2eRGlyAACWLV2+hNmSgAAEAQo0EABA506ePA0AAArUAIMARYsWCBABwFKmAgwEgAoVwgIA/1WrDgCQVWtWAQoaFAgQQAAAsmXJPgiQNi0BAG3dujWQAIKAAwHsPgCQV2/eCAECHEAgAMBgwgQAHEacWPHixQYaBIB8gAAAypUtEwCQGUCEAQwCfAb9AMBo0qMNGAAggAAA1q0BMAAQW7ZsAwsQBAigAMBu3rsjBChQQAIA4sWNDyAAQLkAAQGcQwAQXXp0AwUKIDBAAMB27gcAfAcfXvx48gAEFAiQPoACAO3dux9AAEACBBACHBiAIEAAAP37AyTAQIEEAgAOIkRoAADDhg4BQCgQYOIBAQAuYrx4IACAjh4/AngAoEGAAAcQKBAA4UCAlgcEAIgpE4AAAwBu4v/MqXMnz543BSxgMKEAhAEEACBNqvTBAQcRFgQIcCABAwBWrU4IoDVAgwECAIAFKwAA2bJmySY4EGBtAQBu37olIAAA3bp1BwAA8AABggIBFggwoKCBBAgJCAQIoGCAAACOHzsWAGAy5cqWL2PGPOBAgM4BGAAILVqBAQAABCAIoFp1AQYGAMCGvQCAggANChwQAGA3796+eQtAEGB4gAQAjiNPrhwAAQIAngNAEGB6gwELBADIrh0AAQYBDkAAQAAA+fIAJAgAoH49+/bu37NPECBAgQIJDADIr38BggIIACYYAABCgQUCEgBYIABAQ4cPITocIABARYsXA2TUKAD/QEePHz0OADCS5MgBAVAWWGAAQEuXLgUwUMCAAQCbN20mALCTZ0+fP4H6DDC0wAMIDAQAUKp0AAACTyEkOBAgAAAAEB4IALCVa1evAigAEDuWLNkFAdBOALCWbdu1AgDElTuX7oMACQDk1buXL98DAAAHFjyYcGHDgQUMALCYcWPHjyFHlixZwAEDADBnzqwAQGfPn0GH9hxgQgQAp1GjJiDAwAAAr2HHlj2bdm3bt3Hn1q0bAQIDAIAHFz6ceHHjx5EnV76ceXPnz6FHlz6denXr17Fn107AQAAJAMCHFz+efHnz59GnV7+efXv35wUoCBCAwQABAxYEYDAAQH///wABCBxIsKDBgwgTKlzIsKHDgwISCBAQwQCAixgzZhxAgUGAjyBDEoiAoIAEAQBSqlzJEgABBBMMAJhJs6bNmzhz6tzJs+fOBwcUGBhA4UCAo0cLAFjKlKmDAAsOMCgQoCqCAFgDHIAQIICCAAUIABhLtmzZAQkWBFi7NoEAAHDjyp1Lty7dAQYEABAAoK/fv4ADCw48gEEBCQcCHBDwQEECBwEKBFDAQIEBAAAGANjMebMACA8QBAhQgAEDAwBSqx7QQAGCAgECMABAu7bt2gIKNGjgIIDvAAggPABAvLjx48iNG0jQIIGDBgkWQFAQoHqABQIAaN/Ovbt37hICBP9gwOAAggcDFgRYz379AQDw4wMgAKC+/foUAjQgUCCAA4AJHAAgWNBgQQIRCiQA0NDhQwALAgSAAOFBAgUIDiwwMADARwQHAiQQAMDkSZQnETRAACABggAIFhwIwGDAAgQNEBwooEAAAKBBhQ4lClRBAQUHAgRY4CCBAABRAQhIcCBAAgBZtQJ4AMDrVwACEjAoEMBBgwYHDABg29btW7hxFQSgG+DAAAB5AUQI0EBAAwYMAgwmPGEAAMSJIRRAMKBBgQIKHgwwMEAAAMyZAUQ4cACBAQChRY8mXZqAAQULCgQIwADAa9iwFwwAUNs2AAEAdO8m8ABCAgQBDiBgUAD/wHHkEgIEUGAAwHPoAAYAoF7dOoQDAbQHIADA+3cADwIsoBDA/PkADRIIANCewIMA8QMcUGCAAAD8+fXvBzBAAEAAAgcSLGgQwIAFDRQEYDAAAMSIECMQAGDxogAAGjc2KMAAAYQFBQZIIADgJMoBCwIEOEBgAYCYMgkAqGnTpoIAOhUkAODzp08FAQIscBDggYIGAZYyGADgKYAAUgMsMHAAANasWrdqfQDgK9iwYseCbZCgQAAECQgAaOv27VsIAObOFcAgAN68ARoA6OuXgIADAQYHQPAAAGLEAwAwbsx4QIIFAQIgEADgMmYAAhwE6Nz5AYDQAAQQELBAwQAC/xAcBGgdIAKA2LJnNzAA4Dbu2wQA8O7t+zfw3gIcKAhwAAECAMqXM19OAAB06BQGBKhe/UAACgC2c6egIAD4Ag4SQABg3jwDAOrXqx/gwEGAAAcIAKhvv74BBgcWLEgAACAAgQMHChBgIEIAhQEgAHD4EACEAA4cBCiQAEBGjQkAdPT4EWRIkAMYIAiQAIIBACtZtgTwAEBMAAIeHAhwE6cBADt5CkBwIEBQBAIAFDVqYAAApUuVJgjwNIACAFOpTm0Q4ICBAAQAdPX6lQABAAISIAhwdgAAtWvVMgjwdoIAAHPpPgBwF29evXv3Eiiw4ECAABMAFDZ8mAABAIsbOP+AsCBA5AAHAFS2fBkzZgIKAHT23BkCAggLAgQYAAB1atQTAiCAAAB2bNkAKACwDUDAgQC7FQDw/ds3AQAADBxYIABAcgAEGgBw/hx6dOnTATwIcD3AgQEAuHfnTmAAAAAPGiwokCBBAPUCALR3/x6+ewIGANS3f9+AgwMBAjgQABCAwIECEwQwACChwoUDCAxI8MDAgwILFBwIgHEAgI0cNyYQACBkSAIRCgA4iTKlypUsTw44ECBmgAIGANi8aZOAAAMGFAT4ySCCAgQCABgFICBAgQYMCAB4ClWAAABUq1qlKiCAVq0GAHj96vXBAwBky5odIICCgAUB2gZwsAD/QYC5DgYIAIA3LwADBAD4/Qs4sODBhAEYOABgQoAABwA4fgwZgIEKDRoEuKzgQQEAnDlHCABaAQEBAwCYBpCAAIDVrFuvZlAgQAAHCggAuI0bgIAGAHr79k1AAgAACgowGADhgYMAzJknCODggQAA1KtTFwAgu/bt3Lt7/y6gAQQDAQo4AIA+vXoBCAIEKHAgQIAFCAYAuH8/gP79DQYAAAhAIAADAgAcRJgQwIIADQ8sABBR4kSKEgkQAABAgIAGCgIEeJBgAACSAARAcBBAAQECAFy+BEBgAACaNW3exJkT54ADAXwGWABA6FCiBgIUCJA0gAIBCAA8fQoBQoEA/1UPJCAAQOvWBAC8fgULoEAAsgEUAECbVu1atA8AvAUgIUAABQMMTJAAQO9eAAQcBFAwQAAAwoUBCIAAQPFixo0dP248YEEAypQBXMacoIABAAAGBAAdOgAA0qUTDBCA4EEEAwBcv4YdG3YDBQFsBzAAQPdu3r0FAAAePMDwAAcgOACQXHlyAQMkBGgAYAAA6tUBCACQXft27t29cx8AgYGDAgwYAECf3kEA9gIAGDgQwIGDBREiAMCfX//+/Q8AAAQgcKBAAgwCIEQYAQDDhg4dEgAgcSKABQEuFggAAQDHjhwJGAgQ4ECBAQBOogRAgACAli5fwowp86WDADZtKv8gAGDnTgMAFgQIgIAAAAgOChQAAOABgKZOn0J1KgAA1apWASAIoDXAAQBev4IFOwAA2bJlDUwI4ECAAQBu38IFIMCAAAB279pFAGAv375+/wLuKyAA4QILGCB4AGDxYgMLEBQI4CDBAAEGEDAYAGAz586eOQsAIHo0adIBTgc4AGA169arKQCILXv27AcOHBAAoHs37969IwgAIHw48eLGjxMXICCCgwAIGCAwAGD6dAMArmPPnj1ChAEAvhMAIH48AAIEAKBPr349AAYBAjwAIH8+fQAEAODPr38//gcFAC4AMJBgQYMFBwBQuJBhQ4cPIQIQkOBAAQYKDADQuJGNY0ePHwkAEDmSZEmTAAgESACAZcuWBgYAkDmTZk2bCxgIALCTZ8+eCwAEFTqUaFGjR5EmVbqUqVEIFABElRqVwAIAV7Fm1bp1a4AADSIIICDAgAAAZ9GmVbuWbVu3b+HGlTtXwYEGAgDk1buXb1+/fwEHFjyYcGHDhxEnVryYcWPHjyFHljyZcmXLiQMCADs="
fh = open("heiqie.jpeg", "wb")
fh.write(base64.b64decode(image))
fh.close()

我这里的image是我现在请求到的字符串,你替换一下,你再打开文件看就能看到验证码了

0 回复 有任何疑惑可以回复我~
  • 提问者 HugoL #1
    感谢老师的回答,图片的问题已经解决了,可是现在的问题是调用云打码api在scrapy里识别所保存的图片时,每次都是返回一个空值,就算设置了time.sleep也还是一样返回空值。奇怪的是我在单独实现识别功能时可以正常识别并返回验证码结果,集成到scrapy里时就返回不了,异常图我发在下面的回答了
    回复 有任何疑惑可以回复我~ 2019-01-16 22:18:22
  • bobby 回复 提问者 HugoL #2
    你可以在你scrapy中调用验证码识别的逻辑中 通过断点调试的方式逐行进行代码调试 看看是否识别的代码中哪一行代码出了问题
    回复 有任何疑惑可以回复我~ 2019-01-19 11:12:36
提问者 HugoL 2019-01-14 20:37:43

https://img1.sycdn.imooc.com//szimg/5c3c829500019c1e10010530.jpg
验证码的URL为base64编码的,无法请求

0 回复 有任何疑惑可以回复我~
问题已解决,确定采纳
还有疑问,暂不采纳
微信客服

购课补贴
联系客服咨询优惠详情

帮助反馈 APP下载

慕课网APP
您的移动学习伙伴

公众号

扫描二维码
关注慕课网微信公众号