采纳答案成功!
向帮助你的同学说点啥吧!感谢那些助人为乐的人
老师好,按照您最想的视频代码,我试着写了一下,账号密码都能正确输入,但点击登陆的时候,网页出现Missing argument grant_type 这个提示。尽管我尝试手动点击,依旧出现 Missing argument grant_type
哈哈,还没有,还是报Missing argument grant_type
我下载了最新版本的chrome和它对应的driver还是没用,你解决了没
chromedriver降到60版本 然后chromedriver使用2.33的版本试试
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 | # -*- coding: utf-8 -*- import re import json import datetime try : import urlparse as parse except : from urllib import parse import scrapy from scrapy.loader import ItemLoader from items import ZhihuQuestionItem, ZhihuAnswerItem class ZhihuSpider(scrapy.Spider): name = "zhihu_sel" allowed_domains = [ "www.zhihu.com" ] start_urls = [ 'https://www.zhihu.com/' ] #question的第一页answer的请求url start_answer_url = "https://www.zhihu.com/api/v4/questions/{0}/answers?sort_by=default&include=data%5B%2A%5D.is_normal%2Cis_sticky%2Ccollapsed_by%2Csuggest_edit%2Ccomment_count%2Ccollapsed_counts%2Creviewing_comments_count%2Ccan_comment%2Ccontent%2Ceditable_content%2Cvoteup_count%2Creshipment_settings%2Ccomment_permission%2Cmark_infos%2Ccreated_time%2Cupdated_time%2Crelationship.is_author%2Cvoting%2Cis_thanked%2Cis_nothelp%2Cupvoted_followees%3Bdata%5B%2A%5D.author.is_blocking%2Cis_blocked%2Cis_followed%2Cvoteup_count%2Cmessage_thread_token%2Cbadge%5B%3F%28type%3Dbest_answerer%29%5D.topics&limit={1}&offset={2}" headers = { "HOST" : "www.zhihu.com" , "Referer" : "https://www.zhizhu.com" , 'User-Agent' : "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:51.0) Gecko/20100101 Firefox/51.0" } custom_settings = { "COOKIES_ENABLED" : True } def parse( self , response): """ 提取出html页面中的所有url 并跟踪这些url进行一步爬取 如果提取的url中格式为 /question/xxx 就下载之后直接进入解析函数 """ pass def parse_question( self , response): #处理question页面, 从页面中提取出具体的question item pass def parse_answer( self , reponse): pass def start_requests( self ): from selenium import webdriver browser = webdriver.Chrome(executable_path = "E:/tmp/chromedriver.exe" ) browser.get( "https://www.zhihu.com/signin" ) browser.find_element_by_css_selector( ".SignFlow-accountInput.Input-wrapper input" ).send_keys( "xxx" ) browser.find_element_by_css_selector( ".SignFlow-password input" ).send_keys( "xxx" ) browser.find_element_by_css_selector( ".Button.SignFlow-submitButton" ).click() import time time.sleep( 10 ) Cookies = browser.get_cookies() print (Cookies) cookie_dict = {} import pickle for cookie in Cookies: # 写入文件 f = open ( 'H:/scrapy/ArticleSpider/cookies/zhihu/' + cookie[ 'name' ] + '.zhihu' , 'wb' ) pickle.dump(cookie, f) f.close() cookie_dict[cookie[ 'name' ]] = cookie[ 'value' ] browser.close() return [scrapy.Request(url = self .start_urls[ 0 ], dont_filter = True , cookies = cookie_dict)] |
你把这个代码拷贝过去运行试试 我这里刚才运行了没有问题
带你彻底掌握Scrapy,用Django+Elasticsearch搭建搜索引擎
了解课程