请稍等 ...
×

采纳答案成功!

向帮助你的同学说点啥吧!感谢那些助人为乐的人

请问这个站的数据如何取?

post地址:http://www.jijiyouxuan.com/index.php?s=/index/search/goodlistnew.html
xhr可以看到数据
POST后出现500错误
2021-11-01 23:25:05 [scrapy.core.engine] DEBUG: Crawled (500) <POST http://www.jijiyouxuan.com/index.php?s=/index/search/goodlistnew.html> (referer: None)
2021-11-01 23:25:05 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <500 http://www.jijiyouxuan.com/index.php?s=/index/search/goodlistnew.html>: HTTP status code is not handled or not allowed
def start_requests(self):
browser = uc.Chrome()
browser.get(“http://www.jijiyouxuan.com/”)
input(“回车继续:”)
cookie = browser.get_cookies()
cookie_dict = {}
for cook in cookie:
cookie_dict[cook[“name”]] = cook[“value”]

    print(cookie_dict)
    data = {
        'category_id':'1221',
        'brand_id': '0',
        'manner_id': '0',
        'material_id': '0',
        'size_id': '0',
        'other_id': '0',
        'bed_id': '0',
        'sofa_id': '0',
        'chandi_id': '0',
        'thickness_id': '0',
        'price1': '0',
        'price2': '0',
        'tags': '0',
        'wd': '',
        'page': '1',
        'order_by_field': 'default',
        'order_by_type': 'asc'
    }
    headers={
      'User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36'
    }
    for url in self.start_urls:
        yield scrapy.FormRequest(url=url, formdata=data,cookies=cookie_dict, headers=headers, callback=self.parse)


def parse(self, response):

    pass

正在回答 回答被采纳积分+3

1回答

bobby 2021-11-03 20:56:13

使用 undetected chromedriver也不行?

0 回复 有任何疑惑可以回复我~
问题已解决,确定采纳
还有疑问,暂不采纳
微信客服

购课补贴
联系客服咨询优惠详情

帮助反馈 APP下载

慕课网APP
您的移动学习伙伴

公众号

扫描二维码
关注慕课网微信公众号