请稍等 ...
×

采纳答案成功!

向帮助你的同学说点啥吧!感谢那些助人为乐的人

两个问题

老师,我elasticsearch把answer的content放进去的时候报错:

ValueError: too many values to unpack (expected 2)

另外对于question内容为空的,我做了判断,但是还会报错:

Traceback (most recent call last):

  File "/Users/yujialian/.virtualenvs/article_spider/lib/python3.6/site-packages/twisted/internet/defer.py", line 653, in _runCallbacks

    current.result = callback(current.result, *args, **kw)

  File "/Users/yujialian/Documents/project/crawler/ArticalSpider/ArticalSpider/pipelines.py", line 122, in process_item

    item.save_to_es()

  File "/Users/yujialian/Documents/project/crawler/ArticalSpider/ArticalSpider/items.py", line 187, in save_to_es

    if self["content"]:

  File "/Users/yujialian/.virtualenvs/article_spider/lib/python3.6/site-packages/scrapy/item.py", line 59, in __getitem__

    return self._values[key]

KeyError: 'content'


here's my code:


def save_to_es(self):
   #turn the item in the ES's item
   answer = ZhihuAnswerType()
   answer.zhihu_id = self["zhihu_id"]
   answer.url = self["url"]
   answer.question_id = self["question_id"]
   answer.author_id = self["author_id"]
   answer.answer_excerpt = self["answer_excerpt"]
   answer.content = self["content"]
   answer.praise_num = self["praise_num"]
   answer.comments_num = self["comments_num"]
   answer.create_time = datetime.datetime.fromtimestamp(self["create_time"]).strftime(SQL_DATETIME_FORMAT)
   answer.update_time = datetime.datetime.fromtimestamp(self["update_time"]).strftime(SQL_DATETIME_FORMAT)
   answer.crawl_time = self["crawl_time"]
   answer.suggest = get_suggests(ZhihuAnswerType._doc_type.index, ((answer.answer_excerpt, 5)))

   answer.save()
   return

question段代码:

def save_to_es(self):
   #turn the item in the ES's item
   question = ZhihuQuestionType()
   question.zhihu_id = self["zhihu_id"][0]
   question.topics = ",".join(self["topics"])
   question.url = self["url"][0]
   question.title = "".join(self["title"])
   if self["content"]:
       question.content = "".join(self["content"])
   else:
       question.content = "EMPTY"
   question.answer_num = extract_num("".join(self["answer_num"]))
   question.comments_num = extract_num("".join(self["comments_num"]))
   if len(self["watch_user_num"]) == 2:
       question.watch_user_num = int(self["watch_user_num"][0])
       question.click_num = int(self["watch_user_num"][1])
   else:
       question.watch_user_num = int(self["watch_user_num"][0])
       question.click_num = 0
   question.crawl_time = datetime.datetime.now().strftime(SQL_DATETIME_FORMAT)
   question.suggest = get_suggests(ZhihuQuestionType._doc_type.index, ((question.title, 10), (question.topics, 7), (question.content, 5)))

   question.save()
   return

正在回答

1回答

第一问题: too many values to unpack (expected 2) 这个是变量值和传递进去的参数值数量不一致 你看一下错误站定位到错误然后确定一下就行了

第二个问提: 这是keyerror的问题 你可以直接debug一下看看具体的错误在哪一行 以及确定一下变量是否正确

0 回复 有任何疑惑可以回复我~
  • 提问者 Grant_Lian #1
    非常感谢!
    回复 有任何疑惑可以回复我~ 2017-06-01 17:06:41
问题已解决,确定采纳
还有疑问,暂不采纳
意见反馈 帮助中心 APP下载
官方微信