老师，使用图片下载出现错误-慕课网

# spider 的方法 def start_requests(self): """重写start_requets，使用splash进行js加载""" for url in self.start_urls: yield SplashRequest(url, args={'wait': 0.5})

# 这是imagepipeline def get_media_requests(self, item, info): # 这里执行没有问题 try: for image_url in item['image']: # yield SplashRequest(url=image_url,dont_process_response=True) yield Request(url=image_url) except Exception as e: pass def get_images(self, response, request, info): # 在断点调试中，这里没有执行 path = self.file_path(request, response=response, info=info) orig_image = Image.open(BytesIO(response.body)) width, height = orig_image.size if width < self.min_width or height < self.min_height: logging.warning("Image too small (%dx%d < %dx%d)" % (width, height, self.min_width, self.min_height)) image, buf = self.convert_image(orig_image) yield path, image, buf def file_path(self, request, response=None, info=None): # 这里的图片path没有问题 img_path = super(OssImagePipeline, self).file_path(request, response, info) image_name = img_path.rsplit('/', 1)[-1] if '/' in img_path else img_path self.image_list.append(image_name) if self.folder: image_name = os.path.join(self.folder, image_name) print(image_name) # 没有问题 return image_name def item_completed(self, results, item, info): try: print(results,1111111111111111111111111111111111111111111111111111111111) print(item,22222222222222222222222222222222222222222222222222222222222) base_str = "https://xxx.oss-cn-xxx.aliyuncs.com/" # 我是需要将图片下载到oss上，因此需要路径拼接 # resultes 返回的异常，它在图片下载之后，不知什么原因，返回了False，因此，我这里的image_path为空 image_path = [base_str + x['path'] for ok, x in results if ok] item['image_path'] = image_path # print("+"*20,item['image_path'],"+"*20) except: raise DropItem("Item contains no images") else: return item

这是报的异常 2019-03-07 15:44:25 [scrapy.pipelines.files] WARNING: File (unknown-error): Error downloading image from <GET http://qw.lixia.gov.cn/picture/0/s_c39d2751a63f4c969116e11dd8d5ffdb.jpg> referred in <None>: 'splash' 2019-03-07 15:44:25 [scrapy.pipelines.files] WARNING: File (unknown-error): Error downloading image from <GET http://qw.lixia.gov.cn/picture/0/s_29d03c6e89bc413cba694a5e849ec1e3.jpg> referred in <None>: 'splash' [(False, <twisted.python.failure.Failure scrapy.pipelines.files.FileException: >), (False, <twisted.python.failure.Failure scrapy.pipelines.files.FileException: >)]

1回答

提问者我是一只有宝贝的熊 2019-03-07 17:19:08

当我使用SplashRequest
进行图片下载的时候我打印了一下那个response.body

发现没有执行，Requets执行却报错，我的推测可能是在start_requets时修改了它的原生返回Requet对象导致的，目前这种情况，应该怎么解决？

1 回复有任何疑惑可以回复我~

收起回答

bobby #1
```
你加我qq 442421039 我看看
```
回复有任何疑惑可以回复我~ 2019-03-10 11:56:18

提问者我是一只有宝贝的熊回复 bobby #2

def get_media_requests(self, item, info):
        try:
            for image_url in item['image']:
                # yield Request(image_url)
                yield SplashRequest(image_url, dont_process_response=True, endpoint='render.jpeg')
        except:
            pass
我是重写了这个方法，我的想法是，既然start_request使用了splashRequest,那么，它上传不上去就是因为它原生还在使用Request,事实证明我的猜测是正确的，但是图片下载后有问题，它有一个白色边框，图片的大小被固定到了1024x768， 不明白它的内部是怎么操作了，断点调试没有看出来：

回复有任何疑惑可以回复我~ 2019-03-26 17:34:15

老师，使用图片下载出现错误

正在回答回答被采纳积分+3

1回答

相似问题

请选择置顶位置

本课精华内容

Missing argument grant_type

MysqlTwistedPipline的对象没有cursor属性

知乎answer提取不到

为什么知乎数据导入不到MySQL数据库中？

运行 scrapy crawl jobbole 报错

【讨论题】你认为什么是 JS 逆向？

有没有方法可以比较准确的解析出 title 和正文内容

如何将数据的保存和抓取独立出来？

如何将 nodejs 服务集成进来呢？

【讨论题】字体反爬应该如何解析？

热搜

最近搜索清空