请稍等 ...

老师，本节模拟登录报错，帮忙解决一下谢谢

报错信息如下：
2023-08-22 05:24:28 [scrapy.utils.log] INFO: Scrapy 2.9.0 started (bot: ArticleSpider)
2023-08-22 05:24:28 [scrapy.utils.log] INFO: Versions: lxml 4.9.3.0, libxml2 2.10.3, cssselect 1.2.0, parsel 1.8.1, w3lib 2.1.2, Twisted 22.10.0, Python 3.7.9 (tags/v3.7.9:13c94747c7, Aug 17 2020, 18:58:18) [MSC v.1900 64 bit (AMD64)], pyOpenSSL 23.2.0 (OpenSSL 3.1.2 1 Aug 2023), cryptography 41.0.3, Platform Windows-10-10.0.22000-SP0
2023-08-22 05:24:28 [scrapy.crawler] INFO: Overridden settings:
{‘BOT_NAME’: ‘ArticleSpider’,
‘FEED_EXPORT_ENCODING’: ‘utf-8’,
‘NEWSPIDER_MODULE’: ‘ArticleSpider.spiders’,
‘REQUEST_FINGERPRINTER_IMPLEMENTATION’: ‘2.7’,
‘ROBOTSTXT_OBEY’: True,
‘SPIDER_MODULES’: [‘ArticleSpider.spiders’],
‘TWISTED_REACTOR’: ‘twisted.internet.asyncioreactor.AsyncioSelectorReactor’}
2023-08-22 05:24:28 [asyncio] DEBUG: Using selector: SelectSelector
2023-08-22 05:24:28 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.asyncioreactor.AsyncioSelectorReactor
2023-08-22 05:24:28 [scrapy.utils.log] DEBUG: Using asyncio event loop: asyncio.windows_events._WindowsSelectorEventLoop
2023-08-22 05:24:28 [scrapy.extensions.telnet] INFO: Telnet Password: 4379517276a4322e
2023-08-22 05:24:28 [scrapy.middleware] INFO: Enabled extensions:
[‘scrapy.extensions.corestats.CoreStats’,
‘scrapy.extensions.telnet.TelnetConsole’,
‘scrapy.extensions.logstats.LogStats’]
2023-08-22 05:24:29 [scrapy.middleware] INFO: Enabled downloader middlewares:
[‘scrapy.downloadermiddlewares.robotstxt.RobotsTxtMiddleware’,
‘scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware’,
‘scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware’,
‘scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware’,
‘scrapy.downloadermiddlewares.useragent.UserAgentMiddleware’,
‘scrapy.downloadermiddlewares.retry.RetryMiddleware’,
‘scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware’,
‘scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware’,
‘scrapy.downloadermiddlewares.redirect.RedirectMiddleware’,
‘scrapy.downloadermiddlewares.cookies.CookiesMiddleware’,
‘scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware’,
‘scrapy.downloadermiddlewares.stats.DownloaderStats’]
2023-08-22 05:24:29 [scrapy.middleware] INFO: Enabled spider middlewares:
[‘scrapy.spidermiddlewares.httperror.HttpErrorMiddleware’,
‘scrapy.spidermiddlewares.offsite.OffsiteMiddleware’,
‘scrapy.spidermiddlewares.referer.RefererMiddleware’,
‘scrapy.spidermiddlewares.urllength.UrlLengthMiddleware’,
‘scrapy.spidermiddlewares.depth.DepthMiddleware’]
2023-08-22 05:24:29 [scrapy.middleware] INFO: Enabled item pipelines:
[]
2023-08-22 05:24:29 [scrapy.core.engine] INFO: Spider opened
2023-08-22 05:24:29 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2023-08-22 05:24:29 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
2023-08-22 05:24:29 [undetected_chromedriver.patcher] DEBUG: getting release number from /LATEST_RELEASE
2023-08-22 05:24:29 [scrapy.core.engine] ERROR: Error while obtaining start requests
Traceback (most recent call last):
File “d:\pro\python37\lib\urllib\request.py”, line 1350, in do_open
encode_chunked=req.has_header(‘Transfer-encoding’))
File “d:\pro\python37\lib\http\client.py”, line 1277, in request
self._send_request(method, url, body, headers, encode_chunked)
File “d:\pro\python37\lib\http\client.py”, line 1323, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File “d:\pro\python37\lib\http\client.py”, line 1272, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File “d:\pro\python37\lib\http\client.py”, line 1032, in _send_output
self.send(msg)
File “d:\pro\python37\lib\http\client.py”, line 972, in send
self.connect()
File “d:\pro\python37\lib\http\client.py”, line 1447, in connect
server_hostname=server_hostname)
File “d:\pro\python37\lib\ssl.py”, line 423, in wrap_socket
session=session
File “d:\pro\python37\lib\ssl.py”, line 870, in _create
self.do_handshake()
File “d:\pro\python37\lib\ssl.py”, line 1139, in do_handshake
self._sslobj.do_handshake()
ConnectionResetError: [WinError 10054] 远程主机强迫关闭了一个现有的连接。

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “E:\Envs\article_spider\lib\site-packages\scrapy\core\engine.py”, line 182, in next_request
request = next(self.slot.start_requests)
File “E:\Envs\ArticleSpider\ArticleSpider\spiders\jobbole.py”, line 24, in start_requests
browser = uc.Chrome()
File "E:\Envs\article_spider\lib\site-packages\undetected_chromedriver_init.py", line 258, in init
self.patcher.auto()
File “E:\Envs\article_spider\lib\site-packages\undetected_chromedriver\patcher.py”, line 157, in auto
release = self.fetch_release_number()
File “E:\Envs\article_spider\lib\site-packages\undetected_chromedriver\patcher.py”, line 225, in fetch_release_number
return LooseVersion(urlopen(self.url_repo + path).read().decode())
File “d:\pro\python37\lib\urllib\request.py”, line 222, in urlopen
return opener.open(url, data, timeout)
File “d:\pro\python37\lib\urllib\request.py”, line 525, in open
response = self._open(req, data)
File “d:\pro\python37\lib\urllib\request.py”, line 543, in _open
’_open’, req)
File “d:\pro\python37\lib\urllib\request.py”, line 503, in _call_chain
result = func(*args)
File “d:\pro\python37\lib\urllib\request.py”, line 1393, in https_open
context=self._context, check_hostname=self._check_hostname)
File “d:\pro\python37\lib\urllib\request.py”, line 1352, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error [WinError 10054] 远程主机强迫关闭了一个现有的连接。>
2023-08-22 05:24:29 [scrapy.core.engine] INFO: Closing spider (finished)
2023-08-22 05:24:29 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{‘elapsed_time_seconds’: 0.10559,
‘finish_reason’: ‘finished’,
‘finish_time’: datetime.datetime(2023, 8, 21, 21, 24, 29, 229218),
‘log_count/DEBUG’: 4,
‘log_count/ERROR’: 1,
‘log_count/INFO’: 10,
‘start_time’: datetime.datetime(2023, 8, 21, 21, 24, 29, 123628)}
2023-08-22 05:24:29 [scrapy.core.engine] INFO: Spider closed (finished)

Process finished with exit code 0

weixin_慕盖茨0312499 2023-08-22 05:31:44

源自：4-8 . cnblogs模拟登录(新增内容)

461

收起

提交取消

1回答

bobby 2023-08-22 18:09:15

你把这个spider的代码贴一下我在我本地运行试试

0 回复有任何疑惑可以回复我~

收起回答

提问者 weixin_慕盖茨0312499 #1

谢谢老师回复  目前 模拟登录自己解决了，原因是谷歌浏览器驱动问题，用下面方式解决了：
from seleniumbase import Driver
        browser = Driver(uc=True, incognito=True)
        browser.get("https://account.cnblogs.com/signin")

补充：如有改进的地方老师指正一下，谢谢

回复有任何疑惑可以回复我~ 2023-08-24 05:15:29