请稍等 ...
×

采纳答案成功!

向帮助你的同学说点啥吧!感谢那些助人为乐的人

(已解决)browser = uc.Chrome()

(已经解决)老师!!browser = uc.Chrome()出问题了,我添加
if name == ‘main’:
freeze_support() 还是不行,以下是报错信息
D:\Anaconda3\python.exe “D:\pycharm\PyCharm Community Edition 2020.2\plugins\python-ce\helpers\pydev\pydevd.py” --cmd-line --multiproc --qt-support=auto --client 127.0.0.1 --port 14261 --file D:/eswork-master/articles/main.py
pydev debugger: process 28624 is connecting
Connected to pydev debugger (build 202.6397.98)
2022-02-08 01:55:14 [scrapy.utils.log] INFO: Scrapy 2.5.1 started (bot: articles)
2022-02-08 01:55:14 [scrapy.utils.log] INFO: Versions: lxml 4.5.0.0, libxml2 2.9.9, cssselect 1.1.0, parsel 1.5.2, w3lib 1.21.0, Twisted 20.3.0, Python 3.7.3 (default, Mar 27 2019, 17:13:21) [MSC v.1915 64 bit (AMD64)], pyOpenSSL 19.1.0 (OpenSSL 1.1.1m 14 Dec 2021), cryptography 2.8, Platform Windows-10-10.0.17134-SP0
2022-02-08 01:55:14 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.selectreactor.SelectReactor
2022-02-08 01:55:14 [scrapy.crawler] INFO: Overridden settings:
{‘BOT_NAME’: ‘articles’,
‘NEWSPIDER_MODULE’: ‘articles.spiders’,
‘SPIDER_MODULES’: [‘articles.spiders’]}
2022-02-08 01:55:14 [scrapy.extensions.telnet] INFO: Telnet Password: 0597b55467723803
2022-02-08 01:55:14 [scrapy.middleware] INFO: Enabled extensions:
[‘scrapy.extensions.corestats.CoreStats’,
‘scrapy.extensions.telnet.TelnetConsole’,
‘scrapy.extensions.logstats.LogStats’]
2022-02-08 01:55:14 [scrapy.middleware] INFO: Enabled downloader middlewares:
[‘scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware’,
‘scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware’,
‘scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware’,
‘scrapy.downloadermiddlewares.useragent.UserAgentMiddleware’,
‘scrapy.downloadermiddlewares.retry.RetryMiddleware’,
‘scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware’,
‘scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware’,
‘scrapy.downloadermiddlewares.redirect.RedirectMiddleware’,
‘scrapy.downloadermiddlewares.cookies.CookiesMiddleware’,
‘scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware’,
‘scrapy.downloadermiddlewares.stats.DownloaderStats’]
2022-02-08 01:55:14 [scrapy.middleware] INFO: Enabled spider middlewares:
[‘scrapy.spidermiddlewares.httperror.HttpErrorMiddleware’,
‘scrapy.spidermiddlewares.offsite.OffsiteMiddleware’,
‘scrapy.spidermiddlewares.referer.RefererMiddleware’,
‘scrapy.spidermiddlewares.urllength.UrlLengthMiddleware’,
‘scrapy.spidermiddlewares.depth.DepthMiddleware’]
2022-02-08 01:55:14 [scrapy.middleware] INFO: Enabled item pipelines:
[‘articles.pipelines.ElasticsearchPipeline’]
2022-02-08 01:55:14 [scrapy.core.engine] INFO: Spider opened
2022-02-08 01:55:14 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2022-02-08 01:55:14 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
2022-02-08 01:55:14 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://news.cnblogs.com/n/page/1> (referer: None)
2022-02-08 01:55:14 [undetected_chromedriver.patcher] DEBUG: getting release number from /LATEST_RELEASE
2022-02-08 01:55:15 [undetected_chromedriver.patcher] DEBUG: downloading from https://chromedriver.storage.googleapis.com/98.0.4758.80/chromedriver_win32.zip
2022-02-08 01:55:16 [undetected_chromedriver.patcher] DEBUG: unzipping C:\Users\Lenovo\AppData\Local\Temp\tmpzgwnf88f
2022-02-08 01:55:16 [undetected_chromedriver.patcher] INFO: patching driver executable C:\Users\Lenovo\appdata\roaming\undetected_chromedriver\chromedriver.exe
2022-02-08 01:55:21 [scrapy.utils.log] INFO: Scrapy 2.5.1 started (bot: articles)
2022-02-08 01:55:21 [scrapy.utils.log] INFO: Versions: lxml 4.5.0.0, libxml2 2.9.9, cssselect 1.1.0, parsel 1.5.2, w3lib 1.21.0, Twisted 20.3.0, Python 3.7.3 (default, Mar 27 2019, 17:13:21) [MSC v.1915 64 bit (AMD64)], pyOpenSSL 19.1.0 (OpenSSL 1.1.1m 14 Dec 2021), cryptography 2.8, Platform Windows-10-10.0.17134-SP0
2022-02-08 01:55:21 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.selectreactor.SelectReactor
2022-02-08 01:55:21 [scrapy.crawler] INFO: Overridden settings:
{‘BOT_NAME’: ‘articles’,
‘NEWSPIDER_MODULE’: ‘articles.spiders’,
‘SPIDER_MODULES’: [‘articles.spiders’]}
2022-02-08 01:55:21 [scrapy.extensions.telnet] INFO: Telnet Password: 03f6f96c1e07c8f7
2022-02-08 01:55:21 [scrapy.middleware] INFO: Enabled extensions:
[‘scrapy.extensions.corestats.CoreStats’,
‘scrapy.extensions.telnet.TelnetConsole’,
‘scrapy.extensions.logstats.LogStats’]
2022-02-08 01:55:21 [scrapy.middleware] INFO: Enabled downloader middlewares:
[‘scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware’,
‘scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware’,
‘scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware’,
‘scrapy.downloadermiddlewares.useragent.UserAgentMiddleware’,
‘scrapy.downloadermiddlewares.retry.RetryMiddleware’,
‘scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware’,
‘scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware’,
‘scrapy.downloadermiddlewares.redirect.RedirectMiddleware’,
‘scrapy.downloadermiddlewares.cookies.CookiesMiddleware’,
‘scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware’,
‘scrapy.downloadermiddlewares.stats.DownloaderStats’]
2022-02-08 01:55:21 [scrapy.middleware] INFO: Enabled spider middlewares:
[‘scrapy.spidermiddlewares.httperror.HttpErrorMiddleware’,
‘scrapy.spidermiddlewares.offsite.OffsiteMiddleware’,
‘scrapy.spidermiddlewares.referer.RefererMiddleware’,
‘scrapy.spidermiddlewares.urllength.UrlLengthMiddleware’,
‘scrapy.spidermiddlewares.depth.DepthMiddleware’]
2022-02-08 01:55:21 [scrapy.middleware] INFO: Enabled item pipelines:
[‘articles.pipelines.ElasticsearchPipeline’]
2022-02-08 01:55:21 [scrapy.core.engine] INFO: Spider opened
2022-02-08 01:55:21 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2022-02-08 01:55:21 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6024
2022-02-08 01:55:21 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://news.cnblogs.com/n/page/1> (referer: None)
2022-02-08 01:55:22 [undetected_chromedriver.patcher] DEBUG: getting release number from /LATEST_RELEASE
2022-02-08 01:55:22 [undetected_chromedriver.patcher] DEBUG: downloading from https://chromedriver.storage.googleapis.com/98.0.4758.80/chromedriver_win32.zip
2022-02-08 01:55:23 [undetected_chromedriver.patcher] DEBUG: unzipping C:\Users\Lenovo\AppData\Local\Temp\tmpp_33ux_j
2022-02-08 01:55:24 [undetected_chromedriver.patcher] INFO: patching driver executable C:\Users\Lenovo\appdata\roaming\undetected_chromedriver\chromedriver.exe
2022-02-08 01:55:24 [scrapy.core.scraper] ERROR: Spider error processing <GET https://news.cnblogs.com/n/page/1> (referer: None)
Traceback (most recent call last):
File “D:\Anaconda3\lib\site-packages\scrapy\utils\defer.py”, line 120, in iter_errback
yield next(it)
File “D:\Anaconda3\lib\site-packages\scrapy\utils\python.py”, line 353, in next
return next(self.data)
File “D:\Anaconda3\lib\site-packages\scrapy\utils\python.py”, line 353, in next
return next(self.data)
File “D:\Anaconda3\lib\site-packages\scrapy\core\spidermw.py”, line 56, in _evaluate_iterable
for r in iterable:
File “D:\Anaconda3\lib\site-packages\scrapy\spidermiddlewares\offsite.py”, line 29, in process_spider_output
for x in result:
File “D:\Anaconda3\lib\site-packages\scrapy\core\spidermw.py”, line 56, in _evaluate_iterable
for r in iterable:
File “D:\Anaconda3\lib\site-packages\scrapy\spidermiddlewares\referer.py”, line 342, in
return (_set_referer® for r in result or ())
File “D:\Anaconda3\lib\site-packages\scrapy\core\spidermw.py”, line 56, in _evaluate_iterable
for r in iterable:
File “D:\Anaconda3\lib\site-packages\scrapy\spidermiddlewares\urllength.py”, line 40, in
return (r for r in result or () if _filter®)
File “D:\Anaconda3\lib\site-packages\scrapy\core\spidermw.py”, line 56, in _evaluate_iterable
for r in iterable:
File “D:\Anaconda3\lib\site-packages\scrapy\spidermiddlewares\depth.py”, line 58, in
return (r for r in result or () if _filter®)
File “D:\Anaconda3\lib\site-packages\scrapy\core\spidermw.py”, line 56, in evaluate_iterable
for r in iterable:
File “D:\eswork-master\articles\articles\spiders\pm_spider.py”, line 49, in parse
browser = uc.Chrome()
File "D:\Anaconda3\lib\site-packages\undetected_chromedriver_init
.py", line 357, in init
options.binary_location, *options.arguments
File “D:\Anaconda3\lib\site-packages\undetected_chromedriver\dprocess.py”, line 34, in start_detached
daemon=True,
File “D:\Anaconda3\lib\multiprocessing\process.py”, line 112, in start
self._popen = self._Popen(self)
File “D:\Anaconda3\lib\multiprocessing\context.py”, line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File “D:\Anaconda3\lib\multiprocessing\context.py”, line 322, in _Popen
return Popen(process_obj)
File “D:\Anaconda3\lib\multiprocessing\popen_spawn_win32.py”, line 46, in init
prep_data = spawn.get_preparation_data(process_obj._name)
File “D:\Anaconda3\lib\multiprocessing\spawn.py”, line 143, in get_preparation_data
_check_not_importing_main()
File “D:\Anaconda3\lib\multiprocessing\spawn.py”, line 136, in _check_not_importing_main
is not going to be frozen to produce an executable.’’’)
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if name == ‘main’:
freeze_support()

The “freeze_support()” line can be omitted if the program
is not going to be frozen to produce an executable.
2022-02-08 01:55:24 [scrapy.core.engine] INFO: Closing spider (finished)
2022-02-08 01:55:24 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{‘downloader/request_bytes’: 224,
‘downloader/request_count’: 1,
‘downloader/request_method_count/GET’: 1,
‘downloader/response_bytes’: 16567,
‘downloader/response_count’: 1,
‘downloader/response_status_count/200’: 1,
‘elapsed_time_seconds’: 2.543997,
‘finish_reason’: ‘finished’,
‘finish_time’: datetime.datetime(2022, 2, 7, 17, 55, 24, 205104),
‘httpcompression/response_bytes’: 80409,
‘httpcompression/response_count’: 1,
‘log_count/DEBUG’: 4,
‘log_count/ERROR’: 1,
‘log_count/INFO’: 11,
‘response_received_count’: 1,
‘scheduler/dequeued’: 1,
‘scheduler/dequeued/memory’: 1,
‘scheduler/enqueued’: 1,
‘scheduler/enqueued/memory’: 1,
‘spider_exceptions/RuntimeError’: 1,
‘start_time’: datetime.datetime(2022, 2, 7, 17, 55, 21, 661107)}
2022-02-08 01:55:24 [scrapy.core.engine] INFO: Spider closed (finished)

正在回答 回答被采纳积分+3

1回答

提问者 慕瓜9058083 2022-02-08 17:06:18

已解决

0 回复 有任何疑惑可以回复我~
  • bobby #1
    好的。
    回复 有任何疑惑可以回复我~ 2022-02-09 23:25:10
  • Jynine #2
    怎么解决的?
    回复 有任何疑惑可以回复我~ 2022-02-10 09:47:49
  • 提问者 慕瓜9058083 回复 Jynine #3
    main文件里修改一下
    if __name__ == '__main__':
        sys.path.append(os.path.dirname(os.path.abspath(__file__)))
        execute(["scrapy", "crawl", "xxxxx你自己定义的那个文件名"])
    回复 有任何疑惑可以回复我~ 2022-02-10 17:39:55
问题已解决,确定采纳
还有疑问,暂不采纳
意见反馈 帮助中心 APP下载
官方微信