老师好,我在windows环境下使用undetected_chromedriver,旧有的代码运行正常,但只要我加入options.add_argument("–headless"),就会出现错误:
builtins.TypeError: ‘<’ not supported between instances of ‘NoneType’ and 'int’
我查询到相关的错误是说版本的问题,且课程中提到undetected_chromedriver会自动下载chromedriver,这与我windows中的版本(我平常在使用的chrome browser)有甚么关联性吗? 这是甚么样的问题呢,请问该如何解决?
代码:
def build_driver():
options = ChromeOptions()
options.add_argument('--disable-blink-features=AutomationControlled')
options.add_argument('--disable-infobars')
options.add_argument('--disable-popup-blocking')
options.add_argument("--headless")
return uc.Chrome(options)
详细错误:
C:\Users\Administrator\PycharmProjects\scrapy-venv\Scripts\python.exe C:\Users\Administrator\PycharmProjects\housecrawler\housecrawler\main.py
C:\Users\Administrator\PycharmProjects\housecrawler\housecrawler\items.py:34: ScrapyDeprecationWarning: scrapy.loader.processors.MapCompose is deprecated, instantiate itemloaders.processors.MapCompose instead.
default_input_processor = MapCompose(str.strip)
C:\Users\Administrator\PycharmProjects\housecrawler\housecrawler\items.py:36: ScrapyDeprecationWarning: scrapy.loader.processors.MapCompose is deprecated, instantiate itemloaders.processors.MapCompose instead.
price_in = MapCompose(lambda x: x.replace(',', ''))
C:\Users\Administrator\PycharmProjects\housecrawler\housecrawler\items.py:38: ScrapyDeprecationWarning: scrapy.loader.processors.Join is deprecated, instantiate itemloaders.processors.Join instead.
default_output_processor = Join()
2023-03-28 11:16:54 [scrapy.utils.log] INFO: Scrapy 2.7.1 started (bot: housecrawler)
2023-03-28 11:16:54 [scrapy.utils.log] INFO: Versions: lxml 4.9.2.0, libxml2 2.9.12, cssselect 1.2.0, parsel 1.7.0, w3lib 2.1.1, Twisted 22.10.0, Python 3.9.13 (tags/v3.9.13:6de2ca5, May 17 2022, 16:36:42) [MSC v.1929 64 bit (AMD64)], pyOpenSSL 23.0.0 (OpenSSL 3.0.7 1 Nov 2022), cryptography 39.0.0, Platform Windows-10-10.0.19044-SP0
2023-03-28 11:16:54 [scrapy.crawler] INFO: Overridden settings:
{'BOT_NAME': 'housecrawler',
'COOKIES_ENABLED': False,
'DUPEFILTER_CLASS': 'scrapy_redis_bloomfilter.RFPDupeFilter',
'LOG_FILE': 'rent.log',
'LOG_LEVEL': 'ERROR',
'NEWSPIDER_MODULE': 'housecrawler.spiders',
'REQUEST_FINGERPRINTER_IMPLEMENTATION': '2.7',
'ROBOTSTXT_OBEY': True,
'SCHEDULER': 'scrapy_redis.scheduler.Scheduler',
'SPIDER_MODULES': ['housecrawler.spiders'],
'TWISTED_REACTOR': 'twisted.internet.asyncioreactor.AsyncioSelectorReactor'}
Unhandled error in Deferred:
Traceback (most recent call last):
File "C:\Users\Administrator\PycharmProjects\scrapy-venv\lib\site-packages\scrapy\crawler.py", line 220, in crawl
return self._crawl(crawler, *args, **kwargs)
File "C:\Users\Administrator\PycharmProjects\scrapy-venv\lib\site-packages\scrapy\crawler.py", line 224, in _crawl
d = crawler.crawl(*args, **kwargs)
File "C:\Users\Administrator\PycharmProjects\scrapy-venv\lib\site-packages\twisted\internet\defer.py", line 1947, in unwindGenerator
return _cancellableInlineCallbacks(gen)
File "C:\Users\Administrator\PycharmProjects\scrapy-venv\lib\site-packages\twisted\internet\defer.py", line 1857, in _cancellableInlineCallbacks
_inlineCallbacks(None, gen, status, _copy_context())
--- <exception caught here> ---
File "C:\Users\Administrator\PycharmProjects\scrapy-venv\lib\site-packages\twisted\internet\defer.py", line 1697, in _inlineCallbacks
result = context.run(gen.send, result)
File "C:\Users\Administrator\PycharmProjects\scrapy-venv\lib\site-packages\scrapy\crawler.py", line 115, in crawl
self.spider = self._create_spider(*args, **kwargs)
File "C:\Users\Administrator\PycharmProjects\scrapy-venv\lib\site-packages\scrapy\crawler.py", line 127, in _create_spider
return self.spidercls.from_crawler(self, *args, **kwargs)
File "C:\Users\Administrator\PycharmProjects\scrapy-venv\lib\site-packages\scrapy_redis\spiders.py", line 244, in from_crawler
obj = super(RedisSpider, cls).from_crawler(crawler, *args, **kwargs)
File "C:\Users\Administrator\PycharmProjects\scrapy-venv\lib\site-packages\scrapy\spiders\__init__.py", line 48, in from_crawler
spider = cls(*args, **kwargs)
File "C:\Users\Administrator\PycharmProjects\housecrawler\housecrawler\spiders\rent.py", line 43, in __init__
self.node_driver = build_driver()
File "C:\Users\Administrator\PycharmProjects\housecrawler\housecrawler\spiders\rent.py", line 41, in build_driver
return uc.Chrome(options)
File "C:\Users\Administrator\PycharmProjects\scrapy-venv\lib\site-packages\undetected_chromedriver\__init__.py", line 374, in __init__
if self.patcher.version_main < 108:
builtins.TypeError: '<' not supported between instances of 'NoneType' and 'int'
带你彻底掌握Scrapy,用Django+Elasticsearch搭建搜索引擎
了解课程