selenium.common.exceptions.InvalidArgumentException:消息：使用selenium Python从文本文件读取URL调用get（）时发生无效参数错误_Python_List_Selenium_For Loop_Selenium Webdriver

selenium.common.exceptions.InvalidArgumentException:消息：使用selenium Python从文本文件读取URL调用get（）时发生无效参数错误

python list selenium for-loop selenium-webdriver

selenium.common.exceptions.InvalidArgumentException:消息：使用selenium Python从文本文件读取URL调用get（）时发生无效参数错误,python,list,selenium,for-loop,selenium-webdriver,Python,List,Selenium,For Loop,Selenium Webdriver,我有一个.txt文件中的URL列表，我希望使用selenium运行该文件假设文件名为b.txt，其中包含2个URL（格式如下所示）： , 我试图做的是让selenium运行这两个URL（从.txt文件），但是似乎每次代码到达“driver.get”行时，代码都会失败 url = open ('b.txt','r') url_rpt = url.read().split(",") options = Options() options.add_argument('--headless') opt

我有一个.txt文件中的URL列表，我希望使用selenium运行该文件

假设文件名为b.txt，其中包含2个URL（格式如下所示）： ,

我试图做的是让selenium运行这两个URL（从.txt文件），但是似乎每次代码到达“driver.get”行时，代码都会失败

url = open ('b.txt','r')
url_rpt = url.read().split(",")
options = Options()
options.add_argument('--headless')
options.add_argument('--disable-gpu')
driver = webdriver.Chrome(chrome_options=options)
for link in url_rpt:
   driver.get(link)
driver.quit()

运行代码时得到的结果是

Traceback (most recent call last):
File "C:/Users/ASUS/PycharmProjects/XXXX/Test.py", line 22, in <module>
driver.get(link)
File "C:\Users\ASUS\AppData\Local\Programs\Python\Python38\lib\site- 
packages\selenium\webdriver\remote\webdriver.py", line 333, in get
self.execute(Command.GET, {'url': url})
File "C:\Users\ASUS\AppData\Local\Programs\Python\Python38\lib\site- 
packages\selenium\webdriver\remote\webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "C:\Users\ASUS\AppData\Local\Programs\Python\Python38\lib\site- 
packages\selenium\webdriver\remote\errorhandler.py", line 242, in 
check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.InvalidArgumentException: Message: invalid 
argument
(Session info: headless chrome=79.0.3945.117)

回溯（最近一次呼叫最后一次）：
文件“C:/Users/ASUS/PycharmProjects/XXXX/Test.py”，第22行，在
驱动程序。获取（链接）
文件“C:\Users\ASUS\AppData\Local\Programs\Python\Python38\lib\site-
packages\selenium\webdriver\remote\webdriver.py”，get中的第333行
self.execute（Command.GET，{'url'：url}）
文件“C:\Users\ASUS\AppData\Local\Programs\Python\Python38\lib\site-
packages\selenium\webdriver\remote\webdriver.py”，执行中的第321行
self.error\u handler.check\u响应（响应）
文件“C:\Users\ASUS\AppData\Local\Programs\Python\Python38\lib\site-
packages\selenium\webdriver\remote\errorhandler.py”，第242行，在
检查您的响应
引发异常类（消息、屏幕、堆栈跟踪）
selenium.common.exceptions.InvalidArgumentException:消息：无效
论点
（会话信息：无头镀铬=79.0.3945.117）

有没有关于如何重新编写代码的建议？

此错误消息

Traceback (most recent call last):
  .
    driver.get(link)
  .
    self.execute(Command.GET, {'url': url})
  .
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.InvalidArgumentException: Message: invalid argument
  (Session info: chrome=79.0.3945.117)

…表示作为参数传递给

get（）

的

url

是无效的参数

当包含url列表的文本文件在最后一个url的分隔符之后包含一个空格字符时，我能够重现相同的回溯。b.txt的fag端可能存在空格字符，如
https://www.google.com/,https://www.bing.com/，

调试理想的调试方法是打印
url\u rpt
，它将显示空格字符，如下所示：

代码块：

url = open ('url_list.txt','r') url_rpt = url.read().split(",") print(url_rpt)

控制台输出：

['https://www.google.com/', 'https://www.bing.com/', ' ']

解决方案如果从末尾删除空格字符，您自己的代码将执行perfecto：

options = webdriver.ChromeOptions() options.add_argument("start-maximized") options.add_experimental_option("excludeSwitches", ["enable-automation"]) options.add_experimental_option('useAutomationExtension', False) driver = webdriver.Chrome(options=options, executable_path=r'C:\WebDrivers\chromedriver.exe') url = open ('url_list.txt','r') url_rpt = url.read().split(",") print(url_rpt) for link in url_rpt: driver.get(link) driver.quit()

我还遇到了类似的问题，Selenium在打开URL时出错，并打印了以下消息：

selenium.common.exceptions.InvalidArgumentException: Message: invalid argument (Session info: MicrosoftEdge=91.0.852.0)
仔细查看后，我发现我的url字符串位于“UTF-8”中，包含一个前导的ZWNBSP字符，因此selenium无法接受该url（我从文件中读取url列表，这是导致该错误的原因）。在我看来，selenium应该更好地报告错误（说URL参数无效）
为了纠正此问题，我使用以下代码清理我的URL：

url = url.encode('ascii', 'ignore').decode('unicode_escape')

你所说的“失败”是什么意思？你有例外吗？如果是，消息和堆栈跟踪是什么？我们需要这些基本信息。在上面的for循环中
driver.get（link）
添加一行
print（link）
。当“代码失败”是什么意思？错误消息是什么？如果在url\u rpt:print（url）中为url运行
，会发生什么。这可能不是Selenium的问题，但可能与url 输入和读取策略有关。这将有助于缩小Selenium是否真的抛出了错误，或者问题是否出在文件上。我会在帖子中对此进行更新。@Christine:谢谢！如果我在url\u rpt:print（ur）
中为url运行
，它将返回两个链接。意识到列表末尾有一个逗号！非常感谢您强调这一点！！当我忘记用与@philomath相同的https:// 启动url时，我遇到了同样的错误。我在driver.get（）函数中遇到了这个异常，我通过使用http://作为前缀（在我的例子中是http://localhost）解决了这个问题。仅供参考，如果我们只是打印url进行检查，这些额外的字符（例如ZWNBSP）可能不可见。