Python 如何使用selenium从xlsx文件中的一列打开多个URL_Python_Excel_Selenium_Loops_Url

Python 如何使用selenium从xlsx文件中的一列打开多个URL

python excel selenium loops url

Python 如何使用selenium从xlsx文件中的一列打开多个URL,python,excel,selenium,loops,url,Python,Excel,Selenium,Loops,Url,我是一个绝对的noob，我有下面的场景：我有一个Excel文件，其中的一列在不同的单元格中填充了+4000个url。该url链接到一个类似facebook的页面，用户将被要求设置密码。我需要使用Python从列中检索每个url，用Chrome打开它，为所有用户输入相同的指定密码，然后验证它是否登录到主页上逐步： 1 Openpyxl打开excel电子表格 2查找带有URL的列 3制作一个URL列表 4让chrome打开第一个url 6查找密码字段 7为所有用户输入相同的密码 8确认它登陆主页

我是一个绝对的noob，我有下面的场景：我有一个Excel文件，其中的一列在不同的单元格中填充了+4000个url。该url链接到一个类似facebook的页面，用户将被要求设置密码。我需要使用Python从列中检索每个url，用Chrome打开它，为所有用户输入相同的指定密码，然后验证它是否登录到主页上

逐步：

1 Openpyxl打开excel电子表格

2查找带有URL的列

3制作一个URL列表

4让chrome打开第一个url

6查找密码字段

7为所有用户输入相同的密码

8确认它登陆主页

9与列中的所有其他URL循环，直到结束

10最好得到一份报告，以确认失败的次数（如果有）

到目前为止，这是我的代码：

# I can open the file
import openpyxl
wb=openpyxl.load_workbook('Test Sheet.xlsx')
type(wb)

# get the name of the sheet I need to work with
print (wb.sheetnames)

<Worksheet "Users">

# this line brings the current urls in my file 
sheet=wb['Users']
for x in range (2,4):
print(x,sheet.cell(row=x,column=3).value)

# output
2 https://firstfacebookpage.com
3 https://secondfacebookpage.com


# I found this other way to retrieve the urls from the excel spreadsheet.
ws = wb['Users']
column = ws['c']  
column_list = [column[x].value for x in range(len(column))]
print (column_list)

# output while having only 2 urls in the test sheet.
['Claim Link', 'https://somefacebookurl.com', 'https://someotherfacebookurl.com', None, None, None, 
None, None, None, None, None, None, None, None, None, None, None, None, None, None]

# This login, enter password, verify, close browser, works perfectly if I manually enter the url.
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
driver = webdriver.Chrome()
driver.get("https://firstfacebookpage.com")

password_box = driver.find_element_by_class_name('inputpassword') 
password_box.send_keys("theonepassword") 
print ("Password entered") 

login_box = driver.find_element_by_id('u_0_9') 
login_box.click() 

print ("Done") 
driver.close() 
print("Finished")

#我可以打开文件
导入openpyxl
wb=openpyxl.load_工作簿（'Test Sheet.xlsx'）
类型（wb）
#获取我需要处理的工作表的名称
打印（wb.图纸名称）
#这一行显示我文件中的当前URL
工作表=wb[“用户”]
对于范围（2,4）内的x：
打印（x，sheet.cell（行=x，列=3）.value）
#输出
2.https://firstfacebookpage.com
3.https://secondfacebookpage.com
#我找到了另一种从excel电子表格中检索URL的方法。
ws=wb[“用户”]
column=ws['c']
列列表=[列[x]。范围内x的值（列（列））]
打印（列列表）
#测试表中只有2个URL时输出。
[‘索赔链接’，'https://somefacebookurl.com', 'https://someotherfacebookurl.com”“没有，没有，没有，
无，无，无，无，无，无，无，无，无，无，无，无，无，无，无，无，无]
#这个登录，输入密码，验证，关闭浏览器，如果我手动输入url，效果会很好。
从selenium导入webdriver
从selenium.webdriver.common.keys导入密钥
driver=webdriver.Chrome（）
驱动程序。获取（“https://firstfacebookpage.com")
password\u box=驱动程序。通过类名称（“inputpassword”）查找元素
密码框。发送密码（“密码”）
打印（“输入密码”）
login\u box=driver.find\u element\u by\u id（'u\u 0\u 9'））
登录框。单击（）
打印（“完成”）
驱动程序关闭（）
打印（“完成”）

现在我想不出一种方法来让“driver.get”从电子表格中获取URL并循环这些登录步骤。因为我的文件在列中有+4000个URL，所以我宁愿让脚本为我这样做。

任何帮助都将不胜感激

你可以试试熊猫&xlrd

import pandas as pd
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException



df = pd.read_excel('myurls.xlsm') # Get all the urls from the excel
mylist = df['urls'].tolist() #urls is the column name

print(mylist) # will print all the urls

# now loop through each url & perform actions.
for url in mylist:
    driver = webdriver.Chrome()
    driver.get(url)

    try:
       WebDriverWait(driver, 3).until(EC.alert_is_present(),'Timed out waiting for alert.')

       alert = driver.switch_to.alert
       alert.accept()
       print("alert accepted")
    except TimeoutException:
       print("no alert")
    password_box = driver.find_element_by_class_name('inputpassword') 
    password_box.send_keys("theonepassword") 
    print ("Password entered")
    login_box = driver.find_element_by_id('u_0_9') 
    login_box.click() 
    driver.close()


print ("Done")

为什么需要使用selenium来读取Excel文件？你不能把它读出来吗？使用python读取excel并将其存储到List/Array/Set中。我正在使用openpyxl读取excel文件，我想我可以用它制作一些东西，然后让selenium打开URL。我原以为模块可以相互通信，但这可能是一个不切实际的想法。我理解这行代码mylist=df['Users']。tolist（）#url是用来标识工作表的列名。我的工作表名为“用户”，但它给出了一个关键错误。我想复制错误日志，但它包含的字符似乎超出了允许的范围。Users是列名。我已经更新了答案。很抱歉出现混淆，现在它与列表中的第一个url一起工作，但在第二次出现“show notifications-allow-block”时，它会在主页上停止。然后它给出了一个错误“找不到元素：inputpassword”，我猜它仍然在尝试登录主页，而不是关闭窗口并继续进入列表上的下一个url。但我不知道为什么。警报出人意料。要使用同一浏览器逐个打开url，还是每次都使用新浏览器打开url？如果您每次都使用新浏览器，是否会出现警报？如果通过“新浏览器”您可以使用新窗口或选项卡，我想这将是理想的选择。不确定是否必须一个接一个地完成，或者我是否可以将其设置为一次处理5或10个URL。因为这是一大笔钱，所以可以节省很多时间。“允许通知消息”出现两次，第二次是在第一个url加载、登录工作并登录到主页之后。此时，浏览器应关闭、重新打开并转到下一个url。但是“允许通知”弹出窗口仍然在屏幕上，脚本抛出关于“inputpassword”的错误。