Python 3.x 使用Selenium scraping(Python)时,只获得表中的第一行

Python 3.x 使用Selenium scraping(Python)时,只获得表中的第一行,python-3.x,selenium,xpath,Python 3.x,Selenium,Xpath,我想把整张桌子从下面刮下来: 我设法把它擦掉了,但结果只得到了表的第一行,而不是20行 我看到了前面回答的类似问题,并尝试了给出的解决方案,但当我对xpath使用/时,我的selenium无法找到元素 用于驱动程序中的bod。通过xpath(“//*[@id='proxylisttable']/tbody”)查找元素: col=bod.通过xpath(“/*[@id='proxylisttable']/tbody/tr”)查找元素 对于col中的c: ip=c.find_element_by_x

我想把整张桌子从下面刮下来:

我设法把它擦掉了,但结果只得到了表的第一行,而不是20行

我看到了前面回答的类似问题,并尝试了给出的解决方案,但当我对xpath使用
/
时,我的selenium无法找到元素

用于驱动程序中的bod。通过xpath(“//*[@id='proxylisttable']/tbody”)查找元素:
col=bod.通过xpath(“/*[@id='proxylisttable']/tbody/tr”)查找元素
对于col中的c:
ip=c.find_element_by_xpath('/*[@id=“proxylisttable”]/tbody/tr/td[1]”)
port=c.find_element_by_xpath('/*[@id=“proxylisttable”]/tbody/tr/td[2]”)
code=c.find_element_by_xpath('/*[@id=“proxylisttable”]/tbody/tr/td[3]”)
country=c.通过xpath查找元素('/*[@id=“proxylisttable”]/tbody/tr/td[4][@class=“hm”]”)
匿名性=c.find_element_by_xpath('/*[@id=“proxylistable”]/tbody/tr/td[5]”)
google=c.find_element_by_xpath('/*[@id=“proxylisttable”]/tbody/tr/td[6][@class=“hm”]”)

我的代码只将第一行刮了20次,而没有得到20行。行在
ip
port
code
等处指示。我尝试了多种类型的xpath语法,但结果仍然相同。

修改代码,使第二个for循环中有一个索引,该索引将从1循环到cols长度,并使用它查找每个列元素

ip = c.find_element_by_xpath('//*[@id="proxylisttable"]/tbody/tr[index]/td[1]')

注意:请按照python修改语法。

我认为您的问题在于这一行:

col = bod.find_elements_by_xpath("//*[@id='proxylisttable']/tbody/tr")
正确的语法是:

col = bod.find_elements_by_xpath("//*[@id='proxylisttable']/tbody/tr[insert count here]")
像这样:

table = driver.find_element_by_xpath("//*[@id='proxylisttable']/tbody")
rows = table.find_elements_by_xpath("//*[@id='proxylisttable']/tbody/tr")

for i in range (1, len(rows)+1):
    row = table.find_element_by_xpath("//*[@id='proxylisttable']/tbody/tr[" +str(i) +']')
    for c in row:
        ip = c.find_element_by_xpath('//*[@id="proxylisttable"]/tbody/tr/td[1]')
        port = c.find_element_by_xpath('//*[@id="proxylisttable"]/tbody/tr/td[2]')
        code = c.find_element_by_xpath('//*[@id="proxylisttable"]/tbody/tr/td[3]')
        country = c.find_element_by_xpath('//*[@id="proxylisttable"]/tbody/tr/td[4][@class = "hm"]')
        anonymity = c.find_element_by_xpath('//*[@id="proxylisttable"]/tbody/tr/td[5]')
        google = c.find_element_by_xpath('//*[@id="proxylisttable"]/tbody/tr/td[6][@class = "hm"]')

要处理动态元素,请导入位于的所有元素的
WebdriverWait
visibility\u
等待元素,然后使用以下xpath

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By

driver=webdriver.Chrome("path of the chrome driver")
driver.get('https://free-proxy-list.net/')

rows= WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//table[@id='proxylisttable']/tbody//tr")))
for row in rows:
    ip=row.find_element_by_xpath('./td[1]').text
    port=row.find_element_by_xpath('./td[2]').text
    code=row.find_element_by_xpath('./td[3]').text
    country=row.find_element_by_xpath('./td[4]').get_attribute('textContent')
    Anonymity=row.find_element_by_xpath('./td[5]').text
    google=row.find_element_by_xpath('./td[6]').get_attribute('textContent')
    https=row.find_element_by_xpath('./td[7]').text
    lastchecked=row.find_element_by_xpath('./td[8]').get_attribute('textContent')
    print("IP :{}, Port:{}, code:{}, country:{}, Anonymity:{}, google:{}, https:{}, last_checked:{}".format(ip,port,code,country,Anonymity,google,https,lastchecked))
控制台上的输出:

IP :185.132.133.173, Port:8080, code:NL, country:Netherlands, Anonymity:elite proxy, google:no, https:yes, last_checked:1 minute ago
IP :181.112.225.78, Port:58948, code:EC, country:Ecuador, Anonymity:elite proxy, google:no, https:yes, last_checked:1 minute ago
IP :134.249.149.219, Port:35795, code:UA, country:Ukraine, Anonymity:elite proxy, google:no, https:yes, last_checked:1 minute ago
IP :195.20.30.54, Port:55182, code:UA, country:Ukraine, Anonymity:elite proxy, google:no, https:yes, last_checked:1 minute ago
IP :14.102.69.170, Port:53347, code:IN, country:India, Anonymity:elite proxy, google:no, https:yes, last_checked:1 minute ago
IP :182.53.193.108, Port:54543, code:TH, country:Thailand, Anonymity:elite proxy, google:no, https:yes, last_checked:1 minute ago
IP :159.224.221.175, Port:58299, code:UA, country:Ukraine, Anonymity:elite proxy, google:no, https:yes, last_checked:1 minute ago
IP :36.89.188.123, Port:49725, code:ID, country:Indonesia, Anonymity:elite proxy, google:no, https:yes, last_checked:1 minute ago
IP :103.231.163.58, Port:43620, code:BD, country:Bangladesh, Anonymity:elite proxy, google:no, https:yes, last_checked:1 minute ago
IP :114.130.92.14, Port:49167, code:BD, country:Bangladesh, Anonymity:elite proxy, google:no, https:yes, last_checked:1 minute ago
IP :177.54.200.10, Port:49501, code:BR, country:Brazil, Anonymity:elite proxy, google:no, https:yes, last_checked:1 minute ago
IP :77.38.21.239, Port:8080, code:SI, country:Slovenia, Anonymity:elite proxy, google:no, https:yes, last_checked:1 minute ago
IP :78.137.89.161, Port:8080, code:YE, country:Yemen, Anonymity:transparent, google:no, https:no, last_checked:1 minute ago
IP :103.216.147.49, Port:8080, code:IN, country:India, Anonymity:transparent, google:no, https:no, last_checked:1 minute ago
IP :195.250.188.210, Port:8080, code:EE, country:Estonia, Anonymity:transparent, google:no, https:no, last_checked:1 minute ago
IP :5.196.255.171, Port:3128, code:FR, country:France, Anonymity:transparent, google:no, https:no, last_checked:1 minute ago
IP :109.234.112.250, Port:46675, code:GE, country:Georgia, Anonymity:transparent, google:no, https:no, last_checked:1 minute ago
IP :186.225.48.178, Port:8080, code:BR, country:Brazil, Anonymity:transparent, google:no, https:no, last_checked:1 minute ago
IP :101.255.64.142, Port:35401, code:ID, country:Indonesia, Anonymity:elite proxy, google:no, https:yes, last_checked:1 minute ago
IP :160.119.129.42, Port:57557, code:GN, country:Guinea, Anonymity:elite proxy, google:no, https:yes, last_checked:1 minute ago

你想得到每个单元格的值吗?我尝试了这个方法,不幸的是现在它也只返回第一行。它没有增加最后的“i”。确定
i
没有增加吗?是的:(不幸的是,它没有增加。我试图在代码的其他地方插入
i
。但结果相同。而且我认为这与
tr['+i+'].
有关。我认为
i
需要
str(i)
我终于做到了!谢谢!