标识表行和表数据-CSS选择器Python_Python_Selenium_Html Table_Css Selectors

标识表行和表数据-CSS选择器Python

python selenium

标识表行和表数据-CSS选择器Python,python,selenium,html-table,css-selectors,Python,Selenium,Html Table,Css Selectors,我要从多个表行中提取数据：案例1 Onsite Service After Remote Diagnosis April 19, 2014 April 19, 2017 案例2 CAR October 15, 2016 October 15, 2017 Onsite Service After Remote Diagnosis October 15, 2016 October 15, 201

我要从多个表行中提取数据：

案例1

 Onsite Service After Remote Diagnosis  April 19, 2014  April 19, 2017

案例2

CAR                                     October 15, 2016    October 15, 2017    
Onsite Service After Remote Diagnosis   October 15, 2016    October 15, 2019

案例3

NBD ProSupport                          July 16, 2008   July 15, 2011   
Onsite Service After Remote Diagnosis   July 16, 2008   July 15, 2011

我需要提取的信息位于第二个td上包含“远程诊断后的现场服务”的行上，对于每种情况，该行右侧的日期都是该行的日期

预期产出：

                      April 19, 2017
                    October 15, 2017
                       July 15, 2011

我的代码：

from selenium import webdriver
import time
from openpyxl import load_workbook

driver = webdriver.Chrome()


def scrape(codes):
    dates = []
    for i in range(len(codes)):
        driver.get("https://www.dell.com/support/home/us/en/19/product-support/"
                   "servicetag/%s/warranty?ref=captchasuccess" % codes[i])

    # Solve captcha manually
        if i == 0:
            print("You now have 120\" seconds to solve the captcha")
            time.sleep(120)
            print("120\" Passed")
    # Extract data
        expdate = driver.find_element_by_css_selector("#printdivid > div > div.not-annotated.hover > table:nth-child(3) > tbody > tr > td:nth-child(3)")
        print(expdate.get_attribute('innerText'))
    driver.close()

codes = ['159DT3J', '15FDBG2', '10V8YZ1']
scrape(codes)

我的输出：

April 19, 2014
October 15, 2016
July 16, 2008

取自显示的第一行和第一个

td

我已尝试更改

tbody>tr>td:n子项（3）

，但基于文本进行识别会更好，并避免错误。

由于您需要提取“远程诊断后现场服务”的文本，我建议您使用以下内容更新用于查找元素的行：

expdate = driver.find_element_by_xpath("//td[text()='Onsite Service After Remote Diagnosis']/following-sibling::td")

在这里，我们使用xpath定位器，并在文本“远程诊断后的现场服务”旁边查找

td

是否有方法转到下一个td？因为有两个，秒数就是我要找的是的，只需在xpath

驱动程序中添加一个索引。通过xpath（“远程诊断后的现场服务”]/following sibling:：td）[2]”查找\u元素即可。

我添加了如下索引”//td[text（）=“远程诊断后的现场服务”]/following sibling:：td[2]”)它是有效的：）完成了。快速提问…文本可以使用正则表达式吗？这行吗？您可以阅读此线程了解详细信息：请给出上面示例中包含的表格的HTML代码datas@Jewlanu您是否希望为这三页中的第一项提取

保修到期日期的文本？@yong我已经设法解决了这个问题，谢谢您抽出时间@新的贡献者，是的，这就是我想要提取的，但我设法解决了：）希望它能对我所有的测试起作用