Xpath-如何选择相关数据_Xpath_Xpath 1.0

Xpath-如何选择相关数据

xpath

Xpath-如何选择相关数据,xpath,xpath-1.0,Xpath,Xpath 1.0,有没有办法不显式地使用position（）=4，而是引用相应的th标记我不确定这是最好的解决方案，但您可以尝试一下 //th[not(contains(text(), "ddd"))] | //tr[2]/td[not(position()=4)] 或较短版本 //th[not(.="bbb") and not(.="ddd") and not(.="eee")] | //tr[2]/td[not(position()=index-of(//th, "bbb")) and not(positi

有没有办法不显式地使用

position（）=4

，而是引用相应的

th

标记

我不确定这是最好的解决方案，但您可以尝试一下

//th[not(contains(text(), "ddd"))] | //tr[2]/td[not(position()=4)]

或较短版本

//th[not(.="bbb") and not(.="ddd") and not(.="eee")] | //tr[2]/td[not(position()=index-of(//th, "bbb")) and not(position()=index-of(//th, "ddd")) and not(position()=index-of(//th, "eee"))]

输出为

# Get list of th elements
th_elements = driver.find_elements_by_xpath('//th')
# Get list of td elements
td_elements = driver.find_elements_by_xpath('//tr[2]/td')
# Get indexes of required th elements - [0, 2, 5]
ok_index = [th_elements.index(i) for i in th_elements if i.text not in ('bbb', 'ddd', 'eee')]
for i in ok_index:
    print(th_elements[i].text)
for i in ok_index:
    print(td_elements[i].text)

如果需要

XPath 1.0

解决方案：

'aaa'
'ccc'
'fff'
'111'
'333'
'666'

使用XPath 3.0，您可以将其结构化为

//th[not(.=("bbb", "ddd", "eee"))]| //tr[2]/td[not(position()=(count(//th[.="bbb"]/preceding-sibling::th)+1, count(//th[.="ddd"]/preceding-sibling::th)+1, count(//th[.="eee"]/preceding-sibling::th)+1))]

很好，您已经包含了XML和预期输出，但是您没有说明预期输出符合什么标准——这并不明显。标准是选择每个

th

和相应的

td

，但排除包含“bbb”、“ddd”、“eee”的

th

以及它们相应的

td

标记您在

selenium

中使用的编程语言是什么？@Andersson我正在使用带chrome driverCheck的python谢谢，但这两个xpath在Firepath中都显示为无效？我感谢您的建议，但xpath似乎是最有效的方法，我发布的html代码只是一个大得多的文件的一个片段。我正在处理多个嵌套很深的html文件。也就是说，您之前的答案与我想要的非常接近，只是函数的

索引在xpath 1.0中不起作用，您知道有什么解决方法吗？谢谢，效果很好。我使用selenium，所以xpath 1.0将是理想的解决方案，谢谢
<th>aaa</th>
<th>ccc</th>
<th>fff</th>
<td>111</td>
<td>333</td>
<td>666</td>

# Get list of th elements
th_elements = driver.find_elements_by_xpath('//th')
# Get list of td elements
td_elements = driver.find_elements_by_xpath('//tr[2]/td')
# Get indexes of required th elements - [0, 2, 5]
ok_index = [th_elements.index(i) for i in th_elements if i.text not in ('bbb', 'ddd', 'eee')]
for i in ok_index:
    print(th_elements[i].text)
for i in ok_index:
    print(td_elements[i].text)

'aaa'
'ccc'
'fff'
'111'
'333'
'666'

//th[not(.=("bbb", "ddd", "eee"))]| //tr[2]/td[not(position()=(count(//th[.="bbb"]/preceding-sibling::th)+1, count(//th[.="ddd"]/preceding-sibling::th)+1, count(//th[.="eee"]/preceding-sibling::th)+1))]

let $th := //table/tbody/tr[1]/th, 
$filteredTh := $th[not(. = ("bbb", "ddd", "eee"))], 
$pos := $filteredTh!index-of($th, .)
return ($filteredTh, //table/tbody/tr[position() gt 1]/td[position() = $pos])