刮痧；xpath：在树中搜索特定文本并在下一个节点中提取文本_Xpath_Web Scraping_Text_Scrapy_Contains

刮痧；xpath：在树中搜索特定文本并在下一个节点中提取文本

xpath web-scraping text scrapy

刮痧；xpath：在树中搜索特定文本并在下一个节点中提取文本,xpath,web-scraping,text,scrapy,contains,Xpath,Web Scraping,Text,Scrapy,Contains,试图将智能手表的重量从机器上刮下来。该网站并非所有产品都遵循相同的结构，因此为了获得每个产品的权重，我尝试使用xpath进行关键字搜索： //text()[contains(.,'Weight')] 问题在于代码我可以得到文本“Weight”，但我想得到的是以下节点，其中包含权重的实际值： <tbody> <tr> <th scope = "row">Weight</th> <td> 26.7 g&l

试图将智能手表的重量从机器上刮下来。该网站并非所有产品都遵循相同的结构，因此为了获得每个产品的权重，我尝试使用

xpath

进行关键字搜索：

//text()[contains(.,'Weight')]

问题在于代码我可以得到文本“Weight”，但我想得到的是以下

节点

，其中

包含权重的实际值：
<tbody>
 <tr>
   <th scope = "row">Weight</th>
   <td> 26.7 g</td>
 <tr>
<body>

有什么建议吗？提前感谢。
您可以使用以下兄弟姐妹：：td
：
from lxml import etree


txt = '''<tbody>
 <tr>
   <th scope = "row">Weight</th>
   <td> 26.7 g</td>
 </tr>
</tbody>'''

root = etree.fromstring(txt)

for td in root.xpath('//th[contains(., "Weight")]/following-sibling::td'):
    print(td.text)

工作出色。非常感谢。为了便于将来参考，为了得到前面的同级，我只是将上面的代码更改为“/preference sibling:：td”@sophods不幸的是，XPATH不支持搜索前面元素的函数。这是CSS/Xpath备忘单：但是在lxml
中有.getparent（）函数-从那里可以搜索所有同级。
from lxml import etree


txt = '''<tbody>
 <tr>
   <th scope = "row">Weight</th>
   <td> 26.7 g</td>
 </tr>
</tbody>'''

root = etree.fromstring(txt)

for td in root.xpath('//th[contains(., "Weight")]/following-sibling::td'):
    print(td.text)

 26.7 g