Python：在特定元素上本地/本地使用xpath_Python_Xpath_Lxml

Python：在特定元素上本地/本地使用xpath

python xpath

Python：在特定元素上本地/本地使用xpath,python,xpath,lxml,Python,Xpath,Lxml,我正在尝试从带有xpath的页面获取链接。问题是我只需要表中的链接，但是如果我在整个页面上应用xpath表达式，我将捕获我不需要的链接例如： tree = lxml.html.parse(some_response) links = tree.xpath("//a[contains(@href, 'http://www.example.com/filter/')]") tree = lxml.html.parse(some_response) root = tree.getroot() ta

我正在尝试从带有xpath的页面获取链接。问题是我只需要表中的链接，但是如果我在整个页面上应用xpath表达式，我将捕获我不需要的链接

例如：

tree = lxml.html.parse(some_response)
links = tree.xpath("//a[contains(@href, 'http://www.example.com/filter/')]")

tree = lxml.html.parse(some_response)
root = tree.getroot()
table = root[1][5] #for example
links = table.xpath("//a[contains(@href, 'http://www.example.com/filter/')]")

问题在于将表达式应用于整个文档。我找到了我想要的元素，例如：

tree = lxml.html.parse(some_response)
links = tree.xpath("//a[contains(@href, 'http://www.example.com/filter/')]")

tree = lxml.html.parse(some_response)
root = tree.getroot()
table = root[1][5] #for example
links = table.xpath("//a[contains(@href, 'http://www.example.com/filter/')]")

但这似乎也在整个文档中执行查询，因为我仍然在捕获表外的链接。表示“在元素上使用xpath（）时，xpath表达式将根据元素（如果是相对的）或根树（如果是绝对的）求值：”。所以，我使用的是一个绝对表达式，我需要使它成为相对的？是这样吗

基本上，我如何只过滤这个表中存在的元素？

您的xpath以斜杠（

）开始，因此是绝对的。在前面添加一个点（

），使其相对于当前元素，即

links = table.xpath(".//a[contains(@href, 'http://www.example.com/filter/')]")

另一种选择是直接请求表中的元素。例如：

tree = lxml.html.parse(some_response)
links = tree.xpath("//table[**criteria**]//a[contains(@href, 'http://www.example.com/filter/')]")

links = tree.xpath("//table[@id='my_table_id']//a[contains(@href, 'http://www.example.com/filter/')]")

其中，

**如果页面中有许多表格，则需要标准**

。一些可能的标准是基于表id或类进行筛选。例如：

tree = lxml.html.parse(some_response)
links = tree.xpath("//table[**criteria**]//a[contains(@href, 'http://www.example.com/filter/')]")

links = tree.xpath("//table[@id='my_table_id']//a[contains(@href, 'http://www.example.com/filter/')]")

但是，如果我添加点，它似乎不会递归搜索（如中所示，它只搜索该元素）。至少在我看来是这样的，因为在我将过滤器设置为相对后，它就不再工作了。有没有办法让它从元素以外的地方搜索，而不是只在元素中搜索？@pvt pns这是一个很强的指示，其他东西出了问题。你能上传并链接到一个完整的例子吗？如果你不想这样做，你也可以直接与我联系（点击我的名字查看联系选项）@phihag:你在前面写了一个“.”（点），使之相对。您还可以使用

substant:：a[contains（@href，…）]

作为相对表达式。@pvt pns，应该是

//a

来搜索当前元素中的任何深度。但这不起作用。不知道为什么。也许我在页面中有错误的元素。