如何在python lxml、XPath中使用正则表达式_Python_Regex_Xpath

如何在python lxml、XPath中使用正则表达式

python regex xpath

如何在python lxml、XPath中使用正则表达式,python,regex,xpath,Python,Regex,Xpath,我正在努力做到： for element in root.xpath('//a[@id="hypProduct_[0-9]+"]'): 如何在xpath元素选择器（lxml）中使用[0-9]+？文件规定： By default, XPath supports regular expressions in the EXSLT namespace: >>> regexpNS = "http://exslt.org/regular-expressions" >>&g

我正在努力做到：

for element in root.xpath('//a[@id="hypProduct_[0-9]+"]'):

如何在xpath元素选择器（lxml）中使用[0-9]+？文件规定：

By default, XPath supports regular expressions in the EXSLT namespace:

>>> regexpNS = "http://exslt.org/regular-expressions"
>>> find = etree.XPath("//*[re:test(., '^abc$', 'i')]",
...                    namespaces={'re':regexpNS})

>>> root = etree.XML("<root><a>aB</a><b>aBc</b></root>")
>>> print(find(root)[0].text)
aBc

You can disable this with the boolean keyword argument regexp which defaults to True.

默认情况下，XPath支持EXSLT命名空间中的正则表达式： >>>regexpNS=”http://exslt.org/regular-expressions" >>>find=etree.XPath（“//*[re:test（，“^abc$”，“i”）”， …名称空间={'re'：regexpNS}） >>>root=etree.XML（“aBaBc”） >>>打印（查找（根）[0]。文本） aBc 您可以使用布尔关键字参数regexp禁用此选项，该参数默认为True。

我没有遵循：测试内容。有人能在文档的上下文中解释一下吗。

在您的情况下，表达式是：

//a[re:test(@id, "^hypProduct_[0-9]+$")]

演示：

>>从lxml.html导入fromstring
>>> 
>>>数据='link1'
>>>tree=fromstring（数据）
>>>xpath（'//a[re:test（@id，“^hypProduct”[0-9]+$”），名称空间={re'：“http://exslt.org/regular-expressions“}[0].attrib[“id”]
“hypProduct_10”

和，以获取更多信息。基本上，

re:test

指定测试函数所在的名称空间。

>>> from lxml.html import fromstring
>>> 
>>> data = '<a id="hypProduct_10">link1</a>'
>>> tree = fromstring(data)
>>> tree.xpath('//a[re:test(@id, "^hypProduct_[0-9]+$")]', namespaces={'re': "http://exslt.org/regular-expressions"})[0].attrib["id"]
'hypProduct_10'