Python 如何通过在任意XML深度中通过正则表达式标识属性来使用BeautifulSoup查找bs4 XML属性值？_Python_Regex_Xml_Beautifulsoup

Python 如何通过在任意XML深度中通过正则表达式标识属性来使用BeautifulSoup查找bs4 XML属性值？

python regex xml

Python 如何通过在任意XML深度中通过正则表达式标识属性来使用BeautifulSoup查找bs4 XML属性值？,python,regex,xml,beautifulsoup,Python,Regex,Xml,Beautifulsoup,我有以下bs4元素： from bs4 import BeautifulSoup html_doc = """ <l2 attribute2="Output"><s3><Cell cell_value2="384.01"/></s3></l2>, <l1><s3 attribute1="Cost"><

我有以下bs4元素：

from bs4 import BeautifulSoup

html_doc = """
    <l2 attribute2="Output"><s3><Cell cell_value2="384.01"/></s3></l2>, 
    <l1><s3 attribute1="Cost"><s4><Cell cell_value1="2314.37"/></s4></s3></l1>
"""

soup = BeautifulSoup(html_doc, "html.parser")

我的问题是：如何使用regex

re.compile（r'^attribute[0-9]$）

实现这一点，如果

attribute*

可以位于第一个标记上（例如

l1

或

l2

），或者它可以“更深”，例如

s3

或其他任意深度）

如果属性具有相同的名称，或者它们在相同的深度级别上具有不同的名称，我可以这样做，但不能同时使用这两个名称。

import re
从bs4导入BeautifulSoup
html_doc=“”
, 
"""
soup=BeautifulSoup（html\u doc，“html.parser”）
r=重新编译（r“^attribute\d+”）
out=[]
用于soup.find_all（lambda标记：any）（r.search（a）for a in tag.attrs））：
对于attr，tag.attrs.items（）中的值：
如果r.search（attr）：
out.append（值）
打印（输出）

印刷品：

['Output'，'Cost']

["Output", "Cost"]