Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/xpath/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何将BeautifulSoup元素转换为xpath?_Python_Xpath_Beautifulsoup - Fatal编程技术网

Python 如何将BeautifulSoup元素转换为xpath?

Python 如何将BeautifulSoup元素转换为xpath?,python,xpath,beautifulsoup,Python,Xpath,Beautifulsoup,例如: html = " <ul> <li class="item-0"><a href="link1.html">first item</a></li> <li class="item-1"><a href="link2.html">second item</a></li> <li class="item-inactive"><a href

例如:

html = "
<ul>
     <li class="item-0"><a href="link1.html">first item</a></li>
     <li class="item-1"><a href="link2.html">second item</a></li>
     <li class="item-inactive"><a href="link3.html">third item</a></li>
     <li class="item-1"><a href="link4.html">fourth item</a></li>
     <li class="item-0"><a href="link5.html">fifth item</a>
</ul>"
soup = BeautifulSoup(html)
item_0 = soup.select_one('ul li.item-0')
html=”
” soup=BeautifulSoup(html) 第0项=汤。选择第1项(“ul li.0项”)

有没有像
soup\u to_xpath(item\u 0)
这样的函数可以将
item\u 0
转换为
'/html/body/ul/li[5]
或类似的东西?

恐怕在
bs4
中不可能使用
xpath

提供的
CSS
选择器可以“翻译”为
XPath
如下:

//ul/li[@class="item-0"]

您可以使用以下代码:

from lxml import etree
from lxml.cssselect import CSSSelector # You might need to run "pip install cssselect"

sel = CSSSelector('ul li.item-0')
sel.path
输出:

"descendant-or-self::ul/descendant-or-self::*/li[@class and contains(concat(' ', normalize-space(@class), ' '), ' item-0 ')]"

lxml使用cssselect模块执行此任务:

In [1]: from cssselect import GenericTranslator, SelectorError

In [2]: expression = GenericTranslator().css_to_xpath('ul li.item-0')

In [3]: expression
Out[3]: "descendant-or-self::ul/descendant-or-self::*/li[@class and contains(concat(' ', normalize-space(@class), ' '), ' item-0 ')]"

我只是想得到xpath。“提供的CSS选择器可以“翻译”成XPath”是什么意思?
'ul li.item-0'
是一个
CSS
选择器。是否要使用
BeautifulSoup
生成
XPath
或任何其他自动化工具来生成它?澄清您的问题“是否要使用BeautifulSoup生成XPath或任何其他自动化工具生成它?”这正是我想要的…您可以尝试使用
lxml.etree
()生成绝对
XPath
表达式,如这里所述
tree.getpath(element)
这里的元素似乎是一个lxml对象,我想把一个bs元素翻译成xpah,这和测验者的要求是一样的。
In [1]: from cssselect import GenericTranslator, SelectorError

In [2]: expression = GenericTranslator().css_to_xpath('ul li.item-0')

In [3]: expression
Out[3]: "descendant-or-self::ul/descendant-or-self::*/li[@class and contains(concat(' ', normalize-space(@class), ' '), ' item-0 ')]"