如何在lxml.etree python中使用类名解析html_Python_Python 2.7_Beautifulsoup_Lxml

如何在lxml.etree python中使用类名解析html

python python-2.7

如何在lxml.etree python中使用类名解析html,python,python-2.7,beautifulsoup,lxml,Python,Python 2.7,Beautifulsoup,Lxml,现在，我不想使用xpathtree.xpath（…）而是想知道我们是否可以像在beautifulSoup中那样通过id的类名进行搜索 soup.find（'div'，attrs={'class'：'myclass'}）我在lxml中寻找类似的东西。在bs4中，更简洁的方法是使用css选择器： req = requests.get(url) tree = etree.HTML(req.text) lxml提供了cssselect作为一个模块（实际上是）和元素对象上的一种方便方法 soup.sel

现在，我不想使用xpath

tree.xpath（…）

而是想知道我们是否可以像在beautifulSoup中那样通过id的类名进行搜索

soup.find（'div'，attrs={'class'：'myclass'}）

我在lxml中寻找类似的东西。

在

bs4

中，更简洁的方法是使用css选择器：

req = requests.get(url)
tree = etree.HTML(req.text)

lxml

提供了

cssselect

作为一个模块（实际上是）和

元素

对象上的一种方便方法

soup.select('div.myclass') #  == soup.find_all('div',attrs={'class':'myclass'})

或者，您可以选择预编译表达式并将其应用于

元素

：

import lxml.html

tree = lxml.html.fromstring(req.text)
for div in tree.cssselect('div.myclass'):
    #stuff

您说不想使用xpath，但没有解释原因。如果目标是搜索具有给定类的标记，那么可以使用xpath轻松完成

例如，要查找类为“foo”的div，可以执行以下操作：

from lxml.cssselect import CSSSelector
selector = CSSSelector('div.myclass')

selection = selector(tree)

为什么不使用XPath？这似乎正是你想要的。

tree.find("//div[@class='foo']")