Python 在beautifulsoup中找到xapth以获取目标标记
我有类似Html的结构Python 在beautifulsoup中找到xapth以获取目标标记,python,beautifulsoup,Python,Beautifulsoup,我有类似Html的结构 <html> <body> <-- Some tags --> <div class="main-sv"> <div class="first-sv custom-sv"> <-- Some tags and content--> </div> </div> </body>
<html>
<body>
<-- Some tags -->
<div class="main-sv">
<div class="first-sv custom-sv">
<-- Some tags and content-->
</div>
</div>
</body>
</html>
还有其他类似于lxml
中的xpath
的方法吗
我必须只使用美化组否,正如所述,美化组本身不支持xpath查找。但这里有一个稍微简化的解决方案:
from bs4 import BeautifulSoup
html = """
<div class="main-sv">
<div class="first-sv custom-sv">
<-- Some tags and content-->
</div>
</div>
"""
soup = BeautifulSoup(html, 'html.parser')
print 'first-sv' in soup.find('div', {'class':'main-sv'}).find('div')['class']
# prints True
不,如上所述,BeautifulSoup本身不支持xpath查找。但这里有一个稍微简化的解决方案:
from bs4 import BeautifulSoup
html = """
<div class="main-sv">
<div class="first-sv custom-sv">
<-- Some tags and content-->
</div>
</div>
"""
soup = BeautifulSoup(html, 'html.parser')
print 'first-sv' in soup.find('div', {'class':'main-sv'}).find('div')['class']
# prints True
parent = soup.find('div', {'class':'main-sv'})
child = parent.select('div')[0]
print 'first-sv' in child['class']
# prints True