Xpath 从没有属性的div中提取文本_Xpath_Beautifulsoup

Xpath 从没有属性的div中提取文本

xpath

Xpath 从没有属性的div中提取文本,xpath,beautifulsoup,Xpath,Beautifulsoup,我想分别使用BeautifulSoap和XPath从下面的html中提取内容（这里的内容）。怎样才能做到呢 <div class="paragraph"> <h1>Title here</h1> Content here </div> 你有很多方法可以做到这一点。这里有一些方法通过使用内容或通过使用next\u元素或通过使用next\u同级或通过使用stripped_字符串 from bs4 import Beau

我想分别使用BeautifulSoap和XPath从下面的html中提取内容（这里的内容）。怎样才能做到呢

<div class="paragraph">
    <h1>Title here</h1>
    Content here
</div>

你有很多方法可以做到这一点。这里有一些方法

通过使用

内容

或通过使用

next\u元素

或

通过使用

next\u同级
或
通过使用stripped_字符串

from bs4 import BeautifulSoup
html='''<div class="paragraph">
    <h1>Title here</h1>
    Content here
</div>'''

soup=BeautifulSoup(html,"html.parser")
print(soup.find('div',class_='paragraph').contents[2].strip())
print(soup.find('div',class_='paragraph').find('h1').next_element.next_element.strip())
print(soup.find('div',class_='paragraph').find('h1').next_sibling.strip())
print(list(soup.find('div',class_='paragraph').stripped_strings)[1])

从bs4导入美化组
html=“”
标题在这里
满足于此
'''
soup=BeautifulSoup（html，“html.parser”）
打印（soup.find（'div'，class='段落'）。内容[2].strip（））
打印（soup.find（'div'，class='段落'）。find（'h1'）。next_元素。next_元素。strip（））
打印（soup.find（'div'，class='段落'）。find（'h1'）。next_sibling.strip（））
打印（列表（soup.find（'div'，class='段落'）。剥离字符串）[1]）


您也可以使用css选择器
html='''<div class="paragraph">
    <h1>Title here</h1>
    Content here
</div>'''

soup=BeautifulSoup(html,"html.parser")
print(soup.select_one('.paragraph').contents[2].strip())
print(soup.select_one('.paragraph >h1').next_element.next_element.strip())
print(soup.select_one('.paragraph >h1').next_sibling.strip())
print(list(soup.select_one('.paragraph').stripped_strings)[1])

html=''
标题在这里
满足于此
'''
soup=BeautifulSoup（html，“html.parser”）
打印（soup.select_one（'.段落'）。内容[2]。条带（）
打印（soup.select_one（'.段落>h1'）。next_元素。next_元素。strip（））
打印（soup.select_one（'.paragration>h1'）。next_sibling.strip（））
打印（列表（汤。选择一个（“.段落”）。剥离字符串）[1]）
这是否回答了您的问题？
html='''<div class="paragraph">
    <h1>Title here</h1>
    Content here
</div>'''

soup=BeautifulSoup(html,"html.parser")
print(soup.select_one('.paragraph').contents[2].strip())
print(soup.select_one('.paragraph >h1').next_element.next_element.strip())
print(soup.select_one('.paragraph >h1').next_sibling.strip())
print(list(soup.select_one('.paragraph').stripped_strings)[1])