Python 如何使用BeautifulSoup获得HTML代码中的所有标记，而不使用它们的子项？_Python_Python 2.7_Beautifulsoup

Python 如何使用BeautifulSoup获得HTML代码中的所有标记，而不使用它们的子项？

python python-2.7

Python 如何使用BeautifulSoup获得HTML代码中的所有标记，而不使用它们的子项？,python,python-2.7,beautifulsoup,Python,Python 2.7,Beautifulsoup,我想运行我的HTML源代码，并提取那里的所有标记和文本，但不提取它们的子项例如，此HTML： <html> <head> <title>title</title> </head> <body> Hello world </body> </html> 当我看到的是每一个标签分开，没有他的后代： <html> <head> <title> title <bod

我想运行我的HTML源代码，并提取那里的所有标记和文本，但不提取它们的子项

例如，此HTML：

<html>
<head>
<title>title</title>
</head>
<body>
Hello world
</body>
</html>

当我看到的是每一个标签分开，没有他的后代：

<html>
<head>
<title>
title
<body>
Hello World


标题
你好，世界

我怎样才能做到这一点呢？

我们的想法是迭代所有节点。对于没有子元素的元素，请获取以下文本：

for elm in soup():  # soup() is equivalent to soup.find_all()
    if not elm():  # elm() is equivalent to elm.find_all()
        print(elm.name, elm.get_text(strip=True))
    else:
        print(elm.name)

印刷品：

html
head
title title
body Hello world

html
head
title title
body Hello world