如何用Python中的BeautifulSoup解析格式错误的html（无正文标记）？_Python_Lxml

如何用Python中的BeautifulSoup解析格式错误的html（无正文标记）？

python

如何用Python中的BeautifulSoup解析格式错误的html（无正文标记）？,python,lxml,Python,Lxml,如何用Python中的BeautifulSoup解析格式错误的html（无正文标记）我有很多html文件需要解析，但它们不包含body标记，这会给BeautifulSoup带来问题。见下文： f = "somefile.html" html = open(f,'r').read() soup = BeautifulSoup(html) print soup.prettify() Output>>> <html> <head> <titl

如何用Python中的BeautifulSoup解析格式错误的html（无正文标记）

我有很多html文件需要解析，但它们不包含body标记，这会给BeautifulSoup带来问题。见下文：

f = "somefile.html"
html = open(f,'r').read()
soup = BeautifulSoup(html)
print soup.prettify()

Output>>>

<html>
 <head>
  <title>
   Test Results
  </title>
 </head>
</html>

f=“somefile.html”
html=open（f，'r'）.read（）
soup=BeautifulSoup（html）
打印汤。美化
输出>>>
测试结果

如果“问题”引发错误，请使用try:except block and catch/throw the error不允许这种情况发生吗？