Python 如何使用beautifulsoup读取html标记_Python_Beautifulsoup

Python 如何使用beautifulsoup读取html标记

python

Python 如何使用beautifulsoup读取html标记,python,beautifulsoup,Python,Beautifulsoup,我正在尝试使用beautifulsoap读取html标记，并检查一些标记是否可用或丢失我正在使用beautifulsoup读取文件，然后在测试文件中使用它以下是我尝试过但没有成功的方法： class Testing(unittest.TestCase): @classmethod def setUp(name): name.html = None with open("index.html") as frd:

我正在尝试使用beautifulsoap读取html标记，并检查一些标记是否可用或丢失

我正在使用beautifulsoup读取文件，然后在测试文件中使用它

以下是我尝试过但没有成功的方法：

class Testing(unittest.TestCase):
        @classmethod
        def setUp(name):

            name.html = None
            with open("index.html") as frd:
                name.html = frd.read()
                name.soup = BeautifulSoup(name.html)
            if not name.html:
                raise Exception('cant read')    

        def testing(self)
         assert self.soup.find('html') == 'html'
          #Raise : error

我无法在soup中使用find（）函数找到html标记（尝试打印它以查看输出，但无法正常工作）。如果HTML文件中缺少HTML标记，如何引发异常？

在使用“查找”时尝试此操作，因为它会返回美化的字符串或不返回任何字符串！所以，这件事我可以建议

try:
    assert self.soup.find('html') != None
except AssertionError, e:
    raise Exception("HTML Tag is missing!")

你得到的确切错误是什么？我看你没有：定义之后！assert self.soup.find（'html'）='html'assertionErrorOk，你能试试我刚才发布的另一个答案吗？显示更多代码。具体来说，就是初始化

soup

。好的，试试这个答案，它工作得很好！对不起，不行！！我的文件中有Html标记，但它仍然引发了一个异常，即Html在发现带有无的断言时丢失：）我如何在这里使用assertEqual，我尝试了self.assertEqual（self.soup.find（'Html'），“Html”），对吗？你不能这样做，因为你没有将无与“Html”进行比较，这是完全错误的！