使用BS4进行Python HTML解析_Python_Html_Parsing_Beautifulsoup

使用BS4进行Python HTML解析

python html parsing

使用BS4进行Python HTML解析,python,html,parsing,beautifulsoup,Python,Html,Parsing,Beautifulsoup,我遇到了使用Python&Beautiful-Soup通过HTML解析的问题，我遇到了一个问题，我想为一个非常特定的数据段解析这个问题。这就是我遇到的代码类型： <div class="big_div"> <div class="smaller div"> <div class="other div"> <div class="this">A</div> <div class=

我遇到了使用Python&Beautiful-Soup通过HTML解析的问题，我遇到了一个问题，我想为一个非常特定的数据段解析这个问题。这就是我遇到的代码类型：

<div class="big_div">
   <div class="smaller div">
      <div class="other div">
         <div class="this">A</div>
         <div class="that">2213</div>
      <div class="other div">
         <div class="this">B</div>
         <div class="that">215</div>
      <div class="other div">
         <div class="this">C</div>
         <div class="that">253</div>


A.
2213
B
215
C
253

正如您所看到的，有一系列重复的HTML，只有值不同，我的问题是定位一个特定的值我想在最后一节中找到253。我希望得到任何帮助，因为这是通过HTML解析时经常遇到的问题

提前谢谢你

到目前为止，我试图解析它，但由于名称相同，我不知道如何浏览它。我也尝试过使用for循环，但几乎没有任何进展

您可以在find中使用string属性作为参数

req\u div

将包含您想要的div元素。

如果您不知道，请先阅读文档或相关资料；抱歉，我是python新手，我不确定是否有用于此的文档。谢谢你！我很好奇，因为有一种更简单的方法可以通过javascript进行解析，因为我有一个反复出现的问题，当我按照site=soup.find（'script'type='text/javascript'）键入内容时

"""Suppose html is the object holding html code of your web page that you want to scrape
and req_text is some text that you want to find"""
soup = BeautifulSoup(html, 'lxml')
req_div = soup.find('div', string=req_text)