Python 如何使用BeautifulSoup从html表中提取与标签单元格相关的值？_Python_Html_Parsing_Beautifulsoup

Python 如何使用BeautifulSoup从html表中提取与标签单元格相关的值？

python html parsing

Python 如何使用BeautifulSoup从html表中提取与标签单元格相关的值？,python,html,parsing,beautifulsoup,Python,Html,Parsing,Beautifulsoup,我有以下html代码： <div class="data-table data-table_detailed"> <div class="cell"> <div class="cell_label"> Label1 </div> <div class="cell_value"> Value2 </div> <div class="cell">

我有以下html代码：

<div class="data-table data-table_detailed">
     <div class="cell">
         <div class="cell_label"> Label1 </div>
         <div class="cell_value"> Value2 </div>
    <div class="cell">
         <div class="cell_label"> Label2 </div>
         <div class="cell_value"> Value2 </div>
    <div class="cell">
         <div class="cell_label"> Label3 </div>
         <div class="cell_value"> Value3 </div>

但是如何获取标签标签2单元格中的值？

使用该方法（css选择器）会更容易：

您可以使用

查找下一个兄弟姐妹

：

soup = BeautifulSoup(page)
datatable = soup.find(class_="data-table data-table_detailed")
cell_labels = datatable.find_all(class_="cell_label") #to get the list of labels

for cell_label in cell_labels:
    if "Label2" in cell_label.text:
        print(cell_label.find_next_sibling("div", {"class": "cell_value"}).text)

# results
 Value2

此代码将在带有class

cell\u label

的文档中找到第一个

标记，其中（剥离的）内容为

Label2

：

>>> soup.find('div', class_='cell_label', string=lambda s: s.strip() == 'Label2').find_next_sibling().string
u' Value2 '

如果只需要查找第一个

中包含的

：

我认为OP是在“value”

中的文本之后，即本例中的

Value2

。嗨@arthurim，这到底有助于解决问题吗？如果是这样，你可以积极地结束这个问题。

soup = BeautifulSoup(page)
datatable = soup.find(class_="data-table data-table_detailed")
cell_labels = datatable.find_all(class_="cell_label") #to get the list of labels

for cell_label in cell_labels:
    if "Label2" in cell_label.text:
        print(cell_label.find_next_sibling("div", {"class": "cell_value"}).text)

# results
 Value2

>>> soup.find('div', class_='cell_label', string=lambda s: s.strip() == 'Label2').find_next_sibling().string
u' Value2 '

>>> table = soup.find(class_="data-table data-table_detailed")
>>> table.find('div', class_='cell_label', string=lambda s: s.strip() == 'Label2').find_next_sibling().string
u' Value2 '