Python 确定TD标记内的类_Python_Python 3.x_Beautifulsoup

Python 确定TD标记内的类

python python-3.x

Python 确定TD标记内的类,python,python-3.x,beautifulsoup,Python,Python 3.x,Beautifulsoup,使用python beautifulsoup，我试图找到HTML页面的所有标记。但是，我想过滤掉任何标记，这些标记在其中一个标记中具有特定类我尝试用下面的代码过滤掉标记中包含“Warning”类的行 soup = BeautifulSoup(data, 'html.parser') print(soup.find_all('tr', class_=lambda c: 'Warning' not in c)) 我知道它没有过滤掉“警告类”，因为我在find_all函数中使用了，但是如果我尝试使

使用python beautifulsoup，我试图找到HTML页面的所有

标记。但是，我想过滤掉任何

标记，这些标记在其中一个

标记中具有特定类

我尝试用下面的代码过滤掉

标记中包含“Warning”类的行

soup = BeautifulSoup(data, 'html.parser')
print(soup.find_all('tr', class_=lambda c: 'Warning' not in c))

我知道它没有过滤掉“警告类”，因为我在

find_all

函数中使用了

，但是如果我尝试使用

td

它会给我一个

TypeError:类型为'NoneType'的参数不适合
任何想法都很感激
from bs4 import BeautifulSoup

data = '''
<tr role="row" class="odd red" data-id="32">
   <td role="gridcell" class="Warning">33</td>
   <td role="gridcell">Ralph</td>
   <td role="gridcell">List 2</td>
   <td role="gridcell">FE</td>
   <td role="gridcell">07/12/1996</td>
</tr>
<tr role="row" class="even red" data-id="33">
   <td role="gridcell">34</td>
   <td role="gridcell">Mary</td>
   <td role="gridcell">List 2</td>
   <td role="gridcell">SOTLTM</td>
   <td role="gridcell">08/12/1996</td>
</tr>
<tr role="row" class="odd red" data-id="34">
   <td role="gridcell">35</td>
   <td role="gridcell">Tom</td>
   <td role="gridcell">List 2</td>
   <td role="gridcell">SOTLTM</td>
   <td role="gridcell">09/12/1996</td>
</tr>
'''

soup = BeautifulSoup(data, 'html.parser')
print(soup.find_all('td', class_=lambda c: 'Warning' not in c))

从bs4导入美化组
数据=“”
33
拉尔夫
清单2
铁
07/12/1996
34
玛丽
清单2
索特姆
08/12/1996
35
汤姆
清单2
索特姆
09/12/1996
'''
soup=BeautifulSoup（数据'html.parser'）
打印（soup.find_all（'td'，class_=lambda c:'Warning'不在c中））
类=
不是大多数
元素的属性。这将导致在lambda中将c
设置为None
，因此您可以通过条件测试自动让它们通过过滤器：
print(soup.find_all('td', class_=lambda c: not c or 'Warning' not in c))
#                                          ^^^^^^^^

输出
[Ralph，
清单2，
FE，
07/12/1996, 
34, 
玛丽，
清单2，
索特姆，
08/12/1996, 
35, 
汤姆，
清单2，
索特姆，
09/12/1996]


从这里开始，我们可以根据您的主要关注点应用此条件，即根据子元素筛选
元素：
soup = BeautifulSoup(data, 'html.parser')

for tr in soup.find_all('tr'):
    if not bool(tr.find_all('td', class_=lambda c: c and 'Warning' in c)):
        print(tr) # or print(tr.find_all('td')) if you'd like to 
                  # access only the children of the filtered <tr>s

soup=BeautifulSoup（数据'html.parser'）
对于汤中的tr。查找所有（'tr'）：
如果不是bool（tr.find_all（'td'，class_uu=lambda c:c和c中的'Warning'）：
打印（tr）#或打印（tr.find_all（'td'）），如果您愿意的话
#仅访问已筛选对象的子级

输出

34
玛丽
清单2
索特姆
08/12/1996
35
汤姆
清单2
索特姆
09/12/1996
谢谢你，戈伦。最后，我希望它过滤掉完整的，如果它发现“警告”在任何的。换句话说，最终我希望它更像。玛丽，名单2，SOTLTM，1996年12月8日，35岁，汤姆，名单2，SOTLTM，1996年12月9日]请查看我的更新，如果不适合您，请告诉我。