在pandas中,如何根据为多个指定列的值指定的条件筛选数据帧?
我有一个简单的数据集,其形式如下:在pandas中,如何根据为多个指定列的值指定的条件筛选数据帧?,pandas,dataframe,Pandas,Dataframe,我有一个简单的数据集,其形式如下: import pandas as pd df = pd.DataFrame( [ ["Norway" , 7.537, 0.039, 11 , 31], ["Denmark" , 7.522, -0.004, 9 , 12], ["Switzerland", 7.494, None , 15 , 50], ["F
import pandas as pd
df = pd.DataFrame(
[
["Norway" , 7.537, 0.039, 11 , 31],
["Denmark" , 7.522, -0.004, 9 , 12],
["Switzerland", 7.494, None , 15 , 50],
["Finland" , 7.469, None , None, 29],
["Netherlands", 7.377, 1 , None, 77],
],
columns = [
"country",
"score A",
"score B",
"score C",
"score D"
]
)
如何筛选此数据集,以便对多行的值设置某些条件?那么,假设我想过滤数据,以便排除得分B
和得分C
为空的所有行(所有国家/地区)?这将导致芬兰
行被排除在外
当我尝试以下操作时,我将排除这些列中任何一列中具有任何空值的所有行,结果只包括挪威
和丹麦
行:
df[(df["score B"].notnull()) & (df["score C"].notnull())]
如何做到这一点?指定
或如何:
df[(df["score B"].notnull()) | (df["score C"].notnull())]
输出:
country score A score B score C score D
0 Norway 7.537 0.039 11.0 31
1 Denmark 7.522 -0.004 9.0 12
2 Switzerland 7.494 NaN 15.0 50
4 Netherlands 7.377 1.000 NaN 77
对吧??您只需要排除两个都为空的情况(或者我没有正确理解这一点)?您需要
df[~(df['score B'].isnull() & df['score C'].isnull())]
country score A score B score C score D
0 Norway 7.537 0.039 11.0 31
1 Denmark 7.522 -0.004 9.0 12
2 Switzerland 7.494 NaN 15.0 50
4 Netherlands 7.377 1.000 NaN 77