Python 跨多个列应用贴图
我试图在pandas中的多个列上应用一个映射,以反映数据无效的情况。当数据在我的df['Count']列中无效时,我想将我的df['Value']、df['Lower Confidence Interval']、df['Upper Confidence Interval']和df['Denominator']列设置为-1 这是数据帧的一个示例:Python 跨多个列应用贴图,python,pandas,Python,Pandas,我试图在pandas中的多个列上应用一个映射,以反映数据无效的情况。当数据在我的df['Count']列中无效时,我想将我的df['Value']、df['Lower Confidence Interval']、df['Upper Confidence Interval']和df['Denominator']列设置为-1 这是数据帧的一个示例: Count Value Lower Confidence Interval Upper Confidence Interval De
Count Value Lower Confidence Interval Upper Confidence Interval Denominator
121743 54.15758428 53.95153779 54.36348867 224794
280 91.80327869 88.18009411 94.38654088 305
430 56.95364238 53.39535553 60.44152684 755
970 70.54545455 68.0815009 72.89492873 1375
nan
70 28.57142857 23.27957213 34.52488678 245
125 62.5 55.6143037 68.91456314 200
目前,我正在尝试:
set_minus_1s = {np.nan: -1, '*': -1, -1: -1}
然后:
以及获取错误:
ValueError: Must have equal len keys and value when setting with an iterable
有没有办法链接列引用以调用一次映射,而不是为每个列单独设置一行来调用set\u减号\u 1s
字典作为映射?我认为您可以在应用map
之后使用或替换所有未应用的行:
val = df['Count'].map(set_minus_1s)
print (val)
0 NaN
1 NaN
2 NaN
3 NaN
4 -1.0
5 NaN
6 NaN
Name: Count, dtype: float64
cols =['Value','Count','Lower Confidence Interval','Upper Confidence Interval','Denominator']
df[cols] = df[cols].where(val.isnull(), val, axis=0)
print (df)
Count Value Lower Confidence Interval Upper Confidence Interval \
0 121743.0 54.157584 53.951538 54.363489
1 280.0 91.803279 88.180094 94.386541
2 430.0 56.953642 53.395356 60.441527
3 970.0 70.545455 68.081501 72.894929
4 -1.0 -1.000000 -1.000000 -1.000000
5 70.0 28.571429 23.279572 34.524887
6 125.0 62.500000 55.614304 68.914563
Denominator
0 224794.0
1 305.0
2 755.0
3 1375.0
4 -1.0
5 245.0
6 200.0
对不起,我的代表太低了-现在不是!向上投票。
val = df['Count'].map(set_minus_1s)
print (val)
0 NaN
1 NaN
2 NaN
3 NaN
4 -1.0
5 NaN
6 NaN
Name: Count, dtype: float64
cols =['Value','Count','Lower Confidence Interval','Upper Confidence Interval','Denominator']
df[cols] = df[cols].where(val.isnull(), val, axis=0)
print (df)
Count Value Lower Confidence Interval Upper Confidence Interval \
0 121743.0 54.157584 53.951538 54.363489
1 280.0 91.803279 88.180094 94.386541
2 430.0 56.953642 53.395356 60.441527
3 970.0 70.545455 68.081501 72.894929
4 -1.0 -1.000000 -1.000000 -1.000000
5 70.0 28.571429 23.279572 34.524887
6 125.0 62.500000 55.614304 68.914563
Denominator
0 224794.0
1 305.0
2 755.0
3 1375.0
4 -1.0
5 245.0
6 200.0
cols = ['Value', 'Count', 'Lower Confidence Interval', 'Upper Confidence Interval', 'Denominator']
df[cols] = df[cols].mask(val.notnull(), val, axis=0)
print (df)
Count Value Lower Confidence Interval Upper Confidence Interval \
0 121743.0 54.157584 53.951538 54.363489
1 280.0 91.803279 88.180094 94.386541
2 430.0 56.953642 53.395356 60.441527
3 970.0 70.545455 68.081501 72.894929
4 -1.0 -1.000000 -1.000000 -1.000000
5 70.0 28.571429 23.279572 34.524887
6 125.0 62.500000 55.614304 68.914563
Denominator
0 224794.0
1 305.0
2 755.0
3 1375.0
4 -1.0
5 245.0
6 200.0