Pandas 如何将月值转换为nan?
我有这个df:Pandas 如何将月值转换为nan?,pandas,Pandas,我有这个df: CODE DATE TMAX TMIN PP 0 000130 1991-01-01 32.6 23.4 0.0 1 000130 1991-01-02 31.2 22.4 0.0 2 000130 1991-01-03 32.0 NaN 0.0 3 000130 1991-01-04 32.2 23.0 0.0 4 000130 1991-01-05 30.5
CODE DATE TMAX TMIN PP
0 000130 1991-01-01 32.6 23.4 0.0
1 000130 1991-01-02 31.2 22.4 0.0
2 000130 1991-01-03 32.0 NaN 0.0
3 000130 1991-01-04 32.2 23.0 0.0
4 000130 1991-01-05 30.5 22.0 0.0
... ... ... ... ...
10865 000130 2020-12-31 NaN NaN NaN
10866 000132 1991-01-01 35.2 NaN 0.0
10867 000132 1991-01-02 34.6 NaN 0.0
10868 000132 1991-01-03 35.8 NaN 0.0
10869 000132 1991-01-04 34.8 NaN 0.0
对于PP列,仅当一个月内有3个或更多NaN值(不一定是连续的)时,我需要将每月PP数据转换为NaN。例如:如果1991年1月PP列中有3个NaN值(不必连续),则PP列中1991年1月的所有值都必须转换为NaN。每年每个月都一样。我需要通过代码来完成这项工作。所以我想首先使用df.groupby('CODE')。但是我不知道怎么做。我将感谢任何帮助
提前感谢创建一个序列,计算每个代码x年-月组的
NaN
值的数量,然后您可以使用该序列来屏蔽=3的原始列
以下是一些样本数据,其中[000130,1991-01]应为NaN,[000130,2020-12]和[000132,1991-01]保持不变
### Sample Data
CODE DATE TMAX TMIN PP
0 000130 1991-01-01 32.6 23.4 NaN
1 000130 1991-01-02 31.2 22.4 NaN
2 000130 1991-01-03 32.0 NaN 0.0
3 000130 1991-01-04 32.2 23.0 NaN
4 000130 1991-01-05 30.5 22.0 0.0
10865 000130 2020-12-31 NaN NaN 0.0
10866 000132 1991-01-01 35.2 NaN 0.0
10867 000132 1991-01-02 34.6 NaN NaN
10868 000132 1991-01-03 35.8 NaN 0.0
10869 000132 1991-01-04 34.8 NaN 0.0
df['DATE'] = pd.to_datetime(df['DATE'])
s = df['PP'].isnull().groupby([df['CODE'], df['DATE'].astype('datetime64[M]')]).transform('sum')
df['PP'] = df['PP'].mask(s.ge(3))
# CODE DATE TMAX TMIN PP
#0 000130 1991-01-01 32.6 23.4 NaN
#1 000130 1991-01-02 31.2 22.4 NaN
#2 000130 1991-01-03 32.0 NaN NaN
#3 000130 1991-01-04 32.2 23.0 NaN
#4 000130 1991-01-05 30.5 22.0 NaN
#10865 000130 2020-12-31 NaN NaN 0.0
#10866 000132 1991-01-01 35.2 NaN 0.0
#10867 000132 1991-01-02 34.6 NaN NaN
#10868 000132 1991-01-03 35.8 NaN 0.0
#10869 000132 1991-01-04 34.8 NaN 0.0