Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/293.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 用分组法将平均值填入NaN_Python_Dataframe_Nan - Fatal编程技术网

Python 用分组法将平均值填入NaN

Python 用分组法将平均值填入NaN,python,dataframe,nan,Python,Dataframe,Nan,我的数据集如下所示 Month DayOfWeek Class A1 A2 ... A999 July Monday Bata 7 9 ... 5 July Tuesay Bata 3 1 ... 2 July Sunday Bata 4 5 ... 6 July Monday Adid 9 8 ... 5 July Sunday Adid 4 0 ... 4 Sept Monday Nike

我的数据集如下所示

Month DayOfWeek  Class A1  A2 ... A999
July  Monday     Bata  7   9  ... 5
July  Tuesay     Bata  3   1  ... 2
July  Sunday     Bata  4   5  ... 6
July  Monday     Adid  9   8  ... 5
July  Sunday     Adid  4   0  ... 4
Sept  Monday     Nike  7   5  ... 7
Sept  Sunday     Nike  8   3  ... 7
Sept  Satday     Adid  2   7  ... 7
Sept  Monday     Bata  8   9  ... 4
Oct   Monday     Nike  4   2  ... 5
Oct   Sunday     Bata  8   6  ... 3
July  Monday     Nike  NaN NaN    NaN
Sept  Sunday     Nike  NaN NaN    NaN
Oct   Satday     Nike  NaN NaN    NaN
Sept  Monday     Bata  NaN NaN    NaN
我想用以前记录的平均值填充NAN

我知道我可以用

df['A1'] = df['A1'].fillna((df['A1'].mean()))
但这是一种不好的方式,因为我有1000多列,以后可能会增加

再加上

我想根据周的月和日找出平均数

记录在案

July  Monday     Nike  NaN NaN    NaN
因此,平均值将仅为月份=七月&星期一=星期一的记录的平均值

我该怎么做呢?

给你:

df['A1'] = df.groupby(['Month','DayOfWeek'])['A1'].transform(lambda x: x.fillna(x.mean()))
由于Month=Oct&DayOfWeek=Monday没有值,因此上面仍然会给出一个空值。 在这种情况下,您可能需要编写第二个代码来填充该月的平均值或DayOfWeek的平均值。 下面的代码段使用空值记录月份的平均值填充空值:

df['A1'] = df.groupby('Month')['A1'].transform(lambda x: x.fillna(x.mean()))

向上投票如果这有帮助

您知道多级索引吗?我曾经在一个类似的问题中使用过它们,帮助我使用向下钻取来计算类似的KPI,正如您所寻找的…谢谢,但我在应用这一行后松开了Month和DayOfWeek列,它们不再在数据框中?!!有什么解决方案吗?当我运行这些代码行时,我仍然保持所有列的完整性。你想分享你写的东西吗。运行df=df.groupby'Month'。。。。。。。而不是df['A1']=df.groupby'Month'。。。。。。将使您丢失其他列。