Python 用所有行的中位数替换NaN值,但只选择一些行?

Python 用所有行的中位数替换NaN值,但只选择一些行?,python,pandas,dataframe,Python,Pandas,Dataframe,这是我之前的一个补充,但我知道需要添加一个特定行的选择来应用更改后的值 np.random.seed(0) rng = pd.date_range('2020-09-24', periods=20, freq='0.2H') df = pd.DataFrame({ 'Date': rng, 'Val': np.random.randn(len(rng)), 'Dist' :np.random.randn(len(rng)), 'Variant' : ["Red", "

这是我之前的一个补充,但我知道需要添加一个特定行的选择来应用更改后的值

np.random.seed(0)
rng = pd.date_range('2020-09-24', periods=20, freq='0.2H')
df = pd.DataFrame({ 'Date': rng, 'Val': np.random.randn(len(rng)), 'Dist' :np.random.randn(len(rng)), 'Variant' : ["Red", "Blue", "Blue", "Yellow","Blue", "Blue", "Yellow", "Blue", "Yellow","Blue", "Red", "Red", "Red", "Red","Blue", "Blue",  "Yellow","Red",  "Yellow", "Yellow"]}) 
df.Dist[df.Dist<=-0.6] = np.nan
df.Val[df.Val<=-0.5] = np.nan
但现在我不知道如何使每小时的中位数在列中的所有值中都计算出来,但只用于在变量列中用红色填充值? 同样,这是整个Dist和Val列独立的中间值。
这将使NaN值保留在黄色和蓝色的行中。

从红色变量中获取索引,然后使用变量列以及在
groupby
中计算中值,然后仅更新目标索引

cols = ['Val','Dist']
idx_red = df.Variant.eq('Red')
df.loc[idx_red, cols] = df.loc[idx_red, cols].fillna(df.groupby([df.Date.dt.floor('H')])[cols].transform('median')[idx_red])
输出:

                  Date       Val      Dist Variant
0  2020-09-24 00:00:00  1.764052       NaN     Red
1  2020-09-24 00:12:00  0.400157  0.653619    Blue
2  2020-09-24 00:24:00  0.978738  0.864436    Blue
3  2020-09-24 00:36:00  2.240893       NaN  Yellow
4  2020-09-24 00:48:00  1.867558  2.269755    Blue
5  2020-09-24 01:00:00       NaN       NaN    Blue
6  2020-09-24 01:12:00  0.950088  0.045759  Yellow
7  2020-09-24 01:24:00 -0.151357 -0.187184    Blue
8  2020-09-24 01:36:00 -0.103219  1.532779  Yellow
9  2020-09-24 01:48:00  0.410599  1.469359    Blue
10 2020-09-24 02:00:00  0.144044  0.154947     Red
11 2020-09-24 02:12:00  1.454274  0.378163     Red
12 2020-09-24 02:24:00  0.761038  0.266555     Red
13 2020-09-24 02:36:00  0.121675  0.266555     Red
14 2020-09-24 02:48:00  0.443863 -0.347912    Blue
15 2020-09-24 03:00:00  0.333674  0.156349    Blue
16 2020-09-24 03:12:00  1.494079  1.230291  Yellow
17 2020-09-24 03:24:00 -0.205158  1.202380     Red
18 2020-09-24 03:36:00  0.313068 -0.387327  Yellow
19 2020-09-24 03:48:00       NaN -0.302303  Yellow

注意:请注意,除了“红色”之外的其他变体没有更新,只有NAs的红色变体也没有更新。

这非常感谢,特别是我不知道的
.eq
                  Date       Val      Dist Variant
0  2020-09-24 00:00:00  1.764052       NaN     Red
1  2020-09-24 00:12:00  0.400157  0.653619    Blue
2  2020-09-24 00:24:00  0.978738  0.864436    Blue
3  2020-09-24 00:36:00  2.240893       NaN  Yellow
4  2020-09-24 00:48:00  1.867558  2.269755    Blue
5  2020-09-24 01:00:00       NaN       NaN    Blue
6  2020-09-24 01:12:00  0.950088  0.045759  Yellow
7  2020-09-24 01:24:00 -0.151357 -0.187184    Blue
8  2020-09-24 01:36:00 -0.103219  1.532779  Yellow
9  2020-09-24 01:48:00  0.410599  1.469359    Blue
10 2020-09-24 02:00:00  0.144044  0.154947     Red
11 2020-09-24 02:12:00  1.454274  0.378163     Red
12 2020-09-24 02:24:00  0.761038  0.266555     Red
13 2020-09-24 02:36:00  0.121675  0.266555     Red
14 2020-09-24 02:48:00  0.443863 -0.347912    Blue
15 2020-09-24 03:00:00  0.333674  0.156349    Blue
16 2020-09-24 03:12:00  1.494079  1.230291  Yellow
17 2020-09-24 03:24:00 -0.205158  1.202380     Red
18 2020-09-24 03:36:00  0.313068 -0.387327  Yellow
19 2020-09-24 03:48:00       NaN -0.302303  Yellow