Python Groupby并用前后值';熊猫的意思

Python Groupby并用前后值';熊猫的意思,python,python-3.x,pandas,Python,Python 3.x,Pandas,我尝试在NaN单元格中填充其前和后的值的均值 type date v1 v2 0 a 2018-09 21511.11 17696.8 1 a 2018-10 NaN NaN 2 a 2018-11 NaN NaN 3 a 2018-12 30319.98 24553.6 4 a 2019-01 NaN NaN 5 a 20

我尝试在
NaN
单元格中填充其
后的
值的
均值

   type     date        v1       v2
0     a  2018-09  21511.11  17696.8
1     a  2018-10       NaN      NaN
2     a  2018-11       NaN      NaN
3     a  2018-12  30319.98  24553.6
4     a  2019-01       NaN      NaN
5     a  2019-02       NaN      NaN
6     a  2019-03   7409.61   6110.0
7     a  2019-04       NaN      NaN
8     a  2019-05       NaN      NaN
9     a  2019-06  15212.51  12590.5
10    a  2019-07       NaN      NaN
11    a  2019-08       NaN      NaN
12    a  2019-09  23129.96  19160.9
13    a  2019-10       NaN      NaN
14    a  2019-11       NaN      NaN
15    b  2018-09  21511.11  17696.8
16    b  2018-10       NaN      NaN
17    b  2018-11       NaN      NaN
18    b  2018-12  30319.98  24553.6
19    b  2019-01       NaN      NaN
20    b  2019-02       NaN      NaN
21    b  2019-03   7409.61   6110.0
22    b  2019-04       NaN      NaN
23    b  2019-05       NaN      NaN
24    b  2019-06  15212.51  12590.5
25    b  2019-07       NaN      NaN
26    b  2019-08       NaN      NaN
27    b  2019-09  23129.96  19160.9
28    b  2019-10       NaN      NaN
29    b  2019-11       NaN      NaN
我尝试使用以下代码,并参考:

我得到:

   type     date         v1        v2
0     a  2018-09  21511.110  17696.80
1     a  2018-10  25915.545  21125.20
2     a  2018-11  25915.545  21125.20
3     a  2018-12  30319.980  24553.60
4     a  2019-01  18864.795  15331.80
5     a  2019-02  18864.795  15331.80
6     a  2019-03   7409.610   6110.00
7     a  2019-04  11311.060   9350.25
8     a  2019-05  11311.060   9350.25
9     a  2019-06  15212.510  12590.50
10    a  2019-07  19171.235  15875.70
11    a  2019-08  19171.235  15875.70
12    a  2019-09  23129.960  19160.90
13    a  2019-10  22320.535  18428.85
14    a  2019-11  22320.535  18428.85
15    b  2018-09  21511.110  17696.80
16    b  2018-10  25915.545  21125.20
17    b  2018-11  25915.545  21125.20
18    b  2018-12  30319.980  24553.60
19    b  2019-01  18864.795  15331.80
20    b  2019-02  18864.795  15331.80
21    b  2019-03   7409.610   6110.00
22    b  2019-04  11311.060   9350.25
23    b  2019-05  11311.060   9350.25
24    b  2019-06  15212.510  12590.50
25    b  2019-07  19171.235  15875.70
26    b  2019-08  19171.235  15875.70
27    b  2019-09  23129.960  19160.90
28    b  2019-10  23129.960  19160.90
29    b  2019-11  23129.960  19160.90

但我不知道如何分组
键入
并应用上面的代码。有人能帮忙吗?谢谢。

添加
groupby
和列列表以供处理,还使用每个组的第一个和最后一个缺失值
应用
以避免从一个组值替换到另一个组值(如果组中只存在一些
NaN
s值):

g = df.groupby('type')['v1', 'v2']
df[['v1', 'v2']] = (g.ffill()+g.bfill())/2

df[['v1', 'v2']] = g.apply(lambda x: x.bfill().ffill())
仅适用于数字列的解决方案:

cols = df.select_dtypes(np.number).columns

g = df.groupby('type')[cols]
df[cols] = (g.ffill()+g.bfill())/2
df[cols] = g.apply(lambda x: x.bfill().ffill())
就像你说的:

 df[['v1','v2']] = (df.groupby('type')[['v1','v2']]
                      .agg(['bfill','ffill'])
                      .groupby(level=0, axis=1)
                      .mean()
                   )

使用
df.groupby('type')
并使用生成的
groupbydataframe
上的逻辑,如果我想将其应用于所有
number
列,而不是指定
v1
v2
等,谢谢。是
numerics=['int16','int32','int64','float16','float32','float64'],cols=df.选择类型(include=numerics).列
cols=df.选择类型(np.number).列
?@ahbon-我认为是的,我认为还添加了复数;)
 df[['v1','v2']] = (df.groupby('type')[['v1','v2']]
                      .agg(['bfill','ffill'])
                      .groupby(level=0, axis=1)
                      .mean()
                   )