Python 熊猫从群比中归来

Python 熊猫从群比中归来,python,pandas,pandas-groupby,Python,Pandas,Pandas Groupby,我的目标是根据特定列和特定类型获取数据并插值缺失的值 我实现了这个目标,但在插值之前,我很难回到数据帧的形状 data = [ {"type": "Car", "avg_speed": 30, "max_speed": 200}, {"type": "Car", "avg_speed": 20, "max_speed": 100}, {"type": "Car", "avg_speed": 25, "max_speed": None}, {"type": "Pla

我的目标是根据特定列和特定类型获取数据并插值缺失的值 我实现了这个目标,但在插值之前,我很难回到数据帧的形状

data = [
    {"type": "Car", "avg_speed": 30, "max_speed": 200},
    {"type": "Car", "avg_speed": 20, "max_speed": 100},
    {"type": "Car", "avg_speed": 25, "max_speed": None},
    {"type": "Plane", "avg_speed": 300, "max_speed": 2000},
    {"type": "Plane", "avg_speed": 200, "max_speed": 1000},
    {"type": "Plane", "avg_speed": 250, "max_speed": None}
]


df = pd.DataFrame(data)
print(df)
post_interp = df.groupby("type").apply(lambda x: x.set_index(
    'avg_speed').sort_index().interpolate(method='index'))
print(post_interp)
首次印刷:

    type  avg_speed  max_speed
0    Car         30      200.0
1    Car         20      100.0
2    Car         25        NaN
3  Plane        300     2000.0
4  Plane        200     1000.0
5  Plane        250        NaN
第二次印刷:

                  type  max_speed
type  avg_speed
Car   20           Car      100.0
      25           Car      150.0
      30           Car      200.0
Plane 200        Plane     1000.0
      250        Plane     1500.0
      300        Plane     2000.0

我想返回到带有插值的打印1中数据框的形状。

添加
组键=False
以避免重复索引和上次添加:

另一种具有双
重置索引的解决方案:

post_interp = (df.groupby("type")
                 .apply(lambda x: x.set_index('avg_speed')
                                   .sort_index()
                                   .interpolate(method='index'))
                 .reset_index(level=0, drop=True)
                 .reset_index())
或者您可以在
groupby
之前创建索引:

post_interp = (df.set_index('avg_speed')
                 .sort_index()
                 .groupby("type", group_keys=False)
                 .apply(lambda x: x.interpolate(method='index'))
                 .reset_index())
print(post_interp)
   avg_speed   type  max_speed
0         20    Car      100.0
1         25    Car      150.0
2         30    Car      200.0
3        200  Plane     1000.0
4        250  Plane     1500.0
5        300  Plane     2000.0
最后,如有必要,按相同的列顺序添加:


添加
group\u keys=False
以避免重复索引和上次添加:

另一种具有双
重置索引的解决方案:

post_interp = (df.groupby("type")
                 .apply(lambda x: x.set_index('avg_speed')
                                   .sort_index()
                                   .interpolate(method='index'))
                 .reset_index(level=0, drop=True)
                 .reset_index())
或者您可以在
groupby
之前创建索引:

post_interp = (df.set_index('avg_speed')
                 .sort_index()
                 .groupby("type", group_keys=False)
                 .apply(lambda x: x.interpolate(method='index'))
                 .reset_index())
print(post_interp)
   avg_speed   type  max_speed
0         20    Car      100.0
1         25    Car      150.0
2         30    Car      200.0
3        200  Plane     1000.0
4        250  Plane     1500.0
5        300  Plane     2000.0
最后,如有必要,按相同的列顺序添加:


是否要将新的
max\u speed
avg\u speed
列分配给原始数据帧?是否要将新的
max\u speed
avg\u speed
列分配给原始数据帧?