Pandas 使用循环对多个数据集进行线性一维插值_Pandas_Python 2.7_Scipy_Linear Interpolation

Pandas 使用循环对多个数据集进行线性一维插值

pandas python-2.7

Pandas 使用循环对多个数据集进行线性一维插值,pandas,python-2.7,scipy,linear-interpolation,Pandas,Python 2.7,Scipy,Linear Interpolation,我对使用scipy.interpolate库执行线性插值感兴趣。数据集看起来有点像这样：我想使用此插值函数从该数据集中查找缺少的Y：这里给出的运行次数只有3次，但我运行的数据集将运行1000次。因此，如果您能建议如何使用插值的迭代函数，我们将不胜感激 from scipy.interpolate import interp1d for RUNNumber in range(TotalRuns) InterpolatedFunction[RUNNumber]=interp1d(X, Y)

我对使用scipy.interpolate库执行线性插值感兴趣。数据集看起来有点像这样：

我想使用此插值函数从该数据集中查找缺少的Y：

这里给出的运行次数只有3次，但我运行的数据集将运行1000次。因此，如果您能建议如何使用插值的迭代函数，我们将不胜感激

from scipy.interpolate import interp1d
for RUNNumber in range(TotalRuns)
 InterpolatedFunction[RUNNumber]=interp1d(X, Y)

据我所知，您需要为每次运行定义一个单独的插值函数。然后，您希望将这些函数应用于第二个数据帧。我定义了一个数据帧

df

，其中列

['X'，'Y'，'RUN']

，第二个数据帧

new\u df

，其中列

['X'，'Y\u interpolation'，'RUN']

interpolating_functions = dict()
for run_number in range(1, max_runs):
    run_data = df[df['RUN']==run_number][['X', 'Y']]
    interpolating_functions[run_number] = interp1d(run_data['X'], run_data['Y'])

现在我们有了每次运行的插值函数，我们可以使用它们来填充新数据帧中的“Y_插值”列。这可以使用

apply

函数来完成，该函数接受一个函数并将其应用于数据帧中的每一行。让我们定义一个插值函数，它将获取一行新的df，并使用X值和游程数来计算插值的Y值

def interpolate(row):
    int_func = interpolating_functions[row['RUN']]
    interp_y = int_func._call_linear([row['X'])[0] #the _call_linear method
                                                   #expects and returns an array
    return interp_y[0]

现在我们只使用

apply

和定义的

interpolate

函数

new_df['Y_interpolation'] = new_df.apply(interpolate,axis=1)

我使用的是pandas版本0.20.3，这给了我一个新的_df，看起来像这样：

非常感谢@A.Entuluva就如何处理此问题陈述提出的总体思路。但是，我在尝试使用“apply”函数时遇到错误“ValueError:传递的项目数错误2，placement暗示1”。你能分享你的代码吗？我使用的是：new_df1={'X'：[1,4,12998,1,4,12998,1,4,12998]，'RUN'：[1,1,1,1,2,2,2,2,3,3,3]}new_df=pd.DataFrame（new_df1）new_df['Y_interpolation']=new_df.apply（interpolate，axis=1）你的代码对我来说运行良好。你用的是什么版本的熊猫？

apply

的语法可能已更改。Hi@A.Entuluva，我使用的是0.16.2。升级到0.20.3后，我将尝试相同的方法。谢谢