Python 通过将sklearn.predict传递给df.apply对数据帧进行行预测_Python_Pandas_Scikit Learn

Python 通过将sklearn.predict传递给df.apply对数据帧进行行预测

python pandas scikit-learn

Python 通过将sklearn.predict传递给df.apply对数据帧进行行预测,python,pandas,scikit-learn,Python,Pandas,Scikit Learn,假设我们有一个熊猫数据框架和一个scikit学习模型，使用该数据框架进行训练（fit）。有没有一种方法可以进行行预测？用例是使用sklearn模型使用predict函数填充数据帧中的空值我希望使用pandas apply函数（axis=1）可以实现这一点，但我不断得到维度错误使用熊猫版本“0.22.0”和sklearn版本“0.19.1” 简单的例子： import pandas as pd from sklearn.cluster import kmeans data = [[x,y,x

假设我们有一个熊猫数据框架和一个scikit学习模型，使用该数据框架进行训练（fit）。有没有一种方法可以进行行预测？用例是使用sklearn模型使用predict函数填充数据帧中的空值

我希望使用pandas apply函数（axis=1）可以实现这一点，但我不断得到维度错误

使用熊猫版本“0.22.0”和sklearn版本“0.19.1”

简单的例子：

import pandas as pd
from sklearn.cluster import kmeans

data = [[x,y,x*y] for x in range(1,10) for y in range(10,15)]

df = pd.DataFrame(data,columns=['input1','input2','output'])

model = kmeans()
model.fit(df[['input1','input2']],df['output'])

df['predictions'] = df[['input1','input2']].apply(model.predict,axis=1)

由此产生的维数误差：

ValueError: ('Expected 2D array, got 1D array instead:\narray=[ 1. 
10.].\nReshape your data either using array.reshape(-1, 1) if your data has 
a single feature or array.reshape(1, -1) if it contains a single sample.', 
'occurred at index 0')

在整个柱上运行predict效果良好：

df['predictions'] = model.predict(df[['input1','input2']])

但是，我希望能够灵活地按行使用此选项

我尝试了多种方法来首先重塑数据，例如：

def reshape_predict(df):
    return model.predict(np.reshape(df.values,(1,-1)))

df[['input1','input2']].apply(reshape_predict,axis=1)

它只返回没有错误的输入，而我希望它返回一列输出值（作为数组）

解决方案：

感谢Yakym提供了一个有效的解决方案！根据他的建议尝试一些变体，最简单的解决方案是简单地将行值括在方括号中（我以前尝试过这种方法，但是没有预测的0索引，没有运气）

更详细一点，您可以通过向值添加新轴将每一行转换为二维数组。然后，您必须使用

索引访问预测：

df["predictions"] = df[["input1", "input2"]].apply(
    lambda s: model.predict(s.values[None])[0], axis=1
)

谢谢你，亚基姆！我尝试过各种形式的lambda函数，但没有成功，但你的建议奏效了！

df["predictions"] = df[["input1", "input2"]].apply(
    lambda s: model.predict(s.values[None])[0], axis=1
)