Python 将预测结果保存到CSV

Python 将预测结果保存到CSV,python,numpy,pandas,scikit-learn,Python,Numpy,Pandas,Scikit Learn,我将sklearn回归模型的结果存储到varibla预测中 prediction = regressor.predict(data[['X']]) print(prediction) 预测输出的值如下所示 [ 266.77832991 201.06347505 446.00066136 499.76736079 295.15519906 214.50514991 422.1043505 531.13126879 287.68760191 201.06347505 40

我将sklearn回归模型的结果存储到varibla预测中

prediction = regressor.predict(data[['X']])
print(prediction)
预测输出的值如下所示

[ 266.77832991  201.06347505  446.00066136  499.76736079  295.15519906
  214.50514991  422.1043505   531.13126879  287.68760191  201.06347505
  402.68859792  478.85808879  286.19408248  192.10235848]
然后,我尝试使用to_csv功能将结果保存到本地csv文件:

prediction.to_csv('C:/localpath/test.csv')
但我得到的错误是:

AttributeError: 'numpy.ndarray' object has no attribute 'to_csv'
我正在使用Pandas/Numpy/SKlearn。关于基本修复有什么想法吗?

您可以使用pandas。 如前所述,numpy阵列没有to_csv函数

import numpy as np
import pandas as pd
prediction = pd.DataFrame(predictions, columns=['predictions']).to_csv('prediction.csv')

如果您希望在行或列中添加“.T”,如。

您可以使用
numpy.savetxt
功能:

numpy.savetxt('C:/localpath/test.csv',prediction, ,delimiter=',')
numpy.genfromtxt('C:/localpath/test.csv', delimiter=',')
要加载CSV文件,您可以使用
numpy.genfromtxt
函数:

numpy.savetxt('C:/localpath/test.csv',prediction, ,delimiter=',')
numpy.genfromtxt('C:/localpath/test.csv', delimiter=',')

这是一个非常详细的解决方案,但您甚至可以在生产中使用它

首先保存模型

joblib.dump(regressor, "regressor.sav")
按顺序保存列

pd.DataFrame(X_train.columns).to_csv("feature_list.csv", index = None)
保存列车组的数据类型

pd.DataFrame(X_train.dtypes).reset_index().to_csv("data_types.csv", index = None)
再次使用它:

feature_list = pd.read_csv("feature_list.csv")
feature_list = pd.Index(list(feature_list["0"]))

add_cols = list(feature_list.difference(X_test.columns))

drop_cols = list(X_test.columns.difference(feature_list))

for col in add_cols:
    X_test[col] = np.nan

for col in drop_cols:
    X_test = X_test.drop(col, axis = 1)

# reorder columns
X_test = X_test[feature_list]

types = pd.read_csv("data_types.csv")
for i in range(len(types)):
    X_test[types.iloc[i,0]] = X_test[types.iloc[i,0]].astype(types.iloc[i,1])
做出预测

regressor = joblib.load("regressor.sav")
predictions = regressor.predict(X_test)
res = pd.DataFrame(predictions)
res.index = X_test.index # its important for comparison
res.columns = ["prediction"]
res.to_csv("prediction_results.csv")
保存预测结果

regressor = joblib.load("regressor.sav")
predictions = regressor.predict(X_test)
res = pd.DataFrame(predictions)
res.index = X_test.index # its important for comparison
res.columns = ["prediction"]
res.to_csv("prediction_results.csv")

享受端到端模型/预测保护程序代码

如果我想与
X_test
中的唯一标识符(“id”列,而不是索引)合并,预测结果是否会正确匹配每一行?如:
output=pd.DataFrame(data={“id”:X_test[“id”],“Prediction”:y_pred})
output.to_csv(path_或_buf=“..\\output\\results.csv”,index=False,quoting=3,sep=“;”)
如果X_test与y_pred长度相同,答案是肯定的。加载后我对数据进行了重塑,即:“pred_train=np.genfromfromd1.csv',delimiter=”).txt(-1,1)”,是否有一种方法可以保存和加载数据,而不必考虑重塑数据?