Python 如何在sklearn中将预测与输入数据测试连接起来_Python_Pandas_Scikit Learn

Python 如何在sklearn中将预测与输入数据测试连接起来

python pandas scikit-learn

Python 如何在sklearn中将预测与输入数据测试连接起来,python,pandas,scikit-learn,Python,Pandas,Scikit Learn,我想将来自模型的预测和sklearn在Python中使用的输入数据连接起来。代码是 x_train, x_test, y_train, y_test = train_test_split(x_mat, y, test_size=test_size) mdl = RandomForestRegressor(max_depth=max_depth, n_estimators=n_estimators, n_jobs=n_jobs) mdl.fit(x_train, y_train) y_predic

我想将来自模型的预测和sklearn在Python中使用的输入数据连接起来。代码是

x_train, x_test, y_train, y_test = train_test_split(x_mat, y, test_size=test_size)
mdl = RandomForestRegressor(max_depth=max_depth, n_estimators=n_estimators, n_jobs=n_jobs)
mdl.fit(x_train, y_train)
y_predict = self.mdl.predict(x_test)

问题是这两个变量的格式不同。对于输入数据y_test{Series}，我有一个如下的序列：

2018-07-01T00:00:00Z 375.25

2018-12-23T00:00:00Z 306.13

2018-11-13T00:00:00Z 542.74

2018-12-11T00:00:00Z 556.73

但是预测y_predict{ndarray}是这样一个数组：

[374.35747933 303.1865425 559.07108139 545.67544684]

我想获得一个数据帧，例如：

2018-07-01T00:00:00Z 375.25374.35747933

2018-12-23T00:00:00Z 306.13 303.1865425

2018-11-13T00:00:00Z 542.74 559.07108139

2018-12-11T00:00:00Z 556.73 545.67544684

以便一次直观地逐个比较和/或绘制输入和预测

我想保留带有时间戳的索引，但我担心这可能是另一个问题，因为我尝试了以下方法：

dataset = pd.concat([pd.Series(y_predict), y_test], axis=1, ignore_index = True)

但所得结果将一个序列置于另一个序列之下

提前感谢

为了保留时间戳索引，您可以将序列转换为dataframe并添加列：

results = y_test.to_frame()
results['prediction'] = y_predict

df=pd.DataFrame（{'predict'：y_predict，'test'：y_test}）

？这会保留索引吗？谢谢@QuangHoang，它真的很简单而且有效@vlizana，Quang-Hoang的解决方案也会保留时间戳。它也有效，我尝试了类似的解决方案，但我没有将其转换为DataFrame，所以我感到沮丧。非常感谢