ValueError：序列的真值不明确。使用a.empty、a.bool（）、a.item（）、a.any（）或a.all（）Python Sagemaker XGBoost_Python_Pandas_Numpy

ValueError：序列的真值不明确。使用a.empty、a.bool（）、a.item（）、a.any（）或a.all（）Python Sagemaker XGBoost

python pandas numpy

ValueError：序列的真值不明确。使用a.empty、a.bool（）、a.item（）、a.any（）或a.all（）Python Sagemaker XGBoost,python,pandas,numpy,Python,Pandas,Numpy,我试图从模型文件中得到预测 My dataframe df具有以下数据： target,feature1,feature2 0,0.8571428571428571,31.72975 0,2.0,27.525 0,1.0,47.11675 0,1.0,29.15 0,0.0,42.483000000000004 0,2.0,40.85825 0,0.0,34.97525 1,0.8571428571428571,31.72975 1,0.0,0.0 0,0.8571428571428571,31

我试图从模型文件中得到预测

My dataframe df具有以下数据：

target,feature1,feature2
0,0.8571428571428571,31.72975
0,2.0,27.525
0,1.0,47.11675
0,1.0,29.15
0,0.0,42.483000000000004
0,2.0,40.85825
0,0.0,34.97525
1,0.8571428571428571,31.72975
1,0.0,0.0
0,0.8571428571428571,31.72975

这是我的密码：

import numpy as np
import pickle as pkl
import pandas as pd
from sklearn.model_selection import train_test_split
import xgboost as xgb


df  = pd.read_csv("input.csv")

# Handle Nan values if any 
df.replace([np.inf, -np.inf], np.nan, inplace=True)
df.fillna(df.mean(), inplace=True)
df[~df.isin([np.nan, np.inf, -np.inf]).any(1)].astype(np.float64)

# Split Features and Target    
X = df.drop(columns="target")
y = df["target"]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0)

# Load the model from disk
model = pkl.load(open('xgboost-model', 'rb'))
results = model.predict(X_test, y_test)

我得到这个错误

Traceback (most recent call last):
  File "code.py", line 32, in <module>
    results = model.predict(X_test, y_test)
  File "C:\Anaconda3\envs\myenv\lib\site-packages\xgboost\core.py", line 1348, in predict
    if output_margin:
  File "C:\Anaconda3\envs\myenv\lib\site-packages\pandas\core\generic.py", line 1330, in __nonzero__
    f"The truth value of a {type(self).__name__} is ambiguous. "
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

回溯（最近一次呼叫最后一次）：
文件“code.py”，第32行，在
结果=模型预测（X_检验，y_检验）
文件“C:\Anaconda3\envs\myenv\lib\site packages\xgboost\core.py”，第1348行，在predict中
如果输出_裕度：
文件“C:\Anaconda3\envs\myenv\lib\site packages\pandas\core\generic.py”，第1330行，非零__
f“一个{type（self）.\u name\u}的真值是不明确的。”
ValueError：序列的真值不明确。使用a.empty、a.bool（）、a.item（）、a.any（）或a.all（）。

预测（X_检验，y_检验）

这里我遗漏了什么？

使用

results=model.predict（xgb.DMatrix（X_test.values））

，因为您希望对测试数据集

X_test

进行预测。

结果

包括可以与实际测试数据集进行比较的模型预测y_test，最好使用评估指标。

我得到另一个错误类型error:（'Expecting data to a DMatrix object，get:'）更新了我的答案。它实际上应该是model.predict（xgb.DMatrix（X_test.values））。你能更新一下答案吗？我会接受的。谢谢你抽出时间。