Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/18.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python xgboost:AttributeError:&x27;DMatrix';对象没有属性';手柄';_Python_Python 3.x_Machine Learning_Xgboost_Kaggle - Fatal编程技术网

Python xgboost:AttributeError:&x27;DMatrix';对象没有属性';手柄';

Python xgboost:AttributeError:&x27;DMatrix';对象没有属性';手柄';,python,python-3.x,machine-learning,xgboost,kaggle,Python,Python 3.x,Machine Learning,Xgboost,Kaggle,这个问题真的很奇怪,因为这段代码与其他数据集配合得很好 完整代码: import numpy as np import pandas as pd import xgboost as xgb from sklearn.cross_validation import train_test_split # # Split the Learning Set X_fit, X_eval, y_fit, y_eval= train_test_split( train, target, test_s

这个问题真的很奇怪,因为这段代码与其他数据集配合得很好

完整代码:

import numpy as np
import pandas as pd
import xgboost as xgb
from sklearn.cross_validation import train_test_split

# # Split the Learning Set
X_fit, X_eval, y_fit, y_eval= train_test_split(
    train, target, test_size=0.2, random_state=1
)

clf = xgb.XGBClassifier(missing=np.nan, max_depth=6, 
                        n_estimators=5, learning_rate=0.15, 
                        subsample=1, colsample_bytree=0.9, seed=1400)

# fitting
clf.fit(X_fit, y_fit, early_stopping_rounds=50, eval_metric="logloss", eval_set=[(X_eval, y_eval)])
#print y_pred
y_pred= clf.predict_proba(test)[:,1]
最后一行导致以下错误(提供完整输出):

这里怎么了?我不知道如何解决这个问题


UPD1:实际上这是一个kaggle问题:

这里的问题与初始数据有关:一些值是float或integer,还有一些对象。这就是为什么我们需要铸造它们:

from sklearn import preprocessing 
for f in train.columns: 
    if train[f].dtype=='object': 
        lbl = preprocessing.LabelEncoder() 
        lbl.fit(list(train[f].values)) 
        train[f] = lbl.transform(list(train[f].values))

for f in test.columns: 
    if test[f].dtype=='object': 
        lbl = preprocessing.LabelEncoder() 
        lbl.fit(list(test[f].values)) 
        test[f] = lbl.transform(list(test[f].values))

train.fillna((-999), inplace=True) 
test.fillna((-999), inplace=True)

train=np.array(train) 
test=np.array(test) 
train = train.astype(float) 
test = test.astype(float)

您可能还想看看
分类变量
解决方案,如下所示:

for col in train.select_dtypes(include=['object']).columns:
    train[col] = train[col].astype('category')
    test[col] = test[col].astype('category')

# Encoding categorical features
for col in train.select_dtypes(include=['category']).columns:
    train[col] = train[col].cat.codes
    test[col] = test[col].cat.codes

train.fillna((-999), inplace=True) 
test.fillna((-999), inplace=True)

train=np.array(train) 
test=np.array(test) 

X_fit.dtypes
X_eval.dtypes
的输出是什么?这是用于
X_fit.dtypes
target int64 v1 float64 v2 float64 v3 int64 v4 float64<代码>测试甚至有对象类型哇,谢谢,我不知道熊猫中有这样的数据类型
for col in train.select_dtypes(include=['object']).columns:
    train[col] = train[col].astype('category')
    test[col] = test[col].astype('category')

# Encoding categorical features
for col in train.select_dtypes(include=['category']).columns:
    train[col] = train[col].cat.codes
    test[col] = test[col].cat.codes

train.fillna((-999), inplace=True) 
test.fillna((-999), inplace=True)

train=np.array(train) 
test=np.array(test)