Python 校准分类RCV不';不能使用管道的数据帧?
我使用sklearn管道构建了一个分类模型。现在我想运行CalibredClassifiedRCV来校准概率预测Python 校准分类RCV不';不能使用管道的数据帧?,python,pandas,scikit-learn,pipeline,Python,Pandas,Scikit Learn,Pipeline,我使用sklearn管道构建了一个分类模型。现在我想运行CalibredClassifiedRCV来校准概率预测 model = pickle.load(open('model.pkl', 'rb')) train = pd.read_csv('new_train.csv') features = ['BVW_2C', 'PHIT_2C', 'SW_2C', 'VCARB_2C', 'VCLAY_2C', 'VKER_2C', 'VPYR_2C', 'VSAND_2C', 'GOLD_DTC',
model = pickle.load(open('model.pkl', 'rb'))
train = pd.read_csv('new_train.csv')
features = ['BVW_2C', 'PHIT_2C', 'SW_2C', 'VCARB_2C', 'VCLAY_2C', 'VKER_2C', 'VPYR_2C', 'VSAND_2C', 'GOLD_DTC', 'GOLD_DTS', 'GOLD_GR', 'GOLD_NPHI', 'GOLD_PEF', 'GOLD_RDEEI', 'GOLD_RHOB', 'HETI_NORM_GR']
clf_iso = CalibratedClassifierCV(model, cv=3, method='isotonic')
clf_iso.fit(train[features], train[['trouble']])
预训练模型是从一个文件加载的,它基本上由一个定标器和一个梯度boost分类器组成。如果我打印出模型,它如下所示:
Pipeline(steps=[('preproc',
ColumnTransformer(transformers=[('num',
Pipeline(steps=[('scaler',
StandardScaler())]),
['BVW_2C', 'PHIT_2C', 'SW_2C',
'VCARB_2C', 'VCLAY_2C',
'VKER_2C', 'VPYR_2C',
'VSAND_2C', 'GOLD_DTC',
'GOLD_DTS', 'GOLD_GR',
'GOLD_NPHI', 'GOLD_PEF',
'GOLD_RDEEI', 'GOLD_RHOB',
'HETI_NORM_GR'])])),
('clf',
LGBMClassifier(max_depth=30, n_estimators=300,
num_leaves=200))])
由于某种原因,我在执行clf_iso.fit(train[features],train['trouble']])
时出现以下错误
AttributeError回溯(最近一次调用)
/p/RESIM02/analytics/AICOE/JIOD/geonet ml/geo env/lib/python3.7/site packages/sklearn/utils/__init__;.py in_uget_ucolumn_uindex(X,键)
424试试:
-->425所有_列=X列
426除属性错误外:
AttributeError:'numpy.ndarray'对象没有属性'columns'
在处理上述异常期间,发生了另一个异常:
ValueError回溯(最近一次调用上次)
在里面
1 clf_iso=校准分类RCV(型号,cv=3,方法='等渗')
---->2 clf_iso.fit(系列[功能]、系列[故障])
train[features]
显然是一个数据帧(由类型(train[features])
确认)。为什么它仍然将其视为numpy数组。如果我只是用加载的modelmodel.fit(train[features],train['trouble']])
重新训练相同的数据,它就可以正常工作。为什么它不适用于校准的分类RCV
编辑
看起来像是一个已知问题()
AttributeError Traceback (most recent call last)
/p/ressim02/analytics/AICOE/jiaod/geonet-ml/geo-env/lib/python3.7/site-packages/sklearn/utils/__init__.py in _get_column_indices(X, key)
424 try:
--> 425 all_columns = X.columns
426 except AttributeError:
AttributeError: 'numpy.ndarray' object has no attribute 'columns'
During handling of the above exception, another exception occurred:
ValueError Traceback (most recent call last)
<ipython-input-49-cfae22cb4b89> in <module>
1 clf_iso = CalibratedClassifierCV(model, cv=3, method='isotonic')
----> 2 clf_iso.fit(train[features], train[['trouble']])