Python 在使用SVM的递归特征消除或模型训练中，如何处理字符串特征？_Python_Pandas_Scikit Learn_Feature Extraction

Python 在使用SVM的递归特征消除或模型训练中，如何处理字符串特征？

python pandas scikit-learn

Python 在使用SVM的递归特征消除或模型训练中，如何处理字符串特征？,python,pandas,scikit-learn,feature-extraction,Python,Pandas,Scikit Learn,Feature Extraction,我有这样的数据 shift_id user_id status organization_id location_id department_id open_positions city zip role_id specialty_id latitude longitude years_of_experience

我有这样的数据

shift_id    user_id status  organization_id location_id department_id   open_positions  city    zip role_id specialty_id    latitude    longitude   years_of_experience                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
2   9   S   1   1   19  1   brooklyn    48001   2   9   42.643  -82.583                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
6   60  S   12  19  20  1   test    68410   3   7   40.608  -95.856                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
9   61  S   12  19  20  1   new york    48001   1   7   42.643  -82.583                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
10  60  S   12  19  20  1   test    68410   3   7   40.608  -95.856                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
21  3   S   1   1   19  1   pune    48001   1   2   46.753  -89.584 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
4   7   S   1   1   19  1   needham 2494    4   4   42.292  -71.246 2

因此，它包含字符串和数字特征

我首先要执行特征消除，然后对其执行SVM

这是我的代码

dataset = pd.read_csv("data.csv",header = 0)
data = dataset.drop('organization_id',1)
#data = data.fillna(0, inplace=True)
target = dataset.location_id
#dataset.head()
svm = LinearSVC()
rfe = RFE(svm, 3)
rfe = rfe.fit(data, target)
print(rfe.support_)
print(rfe.ranking_)

但由于列

status

具有字符串值，因此它给出-

ValueError:无法将字符串转换为浮点：“S”

具有这样的字符串特征是显而易见的。处理这种情况的标准做法是什么？

您可以通过以下方式将每个分类特征编码为整数值来修复错误：

如果这种方法不适用于您，我建议您改为尝试。

您可以通过以下方式将每个分类特征编码为整数值来修复错误：

如果这种方法对您不起作用，我建议您尝试一下

from sklearn.preprocessing import OrdinalEncoder

enc = OrdinalEncoder()
features = ['status', 'city']
categorical = data[features]
enc.fit(categorical)
numerical = enc.transform(categorical)

for n, feat in enumerate(features):
    data[feat] = numerical[:, n]