Scikit learn 如何预测看不见的数据？_Scikit Learn

Scikit learn 如何预测看不见的数据？

scikit-learn

Scikit learn 如何预测看不见的数据？,scikit-learn,Scikit Learn,嗨，我正在练习ML模型，在尝试预测看不见的数据时遇到了问题。对分类数据执行onehotencoding时出错 from sklearn.preprocessing import LabelEncoder,OneHotEncoder labelencoder_x_1 = LabelEncoder() #will encode country X[:,1] = labelencoder_x_1.fit_transform(X[:,1]) labelencoder_x_2 = LabelEncod

嗨，我正在练习ML模型，在尝试预测看不见的数据时遇到了问题。对分类数据执行onehotencoding时出错

from sklearn.preprocessing import LabelEncoder,OneHotEncoder
labelencoder_x_1 = LabelEncoder() #will encode country
X[:,1] = labelencoder_x_1.fit_transform(X[:,1])

labelencoder_x_2 = LabelEncoder() #will encode Gender
X[:,2] = labelencoder_x_2.fit_transform(X[:,2])
onehotencoder_x = OneHotEncoder(categorical_features=[1])
X= onehotencoder_x.fit_transform(X).toarray()
X = X[:,1:]

My X有11列，第2列和第3列为分类类型（国家和性别）。模型运行良好，但在尝试针对随机输入测试模型时，onehotencoding失败

input = [[619], ['France'], ['Male'],   [42],   [2],    [0.0],  [1],    [1],    [1],[101348.88]]

input[1] = labelencoder_x_1.fit_transform(input[1])
input[2] = labelencoder_x_2.fit_transform(input[2])
input= onehotencoder_x.fit_transform(input).toarray()

错误：

 C:\Anaconda3\lib\site-packages\sklearn\preprocessing\_encoders.py:451: 
  DeprecationWarning: The 'categorical_features' keyword is deprecated in version 0.20 
and will be removed in 0.22. You can use the ColumnTransformer instead.
  "use the ColumnTransformer instead.", DeprecationWarning)
Traceback (most recent call last):

      File "<ipython-input-44-44a43edf17aa>", line 1, in <module>
    input= onehotencoder_x.fit_transform(input).toarray()

  File "C:\Anaconda3\lib\site-packages\sklearn\preprocessing\_encoders.py", line 624, in 
 fit_transform
    self._handle_deprecations(X)

   File "C:\Anaconda3\lib\site-packages\sklearn\preprocessing\_encoders.py", line 453, in 
_handle_deprecations
     n_features = X.shape[1]

 AttributeError: 'list' object has no attribute 'shape'

C:\Anaconda3\lib\site packages\sklearn\preprocessing\\编码器。py:451:
不推荐使用警告：“分类功能”关键字在版本0.20中不推荐使用
并将在0.22中删除。您可以改用ColumnTransformer。
“改用ColumnTransformer。”，弃用警告）
回溯（最近一次呼叫最后一次）：
文件“”，第1行，在
输入=onehotencoder\u x.fit\u变换（输入）.toarray（）
文件“C:\Anaconda3\lib\site packages\sklearn\preprocessing\\ u encoders.py”，第624行，在
拟合变换
自我处理和弃用（X）
文件“C:\Anaconda3\lib\site packages\sklearn\preprocessing\\ u encoders.py”，第453行，在
_处理反对意见
n_特征=X.形状[1]
AttributeError:“列表”对象没有属性“形状”

我认为这是因为您有嵌套列表

您应该将输入列表展平，并将其用于预测

input[1] = labelencoder_x_1.fit_transform(input[1])
input[2] = labelencoder_x_2.fit_transform(input[2])

intput = [item for sublist in input for item in sublist]

input= onehotencoder_x.fit_transform(input).toarray()

如果有嵌套列表，则列表中的每个元素都将被视为需要通过

fit_transform

函数的项，但由于它是单个元素，因此与fit_transform查找的形状不匹配，即[1，10]（1行，10列）.

在展平时，我得到了AttributeError:“list”对象没有属性“lower”@ChinmayNayak，我已经编辑了我的答案。有趣的是，它不起作用。编辑：我检查了我的旧代码，似乎我在使用Django函数。我的错。但是，这个版本的我的答案，您在其中迭代列表应该适合您。不，获取列表对象错误。AttributeError:“列表”对象没有属性“形状”