Python onehot编码后如何使用预测模型？_Python_Prediction_Categorical Data_One Hot Encoding

Python onehot编码后如何使用预测模型？

python

Python onehot编码后如何使用预测模型？,python,prediction,categorical-data,one-hot-encoding,Python,Prediction,Categorical Data,One Hot Encoding,我已经为这个数据集创建了一个预测模型 >>df.head() Service Tasks Difficulty Hours 0 ABC 24 1 0.833333 1 CDE 77 1 1.750000 2 SDE 90 3 3.166667 3 QWE 47 1 1.0

我已经为这个数据集创建了一个预测模型

>>df.head()

    Service    Tasks Difficulty     Hours
0   ABC         24     1           0.833333
1   CDE         77     1           1.750000
2   SDE         90     3           3.166667
3   QWE         47     1           1.083333
4   ASD         26     3           1.000000

>>df.shape
(998,4)

>>X = df.iloc[:,:-1]
>>y = df.iloc[:,-1].values
>>from sklearn.compose import ColumnTransformer 
>>ct = ColumnTransformer([("cat", OneHotEncoder(),[0])], remainder="passthrough")
>>X = ct.fit_transform(X)  
>>x = X.toarray()
>>x = x[:,1:]

>>x.shape
(998,339)

>>from sklearn.ensemble import RandomForestRegressor
>>rf_model = RandomForestRegressor(random_state = 1)
>>rf_model.fit(x,y)

我如何使用此模型预测此格式的用户输入的

小时数[[“SDE”，90，3]]

我试过了
>>test_input = [["SDE", 90, 3]]
>>test_input = ct.fit_transform(test_input)  
>>test_input = test_input[[:,1:]

>>test_input[0]
array([24, 1], dtype=object)


>>predict_hours = rf_model.predict(test_input)
ValueError

由于我的数据集有许多分类的

值，因此不可能输入“SDE”的编码值作为输入，我需要在收到输入后将“SDE”转换为一个热编码的格式

我不知道该怎么做，有人能帮上忙吗？

您可以使用

管道

轻松处理预处理和分类阶段

将熊猫作为pd导入
从sklearn.pipeline导入管道
从sklearn.compose导入ColumnTransformer
从sklearn.employ导入随机森林回归器
从sklearn.preprocessing导入OneHotEncoder
从sklearn.model\u选择导入列车\u测试\u拆分
#我已经创建了一个虚拟数据集
df=pd.read\u csv（'test.csv'））
X=df.iloc[：，：-1]
y=df.iloc[：，-1]。值
#预处理器
预处理器=ColumnTransformer（[（“cat”，OneHotEncoder（handle_unknown='ignore'），[0]），余数=“passthrough”）
#使用预处理器和分类器创建管道
管道=管道（[（‘预处理器’，预处理器），
（“分类器”，随机森林回归器（随机_状态=1））
])
#分割数据集
X_系列，X_测试，y_系列，y_测试=系列测试分割（X，y，测试尺寸=0.5，
随机_状态=0）
#培植石灰
管道安装（X_系列、y_系列）
#预测
打印（管道预测（X_测试））

请重复并从中删除。堆栈溢出不是为了替换现有的文档和教程。由于有许多网站使用一种热编码进行演示，我们希望您在这里发布之前就使用这些热编码。不要在您的培训和预测样本上使用

fit\u transform（）

fit（）

将变压器安装到培训数据中，然后

transform（）

安装变压器后的培训和测试数据