为什么predict不能提供预期的结果? 数据

为什么predict不能提供预期的结果? 数据,r,naivebayes,R,Naivebayes,这将帮助您了解问题: data <- data.frame(day_type = c("weekend", "weekend", "weekend","weekend", "weekday", "weekday", "weekday", "weekday"), vehicle = c("car", "car", "car", "car",

这将帮助您了解问题:

data <- data.frame(day_type = c("weekend", "weekend", "weekend","weekend",
                                "weekday", "weekday", "weekday", "weekday"),
                   vehicle = c("car", "car", "car", "car",
                               "bus", "bus", "bus", "bus"))

library(naivebayes)

model <- naive_bayes(vehicle ~ day_type, data = data)

predict(model, data.frame(day_type = "weekend"))
    [1] bus
Levels: bus car
你仍然会得到预测!因为那些模型甚至无法理解的值被编码为1和2


因此,正如@Aaron所建议的,确保因子水平匹配,或者使用字符变量而不是因子变量。

这是因子水平不匹配吗?尝试确保输入和预测数据集中的day_类型的级别相同。如果这不会使您的过程慢得多,我建议您在data.frames中使用
stringsAsFactors=F
构建模型。这将解决由级别创建的任何问题,因为您将使用角色变量。
data <- data.frame(day_type = c("weekend", "weekend", "weekend","weekend",
                                "weekday", "weekday", "weekday", "weekday"),
                   vehicle = c("car", "car", "car", "car",
                               "bus", "bus", "bus", "bus"))

library(naivebayes)

model <- naive_bayes(vehicle ~ day_type, data = data)

dt_test1 = data.frame(day_type = "weekend")
dt_test2 = data.frame(day_type = "weekday")
dt_test3 = data.frame(day_type = c("weekend","weekday"))

predict(model, newdata = dt_test1)

# [1] bus
# Levels: bus car

predict(model, newdata = dt_test2)

# [1] bus
# Levels: bus car

predict(model, newdata = dt_test3)

# [1] car bus
# Levels: bus car
dt_test4 = data.frame(day_type = c("January","February"))

predict(model, newdata = dt_test4)

# [1] car bus
# Levels: bus car