Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/ruby-on-rails/54.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 为什么e1071为这种朴素的贝叶斯分类器预测提供NAs?_R - Fatal编程技术网

R 为什么e1071为这种朴素的贝叶斯分类器预测提供NAs?

R 为什么e1071为这种朴素的贝叶斯分类器预测提供NAs?,r,R,失败 library(e1071) train.x <- data.frame( B=c(0,1,0), C=c(0,0,0), D=c(0,0,1), Z=c(1,0,0) ) classifier <- naiveBayes(x=train.x, y=factor(c(TRUE, TRUE, FALSE)), laplace=1) # use laplace (i.e. alpha) of nearly 0 predict(classifier, train

失败

library(e1071)

train.x <- data.frame(
  B=c(0,1,0),
  C=c(0,0,0),
  D=c(0,0,1),
  Z=c(1,0,0)
)

classifier <- naiveBayes(x=train.x, y=factor(c(TRUE, TRUE, FALSE)), laplace=1)  # use laplace (i.e. alpha) of nearly 0
predict(classifier, train.x, type="raw")

     FALSE TRUE
[1,]    NA   NA
[2,]    NA   NA
[3,]    NA   NA
train.x <- data.frame(
  B=c(0,1,0,1),
  C=c(0,0,0,1),
  D=c(0,0,1,1),
  Z=c(1,0,0,1)
)

classifier <- naiveBayes(x=train.x, y=factor(c(TRUE, TRUE, FALSE, FALSE)), laplace=1)  # use laplace (i.e. alpha) of nearly 0
predict(classifier, train.x, type="raw")

              FALSE           TRUE
[1,] 0.000000002761 0.999999997239
[2,] 0.000000002761 0.999999997239
[3,] 0.997729292055 0.002270707945
[4,] 0.999999994295 0.000000005705
库(e1071)

对于数值变量,
naiveBayes
使用每个变量的平均值和标准偏差来计算每个类别每个变量的概率。因为您只有三个培训示例,所以至少一个类的标准偏差必须是未定义的(您提供了两个培训示例的类可以)。通过查看分类器的
属性可以看到,该属性显示平均值和标准偏差:

> classifier$tables
$B
                            B
factor(c(TRUE, TRUE, FALSE)) [,1]      [,2]
                       FALSE  0.0        NA
                       TRUE   0.5 0.7071068

$C
                            C
factor(c(TRUE, TRUE, FALSE)) [,1] [,2]
                       FALSE    0   NA
                       TRUE     0    0

$D
                            D
factor(c(TRUE, TRUE, FALSE)) [,1] [,2]
                       FALSE    1   NA
                       TRUE     0    0

$Z
                            Z
factor(c(TRUE, TRUE, FALSE)) [,1]      [,2]
                       FALSE  0.0        NA
                       TRUE   0.5 0.7071068
naiveBayes
区分数值变量和分类变量,分类变量的概率在没有标准偏差的情况下工作。因此,如果您将数据转换为逻辑数据,它将起作用:

train.x <- sapply(train.x, as.logical)
classifier <- naiveBayes(x=train.x, y=factor(c(TRUE, TRUE, FALSE)), laplace=1)
predict(classifier, train.x, type="raw")
         FALSE       TRUE
[1,] 0.4705882 0.52941176
[2,] 0.4705882 0.52941176
[3,] 0.9142857 0.08571429

train.x我的猜测是:在第一种情况下,可能与#自变量>#训练示例有关。