R，e1071库中的朴素贝叶斯：拟合模型给出了每个记录的先验概率作为预测_R_Naivebayes

R，e1071库中的朴素贝叶斯：拟合模型给出了每个记录的先验概率作为预测

R，e1071库中的朴素贝叶斯：拟合模型给出了每个记录的先验概率作为预测,r,naivebayes,R,Naivebayes,我使用e1071库中的朴素贝叶斯。我有以下名为nb0.csv的玩具数据集 N_INQUIRIES_BIN,TARGET 1,0 2,1 2,0 1,0 1,0 1,0 1,1 然后我使用以下代码行 library(e1071) data = read.csv('d:/nb0.csv') model <- naiveBayes(as.factor(data[, 'N_INQUIRIES_BIN']), data[, 'TARGET']) 然而，当我对训练数据进行预测时，我会得到先验概率

我使用e1071库中的朴素贝叶斯。我有以下名为nb0.csv的玩具数据集

N_INQUIRIES_BIN,TARGET
1,0
2,1
2,0
1,0
1,0
1,0
1,1

然后我使用以下代码行

library(e1071)
data = read.csv('d:/nb0.csv')
model <- naiveBayes(as.factor(data[, 'N_INQUIRIES_BIN']), data[, 'TARGET'])

然而，当我对训练数据进行预测时，我会得到先验概率作为对所有记录的预测

> predict(model, as.factor(data[, 'N_INQUIRIES_BIN']), type='raw')
             0         1
[1,] 0.7142857 0.2857143
[2,] 0.7142857 0.2857143
[3,] 0.7142857 0.2857143
[4,] 0.7142857 0.2857143
[5,] 0.7142857 0.2857143
[6,] 0.7142857 0.2857143
[7,] 0.7142857 0.2857143

这是一个实现错误还是我遗漏了一些明显的东西

顺便说一句，一切都很好

正确答案

代码

library(e1071)
data = read.csv('d:/nb0.csv')

data$N_INQUIRIES_BIN <- as.factor(data$N_INQUIRIES_BIN)

model <- naiveBayes(TARGET ~ ., data)
predict(model, data, type='raw')

库（e1071）
数据=read.csv（'d:/nb0.csv'）
data$N_INQUIRIES_BIN这条评论太长了，所以我将此作为答案发布。我看到了两到三件可以切换的事情：
首先：我建议在模型之外调用as.factor（）
，如下所示：
data$N_INQUIRIES_BIN <- as.factor(data$N_INQUIRIES_BIN)

再次注意，使用此函数调用计算的条件概率和先验概率与您的不同
最后预测（同样，按照帮助文件中的示例）：

为了完整起见，关于帖子的主题，模型中的公式与OP想要的不同，以下是实际调用：
#Keep the as.factor call outside of the model
data$N_INQUIRIES_BIN <- as.factor(data$N_INQUIRIES_BIN)
#explicitly state the formula in the naivebayes
#note that the especified column is TARGET and not N_INQUIRIES_BIN
model <- naiveBayes(TARGET ~ ., data)
#predict the model, with all the dataset
predict(model, data, type='raw')
#Yields the following:
#       0   1
#[1,] 0.8 0.2
#[2,] 0.5 0.5
#[3,] 0.5 0.5
#[4,] 0.8 0.2
#[5,] 0.8 0.2
#[6,] 0.8 0.2
#[7,] 0.8 0.2

#将as.factor调用保持在模型之外
数据$N\u查询\u BIN我认为您的naiveBayes
函数可能是错误的。请注意，在链接的示例中始终有一个公式（您的示例中没有），naiveBayes只接受data.frames或array（因此data[，'TARGET']
可能不起作用）公式不必明确设置，这可以从iris
示例中看出。在同一示例中，显示了iris[，5]
的用法，因此data[，'TARGET']
必须以相同的方式工作。为了安全起见，我检查了model，并根据您的说明更改了代码，看起来一切正常。我是新手，所以我不知道到底是什么触发的。公式（但有一个例子不使用公式），使用数据帧还是其他？谢谢我知道公式调用实际上是另一种方式，我将用正确的数据编辑答案，很高兴能提供帮助！谢谢我还把正确答案添加到问题的底部
model <- naiveBayes(as.factor(data[, 'N_INQUIRIES_BIN']), data[, 'TARGET'])

#Here I can't claim this is the model you are looking for, but for illustration purposes:
model <- naiveBayes(N_INQUIRIES_BIN ~ ., data = data)

model <- naiveBayes(N_INQUIRIES_BIN ~ ., data = data)
model
#
#Naive Bayes Classifier for Discrete Predictors
#
#Call:
#naiveBayes.default(x = X, y = Y, laplace = laplace)
#
#A-priori probabilities:
#Y
#        1         2 
#0.7142857 0.2857143 
#
#Conditional probabilities:
#   TARGET
#Y   [,1]      [,2]
#  1  0.2 0.4472136
#  2  0.5 0.7071068

#Here, all of the dataset is taken into account
predict(model, data, type='raw')
#             1         2
#[1,] 0.8211908 0.1788092
#[2,] 0.5061087 0.4938913
#[3,] 0.8211908 0.1788092
#[4,] 0.8211908 0.1788092
#[5,] 0.8211908 0.1788092
#[6,] 0.8211908 0.1788092
#[7,] 0.5061087 0.4938913

#Keep the as.factor call outside of the model
data$N_INQUIRIES_BIN <- as.factor(data$N_INQUIRIES_BIN)
#explicitly state the formula in the naivebayes
#note that the especified column is TARGET and not N_INQUIRIES_BIN
model <- naiveBayes(TARGET ~ ., data)
#predict the model, with all the dataset
predict(model, data, type='raw')
#Yields the following:
#       0   1
#[1,] 0.8 0.2
#[2,] 0.5 0.5
#[3,] 0.5 0.5
#[4,] 0.8 0.2
#[5,] 0.8 0.2
#[6,] 0.8 0.2
#[7,] 0.8 0.2