Python NLTK朴素贝叶斯分类器：该分类器用于分类输入的底层计算是什么？_Python_Machine Learning_Nltk

Python NLTK朴素贝叶斯分类器：该分类器用于分类输入的底层计算是什么？

python machine-learning

Python NLTK朴素贝叶斯分类器：该分类器用于分类输入的底层计算是什么？,python,machine-learning,nltk,Python,Machine Learning,Nltk,我使用Python NLTK中的朴素贝叶斯分类器计算以下示例的概率分布： import nltk def main(): train = [(dict(feature=1), 'class_x'), (dict(feature=0), 'class_x'), (dict(feature=0), 'class_y'), (dict(feature=0), 'class_y')] test = [dict(feature=1)] classifier = nltk.

我使用Python NLTK中的朴素贝叶斯分类器计算以下示例的概率分布：

import nltk

def main():
    train = [(dict(feature=1), 'class_x'), (dict(feature=0), 'class_x'),   (dict(feature=0), 'class_y'), (dict(feature=0), 'class_y')]

    test = [dict(feature=1)]

    classifier = nltk.classify.NaiveBayesClassifier.train(train)

    print("classes available: ", sorted(classifier.labels()))

    print ("input assigned to: ", classifier.classify_many(test))

    for pdist in classifier.prob_classify_many(test):
        print ("probability distribution: ")
        print ('%.4f %.4f' % (pdist.prob('class_x'), pdist.prob('class_y')))

if __name__ == '__main__':
    main()

培训数据集中有两个类（class_x和class_y）。每个类都有两个输入。对于类_x，第一个输入特征的值为1，第二个输入特征的值为0。对于类_y，两个输入特征的值均为0。测试数据集由一个输入组成，其值为1

运行代码时，输出为：

classes available:  ['class_x', 'class_y']
input assigned to:  ['class_x']
0.7500 0.2500

为了获得每个类的概率或似然度，分类器应该将该类的先验值（在本例中为0.5）乘以该类中每个特征的概率。应考虑平滑

我通常使用类似的公式（或类似的变体）：