Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/scala/19.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 与ngram和nltk相关的零误差浮点除法_Python_Nltk - Fatal编程技术网

Python 与ngram和nltk相关的零误差浮点除法

Python 与ngram和nltk相关的零误差浮点除法,python,nltk,Python,Nltk,我的任务是使用10倍交叉验证方法,在语料库中使用uni、bi和trigrams,并比较它们的准确性。但是,我遇到了一个浮点除法错误。除了循环之外,所有这些代码都是由提问者给出的,因此错误可能就在那里。在这里,我们只使用前1000个句子来测试程序,一旦我知道程序运行,这一行将被删除 import codecs mypath = "/Users/myname/Desktop/" corpusFile = codecs.open(mypath + "estonianSample.txt",mode="

我的任务是使用10倍交叉验证方法,在语料库中使用uni、bi和trigrams,并比较它们的准确性。但是,我遇到了一个浮点除法错误。除了循环之外,所有这些代码都是由提问者给出的,因此错误可能就在那里。在这里,我们只使用前1000个句子来测试程序,一旦我知道程序运行,这一行将被删除

import codecs
mypath = "/Users/myname/Desktop/"
corpusFile = codecs.open(mypath + "estonianSample.txt",mode="r",encoding="latin-1")
sentences = [[tuple(w.split("/")) for w in line[:-1].split()] for line in corpusFile.readlines()]
corpusFile.close()


from math import ceil
N=len(sentences)
chunkSize = int(ceil(N/10.0))


sentences = sentences[:1000]

chunks=[sentences[i:i+chunkSize] for i in range(0, N, chunkSize)]

for i in range(10):

    training = reduce(lambda x,y:x+y,[chunks[j] for j in range(10) if j!=i])
    testing = chunks[i]

from nltk import UnigramTagger,BigramTagger,TrigramTagger
t1 = UnigramTagger(training)
t2 = BigramTagger(training,backoff=t1)
t3 = TrigramTagger(training,backoff=t2)

t3.evaluate(testing)
错误是这样说的:

runfile('/Users/myname/pythonhw3.py', wdir='/Users/myname')
Traceback (most recent call last):
  File "<ipython-input-1-921164840ebd>", line 1, in <module>
    runfile('/Users/myname/pythonhw3.py', wdir='/Users/myname') 
  File "/Users/myname/anaconda/lib/python2.7/site-packages/spyderlib/widgets/externalshell/sitecustomize.py", line 580, in runfile
    execfile(filename, namespace)
  File "/Users/myname/pythonhw3.py", line 34, in <module>
    t3.evaluate(testing)
  File "/Users/myname/anaconda/lib/python2.7/site-packages/nltk/tag/api.py", line 67, in evaluate
    return accuracy(gold_tokens, test_tokens)
  File "/Users/myname/anaconda/lib/python2.7/site-packages/nltk/metrics/scores.py", line 40, in accuracy
    return float(sum(x == y for x, y in izip(reference, test))) / len(test)    
ZeroDivisionError: float division by zero
runfile('/Users/myname/pythonhw3.py',wdir='/Users/myname')
回溯(最近一次呼叫最后一次):
文件“”,第1行,在
运行文件('/Users/myname/pythonhw3.py',wdir='/Users/myname')
runfile中的文件“/Users/myname/anaconda/lib/python2.7/site packages/spyderlib/widgets/externalshell/sitecustomize.py”,第580行
execfile(文件名、命名空间)
文件“/Users/myname/pythonhw3.py”,第34行,在
t3.评估(测试)
文件“/Users/myname/anaconda/lib/python2.7/site packages/nltk/tag/api.py”,第67行
返回精度(黄金代币、测试代币)
文件“/Users/myname/anaconda/lib/python2.7/site packages/nltk/metrics/scores.py”,第40行,准确无误
返回浮点(和(x==y表示x,y表示izip(参考,测试))/len(测试)
ZeroDivisionError:浮点除以零

由于返回值接近负无穷大,因此发生错误

导致问题的具体原因是

t3.evaluate(testing)
你能做的是

try:
    t3.evaluate(testing)
except ZeroDivisonError:
    # Do whatever you want it to do
    print(0)
这对我很有效。试试看


答案是四年后的事了,但希望一位网友能发现这一点很有帮助。

你能发布错误的完整输出吗?已编辑!添加了完整的输出!