Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/linux/28.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Stanford nlp Stanford NER未正确提取百分比_Stanford Nlp_Named Entity Recognition - Fatal编程技术网

Stanford nlp Stanford NER未正确提取百分比

Stanford nlp Stanford NER未正确提取百分比,stanford-nlp,named-entity-recognition,Stanford Nlp,Named Entity Recognition,我试图用斯坦福大学的NER来提取百分比。但它并没有正确地提取百分比 inp_str = 'total revenue received was one hundred and twenty five percent 125% for last financial year' split_inp_str = inp_str.split() st = StanfordNERTagger('english.muc.7class.distsim.crf.ser.gz') print(st.tag(spl

我试图用斯坦福大学的NER来提取百分比。但它并没有正确地提取百分比

inp_str = 'total revenue received was one hundred and twenty five percent 125% for last financial year'
split_inp_str = inp_str.split()
st = StanfordNERTagger('english.muc.7class.distsim.crf.ser.gz')
print(st.tag(split_inp_str))
这将产生以下输出

[('total', 'O'), ('revenue', 'O'), ('received', 'O'), ('was', 'O'), ('one', 'O'), ('hundred', 'O'), ('and', 'O'), ('twenty', 'O'), ('five', 'PERCENT'), ('percent', 'PERCENT'), ('125%', 'O'), ('for', 'O'), ('last', 'O'), ('financial', 'O'), ('year', 'O')]

为什么不提取125%125%

你需要标记句子,而不是拆分()。请尝试以下代码

from nltk import word_tokenize

inp_str = 'total revenue received was one hundred and twenty five percent 125% for last financial year'
split_inp_str = word_tokenize(inp_str)
st = StanfordNERTagger('english.muc.7class.distsim.crf.ser.gz')
print(st.tag(split_inp_str))
当我使用Stanford CoreNLP 3.7.0时,我得到了“125%”的百分比。我正在运行Java代码。如果您使用NLTK,我不能完全确定正在运行什么。