Stanford nlp 在Stanford NER上创建NER模型时出错
在创建NER模型时,我收到一条错误消息,如下所示:Stanford nlp 在Stanford NER上创建NER模型时出错,stanford-nlp,Stanford Nlp,在创建NER模型时,我收到一条错误消息,如下所示: Exception in thread "main" java.lang.RuntimeException: Got NaN for prob in CRFLogConditionalObjectiveFunction.calculate() - this may well indicate numeric underflow due to overly long documents. at edu.stanford.nlp.ie.cr
Exception in thread "main" java.lang.RuntimeException: Got NaN for prob in CRFLogConditionalObjectiveFunction.calculate() - this may well indicate numeric underflow due to overly long documents.
at edu.stanford.nlp.ie.crf.CRFLogConditionalObjectiveFunction.calculate(CRFLogConditionalObjectiveFunction.java:427)
at edu.stanford.nlp.optimization.AbstractCachingDiffFunction.ensure(AbstractCachingDiffFunction.java:140)
at edu.stanford.nlp.optimization.AbstractCachingDiffFunction.valueAt(AbstractCachingDiffFunction.java:145)
at edu.stanford.nlp.optimization.QNMinimizer.lineSearchMinPack(QNMinimizer.java:1460)
at edu.stanford.nlp.optimization.QNMinimizer.minimize(QNMinimizer.java:1008)
at edu.stanford.nlp.optimization.QNMinimizer.minimize(QNMinimizer.java:857)
at edu.stanford.nlp.optimization.QNMinimizer.minimize(QNMinimizer.java:851)
at edu.stanford.nlp.optimization.QNMinimizer.minimize(QNMinimizer.java:93)
at edu.stanford.nlp.ie.crf.CRFClassifier.trainWeights(CRFClassifier.java:1919)
at edu.stanford.nlp.ie.crf.CRFClassifier.train(CRFClassifier.java:1726)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.train(AbstractSequenceClassifier.java:758)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.train(AbstractSequenceClassifier.java:746)
at edu.stanford.nlp.ie.crf.CRFClassifier.main(CRFClassifier.java:3034)
为了创建NER,我只使用了斯坦福NER网站[此处]的Java代码。
Java代码是:
java-cp stanford-ner.jar edu.stanford.nlp.ie.crf.crfclassizer-prop 06012017\u training.prop
此外,用于创建NER的TSV文件为35.369MB。
我试图只创建一个标题为“SYS”的标记
如何克服此错误并成功创建NER模型?
提前谢谢。@stanfordnlphelp仅回答我自己的问题,当我分离所有标点符号然后删除它们时,我没有发现任何错误。另外,当使用代码进行标记化时,这是很好的选择!java-cp stanford-ner.jar edu.stanford.nlp.process.PTBTokenizer jane-austen-emma-ch1.txt>jane-austen-emma-ch1.tok