Stanford nlp 斯坦福NLP:处理中文文本时FileNotFoundException

Stanford nlp 斯坦福NLP:处理中文文本时FileNotFoundException,stanford-nlp,Stanford Nlp,我一直在尝试使用斯坦福CoreNLP中文版,使用他们现有的中文模型() 当我按照中的建议执行以下命令时- 对于这个文件,我总是得到一个java.io.FileNotFoundException,/u/nlp/data/chinese/distsim/xin\u cmn\u 2000-2010.ldc.seg.utf8.all-c1000 下面是完整的堆栈跟踪- Registering annotator segment with class edu.stanford.nlp.pipeline.C

我一直在尝试使用斯坦福CoreNLP中文版,使用他们现有的中文模型()

当我按照中的建议执行以下命令时-

对于这个文件,我总是得到一个java.io.FileNotFoundException/u/nlp/data/chinese/distsim/xin\u cmn\u 2000-2010.ldc.seg.utf8.all-c1000

下面是完整的堆栈跟踪-

Registering annotator segment with class edu.stanford.nlp.pipeline.ChineseSegmenterAnnotator Adding annotator segment Loading Segmentation Model [edu/stanford/nlp/models/segmenter/chinese/ctb.gz]...Loading classifier from edu/stanford/nlp/models/segmenter/chinese/ctb.gz ... Loading Chinese dictionaries from 1 files:   edu/stanford/nlp/models/segmenter/chinese/dict-chris6.ser.gz

loading dictionaries from edu/stanford/nlp/models/segmenter/chinese/dict-chris6.ser.gz...Done. Unique words in ChineseDictionary is: 423200 done [19.6 sec]. done. Time elapsed: 19670 ms 
Adding annotator ssplit edu.stanford.nlp.pipeline.AnnotatorImplementations:ssplit.boundaryTokenRegex=[.]|[!?]+|[。]|[!?]+

Adding annotator pos 
Reading POS tagger model from edu/stanford/nlp/models/pos-tagger/chinese-distsim/chinese-distsim.tagger ... done [2.8 sec]. 
Adding annotator ner 
Loading classifier from edu/stanford/nlp/models/ner/chinese.misc.distsim.crf.ser.gz ... 
Loading distsim lexicon from /u/nlp/data/chinese/distsim/xin_cmn_2000-2010.ldc.seg.utf8.all-c1000 ... 

edu.stanford.nlp.io.RuntimeIOException: java.io.FileNotFoundException: 
/u/nlp/data/chinese/distsim/xin_cmn_2000-2010.ldc.seg.utf8.all-c1000 (No such file or directory)

    at edu.stanford.nlp.io.IOUtils.inputStreamFromFile(IOUtils.java:481)
    at edu.stanford.nlp.io.IOUtils.readerFromFile(IOUtils.java:522)
    at edu.stanford.nlp.objectbank.ReaderIteratorFactory$ReaderIterator.setNextObject(ReaderIteratorFactory.java:189)
    at edu.stanford.nlp.objectbank.ReaderIteratorFactory$ReaderIterator.<init>(ReaderIteratorFactory.java:161)
    at edu.stanford.nlp.objectbank.ReaderIteratorFactory.iterator(ReaderIteratorFactory.java:98)
    at edu.stanford.nlp.objectbank.ObjectBank$OBIterator.<init>(ObjectBank.java:404)
    at edu.stanford.nlp.objectbank.ObjectBank.iterator(ObjectBank.java:242)
    at edu.stanford.nlp.ie.NERFeatureFactory.initLexicon(NERFeatureFactory.java:474)
    at edu.stanford.nlp.ie.NERFeatureFactory.init(NERFeatureFactory.java:382)
    at edu.stanford.nlp.ie.AbstractSequenceClassifier.reinit(AbstractSequenceClassifier.java:172)
    at edu.stanford.nlp.ie.crf.CRFClassifier.loadClassifier(CRFClassifier.java:2619)
    at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1666)
    at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1721)
    at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1708)
    at edu.stanford.nlp.ie.crf.CRFClassifier.getClassifier(CRFClassifier.java:2836)
    at edu.stanford.nlp.ie.ClassifierCombiner.loadClassifierFromPath(ClassifierCombiner.java:189)
    at edu.stanford.nlp.ie.ClassifierCombiner.loadClassifiers(ClassifierCombiner.java:173)
    at edu.stanford.nlp.ie.ClassifierCombiner.<init>(ClassifierCombiner.java:113)   at edu.stanford.nlp.ie.NERClassifierCombiner.<init>(NERClassifierCombiner.java:65)
    at edu.stanford.nlp.pipeline.AnnotatorImplementations.ner(AnnotatorImplementations.java:99)
    at edu.stanford.nlp.pipeline.AnnotatorFactories$6.create(AnnotatorFactories.java:319
    at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:85)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:289)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:126)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:122)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.main(StanfordCoreNLP.java:1056)
Caused by: java.io.FileNotFoundException: /u/nlp/data/chinese/distsim/xin_cmn_2000-2010.ldc.seg.utf8.all-c1000 (No such file or directory)
    at java.io.FileInputStream.open(Native Method)
    at java.io.FileInputStream.<init>(FileInputStream.java:138)
    at edu.stanford.nlp.io.IOUtils.inputStreamFromFile(IOUtils.java:475)    ... 25 more 

Loading classifier from edu/stanford/nlp/models/ner/chinese.misc.distsim.crf.ser.gz ... Exception in thread "main" edu.stanford.nlp.io.RuntimeIOException: java.io.FileNotFoundException
    at edu.stanford.nlp.pipeline.AnnotatorFactories$6.create(AnnotatorFactories.java:321)
    at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:85)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:289)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:126)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:122)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.main(StanfordCoreNLP.java:1056)
Caused by: java.io.FileNotFoundException
    at edu.stanford.nlp.ie.ClassifierCombiner.loadClassifierFromPath(ClassifierCombiner.java:199)
    at edu.stanford.nlp.ie.ClassifierCombiner.loadClassifiers(ClassifierCombiner.java:173)
    at edu.stanford.nlp.ie.ClassifierCombiner.<init>(ClassifierCombiner.java:113)
    at edu.stanford.nlp.ie.NERClassifierCombiner.<init>(NERClassifierCombiner.java:65)
    at edu.stanford.nlp.pipeline.AnnotatorImplementations.ner(AnnotatorImplementations.java:99)
    at edu.stanford.nlp.pipeline.AnnotatorFactories$6.create(AnnotatorFactories.java:319)   ... 5 more
Caused by: java.lang.ClassCastException: java.util.ArrayList cannot be cast to edu.stanford.nlp.classify.LinearClassifier
    at edu.stanford.nlp.ie.ner.CMMClassifier.loadClassifier(CMMClassifier.java:1070)
    at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1666)
    at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1721)
    at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1708)
    at edu.stanford.nlp.ie.ner.CMMClassifier.getClassifier(CMMClassifier.java:1116)
    at edu.stanford.nlp.ie.ClassifierCombiner.loadClassifierFromPath(ClassifierCombiner.java:195)   ... 10 more
向类edu.stanford.nlp.pipeline.chinesesesegmenternotator注册注释器段添加注释器段加载分段模型[edu/stanford/nlp/models/segmenter/chinese/ctb.gz]…从edu/stanford/nlp/models/segmenter/chinese/ctb.gz加载分类器。。。从1个文件加载中文词典:edu/stanford/nlp/models/segmenter/Chinese/dict-chris6.ser.gz
从edu/stanford/nlp/models/segmenter/chinese/dict-chris6.ser.gz加载字典…完成。中文字典中唯一的单词是:423200完成[19.6秒]。完成。时间:19670毫秒
添加注释器ssplit edu.stanford.nlp.pipeline.AnnotatorImplements:ssplit.boundaryTokenRegex=[.]|[!?]+|[.]|[!?]+
添加注释器位置
阅读来自edu/stanford/nlp/models/POS-tagger/chinese-distsim/chinese-distsim.tagger的POS-tagger模型。。。完成[2.8秒]。
添加注释器
正在从edu/stanford/nlp/models/ner/chinese.misc.distsim.crf.ser.gz加载分类器。。。
正在从/u/nlp/data/chinese/distsim/xin_cmn_2000-2010.ldc.seg.utf8.all-c1000加载distsim词典。。。
edu.stanford.nlp.io.RuntimeIOException:java.io.FileNotFoundException:
/u/nlp/data/chinese/distsim/xin_cmn_2000-2010.ldc.seg.utf8.all-c1000(无此类文件或目录)
位于edu.stanford.nlp.io.IOUtils.inputStreamFromFile(IOUtils.java:481)
位于edu.stanford.nlp.io.IOUtils.readerFromFile(IOUtils.java:522)
位于edu.stanford.nlp.objectbank.ReaderIteratorFactory$ReaderIterator.setNextObject(ReaderIteratorFactory.java:189)
位于edu.stanford.nlp.objectbank.ReaderIteratorFactory$ReaderIterator.(ReaderIteratorFactory.java:161)
位于edu.stanford.nlp.objectbank.ReaderIteratorFactory.iterator(ReaderIteratorFactory.java:98)
在edu.stanford.nlp.objectbank.objectbank$OBIterator.(objectbank.java:404)
位于edu.stanford.nlp.objectbank.objectbank.iterator(objectbank.java:242)
位于edu.stanford.nlp.ie.nerfeaturefacture.initLexicon(nerfeaturefacture.java:474)
位于edu.stanford.nlp.ie.NERFeatureFactory.init(NERFeatureFactory.java:382)
位于edu.stanford.nlp.ie.AbstractSequenceClassifier.reinit(AbstractSequenceClassifier.java:172)
位于edu.stanford.nlp.ie.crf.crfclassizer.loadClassifier(crfclassizer.java:2619)
位于edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1666)
位于edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1721)
位于edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1708)
位于edu.stanford.nlp.ie.crf.crfclassizer.getClassifier(crfclassizer.java:2836)
位于edu.stanford.nlp.ie.ClassifierCombiner.loadClassifierFromPath(ClassifierCombiner.java:189)
在edu.stanford.nlp.ie.ClassifierCombiner.loadClassifiers(ClassifierCombiner.java:173)
在edu.stanford.nlp.ie.ClassifierCombiner.(ClassifierCombiner.java:113)在edu.stanford.nlp.ie.NERClassifierCombiner.(NERClassifierCombiner.java:65)
位于edu.stanford.nlp.pipeline.annotatorimplements.ner(annotatorimplements.java:99)
位于edu.stanford.nlp.pipeline.AnnotatorFactorys$6.create(annotatorFactorys.java:319
位于edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:85)
位于edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:289)
位于edu.stanford.nlp.pipeline.StanfordCoreNLP.(StanfordCoreNLP.java:126)
位于edu.stanford.nlp.pipeline.StanfordCoreNLP.(StanfordCoreNLP.java:122)
位于edu.stanford.nlp.pipeline.StanfordCoreNLP.main(StanfordCoreNLP.java:1056)
原因:java.io.FileNotFoundException:/u/nlp/data/chinese/distsim/xin\u cmn\u 2000-2010.ldc.seg.utf8.all-c1000(无此类文件或目录)
在java.io.FileInputStream.open(本机方法)
位于java.io.FileInputStream。(FileInputStream.java:138)
在edu.stanford.nlp.io.IOUtils.inputStreamFromFile(IOUtils.java:475)上还有25个
从edu/stanford/nlp/models/ner/chinese.misc.distsim.crf.ser.gz加载分类器…线程“main”中的异常edu.stanford.nlp.io.RuntimeIOException:java.io.FileNotFoundException
位于edu.stanford.nlp.pipeline.AnnotatorFactories$6.create(AnnotatorFactories.java:321)
位于edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:85)
位于edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:289)
位于edu.stanford.nlp.pipeline.StanfordCoreNLP.(StanfordCoreNLP.java:126)
位于edu.stanford.nlp.pipeline.StanfordCoreNLP.(StanfordCoreNLP.java:122)
位于edu.stanford.nlp.pipeline.StanfordCoreNLP.main(StanfordCoreNLP.java:1056)
原因:java.io.FileNotFoundException
位于edu.stanford.nlp.ie.ClassifierCombiner.loadClassifierFromPath(ClassifierCombiner.java:199)
在edu.stanford.nlp.ie.ClassifierCombiner.loadClassifiers(ClassifierCombiner.java:173)
在edu.stanford.nlp.ie.ClassifierCombiner.(ClassifierCombiner.java:113)
在edu.stanford.nlp.ie.NERClassifierCombiner.(NERClassifierCombiner.java:65)
位于edu.stanford.nlp.pipeline.annotatorimplements.ner(annotatorimplements.java:99)
在edu.stanford.nlp.pipeline.AnnotatorFactories$6.create(AnnotatorFactories.java:319)…还有5个
原因:java.lang.ClassCastException:java.util.ArrayList无法强制转换为edu.stanford.nlp.classify.LinearClassifier
在edu.stanford.nlp.ie.ner.CMMClassifier.loadClassifier(CMMClassifier.java:1070)
位于edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1666)
位于edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1721)
斯坦福大学
Registering annotator segment with class edu.stanford.nlp.pipeline.ChineseSegmenterAnnotator Adding annotator segment Loading Segmentation Model [edu/stanford/nlp/models/segmenter/chinese/ctb.gz]...Loading classifier from edu/stanford/nlp/models/segmenter/chinese/ctb.gz ... Loading Chinese dictionaries from 1 files:   edu/stanford/nlp/models/segmenter/chinese/dict-chris6.ser.gz

loading dictionaries from edu/stanford/nlp/models/segmenter/chinese/dict-chris6.ser.gz...Done. Unique words in ChineseDictionary is: 423200 done [19.6 sec]. done. Time elapsed: 19670 ms 
Adding annotator ssplit edu.stanford.nlp.pipeline.AnnotatorImplementations:ssplit.boundaryTokenRegex=[.]|[!?]+|[。]|[!?]+

Adding annotator pos 
Reading POS tagger model from edu/stanford/nlp/models/pos-tagger/chinese-distsim/chinese-distsim.tagger ... done [2.8 sec]. 
Adding annotator ner 
Loading classifier from edu/stanford/nlp/models/ner/chinese.misc.distsim.crf.ser.gz ... 
Loading distsim lexicon from /u/nlp/data/chinese/distsim/xin_cmn_2000-2010.ldc.seg.utf8.all-c1000 ... 

edu.stanford.nlp.io.RuntimeIOException: java.io.FileNotFoundException: 
/u/nlp/data/chinese/distsim/xin_cmn_2000-2010.ldc.seg.utf8.all-c1000 (No such file or directory)

    at edu.stanford.nlp.io.IOUtils.inputStreamFromFile(IOUtils.java:481)
    at edu.stanford.nlp.io.IOUtils.readerFromFile(IOUtils.java:522)
    at edu.stanford.nlp.objectbank.ReaderIteratorFactory$ReaderIterator.setNextObject(ReaderIteratorFactory.java:189)
    at edu.stanford.nlp.objectbank.ReaderIteratorFactory$ReaderIterator.<init>(ReaderIteratorFactory.java:161)
    at edu.stanford.nlp.objectbank.ReaderIteratorFactory.iterator(ReaderIteratorFactory.java:98)
    at edu.stanford.nlp.objectbank.ObjectBank$OBIterator.<init>(ObjectBank.java:404)
    at edu.stanford.nlp.objectbank.ObjectBank.iterator(ObjectBank.java:242)
    at edu.stanford.nlp.ie.NERFeatureFactory.initLexicon(NERFeatureFactory.java:474)
    at edu.stanford.nlp.ie.NERFeatureFactory.init(NERFeatureFactory.java:382)
    at edu.stanford.nlp.ie.AbstractSequenceClassifier.reinit(AbstractSequenceClassifier.java:172)
    at edu.stanford.nlp.ie.crf.CRFClassifier.loadClassifier(CRFClassifier.java:2619)
    at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1666)
    at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1721)
    at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1708)
    at edu.stanford.nlp.ie.crf.CRFClassifier.getClassifier(CRFClassifier.java:2836)
    at edu.stanford.nlp.ie.ClassifierCombiner.loadClassifierFromPath(ClassifierCombiner.java:189)
    at edu.stanford.nlp.ie.ClassifierCombiner.loadClassifiers(ClassifierCombiner.java:173)
    at edu.stanford.nlp.ie.ClassifierCombiner.<init>(ClassifierCombiner.java:113)   at edu.stanford.nlp.ie.NERClassifierCombiner.<init>(NERClassifierCombiner.java:65)
    at edu.stanford.nlp.pipeline.AnnotatorImplementations.ner(AnnotatorImplementations.java:99)
    at edu.stanford.nlp.pipeline.AnnotatorFactories$6.create(AnnotatorFactories.java:319
    at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:85)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:289)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:126)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:122)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.main(StanfordCoreNLP.java:1056)
Caused by: java.io.FileNotFoundException: /u/nlp/data/chinese/distsim/xin_cmn_2000-2010.ldc.seg.utf8.all-c1000 (No such file or directory)
    at java.io.FileInputStream.open(Native Method)
    at java.io.FileInputStream.<init>(FileInputStream.java:138)
    at edu.stanford.nlp.io.IOUtils.inputStreamFromFile(IOUtils.java:475)    ... 25 more 

Loading classifier from edu/stanford/nlp/models/ner/chinese.misc.distsim.crf.ser.gz ... Exception in thread "main" edu.stanford.nlp.io.RuntimeIOException: java.io.FileNotFoundException
    at edu.stanford.nlp.pipeline.AnnotatorFactories$6.create(AnnotatorFactories.java:321)
    at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:85)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:289)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:126)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:122)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.main(StanfordCoreNLP.java:1056)
Caused by: java.io.FileNotFoundException
    at edu.stanford.nlp.ie.ClassifierCombiner.loadClassifierFromPath(ClassifierCombiner.java:199)
    at edu.stanford.nlp.ie.ClassifierCombiner.loadClassifiers(ClassifierCombiner.java:173)
    at edu.stanford.nlp.ie.ClassifierCombiner.<init>(ClassifierCombiner.java:113)
    at edu.stanford.nlp.ie.NERClassifierCombiner.<init>(NERClassifierCombiner.java:65)
    at edu.stanford.nlp.pipeline.AnnotatorImplementations.ner(AnnotatorImplementations.java:99)
    at edu.stanford.nlp.pipeline.AnnotatorFactories$6.create(AnnotatorFactories.java:319)   ... 5 more
Caused by: java.lang.ClassCastException: java.util.ArrayList cannot be cast to edu.stanford.nlp.classify.LinearClassifier
    at edu.stanford.nlp.ie.ner.CMMClassifier.loadClassifier(CMMClassifier.java:1070)
    at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1666)
    at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1721)
    at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1708)
    at edu.stanford.nlp.ie.ner.CMMClassifier.getClassifier(CMMClassifier.java:1116)
    at edu.stanford.nlp.ie.ClassifierCombiner.loadClassifierFromPath(ClassifierCombiner.java:195)   ... 10 more