Nlp 培训Stanford关系提取器模型时tmp/roth_Sequences.ser上的FileNotFoundException

Nlp 培训Stanford关系提取器模型时tmp/roth_Sequences.ser上的FileNotFoundException,nlp,stanford-nlp,Nlp,Stanford Nlp,我正试图训练我自己的关系提取模型,如前所述,但不断得到一个奇怪的错误 我的属性文件: #Below are some basic options. See edu.stanford.nlp.ie.machinereading.MachineReadingProperties class for more options. # Pipeline options annotators = pos, lemma, parse parse.maxlen = 100 # MachineReading

我正试图训练我自己的关系提取模型,如前所述,但不断得到一个奇怪的错误

我的属性文件:

#Below are some basic options. See edu.stanford.nlp.ie.machinereading.MachineReadingProperties class for more options.

# Pipeline options
annotators = pos, lemma, parse
parse.maxlen = 100

# MachineReading properties. You need one class to read the dataset into correct format. See edu.stanford.nlp.ie.machinereading.domains.ace.AceReader for another example.
datasetReaderClass = edu.stanford.nlp.ie.machinereading.domains.roth.RothCONLL04Reader

readerLogLevel = INFO
#Data directory for training. The datasetReaderClass reads data from this path and makes corresponding sentences and annotations.
trainPath = ../re-training-data.corp

#Whether to crossValidate, that is evaluate, or just train.
crossValidate = false
kfold = 10

#Change this to true if you want to use CoreNLP pipeline generated NER tags. The default model generated with the relation extractor release uses the CoreNLP pipeline provided tags (option set to true$
trainUsePipelineNER=true

# where to save training sentences. uses the file if it exists, otherwise creates it.
serializedTrainingSentencesPath = tmp/roth_sentences.ser

serializedEntityExtractorPath = tmp/roth_entity_model.ser

# where to store the output of the extractor (sentence objects with relations generated by the model). This is what you will use as the model when using 'relation' annotator in the CoreNLP pipeline.
serializedRelationExtractorPath = tmp/kpl-relation-model-pipeline.ser

# uncomment to load a serialized model instead of retraining
# loadModel = true

#relationResultsPrinters = edu.stanford.nlp.ie.machinereading.RelationExtractorResultsPrinter,edu.stanford.nlp.ie.machinereading.domains.roth.RothResultsByRelation. For printing output of the model.
relationResultsPrinters = edu.stanford.nlp.ie.machinereading.RelationExtractorResultsPrinter

#In this domain, this is trivial since all the entities are given (or set using CoreNLP NER tagger).
entityClassifier = edu.stanford.nlp.ie.machinereading.domains.roth.RothEntityExtractor

extractRelations = true
extractEvents = false

#We are setting the entities beforehand so the model does not learn how to extract entities etc.
extractEntities = false

#Opposite of crossValidate.
trainOnly=true

# The set chosen by feature selection using RothCONLL04:
relationFeatures = arg_words,arg_type,dependency_path_lowlevel,dependency_path_words,surface_path_POS,entities_between_args,full_tree_path
以下是我在终端中运行的内容:

sudo java -cp stanford-corenlp-3.7.0.jar:stanford-corenlp-3.7.0-models.jar edu.stanford.nlp.ie.machinereading.MachineReading --arguments kpl-re-model.properties
结果是:

PERCENTAGE OF TRAIN: 1.0
The reader log level is set to INFO
Adding annotator pos
Loading POS tagger from edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger ... done [0.8 sec].
Adding annotator lemma
Adding annotator parse
Loading parser from serialized file edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz ... done [0.6 sec].
Jan 17, 2017 4:55:06 PM edu.stanford.nlp.ie.machinereading.MachineReading makeResultsPrinters
INFO: Making result printers from 
Jan 17, 2017 4:55:06 PM edu.stanford.nlp.ie.machinereading.MachineReading makeResultsPrinters
INFO: Making result printers from edu.stanford.nlp.ie.machinereading.RelationExtractorResultsPrinter
Jan 17, 2017 4:55:06 PM edu.stanford.nlp.ie.machinereading.MachineReading makeResultsPrinters
INFO: Making result printers from 
Jan 17, 2017 4:55:06 PM edu.stanford.nlp.ie.machinereading.MachineReading loadOrMakeSerializedSentences
INFO: Parsing corpus sentences...
Jan 17, 2017 4:55:06 PM edu.stanford.nlp.ie.machinereading.MachineReading loadOrMakeSerializedSentences
INFO: These sentences will be serialized to /home/ubuntu/stanford-corenlp-full-2016-10-31/tmp/roth_sentences.ser
Jan 17, 2017 4:55:06 PM edu.stanford.nlp.ie.machinereading.domains.roth.RothCONLL04Reader read
INFO: Reading file: ../re-training-data.corp
Jan 17, 2017 4:55:07 PM edu.stanford.nlp.ie.machinereading.GenericDataSetReader preProcessSentences
SEVERE: GenericDataSetReader: Started pre-processing the corpus...
Jan 17, 2017 4:55:07 PM edu.stanford.nlp.ie.machinereading.GenericDataSetReader preProcessSentences
INFO: Annotating dataset with edu.stanford.nlp.pipeline.StanfordCoreNLP@5f9d02cb
Jan 17, 2017 4:58:32 PM edu.stanford.nlp.ie.machinereading.GenericDataSetReader preProcessSentences
SEVERE: GenericDataSetReader: Pre-processing complete.
Jan 17, 2017 4:58:32 PM edu.stanford.nlp.ie.machinereading.GenericDataSetReader parse
SEVERE: Changing NER tags using the CoreNLP pipeline.
Replacing old annotator "parse" with signature [edu.stanford.nlp.pipeline.ParserAnnotator#parse.maxlen:100;#] with new annotator with signature [edu.stanford.nlp.pipeline.ParserAnnotator##]
Adding annotator pos
Adding annotator lemma
Adding annotator ner
Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... done [1.4 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz ... done [0.5 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz ... done [0.5 sec].
Jan 17, 2017 4:58:45 PM edu.stanford.nlp.ie.machinereading.MachineReading loadOrMakeSerializedSentences
INFO: Done. Parsed 1183 sentences.
Jan 17, 2017 4:58:45 PM edu.stanford.nlp.ie.machinereading.MachineReading loadOrMakeSerializedSentences
INFO: Serializing parsed sentences to /home/ubuntu/stanford-corenlp-full-2016-10-31/tmp/roth_sentences.ser...
Exception in thread "main" java.io.FileNotFoundException: tmp/roth_sentences.ser (No such file or directory)
    at java.io.FileOutputStream.open0(Native Method)
    at java.io.FileOutputStream.open(FileOutputStream.java:270)
    at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
    at edu.stanford.nlp.io.IOUtils.writeObjectToFile(IOUtils.java:77)
    at edu.stanford.nlp.io.IOUtils.writeObjectToFile(IOUtils.java:63)
    at edu.stanford.nlp.ie.machinereading.MachineReading.loadOrMakeSerializedSentences(MachineReading.java:914)
    at edu.stanford.nlp.ie.machinereading.MachineReading.run(MachineReading.java:270)
    at edu.stanford.nlp.ie.machinereading.MachineReading.main(MachineReading.java:111
该错误声明它找不到“tmp/roth_-sensions.ser”,但它没有意义,因为它应该创建该文件

有什么想法吗

谢谢!
西蒙。

我想如果你把tmp/roth_句子.ser改成roth_句子.ser,它应该会起作用。我猜问题是/home/ubuntu/stanford-corenlp-full-2016-10-31/tmp不存在,所以当它试图写入文件时会崩溃。

成功了!奇怪的是,在发布这个问题之前,我也有同样的怀疑,所以我尝试创建了一个tmp目录,但没有成功。无论如何,谢谢你!