如何培训定制型号OPENNLP？_Nlp_Opennlp

如何培训定制型号OPENNLP？

nlp

如何培训定制型号OPENNLP？,nlp,opennlp,Nlp,Opennlp,我想训练我自己的定制模型。我从哪里开始我正在使用此示例数据来训练模型： <START:meaningless>Took connection and<END> selected the Text in the Letter Template and cleared the Formatting of Text to Normal. 建立连接并选择信函模板中的文本，并将文本格式清除为正常。基本上，我想从给定的输入中识别一些无意义的文本我尝试使用opennlp开

我想训练我自己的定制模型。我从哪里开始

我正在使用此示例数据来训练模型：

<START:meaningless>Took connection and<END>  selected the Text in the Letter Template and cleared the Formatting of Text to Normal.

建立连接并选择信函模板中的文本，并将文本格式清除为正常。

基本上，我想从给定的输入中识别一些无意义的文本

我尝试使用opennlp开发文档中给出的以下示例代码但出现错误：模型与名称查找器不兼容

Charset charset = Charset.forName("UTF-8"); ObjectStream<String> lineStream = new PlainTextByLineStream(new FileInputStream("mynewmodel.train"), charset); ObjectStream<NameSample> sampleStream = new NameSampleDataStream(lineStream); TokenNameFinderModel model; try { model = NameFinderME.train("en", "meaningless", sampleStream, Collections.<String, Object>emptyMap(), 100, 5); } finally { sampleStream.close(); } try { modelOut = new BufferedOutputStream(new FileOutputStream(modelFile)); model.serialize(modelOut); } finally { if (modelOut != null) modelOut.close(); }

Charset Charset=Charset.forName（“UTF-8”）；对象流lineStream= 新的明文ByLineStream（新的FileInputStream（“mynewmodel.train”），字符集； ObjectStream sampleStream=新名称采样数据流（lineStream）； TokenNameFinderModel模型；试一试{ 模型=名称finderme.train（“en”，“无意义”，sampleStream， Collections.emptyMap（），100,5）； } 最后{ sampleStream.close（）； } 试一试{ modelOut=newbufferedoutputstream（newfileoutputstream（modelFile））；序列化（modelOut）； }最后{ if（modelOut！=null） modelOut.close（）； }
可能的问题：您没有向培训师呈现清晰的标记文本。如果我正确理解了文档，那么PlainTextByLineStream需要以空格分隔的标记。所以

<START:meaningless> Took connection and <END>

已建立连接并
而不是

<START:meaningless>Took connection and<END>

已建立连接并
一个问题是“mynewmodel.train”是什么类型的文件？？