Stanford nlp 如何在核心nlp中使用IOB类型的编码

Stanford nlp 如何在核心nlp中使用IOB类型的编码,stanford-nlp,named-entity-recognition,Stanford Nlp,Named Entity Recognition,我正在尝试使用如下所示的训练集训练我的NER模型 British B-company Broadcasting I-company Corporation I-company British nationality public B-orgTpye service I-orgType broadcaster I-orgTpye headquartered HQ London city Newyork city American B-company Airlines

我正在尝试使用如下所示的训练集训练我的NER模型

British B-company 
Broadcasting    I-company 
Corporation I-company 
British nationality
public  B-orgTpye
service I-orgType
broadcaster I-orgTpye
headquartered   HQ
London  city
Newyork city
American    B-company
Airlines    I-company   
Jaguar  auto
Mercedes    auto
McLaren auto
当我运行CRF分类器时。它不认识B和I。它将它们视为独立的令牌标签

下面是我的分类器代码

String[] String2StringArray =  "The British Broadcasting Corporation is a British public service broadcaster headquartered at Broadcasting House in London";   

    Properties props = new Properties();

            String basedir = ModelLocation");
            props.setProperty("ner.model", customModelFile"));
            props.setProperty("ner.model", basedir);
            props.setProperty("ner.combinationMode", "HIGH_RECALL");
            props.setProperty("ner.useSUTime", "true");
            Property("sutime.includeRange", "true");

            props.setProperty("ner.applyNumericClassifiers", "true");


    StringBuilder classifierOutputAsString = new StringBuilder();
            /*Combining different classifier models*/
            //NERClassifierCombiner classifierCombiner = new NERClassifierCombiner(props);
            NERClassifierCombiner classifierCombiner = new NERClassifierCombiner(true,true,GenericNERModel_A,customModelFile));


            for (String str : String2StringArray) {
                String classifiedToken = classifierCombiner.classifyWithInlineXML(str);
                classifierOutputAsString.append(classifiedToken);       

            }

        System.out.println(classifierOutputAsString.toString());
输出如下图所示:

The <ORGANIZATION>British Broadcasting Corporation</ORGANIZATION> is a <nationality>British</nationality> <B-orgTpye>public</B-orgTpye> <I-orgType>service</I-orgType> <I-orgTpye>broadcaster</I-orgTpye> <HQ>headquartered</HQ> <city>at</city> <ORGANIZATION>Broadcasting House</ORGANIZATION> in <LOCATION>London</LOCATION>
英国广播公司是一家英国公共服务广播公司,总部设在伦敦广播公司

基于克里斯托弗·曼宁之前的回答。我在道具文件中添加了这些行

    props.setProperty("entitySubclassification", "IOB1");
    props.setProperty("retainEntitySubclassification", "true");
    props.setProperty("mergeTags", "true");

现在,它使用IOB类型的编码。

谢谢。这导致我在以下位置进行检查: