Python 如何在Spacy中训练具有不同波束目标参数的NER模型？_Python_Spacy

Python 如何在Spacy中训练具有不同波束目标参数的NER模型？

python

Python 如何在Spacy中训练具有不同波束目标参数的NER模型？,python,spacy,Python,Spacy,我试图用几轮光束物镜（而不是光束宽度=1）来更新预先训练过的spacy模型en\u core\u web\u md），但我似乎找不到正确的方法将不同的参数传递到**cfg，以便模型使用它们进行训练（此时）这是我最近的一次尝试： pipe_exceptions = ["ner", "trf_wordpiecer", "trf_tok2vec"] other_pipes = [pipe for pipe in nlp.pipe_names

我试图用几轮光束物镜（而不是

光束宽度=1

）来更新预先训练过的spacy模型

en\u core\u web\u md

），但我似乎找不到正确的方法将不同的参数传递到

**cfg

，以便模型使用它们进行训练（此时）

这是我最近的一次尝试：

pipe_exceptions = ["ner", "trf_wordpiecer", "trf_tok2vec"]
other_pipes = [pipe for pipe in nlp.pipe_names if pipe not in pipe_exceptions]
# only train NER
with nlp.disable_pipes(*other_pipes), warnings.catch_warnings():
    # show warnings for misaligned entity spans once
    warnings.filterwarnings("once", category=UserWarning, module='spacy')

    # TRY TO FORCE BEAM TRAINING INSTEAD OF GREEDY METHOD
    nlp.use_params({'ner':{'beam_width':16, 'beam_density':0.0001}})
    print(nlp.meta) 

    sizes = compounding(1.0, 4.0, 1.001)
    # batch up the examples using spaCy's minibatch
    for itn in range(n_iter):
        random.shuffle(TRAIN_DATA_2)
        batches = minibatch(TRAIN_DATA_2, size=sizes)
        losses = {}
        for batch in batches:
            texts, annotations = zip(*batch)
            nlp.update(texts, 
            annotations, 
            sgd=optimizer, 
            drop=0.35, 
            losses=losses
            )
        print("Losses", losses)

但是，经过培训后，

model/ner/cfg

文件仍然列出：

{
"beam_width":1,
"beam_density":0.0,
"beam_update_prob":1.0,
...

因此，我有几个问题：

我能用新的光束目标更新现有的贪婪训练模型吗

如果为真，我如何正确更改这些培训参数（并确认它们已更改）

如果为false，对于新的从头开始的模型，如何正确更改这些训练参数（并确认它们已更改）

为什么这样做？ 我试图训练一个模型，该模型为我可以向我的用户展示的NER决策提供概率。波斯特和其他一些人展示了如何使用beam_parse从贪婪模型中获得事后概率。然而，他们都提到贪心模型没有经过全局目标的训练，所以这些分数不是特别有意义，除非你也进行一些迭代的波束训练。（）

我从中找到了答案。这是修改配置参数的语法

nlp.entity.cfg['beam_width'] = 16
nlp.entity.cfg['beam_density'] = 0.0001

我将这些行放在

optimizer=nlp.resume\u training（）

前面，这些值用于培训。

您是否使用类似

nlp.resume\u training（**cfg）

或

nlp.begin\u training（**cfg）

的方法启动培训？你试过把你的参数传给那里吗？@SergeyBushmanov我试过

nlp。恢复训练（波束宽度=16，波束密度=0.0001）

和

spacy.load（型号，波束宽度=16，波束密度=0.0001）

。两者都完成了模型运行，但结果中的cfg文件显示beam_width=1。我还尝试了

nlp.update（text，annotation，sgd=optimizer，drop=0.35，loss=loss，component_cfg={'ner'：{'beam_width'：16，'beam_density'：0.0001}）

，但运行失败，因为'ner.update'不允许附加参数（）尝试将您的问题发布到。完成。