Stanford nlp 是否有可能获得一组包含短语的特定命名实体标记_Stanford Nlp_Named Entity Recognition

Stanford nlp 是否有可能获得一组包含短语的特定命名实体标记

stanford-nlp

Stanford nlp 是否有可能获得一组包含短语的特定命名实体标记,stanford-nlp,named-entity-recognition,Stanford Nlp,Named Entity Recognition,我正在使用斯坦福大学的CoreNLP解析器浏览一些文本，其中有日期短语，比如“十月的第二个星期一”和“过去的一年”。库将适当地将每个标记标记为一个日期命名实体，但是有没有一种方法可以通过编程获得整个日期短语？不仅仅是日期，命名实体的组织也会这样做（“例如，国际奥林匹克委员会，可以在给定的文本示例中确定）在斯坦福注释器和分类器加载后，将产生以下输出： DATE: Thanksgiving DATE: Thanksgiving DATE: the DATE: second DATE: Monday

我正在使用斯坦福大学的CoreNLP解析器浏览一些文本，其中有日期短语，比如“十月的第二个星期一”和“过去的一年”。库将适当地将每个标记标记为一个日期命名实体，但是有没有一种方法可以通过编程获得整个日期短语？不仅仅是日期，命名实体的组织也会这样做（“例如，国际奥林匹克委员会，可以在给定的文本示例中确定）

在斯坦福注释器和分类器加载后，将产生以下输出：

DATE: Thanksgiving
DATE: Thanksgiving
DATE: the
DATE: second
DATE: Monday
DATE: in
DATE: October
DATE: the
DATE: past
DATE: year

我觉得库必须识别这些短语并将它们用于命名实体标记，所以问题是数据是否通过api以某种方式保存和可用

谢谢，

Kevin

在讨论邮件列表后，我发现api不支持这一点。我的解决方案是保持最后一个网元的状态，并在必要时构建一个字符串。nlp邮件列表中的John B.有助于回答我的问题。

在对邮件列表进行讨论后，我发现api不支持这一点。我的解决方案是保持最后一个网元的状态，并在必要时构建一个字符串。nlp邮件列表中的John B.对回答我的问题很有帮助。

非常感谢，我也打算这么做。然而，Stanford NER API支持

classifyToCharOffset

（或类似的东西）来获取整个短语。我不知道，也许这只是你想法的一个实现：D.

非常感谢，我也打算这么做。然而，Stanford NER API支持

classifyToCharOffset

（或类似的东西）来获取整个短语。我不知道，也许这只是你想法的一个实现：D.

命名实体标记器和词类标记器是CoreNLP管道中不同的算法，似乎API消费者的任务是集成它们

请原谅我的C#但这里有一个简单的类：

    public class NamedNounPhrase
    {
        public NamedNounPhrase()
        {
            Phrase = string.Empty;
            Tags = new List<string>();
        }

        public string Phrase { get; set; }

        public IList<string> Tags { get; set; }

    }

公共类名称
{
公共名称名词短语（）
{
短语=string.Empty；
标签=新列表（）；
}
公共字符串短语{get；set；}
公共IList标记{get；set；}
}

以及查找所有顶级名词短语及其相关命名实体标记的代码：

    private void _monkey()
    {

        ...

        var nounPhrases = new List<NamedNounPhrase>();

        foreach (CoreMap sentence in sentences.toArray())
        {
            var tree =
                (Tree)sentence.get(new TreeCoreAnnotations.TreeAnnotation().getClass());

            if (null != tree)
                _walk(tree, nounPhrases);
        }

        foreach (var nounPhrase in nounPhrases)
            Console.WriteLine(
                "{0} ({1})",
                nounPhrase.Phrase,
                string.Join(", ", nounPhrase.Tags)
                );
    }

    private void _walk(Tree tree, IList<NamedNounPhrase> nounPhrases)
    {
        if ("NP" == tree.value())
        {
            var nounPhrase = new NamedNounPhrase();

            foreach (Tree leaf in tree.getLeaves().toArray())
            {
                var label = (CoreLabel) leaf.label();
                nounPhrase.Phrase += (string) label.get(new CoreAnnotations.TextAnnotation().getClass()) + " ";
                nounPhrase.Tags.Add((string) label.get(new CoreAnnotations.NamedEntityTagAnnotation().getClass()));
            }

            nounPhrases.Add(nounPhrase);
        }
        else
        {
            foreach (var child in tree.children())
            {
                _walk(child, nounPhrases);
            }
        }
    }

private void\u monkey（）
{
...
var名词短语=新列表（）；
foreach（句中的CoreMap句。toArray（））
{
变异树=
（树）语句.get（新的treecoreantations.TreeAnnotation（）.getClass（））；
if（null！=树）
_散步（树、名词）；
}
foreach（名词短语中的名词短语）
控制台写入线(
"{0} ({1})",
名词短语，短语，
string.Join（“，”名词短语.Tags）
);
}
私人空间散步（树木、树木）
{
如果（“NP”==tree.value（））
{
var nounophase=新名称nounophase（）；
foreach（Tree.getLeaves（）.toArray（）中的树叶）
{
var label=（corelab）leaf.label（）；
名词短语.短语+=（字符串）标签.get（新的CoreAnnotations.TextAnnotation（）.getClass（））+“”；
nounephase.Tags.Add（（string）label.get（newcoreAnnotations.NamedEntityTagAnnotation（）.getClass（））；
}
名词短语。添加（名词短语）；
}
其他的
{
foreach（树中的var child.children（））
{
_步行（儿童，名词短语）；
}
}
}

希望有帮助

命名实体标记器和词性标记器是CoreNLP管道中的不同算法，API使用者似乎负责将它们集成起来

请原谅我的C#但这里有一个简单的类：

    public class NamedNounPhrase
    {
        public NamedNounPhrase()
        {
            Phrase = string.Empty;
            Tags = new List<string>();
        }

        public string Phrase { get; set; }

        public IList<string> Tags { get; set; }

    }

公共类名称
{
公共名称名词短语（）
{
短语=string.Empty；
标签=新列表（）；
}
公共字符串短语{get；set；}
公共IList标记{get；set；}
}

以及查找所有顶级名词短语及其相关命名实体标记的代码：

    private void _monkey()
    {

        ...

        var nounPhrases = new List<NamedNounPhrase>();

        foreach (CoreMap sentence in sentences.toArray())
        {
            var tree =
                (Tree)sentence.get(new TreeCoreAnnotations.TreeAnnotation().getClass());

            if (null != tree)
                _walk(tree, nounPhrases);
        }

        foreach (var nounPhrase in nounPhrases)
            Console.WriteLine(
                "{0} ({1})",
                nounPhrase.Phrase,
                string.Join(", ", nounPhrase.Tags)
                );
    }

    private void _walk(Tree tree, IList<NamedNounPhrase> nounPhrases)
    {
        if ("NP" == tree.value())
        {
            var nounPhrase = new NamedNounPhrase();

            foreach (Tree leaf in tree.getLeaves().toArray())
            {
                var label = (CoreLabel) leaf.label();
                nounPhrase.Phrase += (string) label.get(new CoreAnnotations.TextAnnotation().getClass()) + " ";
                nounPhrase.Tags.Add((string) label.get(new CoreAnnotations.NamedEntityTagAnnotation().getClass()));
            }

            nounPhrases.Add(nounPhrase);
        }
        else
        {
            foreach (var child in tree.children())
            {
                _walk(child, nounPhrases);
            }
        }
    }

private void\u monkey（）
{
...
var名词短语=新列表（）；
foreach（句中的CoreMap句。toArray（））
{
变异树=
（树）语句.get（新的treecoreantations.TreeAnnotation（）.getClass（））；
if（null！=树）
_散步（树、名词）；
}
foreach（名词短语中的名词短语）
控制台写入线(
"{0} ({1})",
名词短语，短语，
string.Join（“，”名词短语.Tags）
);
}
私人空间散步（树木、树木）
{
如果（“NP”==tree.value（））
{
var nounophase=新名称nounophase（）；
foreach（Tree.getLeaves（）.toArray（）中的树叶）
{
var label=（corelab）leaf.label（）；
名词短语.短语+=（字符串）标签.get（新的CoreAnnotations.TextAnnotation（）.getClass（））+“”；
nounephase.Tags.Add（（string）label.get（newcoreAnnotations.NamedEntityTagAnnotation（）.getClass（））；
}
名词短语。添加（名词短语）；
}
其他的
{
foreach（树中的var child.children（））
{
_步行（儿童，名词短语）；
}
}
}

希望有帮助