Stanford nlp 如何在Stanford CoreNLP服务器上使用自定义TokensRegex规则注释器?

Stanford nlp 如何在Stanford CoreNLP服务器上使用自定义TokensRegex规则注释器?,stanford-nlp,stanford-nlp-server,corenlp-server,Stanford Nlp,Stanford Nlp Server,Corenlp Server,TokensRegex规则颜色注释器(stanford-corenlp-full-2016-10-31/TokensRegex/color.rules.txt)在通过命令行使用corenlp时成功加载,但在带有java.lang.IllegalArgumentException:Unknown annotator:color的web服务器上失败 设置 # custom.properties annotators=tokenize,ssplit,pos,lemma,ner,regexner,col

TokensRegex规则颜色注释器(
stanford-corenlp-full-2016-10-31/TokensRegex/color.rules.txt
)在通过命令行使用corenlp时成功加载,但在带有
java.lang.IllegalArgumentException:Unknown annotator:color
的web服务器上失败

设置

# custom.properties
annotators=tokenize,ssplit,pos,lemma,ner,regexner,color
customAnnotatorClass.color = edu.stanford.nlp.pipeline.TokensRegexAnnotator
color.rules = tokensregex/color.rules.txt
命令行

$ java -cp "*" -Xmx2g edu.stanford.nlp.pipeline.StanfordCoreNLP -props custom.properties -file ./tokensregex/color.input.txt -outputFormat text
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Registering annotator color with class edu.stanford.nlp.pipeline.TokensRegexAnnotator
...
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator color
[main] INFO edu.stanford.nlp.ling.tokensregex.CoreMapExpressionExtractor - Reading TokensRegex rules from tokensregex/color.rules.txt
[main] INFO edu.stanford.nlp.ling.tokensregex.CoreMapExpressionExtractor - Read 7 rules

# color.input.txt.output
Sentence #1 (9 tokens):
Both blue and light blue are nice colors.
[Text=Both CharacterOffsetBegin=0 CharacterOffsetEnd=4 PartOfSpeech=CC Lemma=both NamedEntityTag=O]
[Text=blue CharacterOffsetBegin=5 CharacterOffsetEnd=9 PartOfSpeech=JJ Lemma=blue NamedEntityTag=COLOR NormalizedNamedEntityTag=#0000FF]
...
服务器

  • java-mx2g-cp“*”edu.stanford.nlp.pipeline.StanfordCoreNLPServer-c custom.properties
  • wget--post data“蓝色和浅蓝色都是不错的颜色。”“localhost:9000/?properties={“annotators”:“tokenize,ssplit,pos,lemma,ner,regexner,color”,“outputFormat”:“json”}'-O-

    HTTP request sent, awaiting response... 500 Internal Server Error
        2016-11-05 14:41:27 ERROR 500: Internal Server Error.
    
    java.lang.IllegalArgumentException: Unknown annotator: color
        at edu.stanford.nlp.pipeline.StanfordCoreNLP.ensurePrerequisiteAnnotators(StanfordCoreNLP.java:304)
        at edu.stanford.nlp.pipeline.StanfordCoreNLPServer$CoreNLPHandler.getProperties(StanfordCoreNLPServer.java:713)
        at edu.stanford.nlp.pipeline.StanfordCoreNLPServer$CoreNLPHandler.handle(StanfordCoreNLPServer.java:540)
        at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
        at sun.net.httpserver.AuthFilter.doFilter(AuthFilter.java:83)
        at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:82)
        at sun.net.httpserver.ServerImpl$Exchange$LinkHandler.handle(ServerImpl.java:675)
        at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
        at sun.net.httpserver.ServerImpl$Exchange.run(ServerImpl.java:647)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
    
  • 解决方案

    在请求中包括自定义注释器属性:
    wget--post data“蓝色和浅蓝色都是不错的颜色”。“localhost:9000/?properties={”color.rules:“tokensregex/color.rules.txt”,“customAnnotatorClass.color:“edu.stanford.nlp.pipeline.TokensRegexAnnotator”,“注释器”:“标记化、ssplit、pos、引理、ner、regexner、color”,“enforceRequirements”:“false”,“outputFormat”:“json”}'-O-

    添加

    HTTP request sent, awaiting response... 500 Internal Server Error
        2016-11-05 14:41:27 ERROR 500: Internal Server Error.
    
    java.lang.IllegalArgumentException: Unknown annotator: color
        at edu.stanford.nlp.pipeline.StanfordCoreNLP.ensurePrerequisiteAnnotators(StanfordCoreNLP.java:304)
        at edu.stanford.nlp.pipeline.StanfordCoreNLPServer$CoreNLPHandler.getProperties(StanfordCoreNLPServer.java:713)
        at edu.stanford.nlp.pipeline.StanfordCoreNLPServer$CoreNLPHandler.handle(StanfordCoreNLPServer.java:540)
        at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
        at sun.net.httpserver.AuthFilter.doFilter(AuthFilter.java:83)
        at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:82)
        at sun.net.httpserver.ServerImpl$Exchange$LinkHandler.handle(ServerImpl.java:675)
        at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
        at sun.net.httpserver.ServerImpl$Exchange.run(ServerImpl.java:647)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
    
    "enforceRequirements":"false"
    

    这给了我一个新的错误
    java.lang.IllegalArgumentException:没有名为color的注释器
    。但是,在一些搜索之后,我发现CoreNLP服务器没有加载。我必须在请求中包含color注释器属性。是否全部
    ner
    regexner
    颜色
    是否适合您?