如何访问Lucene POS属性（日本Kuromoji分析器）_Lucene

如何访问Lucene POS属性（日本Kuromoji分析器）

lucene

如何访问Lucene POS属性（日本Kuromoji分析器）,lucene,Lucene,我正在尝试将日语文本标记化，并将词性属性提取为 Kuromoji/Lucene附带了一个属性实现，该实现应该提供POS数据，但我无法提取该数据-我在POS.getPartOfSpeech（）行上得到了一个NullPointerException。特许属性打印。我错过了什么，做错了什么 String content = "こんばんは今日寒かったですね今日、頂いたお菓子があまりにも美味しくて上り羊羹御利益ありそうな、ネーミングぷるんぷるんの、上品な水羊羹です！そして、スイーツもう

我正在尝试将日语文本标记化，并将词性属性提取为
Kuromoji/Lucene附带了一个属性实现，该实现应该提供POS数据，但我无法提取该数据-我在POS.getPartOfSpeech（）行上得到了一个NullPointerException。特许属性打印。我错过了什么，做错了什么

String content = "こんばんは今日寒かったですね今日、頂いたお菓子があまりにも美味しくて上り羊羹御利益ありそうな、ネーミングぷるんぷるんの、上品な水羊羹です！そして、スイーツもう一品！先日アップしたお友達の干し芋。"; Analyzer analyzer = new JapaneseAnalyzer(); TokenStream stream = analyzer.tokenStream("TEXT", content); Iterator<AttributeImpl> it = stream.getAttributeImplsIterator(); while (it.hasNext()) { AttributeImpl attr = it.next(); System.out.println(attr.getClass()); } CharTermAttribute term = stream.addAttribute(CharTermAttribute.class); PartOfSpeechAttributeImpl pos = stream.getAttribute(PartOfSpeechAttributeImpl.class); stream.reset(); while (stream.incrementToken()) { System.out.println("[" + term.toString() + "]: "); System.out.println(pos.getPartOfSpeech()); }
我还遵循了其他Stackoverflow帖子的建议，为SpeechAttributeImpl部分添加属性（）而不是getAttribute（）。但这给了我一个IllegalArgumentException（尽管此Arributempl实现了Lucene属性）：
仅供参考：目前我们使用Lucene 6.0.0。索引和搜索在日语中工作得很好，因为Kuromoji包默认包含在Lucene发行版中（您只需要选择JapaneseAnalyzer）。这种标记化过程发生在索引化或搜索之外，因此与特定字段无关；它用于不同的目的

谢谢
部分SpeechAttributeImpl不正确。这应该是演讲的一部分

PartOfSpeechAttribute pattr = stream.addAttribute(PartOfSpeechAttribute.class); try { stream.reset(); while (stream.incrementToken()) { cattr.toString(); String pos[] = pattr.getPartOfSpeech().split("-"); Token token = new Token(stream.getAttribute(CharTermAttribute.class).toString(), pos); result.add(token); } stream.close(); } catch (IOException e) { return result; } finally { analyzer.close(); } return result;

java.lang.IllegalArgumentException: addAttribute() only accepts an interface that extends Attribute, but org.apache.lucene.analysis.ja.tokenattributes.PartOfSpeechAttributeImpl does not fulfil this contract. at org.apache.lucene.util.AttributeSource.addAttribute(AttributeSource.java:210) at ...

PartOfSpeechAttribute pattr = stream.addAttribute(PartOfSpeechAttribute.class); try { stream.reset(); while (stream.incrementToken()) { cattr.toString(); String pos[] = pattr.getPartOfSpeech().split("-"); Token token = new Token(stream.getAttribute(CharTermAttribute.class).toString(), pos); result.add(token); } stream.close(); } catch (IOException e) { return result; } finally { analyzer.close(); } return result;