Java 基于uimaFIT代码的字典示例

Java 基于uimaFIT代码的字典示例,java,uima,Java,Uima,我正在看一看,我只是有相当多的困难,以增加一个 这是我迄今为止最好的一次闭嘴: public class LocationAnnotator extends JCasAnnotator_ImplBase { public static final String RES_DICTIONARY = "dictionary"; @ExternalResource(key = RES_DICTIONARY) private DataResource resource;

我正在看一看,我只是有相当多的困难,以增加一个

这是我迄今为止最好的一次闭嘴:

public class LocationAnnotator extends JCasAnnotator_ImplBase {

    public static final String RES_DICTIONARY = "dictionary";

    @ExternalResource(key = RES_DICTIONARY)
    private DataResource resource;
    private Dictionary dictionary;

    @Override
    public void initialize(UimaContext context) throws ResourceInitializationException {
        super.initialize(context);
        try {
            DictionaryBuilder dictBuilder = new HashMapDictionaryBuilder();
            // create dictionary file parser
            DictionaryFileParserImpl fileParser = new DictionaryFileParserImpl();
            fileParser.parseDictionaryFile(resource.getUri().getPath(), resource.getInputStream(), dictBuilder);
            dictionary = dictBuilder.getDictionary();
        } catch (IOException e) {
            throw new ResourceInitializationException();
        }
    }

    @Override
    public void process(JCas cas) throws AnalysisEngineProcessException {
        String docText = cas.getDocumentText();
        for (String line : docText.split("\n")) {
            for (String word : line.split(" ")) {
                if (dictionary.contains(word)) {
                    int pos = docText.indexOf(word);
                    Location annotation = new Location(cas, pos, pos + word.length());
                    annotation.addToIndexes();
                }
            }
        }

    }
}
我是这样执行引擎的:

CollectionReaderDescription reader = CollectionReaderFactory.createReaderDescription(CvReader.class, CvReader.PARAM_INPUT_FILE, "docs/simple-doc.txt");

AnalysisEngineDescription tokenizer = AnalysisEngineFactory.createEngineDescription(LocationAnnotator.class);
ExternalResourceFactory.bindResource(tokenizer, LocationAnnotator.RES_DICTIONARY, "META-INF/dictionaries/location.dict.xml");

for (JCas cas : SimplePipeline.iteratePipeline(reader, tokenizer)) {
    for (Location location : JCasUtil.select(cas, Location.class)) {
        System.out.println("Found location: " + location.getCoveredText());
    }
}
没有比这更优雅的方式了吗?不喜欢初始化。将使用注释作为
@ExternalResource
初始化字典


如果有人能给我举个更简单的例子,我会很高兴的。。谢谢

只有在a)继承自uimaFIT版本的JCasAnnotator_ImplBase,b)在重写派生类中的initialize(…)方法时调用super.initialize(…)时,@ExternalResource之类的uimaFIT注释才起作用。谢谢!我在上面的代码中调整了b)。看起来更好,但直接实例化是不可能的,我想?你说的“直接实例化”是什么意思?我希望目录也存在类似于
@ExternalResource
的注释,这样我就不需要自己在
初始化(UimaContext上下文)中初始化它了
方法。这是如何实现外部资源的问题。可以实现更智能的外部资源,例如,实现作为外部资源的字典,并在其中实现加载逻辑。这使得外部资源的代码更加复杂,但是组件代码变得更加紧凑。