Java 基于uimaFIT代码的字典示例
我正在看一看,我只是有相当多的困难,以增加一个 这是我迄今为止最好的一次闭嘴:Java 基于uimaFIT代码的字典示例,java,uima,Java,Uima,我正在看一看,我只是有相当多的困难,以增加一个 这是我迄今为止最好的一次闭嘴: public class LocationAnnotator extends JCasAnnotator_ImplBase { public static final String RES_DICTIONARY = "dictionary"; @ExternalResource(key = RES_DICTIONARY) private DataResource resource;
public class LocationAnnotator extends JCasAnnotator_ImplBase {
public static final String RES_DICTIONARY = "dictionary";
@ExternalResource(key = RES_DICTIONARY)
private DataResource resource;
private Dictionary dictionary;
@Override
public void initialize(UimaContext context) throws ResourceInitializationException {
super.initialize(context);
try {
DictionaryBuilder dictBuilder = new HashMapDictionaryBuilder();
// create dictionary file parser
DictionaryFileParserImpl fileParser = new DictionaryFileParserImpl();
fileParser.parseDictionaryFile(resource.getUri().getPath(), resource.getInputStream(), dictBuilder);
dictionary = dictBuilder.getDictionary();
} catch (IOException e) {
throw new ResourceInitializationException();
}
}
@Override
public void process(JCas cas) throws AnalysisEngineProcessException {
String docText = cas.getDocumentText();
for (String line : docText.split("\n")) {
for (String word : line.split(" ")) {
if (dictionary.contains(word)) {
int pos = docText.indexOf(word);
Location annotation = new Location(cas, pos, pos + word.length());
annotation.addToIndexes();
}
}
}
}
}
我是这样执行引擎的:
CollectionReaderDescription reader = CollectionReaderFactory.createReaderDescription(CvReader.class, CvReader.PARAM_INPUT_FILE, "docs/simple-doc.txt");
AnalysisEngineDescription tokenizer = AnalysisEngineFactory.createEngineDescription(LocationAnnotator.class);
ExternalResourceFactory.bindResource(tokenizer, LocationAnnotator.RES_DICTIONARY, "META-INF/dictionaries/location.dict.xml");
for (JCas cas : SimplePipeline.iteratePipeline(reader, tokenizer)) {
for (Location location : JCasUtil.select(cas, Location.class)) {
System.out.println("Found location: " + location.getCoveredText());
}
}
没有比这更优雅的方式了吗?不喜欢初始化。将使用注释作为@ExternalResource
初始化字典
如果有人能给我举个更简单的例子,我会很高兴的。。谢谢 只有在a)继承自uimaFIT版本的JCasAnnotator_ImplBase,b)在重写派生类中的initialize(…)方法时调用super.initialize(…)时,@ExternalResource之类的uimaFIT注释才起作用。谢谢!我在上面的代码中调整了b)。看起来更好,但直接实例化是不可能的,我想?你说的“直接实例化”是什么意思?我希望目录也存在类似于
@ExternalResource
的注释,这样我就不需要自己在初始化(UimaContext上下文)中初始化它了
方法。这是如何实现外部资源的问题。可以实现更智能的外部资源,例如,实现作为外部资源的字典,并在其中实现加载逻辑。这使得外部资源的代码更加复杂,但是组件代码变得更加紧凑。