Java Lucene全文搜索仅适用于与搜索字符串完全匹配的标签
我在使用ApacheLucene进行全文搜索时遇到了一些麻烦。我可以在键入整个标签时检索名称,例如“cat”,但键入“c”不会产生任何结果。我正在使用RDF4J。 这是我使用的SPARQL查询:Java Lucene全文搜索仅适用于与搜索字符串完全匹配的标签,java,lucene,sparql,rdf4j,Java,Lucene,Sparql,Rdf4j,我在使用ApacheLucene进行全文搜索时遇到了一些麻烦。我可以在键入整个标签时检索名称,例如“cat”,但键入“c”不会产生任何结果。我正在使用RDF4J。 这是我使用的SPARQL查询: SELECT DISTINCT ?e2 ?altLabel ?label ?description WHERE { { ?e2 search:matches ?match . ?match search:query ?string ;
SELECT DISTINCT ?e2 ?altLabel ?label ?description WHERE
{
{
?e2 search:matches ?match .
?match search:query ?string ;
search:property ?labelIri ;
search:snippet ?altLabel
}
?e2 ?labelIri ?label.
}
然后,LuceneSailConnection将其转换为:
Distinct
Projection
ProjectionElemList
ProjectionElem "e2"
ProjectionElem "label"
ProjectionElem "description"
Extension
ExtensionElem (description)
Var (name=description)
Join
Join
Join
StatementPattern
Var (name=e2)
Var (name=_const_232d65d1_uri, value=http://www.openrdf.org/contrib/lucenesail#matches, anonymous)
Var (name=match)
StatementPattern
Var (name=match)
Var (name=_const_802884e6_uri, value=http://www.openrdf.org/contrib/lucenesail#query, anonymous)
Var (name=string)
StatementPattern
Var (name=match)
Var (name=_const_f59a94f7_uri, value=http://www.openrdf.org/contrib/lucenesail#property, anonymous)
Var (name=labelIri)
StatementPattern
Var (name=e2)
Var (name=labelIri)
Var (name=label)
这是用于为知识库中的概念及其标签编制索引的代码:
@Override
public void indexLocalKb(KnowledgeBase aKb) throws IOException
{
Analyzer analyzer = new StandardAnalyzer();
Directory directory = FSDirectory
.open(new File(luceneIndexDir, aKb.getRepositoryId()).toPath());
IndexWriter indexWriter = new IndexWriter(directory, new IndexWriterConfig(analyzer));
try (RepositoryConnection conn = getConnection(aKb)) {
RepositoryResult<Statement> stmts = RdfUtils
.getStatementsSparql(conn, null, aKb.getLabelIri(), null,
Integer.MAX_VALUE, false, null);
while (stmts.hasNext()) {
Statement stmt = stmts.next();
String id = stmt.getSubject().stringValue();
String label = stmt.getObject().stringValue();
String predicate = stmt.getPredicate().stringValue();
indexEntity(id, label, predicate, indexWriter);
}
}
indexWriter.close();
}
private void indexEntity(String aId, String aLabel, String aPredictate,
IndexWriter aIndexWriter)
{
try {
String FIELD_ID = "id";
String FIELD_CONTENT = "label";
Document doc = new Document();
doc.add(new StringField(FIELD_ID, aId, Field.Store.YES));
doc.add(new StringField(FIELD_CONTENT, aLabel, Field.Store.YES));
aIndexWriter.addDocument(doc);
aIndexWriter.commit();
log.info("Entity indexed with id [{}] and label [{}], predicate [{}]",
aId, aLabel, aPredictate);
}
catch (IOException e) {
log.error("Could not index entity with id [{}] and label [{}]", aId, aLabel);
}
}
@覆盖
public void indexLocalKb(知识库aKb)引发IOException
{
Analyzer Analyzer=新的StandardAnalyzer();
Directory=FSDirectory
.open(新文件(luceneIndexDir,aKb.getRepositoryId()).toPath());
IndexWriter IndexWriter=newindexwriter(目录,newindexwriterconfig(分析器));
try(RepositoryConnection-conn=getConnection(aKb)){
RepositoryResult stmts=RdfUtils
.getStatementsParql(conn,null,aKb.getLabelIri(),null,
Integer.MAX_值,false,null);
while(stmts.hasNext()){
语句stmt=stmts.next();
字符串id=stmt.getSubject().stringValue();
字符串标签=stmt.getObject().stringValue();
字符串谓词=stmt.getPredicate().stringValue();
索引(id、标签、谓词、indexWriter);
}
}
indexWriter.close();
}
私有无效索引(字符串辅助、字符串阿拉贝尔、字符串A预测、,
索引编写器(索引编写器)
{
试一试{
字符串字段\u ID=“ID”;
字符串字段\u CONTENT=“label”;
单据单据=新单据();
添加文档(新的StringField(FIELD_ID,aId,FIELD.Store.YES));
添加文档(新的StringField(FIELD_CONTENT,aLabel,FIELD.Store.YES));
aIndexWriter.addDocument(doc);
aIndexWriter.commit();
log.info(“用id[{}]和标签[{}]索引的实体,谓词[{}]”,
援助,阿拉贝尔,阿普雷克特);
}
捕获(IOE异常){
log.error(“无法索引id为[{}]且标签为[{}]”的实体,aId,阿拉贝尔);
}
}
您必须使用Lucene查询语法。搜索c*
而不是搜索c
。请参见考虑到Lucene resp,您至少应该提到您正在使用的API。全文搜索不是SPARQL标准的一部分。(我猜是Sesame resp.RDF4J)我想说,如果你想搜索以c
开头的东西,那么根据Lucene查询语法,查询必须是c*
。参见第5.1.2节。全文search@rec对我来说,这听起来像是正确的答案-真的想把它作为答案发布吗?@rec谢谢,这很有效。