Lucene Jackrabbit Oak Lucine索引和SQL2查询,用于txt和pdf格式的全文搜索
我尝试使用Oak版本1.16.0在文件内容中实现全文搜索 试图创建索引,就像Oak文档中所说的那样,对所有属性进行索引Lucene Jackrabbit Oak Lucine索引和SQL2查询,用于txt和pdf格式的全文搜索,lucene,jcr,jackrabbit,jackrabbit-oak,Lucene,Jcr,Jackrabbit,Jackrabbit Oak,我尝试使用Oak版本1.16.0在文件内容中实现全文搜索 试图创建索引,就像Oak文档中所说的那样,对所有属性进行索引 /oak:index/assetType - jcr:primaryType = "oak:QueryIndexDefinition" - type = "lucene" - compatVersion = 2 - async = "async" + indexRules - jcr:primaryType = "nt:unstructured"
/oak:index/assetType
- jcr:primaryType = "oak:QueryIndexDefinition"
- type = "lucene"
- compatVersion = 2
- async = "async"
+ indexRules
- jcr:primaryType = "nt:unstructured"
+ nt:base
+ properties
- jcr:primaryType = "nt:unstructured"
+ allProps
- name = ".*"
- isRegexp = true
- nodeScopeIndex = true
publicstaticvoidcreateindex(存储库){
会话=空;
试一试{
session=repository.login();
Node root=session.getRootNode();
Node index=root.getNode(“oak:index”);
Node lucinindex=index.addNode(“assetType”、“oak:QueryIndexDefinition”);
lucineIndex.setProperty(“兼容版本”,“2”);
setProperty(“类型”、“lucene”);
setProperty(“异步”、“异步”);
Node rules=lucinendex.addNode(“索引规则”,“nt:非结构化”);
Node base=rules.addNode(“nt:base”);
节点属性=base.addNode(“属性”,“nt:非结构化”);
Node allProps=properties.addNode(“allProps”);
setProperty(“jcr:content”、“*”);
allProps.setProperty(“isRegexp”,true);
setProperty(“nodeScopeIndex”,true);
session.save();
}捕获(LoginException e){
e、 printStackTrace();
}捕获(存储异常e){
e、 printStackTrace();
}最后{
session.logout();
}
}
public static void saveFileIfNotExist(字节[]rawFile,字符串文件名,字符串folderName,字符串mimeType,存储库){
会话=空;
试一试{
session=repository.login(新的simpleredentials(“admin”,“admin.tocharray()));
Node root=session.getRootNode();
Binary Binary=session.getValueFactory().createBinary(新的ByteArrayInputStream(rawFile));
如果(!root.hasNode(folderName)){
System.out.println(“无文件夹”);
Node folder=root.addNode(folderName,“nt:folder”);
Node file=folder.addNode(文件名,“nt:file”);
Node content=file.addNode(“jcr:content”、“nt:resource”);
setProperty(“jcr:mimeType”,mimeType);
setProperty(“jcr:data”,二进制);
}否则{
System.out.println(“文件夹存在”);
}
session.save();
}
捕获(存储异常e){
e、 printStackTrace();
}最后{
session.logout();
}
}
文件内容:
An implementation of the Value interface must override the inherited method
Object.equals(Object) so that, given Value instances V1 and V2,
V1.equals(V2) will return true if.
DocumentNodeStore rdb=newdocumentnodestore(new-RDBDocumentNodeStoreBuilder().setRDBConnection(dataSource));
Repository repo=new Jcr(new Oak(rdb)).with(new OpenSecurityProvider()).createRepository();
创建指数(repo);
byte[]rawFile=readBytes(“D:\\file.txt”);
saveFileIfNotExist(rawFile,“txt\u文件夹”,“text\u文件”,“text/plain”,repo);
会话=空;
试一试{
session=repo.login();
Node root=session.getRootNode();
Node index=root.getNode(“oak:index”);
QueryManager QueryManager=session.getWorkspace().getQueryManager();session.getWorkspace().getQueryManager();
Query Query=queryManager.createQuery(“从[nt:resource]中选择*作为s,其中包含(s.*,*so*)选项(遍历警告)”,Query.JCR_SQL2);
QueryResult result=query.execute();
RowIterator ri=result.getRows();
while(ri.hasNext()){
Row Row=ri.nextRow();
System.out.println(“行:“+Row.toString());
}
}捕获(存储异常e){
e、 printStackTrace();
}
最后{
session.logout();
((RepositoryImpl)repo.shutdown();
dispose();
}
但不会返回任何内容,并在日志中发出警告:
2019-10-02 18:27:35,821 [main] WARN QueryImpl - Traversal query (query without index): SELECT * FROM [nt:resource] AS s WHERE CONTAINS(s.*, '*so*') option(traversal warn); consider creating an index
我没有仔细检查所有的代码段,但似乎缺少的一件事是设置一个异步索引器(您的索引定义有
async=“async”
)。只是从我的头顶打字,但做一些类似的事情
new Oak(rdb)).with(new OpenSecurityProvider().withAsyncIndexing("async", 5) // 5 is number seconds to define period at which async indexer would run
顺便说一句,因为它是一个异步索引,所以在查询中显示结果之前,您需要等待一段时间。但是,即使结果没有显示,查询仍应提取您的索引。我没有仔细检查所有代码段,但似乎缺少的一点是设置一个异步索引器(您的索引def具有
async=“async”
)。只是从我的头顶打字,但做一些类似的事情
new Oak(rdb)).with(new OpenSecurityProvider().withAsyncIndexing("async", 5) // 5 is number seconds to define period at which async indexer would run
顺便说一句,因为它是一个异步索引,所以在查询中显示结果之前,您需要等待一段时间。但是,即使结果没有显示出来,查询仍应获取您的索引。谢谢。我添加了LuceneProvider
LuceneIndexProvider=new LuceneIndexProvider();repository=new Jcr(new Oak(rdb)).with(new OpenSecurityProvider()).with(new lucenedexeditorprovider()).with((QueryIndexProvider)provider.).withAsyncIndexing(“async”,5).createRepository()
,并查看它是否尝试在日志中构建索引。但查询结果仍然为空,警告消息仍在日志中:谢谢。我添加了LuceneProviderLuceneIndexProvider=new LuceneIndexProvider();repository=new Jcr(new Oak(rdb)).with(new OpenSecurityProvider()).with(new LuceneIndexEditorProvider()).with((QueryIndexProvider)provider).with(AsyncIndexing)(“async”,5).createRepository()
并查看它是否尝试