Lucene全文搜索_Lucene - Fatal编程技术网

Lucene全文搜索

lucene

Lucene全文搜索,lucene,Lucene,我将Lucene 3.3.0与java一起使用。我面临以下问题，不知道是否有解决方案我使用StandardAnalyzer为以下文本编制索引：“这个男孩玩得很努力，赢得了比赛”，然后我使用“play”进行查询搜索。。。Lucene仅在使用WildcardQuery builder时才查找命中率问题是，当我尝试搜索“男孩游戏”时，它找不到任何点击有没有办法让Lucene做一些类似上下文搜索的东西来解决这个问题谢谢，萨梅尔 private static void addDoc（IndexW

我将Lucene 3.3.0与java一起使用。我面临以下问题，不知道是否有解决方案

我使用StandardAnalyzer为以下文本编制索引：“这个男孩玩得很努力，赢得了比赛”，然后我使用“play”进行查询搜索。。。Lucene仅在使用WildcardQuery builder时才查找命中率

问题是，当我尝试搜索“男孩游戏”时，它找不到任何点击

有没有办法让Lucene做一些类似上下文搜索的东西来解决这个问题

谢谢，萨梅尔

private static void addDoc（IndexWriter w，字符串值）引发IOException{
单据单据=新单据（）；
新增单据（新增字段（“标题”，值，Field.Store.YES，Field.Index.analysis））；
w、 添加文档（doc）；
}
@抑制警告（“弃用”）
公共静态void lucene（字符串参数，字符串查询）抛出IOException，ParseException{
//0.指定用于标记文本的分析器。
//应使用相同的分析器进行索引和搜索
StandardAnalyzer=新的StandardAnalyzer（版本.LUCENE\u当前）；
//1.创建索引
目录索引=新的RAMDirectory（）；
//IndexWriter中的布尔arg表示
//创建新索引，覆盖任何现有索引
IndexWriter w=新的IndexWriter（索引、分析器、真、，
IndexWriter.MaxFieldLength.UNLIMITED）；
字符串[]splitOnLinefeed=args.split（“\n”）；
for（int i=0；i

1）查询“播放”：StandardAnalyzer不提供词干分析。很明显，您要么使用通配符，要么提供完全相同的术语。因此，没有词干，“玩”和“玩”是完全不同的
如果您想让“title:play”起作用，您可以通过组合StandardAnalyzer的组件（标记器、过滤器）和
2） “男孩游戏”：你确定你的pdf是否被正确解析了吗。请尝试将“args”参数打印到lucene（）
 1）查询“播放”：StandardAnalyzer不提供词干分析。很明显，您要么使用通配符，要么提供完全相同的术语。因此，没有词干，“玩”和“玩”是完全不同的
如果您想让“title:play”起作用，您可以通过组合StandardAnalyzer的组件（标记器、过滤器）和
2） “男孩游戏”：你确定你的pdf是否被正确解析了吗。请尝试将“args”参数打印到lucene（） @Skaffman，我添加了代码，我使用Tika解析包含数据的PDF文件。@Skaffman，我添加了代码，我使用Tika解析包含数据的PDF文件。
private static void addDoc(IndexWriter w, String value) throws IOException {
    Document doc = new Document();
    doc.add(new Field("title", value, Field.Store.YES, Field.Index.ANALYZED));
    w.addDocument(doc);
}

@SuppressWarnings("deprecation")
public static void lucene(String args, String query) throws IOException, ParseException {
    // 0. Specify the analyzer for tokenizing text.
    // The same analyzer should be used for indexing and searching
    StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT);

    // 1. create the index
    Directory index = new RAMDirectory();

    // the boolean arg in the IndexWriter ctor means to
    // create a new index, overwriting any existing index
    IndexWriter w = new IndexWriter(index, analyzer, true,
            IndexWriter.MaxFieldLength.UNLIMITED);
    String[] splitOnLinefeed = args.split("\n");
    for (int i = 0; i < splitOnLinefeed.length; i++) {
        addDoc(w, splitOnLinefeed[i]);
    }
    w.close();

    // 2. query
    String querystr = query+"*";

    // the "title" arg specifies the default field to use
    // when no field is explicitly specified in the query.
    Query q = new QueryParser(Version.LUCENE_CURRENT, "title", analyzer)
            .parse(querystr);

    // 3. search
    IndexSearcher searcher = new IndexSearcher(index, true);
    ScoreDoc[] hits = searcher.search(q, 100).scoreDocs;

    // 4. display results
    System.out.println("Found " + hits.length + " hit(s).");
    for (int i = 0; i < hits.length; ++i) {
        int docId = hits[i].doc;
        Document d = searcher.doc(docId);
        System.out.println((i + 1) + ". " + d.get("title"));
    }

    // searcher can only be closed when there
    // is no need to access the documents any more.
    searcher.close();
}

public static void main(String[] args) throws Exception {
    lucene(parse("Test.pdf"), "boy game");
}