Java Lucene文本搜索中出错_Java_Lucene_Text Search

Java Lucene文本搜索中出错

java lucene

Java Lucene文本搜索中出错,java,lucene,text-search,Java,Lucene,Text Search,我是文本搜索新手，我正在学习一些与lucene相关的例子。我从这个链接中找到了一个例子。我在EclipseIDE中尝试了它。但它给出了一些错误。我还导入了所有相关的jar文件代码如下：内存中的公共类示例{ public static void main(String[] args) { // Construct a RAMDirectory to hold the in-memory representation // of the index. RAM

我是文本搜索新手，我正在学习一些与lucene相关的例子。我从这个链接中找到了一个例子。我在EclipseIDE中尝试了它。但它给出了一些错误。我还导入了所有相关的jar文件

代码如下：

内存中的公共类示例{

public static void main(String[] args) {
      // Construct a RAMDirectory to hold the in-memory representation
      // of the index.
       RAMDirectory idx = new RAMDirectory();

      try {
         // Make an writer to create the index
         IndexWriter writer =
                 new IndexWriter(idx, new StandardAnalyzer(Version.LUCENE_48),

IndexWriter.MaxFieldLength.LIMITED

         // Add some Document objects containing quotes
         writer.addDocument(createDocument("Theodore Roosevelt",
                 "It behooves every man to remember that the work of the " +
                         "critic, is of altogether secondary importance, and that, " +
                         "in the end, progress is accomplished by the man who does " +
                         "things."));
         writer.addDocument(createDocument("Friedrich Hayek",
                 "The case for individual freedom rests largely on the " +
                         "recognition of the inevitable and universal ignorance " +
                         "of all of us concerning a great many of the factors on " +
                         "which the achievements of our ends and welfare depend."));
         writer.addDocument(createDocument("Ayn Rand",
                 "There is nothing to take a man's freedom away from " +
                         "him, save other men. To be free, a man must be free " +
                         "of his brothers."));
         writer.addDocument(createDocument("Mohandas Gandhi",
                 "Freedom is not worth having if it does not connote " +
                         "freedom to err."));

         // Optimize and close the writer to finish building the index
         writer.optimize();
         writer.close();

         // Build an IndexSearcher using the in-memory index
         Searcher searcher = new IndexSearcher(idx);

         // Run some queries
         search(searcher, "freedom");
         search(searcher, "free");
         search(searcher, "progress or achievements");

         searcher.close();
      }
      catch (IOException ioe) {
         // In this example we aren't really doing an I/O, so this
         // exception should never actually be thrown.
         ioe.printStackTrace();
      }
      catch (ParseException pe) {
         pe.printStackTrace();
      }
   }

   /**
    * Make a Document object with an un-indexed title field and an
    * indexed content field.
    */
   private static Document createDocument(String title, String content) {
      Document doc = new Document();

      // Add the title as an unindexed field...

      doc.add(new Field("title", title, Field.Store.YES, Field.Index.NO));


      // ...and the content as an indexed field. Note that indexed
      // Text fields are constructed using a Reader. Lucene can read
      // and index very large chunks of text, without storing the
      // entire content verbatim in the index. In this example we
      // can just wrap the content string in a StringReader.
      doc.add(new Field("content", content, Field.Store.YES, Field.Index.ANALYZED));

      return doc;
   }

   /**
    * Searches for the given string in the "content" field
    */
   private static void search(Searcher searcher, String queryString)
           throws ParseException, IOException {

      // Build a Query object
      //Query query = QueryParser.parse(
      QueryParser parser = new QueryParser("content", new StandardAnalyzer(Version.LUCENE_48));
      Query query = parser.parse(queryString);


      int hitsPerPage = 10;
      // Search for the query
      TopScoreDocCollector collector = TopScoreDocCollector.create(5 * hitsPerPage, false);
      searcher.search(query, collector);

      ScoreDoc[] hits = collector.topDocs().scoreDocs;

      int hitCount = collector.getTotalHits();
      System.out.println(hitCount + " total matching documents");

      // Examine the Hits object to see if there were any matches

      if (hitCount == 0) {
         System.out.println(
                 "No matches were found for \"" + queryString + "\"");
      } else {
         System.out.println("Hits for \"" +
                 queryString + "\" were found in quotes by:");

         // Iterate over the Documents in the Hits object
         for (int i = 0; i < hitCount; i++) {
            // Document doc = hits.doc(i);
            ScoreDoc scoreDoc = hits[i];
            int docId = scoreDoc.doc;
            float docScore = scoreDoc.score;
            System.out.println("docId: " + docId + "\t" + "docScore: " + docScore);

            Document doc = searcher.doc(docId);

            // Print the value that we stored in the "title" field. Note
            // that this Field was not indexed, but (unlike the
            // "contents" field) was stored verbatim and can be
            // retrieved.
            System.out.println("  " + (i + 1) + ". " + doc.get("title"));
            System.out.println("Content: " + doc.get("content"));            
         }
      }
      System.out.println();
   } }

StandardAnalyzerVersion.LUCENE_48； IndexWriter writer=新的IndexWriteridx，cfg

       // Add some Document objects containing quotes
       writer.addDocument(createDocument("Theodore Roosevelt",
               "It behooves every man to remember that the work of the " +
                       "critic, is of altogether secondary importance, and that, " +
                       "in the end, progress is accomplished by the man who does " +
                       "things."));
       writer.addDocument(createDocument("Friedrich Hayek",
               "The case for individual freedom rests largely on the " +
                       "recognition of the inevitable and universal ignorance " +
                       "of all of us concerning a great many of the factors on " +
                       "which the achievements of our ends and welfare depend."));
       writer.addDocument(createDocument("Ayn Rand",
               "There is nothing to take a man's freedom away from " +
                       "him, save other men. To be free, a man must be free " +
                       "of his brothers."));
       writer.addDocument(createDocument("Mohandas Gandhi",
               "Freedom is not worth having if it does not connote " +
                       "freedom to err."));

       // Optimize and close the writer to finish building the index
       writer.commit();
       writer.close();

       // Build an IndexSearcher using the in-memory index
       IndexSearcher searcher = new IndexSearcher(DirectoryReader.open(idx));

       // Run some queries
       search(searcher, "freedom");
       search(searcher, "free");
       search(searcher, "progress or achievements");

       //searcher.close();



 }

 /**
  * Make a Document object with an un-indexed title field and an
  * indexed content field.
  */
 private static Document createDocument(String title, String content) {
    Document doc = new Document();

    // Add the title as an unindexed field...

    doc.add(new Field("title", title, Field.Store.YES, Field.Index.NO));


    // ...and the content as an indexed field. Note that indexed
    // Text fields are constructed using a Reader. Lucene can read
    // and index very large chunks of text, without storing the
    // entire content verbatim in the index. In this example we
    // can just wrap the content string in a StringReader.
    doc.add(new Field("content", content, Field.Store.YES, Field.Index.ANALYZED));

    return doc;
 }

 /**
  * Searches for the given string in the "content" field
  */
 private static void search(IndexSearcher searcher, String queryString)
         throws ParseException, IOException {

    // Build a Query object
    //Query query = QueryParser.parse(
    QueryParser parser = new QueryParser("content", new StandardAnalyzer(Version.LUCENE_48));
    Query query = parser.parse(queryString);


    int hitsPerPage = 10;
    // Search for the query
    TopScoreDocCollector collector = TopScoreDocCollector.create(5 * hitsPerPage, false);
    searcher.search(query, collector);

    ScoreDoc[] hits = collector.topDocs().scoreDocs;

    int hitCount = collector.getTotalHits();
    System.out.println(hitCount + " total matching documents");

    // Examine the Hits object to see if there were any matches

    if (hitCount == 0) {
       System.out.println(
               "No matches were found for \"" + queryString + "\"");
    } else {
       System.out.println("Hits for \"" +
               queryString + "\" were found in quotes by:");

       // Iterate over the Documents in the Hits object
       for (int i = 0; i < hitCount; i++) {
          // Document doc = hits.doc(i);
          ScoreDoc scoreDoc = hits[i];
          int docId = scoreDoc.doc;
          float docScore = scoreDoc.score;
          System.out.println("docId: " + docId + "\t" + "docScore: " + docScore);

          Document doc = searcher.doc(docId);

          // Print the value that we stored in the "title" field. Note
          // that this Field was not indexed, but (unlike the
          // "contents" field) was stored verbatim and can be
          // retrieved.
          System.out.println("  " + (i + 1) + ". " + doc.get("title"));
          System.out.println("Content: " + doc.get("content"));            
       }
    }
    System.out.println();
 } }

这是输出：

主线程java.lang.VerifyError中出现异常：类 org.apache.lucene.analysis.SimpleAnalyzer覆盖最终方法 Ljava/lang/String；Ljava/io/Reader；Lorg/apache/lucene/analysis/tokenStream；位于java.lang.ClassLoader.defineClass1Native方法 java.lang.ClassLoader.defineClassUnknown源位于 java.security.SecureClassLoader.defineClass未知源位于 java.net.URLClassLoader.defineclass未知源位于 java.net.URLClassLoader.access$100Unknown Source位于 java.net.URLClassLoader$1.run未知源代码位于 java.net.URLClassLoader$1.run未知源代码位于位于的java.security.AccessController.doPrivilegeEdNative方法 java.net.URLClassLoader.findClassUnknown源位于位于的java.lang.ClassLoader.loadClassUnknown源 sun.misc.Launcher$AppClassLoader.loadClassUnknown源位于位于的java.lang.ClassLoader.loadClassUnknown源 beehex.inmemeory.textsearch.InMemoryExample.searchInMemoryExample.java:98 在 beehex.inmemeory.textsearch.InMemoryExample.main InMemoryExample.java:58

我在IndexWriter构造函数上没有看到第三个参数。您应该修改代码以适应新的lucene api，如下所示：

IndexWriterConfig cfg = new IndexWriterConfig(Version.LUCENE_48, new StandardAnalyzer(Version.LUCENE_48));
IndexWriter writer = new IndexWriter(idx, cfg);

IndexSearcher searcher = new IndexSearcher(DirectoryReader.open(idx));

另外，我宁愿让我的主方法抛出异常，让程序完全失败，而不是在这里捕获异常

编辑：

2删除optimize调用，因为IndexWriter类不再具有该方法，我认为commit将在这里起作用

3定义IndexSearcher类，如下所示：

IndexWriterConfig cfg = new IndexWriterConfig(Version.LUCENE_48, new StandardAnalyzer(Version.LUCENE_48));
IndexWriter writer = new IndexWriter(idx, cfg);

IndexSearcher searcher = new IndexSearcher(DirectoryReader.open(idx));

你使用的是什么版本的lucene？我使用的是lucene 4.8.1我删除了try-catch块并引发了一个异常。我修改了你提到的代码。但是我仍然收到我的帖子中提到的第二、第三和第四个语法错误。现在试一试。另外，请阅读以下内容以备将来参考：呃……你无法关闭搜索程序，所以请删除该行。Actually只需转到我指向您的链接…并阅读一些java教程，同时进行修改。searcher.close中出现错误；然后我对其进行了注释并运行代码。我的输出在最初的帖子中。它给出了一个错误。我已再次发布修改后的代码和输出。对于由此造成的不便，我深表歉意。非常感谢r支持。非常感谢。请帮我解决这个问题。