Java Lucene文本搜索中出错
我是文本搜索新手,我正在学习一些与lucene相关的例子。我从这个链接中找到了一个例子。我在EclipseIDE中尝试了它。但它给出了一些错误。我还导入了所有相关的jar文件 代码如下: 内存中的公共类示例{Java Lucene文本搜索中出错,java,lucene,text-search,Java,Lucene,Text Search,我是文本搜索新手,我正在学习一些与lucene相关的例子。我从这个链接中找到了一个例子。我在EclipseIDE中尝试了它。但它给出了一些错误。我还导入了所有相关的jar文件 代码如下: 内存中的公共类示例{ public static void main(String[] args) { // Construct a RAMDirectory to hold the in-memory representation // of the index. RAM
public static void main(String[] args) {
// Construct a RAMDirectory to hold the in-memory representation
// of the index.
RAMDirectory idx = new RAMDirectory();
try {
// Make an writer to create the index
IndexWriter writer =
new IndexWriter(idx, new StandardAnalyzer(Version.LUCENE_48),
IndexWriter.MaxFieldLength.LIMITED
// Add some Document objects containing quotes
writer.addDocument(createDocument("Theodore Roosevelt",
"It behooves every man to remember that the work of the " +
"critic, is of altogether secondary importance, and that, " +
"in the end, progress is accomplished by the man who does " +
"things."));
writer.addDocument(createDocument("Friedrich Hayek",
"The case for individual freedom rests largely on the " +
"recognition of the inevitable and universal ignorance " +
"of all of us concerning a great many of the factors on " +
"which the achievements of our ends and welfare depend."));
writer.addDocument(createDocument("Ayn Rand",
"There is nothing to take a man's freedom away from " +
"him, save other men. To be free, a man must be free " +
"of his brothers."));
writer.addDocument(createDocument("Mohandas Gandhi",
"Freedom is not worth having if it does not connote " +
"freedom to err."));
// Optimize and close the writer to finish building the index
writer.optimize();
writer.close();
// Build an IndexSearcher using the in-memory index
Searcher searcher = new IndexSearcher(idx);
// Run some queries
search(searcher, "freedom");
search(searcher, "free");
search(searcher, "progress or achievements");
searcher.close();
}
catch (IOException ioe) {
// In this example we aren't really doing an I/O, so this
// exception should never actually be thrown.
ioe.printStackTrace();
}
catch (ParseException pe) {
pe.printStackTrace();
}
}
/**
* Make a Document object with an un-indexed title field and an
* indexed content field.
*/
private static Document createDocument(String title, String content) {
Document doc = new Document();
// Add the title as an unindexed field...
doc.add(new Field("title", title, Field.Store.YES, Field.Index.NO));
// ...and the content as an indexed field. Note that indexed
// Text fields are constructed using a Reader. Lucene can read
// and index very large chunks of text, without storing the
// entire content verbatim in the index. In this example we
// can just wrap the content string in a StringReader.
doc.add(new Field("content", content, Field.Store.YES, Field.Index.ANALYZED));
return doc;
}
/**
* Searches for the given string in the "content" field
*/
private static void search(Searcher searcher, String queryString)
throws ParseException, IOException {
// Build a Query object
//Query query = QueryParser.parse(
QueryParser parser = new QueryParser("content", new StandardAnalyzer(Version.LUCENE_48));
Query query = parser.parse(queryString);
int hitsPerPage = 10;
// Search for the query
TopScoreDocCollector collector = TopScoreDocCollector.create(5 * hitsPerPage, false);
searcher.search(query, collector);
ScoreDoc[] hits = collector.topDocs().scoreDocs;
int hitCount = collector.getTotalHits();
System.out.println(hitCount + " total matching documents");
// Examine the Hits object to see if there were any matches
if (hitCount == 0) {
System.out.println(
"No matches were found for \"" + queryString + "\"");
} else {
System.out.println("Hits for \"" +
queryString + "\" were found in quotes by:");
// Iterate over the Documents in the Hits object
for (int i = 0; i < hitCount; i++) {
// Document doc = hits.doc(i);
ScoreDoc scoreDoc = hits[i];
int docId = scoreDoc.doc;
float docScore = scoreDoc.score;
System.out.println("docId: " + docId + "\t" + "docScore: " + docScore);
Document doc = searcher.doc(docId);
// Print the value that we stored in the "title" field. Note
// that this Field was not indexed, but (unlike the
// "contents" field) was stored verbatim and can be
// retrieved.
System.out.println(" " + (i + 1) + ". " + doc.get("title"));
System.out.println("Content: " + doc.get("content"));
}
}
System.out.println();
} }
StandardAnalyzerVersion.LUCENE_48;
IndexWriter writer=新的IndexWriteridx,cfg
// Add some Document objects containing quotes
writer.addDocument(createDocument("Theodore Roosevelt",
"It behooves every man to remember that the work of the " +
"critic, is of altogether secondary importance, and that, " +
"in the end, progress is accomplished by the man who does " +
"things."));
writer.addDocument(createDocument("Friedrich Hayek",
"The case for individual freedom rests largely on the " +
"recognition of the inevitable and universal ignorance " +
"of all of us concerning a great many of the factors on " +
"which the achievements of our ends and welfare depend."));
writer.addDocument(createDocument("Ayn Rand",
"There is nothing to take a man's freedom away from " +
"him, save other men. To be free, a man must be free " +
"of his brothers."));
writer.addDocument(createDocument("Mohandas Gandhi",
"Freedom is not worth having if it does not connote " +
"freedom to err."));
// Optimize and close the writer to finish building the index
writer.commit();
writer.close();
// Build an IndexSearcher using the in-memory index
IndexSearcher searcher = new IndexSearcher(DirectoryReader.open(idx));
// Run some queries
search(searcher, "freedom");
search(searcher, "free");
search(searcher, "progress or achievements");
//searcher.close();
}
/**
* Make a Document object with an un-indexed title field and an
* indexed content field.
*/
private static Document createDocument(String title, String content) {
Document doc = new Document();
// Add the title as an unindexed field...
doc.add(new Field("title", title, Field.Store.YES, Field.Index.NO));
// ...and the content as an indexed field. Note that indexed
// Text fields are constructed using a Reader. Lucene can read
// and index very large chunks of text, without storing the
// entire content verbatim in the index. In this example we
// can just wrap the content string in a StringReader.
doc.add(new Field("content", content, Field.Store.YES, Field.Index.ANALYZED));
return doc;
}
/**
* Searches for the given string in the "content" field
*/
private static void search(IndexSearcher searcher, String queryString)
throws ParseException, IOException {
// Build a Query object
//Query query = QueryParser.parse(
QueryParser parser = new QueryParser("content", new StandardAnalyzer(Version.LUCENE_48));
Query query = parser.parse(queryString);
int hitsPerPage = 10;
// Search for the query
TopScoreDocCollector collector = TopScoreDocCollector.create(5 * hitsPerPage, false);
searcher.search(query, collector);
ScoreDoc[] hits = collector.topDocs().scoreDocs;
int hitCount = collector.getTotalHits();
System.out.println(hitCount + " total matching documents");
// Examine the Hits object to see if there were any matches
if (hitCount == 0) {
System.out.println(
"No matches were found for \"" + queryString + "\"");
} else {
System.out.println("Hits for \"" +
queryString + "\" were found in quotes by:");
// Iterate over the Documents in the Hits object
for (int i = 0; i < hitCount; i++) {
// Document doc = hits.doc(i);
ScoreDoc scoreDoc = hits[i];
int docId = scoreDoc.doc;
float docScore = scoreDoc.score;
System.out.println("docId: " + docId + "\t" + "docScore: " + docScore);
Document doc = searcher.doc(docId);
// Print the value that we stored in the "title" field. Note
// that this Field was not indexed, but (unlike the
// "contents" field) was stored verbatim and can be
// retrieved.
System.out.println(" " + (i + 1) + ". " + doc.get("title"));
System.out.println("Content: " + doc.get("content"));
}
}
System.out.println();
} }
这是输出:
主线程java.lang.VerifyError中出现异常:类
org.apache.lucene.analysis.SimpleAnalyzer覆盖最终方法
Ljava/lang/String;Ljava/io/Reader;Lorg/apache/lucene/analysis/tokenStream;
位于java.lang.ClassLoader.defineClass1Native方法
java.lang.ClassLoader.defineClassUnknown源位于
java.security.SecureClassLoader.defineClass未知源位于
java.net.URLClassLoader.defineclass未知源位于
java.net.URLClassLoader.access$100Unknown Source位于
java.net.URLClassLoader$1.run未知源代码位于
java.net.URLClassLoader$1.run未知源代码位于
位于的java.security.AccessController.doPrivilegeEdNative方法
java.net.URLClassLoader.findClassUnknown源位于
位于的java.lang.ClassLoader.loadClassUnknown源
sun.misc.Launcher$AppClassLoader.loadClassUnknown源位于
位于的java.lang.ClassLoader.loadClassUnknown源
beehex.inmemeory.textsearch.InMemoryExample.searchInMemoryExample.java:98
在
beehex.inmemeory.textsearch.InMemoryExample.main InMemoryExample.java:58
我在IndexWriter构造函数上没有看到第三个参数。您应该修改代码以适应新的lucene api,如下所示:
IndexWriterConfig cfg = new IndexWriterConfig(Version.LUCENE_48, new StandardAnalyzer(Version.LUCENE_48));
IndexWriter writer = new IndexWriter(idx, cfg);
IndexSearcher searcher = new IndexSearcher(DirectoryReader.open(idx));
另外,我宁愿让我的主方法抛出异常,让程序完全失败,而不是在这里捕获异常
编辑:
2删除optimize调用,因为IndexWriter类不再具有该方法,我认为commit将在这里起作用
3定义IndexSearcher类,如下所示:
IndexWriterConfig cfg = new IndexWriterConfig(Version.LUCENE_48, new StandardAnalyzer(Version.LUCENE_48));
IndexWriter writer = new IndexWriter(idx, cfg);
IndexSearcher searcher = new IndexSearcher(DirectoryReader.open(idx));
你使用的是什么版本的lucene?我使用的是lucene 4.8.1我删除了try-catch块并引发了一个异常。我修改了你提到的代码。但是我仍然收到我的帖子中提到的第二、第三和第四个语法错误。现在试一试。另外,请阅读以下内容以备将来参考:呃……你无法关闭搜索程序,所以请删除该行。Actually只需转到我指向您的链接…并阅读一些java教程,同时进行修改。searcher.close中出现错误;然后我对其进行了注释并运行代码。我的输出在最初的帖子中。它给出了一个错误。我已再次发布修改后的代码和输出。对于由此造成的不便,我深表歉意。非常感谢r支持。非常感谢。请帮我解决这个问题。