Java 将以前写入HDFS的lucene索引加载到RamDirectory中_Java_Apache_Hadoop_Lucene

Java 将以前写入HDFS的lucene索引加载到RamDirectory中

java apache hadoop lucene

Java 将以前写入HDFS的lucene索引加载到RamDirectory中,java,apache,hadoop,lucene,Java,Apache,Hadoop,Lucene,以下是错误消息： Exception in thread "main" org.apache.lucene.index.IndexNotFoundException: no segments* file found in RAMDirectory@1cff1d4a lockFactory=org.apache.lucene.store.SingleInstanceLockFactory@2ddf0c3: files: [/prod/hdfs/LUCENE/index/140601/_0.cfe,

以下是错误消息：

Exception in thread "main" org.apache.lucene.index.IndexNotFoundException: no segments* file found in RAMDirectory@1cff1d4a lockFactory=org.apache.lucene.store.SingleInstanceLockFactory@2ddf0c3: files: [/prod/hdfs/LUCENE/index/140601/_0.cfe, /prod/hdfs/LUCENE/index/140601/segments_2, /prod/hdfs/LUCENE/index/140601/_0.si, /prod/hdfs/LUCENE/index/140601/segments.gen, /prod/hdfs/LUCENE/index/140601/_0.cfs]
    at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:801)
    at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:52)
    at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:66)

我已正确提交并关闭索引编写器

以下是搜索者代码：

public class SearchFiles {

private SearchFiles() {}

public static void main(String[] args) throws Exception  {

    String filenm = ""; 
    // Creating FileSystem object, to be able to work with HDFS
    Configuration config = new Configuration();
    config.set("fs.defaultFS","hdfs://127.0.0.1:9000/");
    config.addResource(new Path("/usr/local/Cellar/hadoop/2.4.0/libexec/etc/hadoop/core-site.xml"));
    FileSystem dfs = FileSystem.get(config);
    FileStatus[] status = dfs.listStatus(new Path("/prod/hdfs/LUCENE/index/140601"));

    // Creating a RAMDirectory (memory) object, to be able to create index in memory.
    RAMDirectory rdir = new RAMDirectory();

    // Getting the list of index files present in the directory into an array.
    FSDataInputStream filereader = null;

    for (int i=0;i<status.length;i++)
    {

    // Reading data from index files on HDFS directory into filereader object.
    filereader = dfs.open(status[i].getPath());
        int size = filereader.available();
        // Reading data from file into a byte array.            

        byte[] bytarr = new byte[size];
        filereader.read(bytarr, 0, size);

    // Creating file in RAM directory with names same as that of 
    //index files present in HDFS directory.
        filenm = new String (status[i].getPath().toString()) ; 
        String sSplitValue = filenm.substring(21,filenm.length());
        System.out.println( sSplitValue);

        IndexOutput indxout = rdir.createOutput((sSplitValue) , null);

        // Writing data from byte array to the file in RAM directory
        indxout.writeBytes(bytarr,bytarr.length);
        indxout.flush();        
        indxout.close();  
    }
    filereader.close();
//  IndexReader indexReader = IndexReader.open(rdir);

    IndexReader indexReader = DirectoryReader.open(rdir); 
    IndexSearcher searcher = new IndexSearcher(indexReader);
    Analyzer analyzer = new StandardAnalyzer (Version.LUCENE_47); 
    QueryParser parser = new QueryParser(Version.LUCENE_47, "FUNDG_SRCE_CD",analyzer); 
    Query query = parser.parse("D"); 
    TopDocs results = searcher.search(query,1000); 

    int numTotalHits = results.totalHits; 
    TopDocs topDocs = searcher.search(query,1000); 
    ScoreDoc[] hits = topDocs.scoreDocs; 

    //Printing the number of documents or entries that match the search query.
    System.out.println("Total Hits = "+ numTotalHits); 
    for (int j =0 ; j < hits.length ; j++) {
        int docId = hits[j].doc; 

        Document d = searcher.doc(docId);

    System.out.println(d.get("FUNDG_SRCE_CD") +" " + d.get("ACCT_NUM") ) ; 
}
}
}

我认为不应该将null作为IOContext参数传入createOutput。尝试改用IOContext.DEFAULT。我真的不知道这是否会奏效，但也许是朝着正确方向迈出的一步

为什么不让它变得简单呢？您可以使用适当的RAMDirectory构造函数复制索引：

public static void main(String[] args) throws Exception  {
    Directory oldDirectory = FSDirectory("/prod/hdfs/LUCENE/index/140601");
    Directory rdir = new RAMDirectory(fsDirectory, IOContext.DEFAULT);
    IndexReader indexReader = DirectoryReader.open(rdir); 
    //etc.
}

感谢您的帮助femtoRgon，我尝试将IOContext参数传递给createOutput。它没有帮助：。我将尝试使用构造函数复制索引。你有没有发现其他可能导致这种情况的原因？再次感谢你的帮助。我有点怀疑会有其他问题。坦率地说，我没有为此花太多心思。只需打开旧目录上的RAMDirectory，即可替换IndexReader IndexReader=DirectoryReader.openrdir；之前的所有内容；，并将编写索引文件的责任推迟到它所属的Lucene。老实说，深入研究它似乎有点浪费时间。我解决了这个问题，问题是字符串sSplitValue=filenm.substring21，filenm.length；