Hadoop 以编程方式创建HFile并将其加载到HBase时,新条目不可用

Hadoop 以编程方式创建HFile并将其加载到HBase时,新条目不可用,hadoop,hbase,bulk-load,Hadoop,Hbase,Bulk Load,我试图以编程方式创建HFiles,并将它们加载到正在运行的HBase实例中。我在HFileOutputFormat和LoadIncrementalHFiles 我设法创建了新的HFile,并将其发送到集群。在群集web界面中,将显示新的存储文件,但新的keyrange不可用 InputStream stream = ProgrammaticHFileGeneration.class.getResourceAsStream("ga-hourly.txt"); BufferedReader read

我试图以编程方式创建HFiles,并将它们加载到正在运行的HBase实例中。我在
HFileOutputFormat
LoadIncrementalHFiles

我设法创建了新的HFile,并将其发送到集群。在群集web界面中,将显示新的存储文件,但新的keyrange不可用

InputStream stream = ProgrammaticHFileGeneration.class.getResourceAsStream("ga-hourly.txt");
BufferedReader reader = new BufferedReader(new InputStreamReader(stream));
String line = null;

Map<byte[], String> rowValues = new HashMap<byte[], String>();

while((line = reader.readLine())!=null) {
    String[] vals = line.split(",");
    String row = new StringBuilder(vals[0]).append(".").append(vals[1]).append(".").append(vals[2]).append(".").append(vals[3]).toString();
    rowValues.put(row.getBytes(), line);
}

List<byte[]> keys = new ArrayList<byte[]>(rowValues.keySet());
Collections.sort(keys, byteArrComparator);


HBaseTestingUtility testingUtility = new HBaseTestingUtility();
testingUtility.startMiniCluster();

testingUtility.createTable("table".getBytes(), "data".getBytes());

Writer writer = new HFile.Writer(testingUtility.getTestFileSystem(),
    new Path("/tmp/hfiles/data/hfile"),
    HFile.DEFAULT_BLOCKSIZE, Compression.Algorithm.NONE, KeyValue.KEY_COMPARATOR);

for(byte[] key:keys) {
    writer.append(new KeyValue(key, "data".getBytes(), "d".getBytes(), rowValues.get(key).getBytes()));
}

writer.appendFileInfo(StoreFile.BULKLOAD_TIME_KEY, Bytes.toBytes(System.currentTimeMillis()));
writer.appendFileInfo(StoreFile.MAJOR_COMPACTION_KEY, Bytes.toBytes(true));
writer.close();

Configuration conf = testingUtility.getConfiguration();

LoadIncrementalHFiles loadTool = new LoadIncrementalHFiles(conf);
HTable hTable = new HTable(conf, "table".getBytes());

loadTool.doBulkLoad(new Path("/tmp/hfiles"), hTable);

ResultScanner scanner = hTable.getScanner("data".getBytes());
Result next = null;
System.out.println("Scanning");
while((next = scanner.next()) != null) {
    System.out.format("%s %s\n", new String(next.getRow()), new String(next.getValue("data".getBytes(), "d".getBytes())));
}
InputStream-stream=programmaticfilegeneration.class.getResourceAsStream(“ga hourly.txt”);
BufferedReader reader=新的BufferedReader(新的InputStreamReader(流));
字符串行=null;
Map rowValues=newhashmap();
而((line=reader.readLine())!=null){
字符串[]VAL=line.split(“,”);
字符串行=新建StringBuilder(VAL[0])。追加(“.”)。追加(VAL[1])。追加(“.”)。追加(VAL[2])。追加(“.”)。追加(VAL[3])。toString();
rowValues.put(row.getBytes(),第行);
}
列表键=新的ArrayList(rowValues.keySet());
集合。排序(键、byteArrComparator);
HBaseTestingUtility testingUtility=新的HBaseTestingUtility();
testingUtility.startMiniCluster();
createTable(“table.getBytes(),“data.getBytes());
Writer Writer=new HFile.Writer(testingUtility.getTestFileSystem(),
新路径(“/tmp/hfiles/data/hfile”),
HFile.DEFAULT_BLOCKSIZE、Compression.Algorithm.NONE、KeyValue.KEY_COMPARATOR);
for(字节[]键:键){
append(新的KeyValue(key,“data”.getBytes(),“d”.getBytes(),rowValues.get(key).getBytes());
}
writer.appendFileInfo(StoreFile.BULKLOAD_TIME_KEY,Bytes.toBytes(System.currentTimeMillis());
writer.appendFileInfo(StoreFile.MAJOR\u压缩密钥,Bytes.toBytes(true));
writer.close();
Configuration conf=testingUtility.getConfiguration();
LoadIncrementalHFiles loadTool=新的LoadIncrementalHFiles(conf);
HTable HTable=新的HTable(conf,“table”.getBytes());
loadTool.doBulkLoad(新路径(“/tmp/hfiles”)、hTable;
ResultScanner scanner=hTable.getScanner(“data.getBytes());
结果next=null;
系统输出打印项次(“扫描”);
while((next=scanner.next())!=null){
System.out.format(“%s%s\n”、新字符串(next.getRow())、新字符串(next.getValue(“data.getBytes(),“d.getBytes()));
}

有人真的做到了吗?我有一个可编译/可测试的版本

看看hbase源代码中的LoadIncrementalHFiles测试: