Hadoop 以编程方式创建HFile并将其加载到HBase时，新条目不可用_Hadoop_Hbase_Bulk Load

Hadoop 以编程方式创建HFile并将其加载到HBase时，新条目不可用

hadoop hbase

Hadoop 以编程方式创建HFile并将其加载到HBase时，新条目不可用,hadoop,hbase,bulk-load,Hadoop,Hbase,Bulk Load,我试图以编程方式创建HFiles，并将它们加载到正在运行的HBase实例中。我在HFileOutputFormat和LoadIncrementalHFiles 我设法创建了新的HFile，并将其发送到集群。在群集web界面中，将显示新的存储文件，但新的keyrange不可用 InputStream stream = ProgrammaticHFileGeneration.class.getResourceAsStream("ga-hourly.txt"); BufferedReader read

我试图以编程方式创建HFiles，并将它们加载到正在运行的HBase实例中。我在

HFileOutputFormat

和

LoadIncrementalHFiles

我设法创建了新的HFile，并将其发送到集群。在群集web界面中，将显示新的存储文件，但新的keyrange不可用

InputStream stream = ProgrammaticHFileGeneration.class.getResourceAsStream("ga-hourly.txt");
BufferedReader reader = new BufferedReader(new InputStreamReader(stream));
String line = null;

Map<byte[], String> rowValues = new HashMap<byte[], String>();

while((line = reader.readLine())!=null) {
    String[] vals = line.split(",");
    String row = new StringBuilder(vals[0]).append(".").append(vals[1]).append(".").append(vals[2]).append(".").append(vals[3]).toString();
    rowValues.put(row.getBytes(), line);
}

List<byte[]> keys = new ArrayList<byte[]>(rowValues.keySet());
Collections.sort(keys, byteArrComparator);


HBaseTestingUtility testingUtility = new HBaseTestingUtility();
testingUtility.startMiniCluster();

testingUtility.createTable("table".getBytes(), "data".getBytes());

Writer writer = new HFile.Writer(testingUtility.getTestFileSystem(),
    new Path("/tmp/hfiles/data/hfile"),
    HFile.DEFAULT_BLOCKSIZE, Compression.Algorithm.NONE, KeyValue.KEY_COMPARATOR);

for(byte[] key:keys) {
    writer.append(new KeyValue(key, "data".getBytes(), "d".getBytes(), rowValues.get(key).getBytes()));
}

writer.appendFileInfo(StoreFile.BULKLOAD_TIME_KEY, Bytes.toBytes(System.currentTimeMillis()));
writer.appendFileInfo(StoreFile.MAJOR_COMPACTION_KEY, Bytes.toBytes(true));
writer.close();

Configuration conf = testingUtility.getConfiguration();

LoadIncrementalHFiles loadTool = new LoadIncrementalHFiles(conf);
HTable hTable = new HTable(conf, "table".getBytes());

loadTool.doBulkLoad(new Path("/tmp/hfiles"), hTable);

ResultScanner scanner = hTable.getScanner("data".getBytes());
Result next = null;
System.out.println("Scanning");
while((next = scanner.next()) != null) {
    System.out.format("%s %s\n", new String(next.getRow()), new String(next.getValue("data".getBytes(), "d".getBytes())));
}

InputStream-stream=programmaticfilegeneration.class.getResourceAsStream（“ga hourly.txt”）；
BufferedReader reader=新的BufferedReader（新的InputStreamReader（流））；
字符串行=null；
Map rowValues=newhashmap（）；
而（（line=reader.readLine（））！=null）{
字符串[]VAL=line.split（“，”）；
字符串行=新建StringBuilder（VAL[0]）。追加（“.”）。追加（VAL[1]）。追加（“.”）。追加（VAL[2]）。追加（“.”）。追加（VAL[3]）。toString（）；
rowValues.put（row.getBytes（），第行）；
}
列表键=新的ArrayList（rowValues.keySet（））；
集合。排序（键、byteArrComparator）；
HBaseTestingUtility testingUtility=新的HBaseTestingUtility（）；
testingUtility.startMiniCluster（）；
createTable（“table.getBytes（），“data.getBytes（））；
Writer Writer=new HFile.Writer（testingUtility.getTestFileSystem（），
新路径（“/tmp/hfiles/data/hfile”），
HFile.DEFAULT_BLOCKSIZE、Compression.Algorithm.NONE、KeyValue.KEY_COMPARATOR）；
for（字节[]键：键）{
append（新的KeyValue（key，“data”.getBytes（），“d”.getBytes（），rowValues.get（key）.getBytes（））；
}
writer.appendFileInfo（StoreFile.BULKLOAD_TIME_KEY，Bytes.toBytes（System.currentTimeMillis（））；
writer.appendFileInfo（StoreFile.MAJOR\u压缩密钥，Bytes.toBytes（true））；
writer.close（）；
Configuration conf=testingUtility.getConfiguration（）；
LoadIncrementalHFiles loadTool=新的LoadIncrementalHFiles（conf）；
HTable HTable=新的HTable（conf，“table”.getBytes（））；
loadTool.doBulkLoad（新路径（“/tmp/hfiles”）、hTable；
ResultScanner scanner=hTable.getScanner（“data.getBytes（））；
结果next=null；
系统输出打印项次（“扫描”）；
while（（next=scanner.next（））！=null）{
System.out.format（“%s%s\n”、新字符串（next.getRow（））、新字符串（next.getValue（“data.getBytes（），“d.getBytes（）））；
}

有人真的做到了吗？我有一个可编译/可测试的版本

看看hbase源代码中的LoadIncrementalHFiles测试：