Hbase Java API：检索与部分行键匹配的所有行_Java_Hbase

Hbase Java API：检索与部分行键匹配的所有行

java hbase

Hbase Java API：检索与部分行键匹配的所有行,java,hbase,Java,Hbase,在Python模块中，我可以检索具有以给定字符串开头的行键的所有行（即，使用部分行键进行搜索）假设我有一个格式为（ID | TYPE | DATE）的rowkey，我可以通过以下方式找到ID为1、类型为a的所有行： import happybase connection = happybase.Connection('hmaster-host.com') table = connection.table('table_name') for key, data in table.scan(row

在Python模块中，我可以检索具有以给定字符串开头的行键的所有行（即，使用部分行键进行搜索）

假设我有一个格式为（ID | TYPE | DATE）的rowkey，我可以通过以下方式找到ID为1、类型为a的所有行：

import happybase
connection = happybase.Connection('hmaster-host.com')
table = connection.table('table_name')
for key, data in table.scan(row_prefix="1|A|"):
    print key, data

这是迄今为止我拥有的一个完全面向客户端的Java程序，适用于尝试使用进行基本操作的任何人，但我只能使用完整的行键搜索行：

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.hbase.client.Get;
import org.apache.hadoop.hbase.client.HTable;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.util.Bytes;
//class foo {
public static void main(String[] args) {
    Configuration conf = new Configuration();
    conf.addResource(new Path("C:\\core-site.xml"));
    conf.addResource(new Path("C:\\hbase-site.xml"));
    HTable table = new HTable(conf, "table_name");
    Result row = table.get(new Get(Bytes.toBytes("1|A|2014-01-01 00:00")));
    printRow(row); 
}
public static void printRow(Result result) {
    String returnString = "";
    returnString += Bytes.toString(result.getValue(Bytes.toBytes("cf"), Bytes.toBytes("id"))) + ", ";
    returnString += Bytes.toString(result.getValue(Bytes.toBytes("cf"), Bytes.toBytes("type"))) + ", ";
    returnString += Bytes.toString(result.getValue(Bytes.toBytes("cf"), Bytes.toBytes("date")));
    System.out.println(returnString);
}
//}

其中“cf”是列族的名称

答复:

import java.io.IOException；
导入java.util.Iterator；
导入org.apache.hadoop.conf.Configuration；
导入org.apache.hadoop.fs.Path；
导入org.apache.hadoop.hbase.client.HTable；
导入org.apache.hadoop.hbase.client.Result；
导入org.apache.hadoop.hbase.client.ResultScanner；
导入org.apache.hadoop.hbase.client.Scan；
导入org.apache.hadoop.hbase.filter.filter；
导入org.apache.hadoop.hbase.filter.PrefixFilter；
导入org.apache.hadoop.hbase.util.Bytes；
//福班{
公共静态void main（字符串[]args）{
Configuration conf=新配置（）；
conf.addResource（新路径（“C:\\core site.xml”）；
conf.addResource（新路径（“C:\\hbase site.xml”）；
HTable table=新的HTable（conf，“table_name”）；
byte[]前缀=Bytes.toBytes（“1 | A |”）；
扫描=新扫描（前缀）；
Filter prefixFilter=新的prefixFilter（前缀）；
scan.setFilter（prefixFilter）；
ResultScanner ResultScanner=table.getScanner（扫描）；
打印行（结果扫描）；
//结果行=table.get（新的get（Bytes.toBytes（“1 | A | 2014-01-01 00:00”））；
//printRow（row）；
}
公共静态无效打印行（结果扫描结果扫描）{
for（Iterator Iterator=results.Iterator（）；Iterator.hasNext（）；）{
printRow（迭代器.next（）；
}
}
公共静态无效打印行（结果）{
字符串returnString=“”；
returnString+=Bytes.toString（result.getValue（Bytes.toBytes（“cf”）、Bytes.toBytes（“id”））+“，”；
returnString+=Bytes.toString（result.getValue（Bytes.toBytes（“cf”）、Bytes.toBytes（“type”））+“，”；
returnString+=Bytes.toString（result.getValue（Bytes.toBytes（“cf”）、Bytes.toBytes（“date”））；
System.out.println（返回字符串）；
}
//}

请注意，我使用的是

setFilter

方法，而下面的答案使用的是

addFilter

方法，因为我们使用的是不同的API。

您使用的是HTable

get

操作，因此您只返回一行（请注意，您也可以在此处指定前缀，不必给出完整的键）

如果要返回多行，应使用

Scan

byte[] prefix=Bytes.toBytes("1|A|");
Scan scan = new Scan(prefix);
PrefixFilter prefixFilter = new PrefixFilter(prefix);
scan.addFilter(prefixFilter);
ResultScanner resultScanner = table.getScanner(scan);

对此很抱歉-这就是在没有IDE的情况下编辑某些内容时发生的情况：）（我将其作为字符串，然后决定不重复字节。toBytes（）：）这是执行部分键扫描的最佳/唯一/最有效的方法吗？请注意，在HBase 0.98（可能更早）中，他们将API更改为setFilter。scan.setFilter（prefixFilter）；在使用happybase时，您是否遇到了thrift server的问题，因为当我使用happybase检索大量数据时，thrift server会崩溃，而对于小型数据检索，它会工作。

byte[] prefix=Bytes.toBytes("1|A|");
Scan scan = new Scan(prefix);
PrefixFilter prefixFilter = new PrefixFilter(prefix);
scan.addFilter(prefixFilter);
ResultScanner resultScanner = table.getScanner(scan);