将多前缀行筛选器设置为扫描仪hbase java_Java_Hadoop_Mapreduce_Hbase

将多前缀行筛选器设置为扫描仪hbase java

java hadoop mapreduce hbase

将多前缀行筛选器设置为扫描仪hbase java,java,hadoop,mapreduce,hbase,Java,Hadoop,Mapreduce,Hbase,我想创建一个扫描仪，它将为我提供带有2个前缀过滤器的结果例如，我希望其键以字符串“x”或字符串“y”开头的所有行。目前，我只知道用一个前缀按以下方式执行： scan.setRowPrefixFilter(prefixFiltet) 我刚刚尝试过，但似乎无法向RowPrefixFilter添加正则表达式，因此我想解决方案是使用 scan.setRowPrefixFilter("x") scan.setRowPrefixFilter("y") 这将获得所需的行。在这种情况下，您不能使用set

我想创建一个扫描仪，它将为我提供带有2个前缀过滤器的结果
例如，我希望其键以字符串“x”或字符串“y”开头的所有行。
目前，我只知道用一个前缀按以下方式执行：

scan.setRowPrefixFilter(prefixFiltet)

我刚刚尝试过，但似乎无法向RowPrefixFilter添加正则表达式，因此我想解决方案是使用

scan.setRowPrefixFilter("x")
scan.setRowPrefixFilter("y")

这将获得所需的行。

在这种情况下，您不能使用

setRowPrefixFilter

API，您必须使用更通用的

setFilter

API，例如：

scan.setFilter(
  new FilterList(
    FilterList.Operator.MUST_PASS_ONE, 
    new PrefixFilter('xx'), 
    new PrefixFilter('yy')
  )
);

我已经实现了一个批处理集前缀过滤器，也许可以帮助你

    List<String> bindCodes = new ArrayList<>();
    bindCodes.add("CM0001");
    bindCodes.add("CE7563");
    bindCodes.add("DR6785");

    Scan scan = new Scan();
    scan.setCaching(50);//set get batch numbers
    //set Column
    scan.addColumn(HTableColumnEnum.GPS_CF_1.getCfName().getBytes(), LOCATION_CREATE_DATE_ARRAY);
    //set Family
    scan.addFamily(HTableColumnEnum.GPS_CF_1.getCfName().getBytes());

    //create filterList
    FilterList filterList = new FilterList(FilterList.Operator.MUST_PASS_ONE);
    //put mulit prefix row key
    bindCodes.forEach(s -> {
        filterList.addFilter(new PrefixFilter(Bytes.toBytes(s)));
    });

    //set filterList to scan
    scan.setFilter(filterList);

List bindCodes=newarraylist（）；
bindCodes.添加（“CM0001”）；
bindCodes.添加（“CE7563”）；
bindCodes.添加（“DR6785”）；
扫描=新扫描（）；
扫描设置缓存（50）//设置获取批次号
//集合列
scan.addColumn（HTableColumnEnum.GPS\u CF\u 1.getCfName（）.getBytes（），位置\u创建\u日期\u数组）；
//集合族
scan.addFamily（htablecolumneum.GPS_CF_1.getCfName（）.getBytes（））；
//创建过滤器列表
FilterList FilterList=新的FilterList（FilterList.Operator.MUST\u PASS\u ONE）；
//放置多行前缀键
bindCodes.forEach（s->{
addFilter（新的PrefixFilter（Bytes.toBytes））；
});
//将filterList设置为扫描
scan.setFilter（过滤器列表）；

如果您这样做，它将只返回以“y”开头的键，因为您覆盖了“x”。我的目标是得到一个扫描对象，该对象将具有两个键的结果。糟糕的是，我忘了添加您应该分别执行每个扫描。尝试执行一个，存储结果，并添加第二个的结果关于此解决方案性能的说明：您正在执行一个完整的表扫描，并将所有行通过这些过滤器。一般来说，这是非常低效的。使用

scan.setRowPrefixFilter（prefix）

进行多个扫描时，如果表很大，并且只有少数前缀，则可能会更快。