Apache spark 如何通过Spark使用HBase ColumnRangeFilter_Apache Spark_Hbase

Apache spark 如何通过Spark使用HBase ColumnRangeFilter

apache-spark hbase

Apache spark 如何通过Spark使用HBase ColumnRangeFilter,apache-spark,hbase,Apache Spark,Hbase,我正在考虑如何使用Spark提供的HBase ColumnRangeFilter。我看了一下，但是这个API不包含ColumnRangeFilter。所以我不知道怎样用火花过滤例如，我想使用以“20170225”开头，以“20170305”结尾的ColumnRangeFilter 我想在代码下面扫描一行 val conf = HBaseConfiguration.create() conf.set(TableInputFormat.INPUT_TABLE, "like_count") va

我正在考虑如何使用Spark提供的HBase ColumnRangeFilter。
我看了一下，但是这个API不包含ColumnRangeFilter。
所以我不知道怎样用火花过滤

例如，我想使用以“20170225”开头，以“20170305”结尾的ColumnRangeFilter

我想在代码下面扫描一行

val conf = HBaseConfiguration.create()
conf.set(TableInputFormat.INPUT_TABLE, "like_count")
val startRow = "001"
val endRow = "100"
conf.set(TableInputFormat.SCAN_ROW_START, startRow)
conf.set(TableInputFormat.SCAN_ROW_STOP, endRow)
sc.newAPIHadoopRDD(conf, classOf[TableInputFormat], classOf[ImmutableBytesWritable], classOf[Result])

我需要添加什么代码？

如果有人有建议，请告诉我。

使用扫描对象设置开始行和结束行，并在Hbase配置中设置该扫描对象，然后将该配置对象传递给tableInputFormat

你从哪里得到这个

convertScanToString（）

函数？它是一个自定义方法私有静态字符串convertScanToString（扫描）{try{ClientProtos.Scan proto=ProtobufUtil.toScan（扫描）；返回Base64.encodeBytes（proto.toByteArray（））；}catch（异常e）{e.printStackTrace（）；返回“”；}非常感谢你。我最终复制了其中一个HBase的逻辑classes@alina如果你觉得答案有帮助，请把它投上一票

Scan scan = new Scan(startRow, endRow);
scan.setMaxVersions(MAX_VERSIONS);

//This can also be done if not specified in scan object constructor
scan.setFilter(new ColumnRangeFilter(startrow,true,endrow,true));


HBaseConfiguration.merge(conf, HBaseConfiguration.create(conf));

conf.set(TableInputFormat.INPUT_TABLE, username + ":" + path);
conf.set(TableInputFormat.SCAN, convertScanToString(scan));


tableInputFormat.setConf(conf);