Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Apache spark Pyspark:针对ApachePhoenix按日期推送筛选器_Apache Spark_Pyspark_Phoenix - Fatal编程技术网

Apache spark Pyspark:针对ApachePhoenix按日期推送筛选器

Apache spark Pyspark:针对ApachePhoenix按日期推送筛选器,apache-spark,pyspark,phoenix,Apache Spark,Pyspark,Phoenix,我正在尝试从pyspark中筛选ApachePhoenix中的日期。phoenix中的列创建为Date,筛选器为datetime。当我使用explain时,我看到spark没有将过滤器推到phoenix。我试过很多组合,但运气不好 有办法吗 df = sqlContext.read \ .format("org.apache.phoenix.spark") \ .option("table", "TABLENAME") \ .option("zkUrl",zookepperServ

我正在尝试从pyspark中筛选ApachePhoenix中的日期。phoenix中的列创建为Date,筛选器为datetime。当我使用explain时,我看到spark没有将过滤器推到phoenix。我试过很多组合,但运气不好

有办法吗

df = sqlContext.read \
   .format("org.apache.phoenix.spark") \
  .option("table", "TABLENAME") \
  .option("zkUrl",zookepperServer +":2181:/hbase-unsecure" ) \
  .load()
print(df.printSchema())

startValidation = datetime.datetime.now()

print(df.filter(df['FH'] >startValidation).explain(True))
结果

root
 |-- METER_ID: string (nullable = true)
 |-- FH: date (nullable = true)

None
   == Parsed Logical Plan ==
'Filter (FH#53 > 1486726683446150)
+- Relation[METER_ID#52,FH#53,SUMMERTIME#54,MAGNITUDE#55,SOURCE#56,ENTRY_DATETIME#57,BC#58,T_VAL_AE#59,T_VAL_AI#60,T_VAL_R1#61,T_VAL_R2#62,T_VAL_R3#63,T_VAL_R4#64] PhoenixRelation(DAILYREADS,10.0.0.13:2181:/hbase-unsecure)

== Analyzed Logical Plan ==
METER_ID: string, FH: date, SUMMERTIME: string, MAGNITUDE: int, SOURCE: int, ENTRY_DATETIME: date, BC: string, T_VAL_AE: int, T_VAL_AI: int, T_VAL_R1: int, T_VAL_R2: int, T_VAL_R3: int, T_VAL_R4: int
Filter (cast(FH#53 as string) > cast(1486726683446150 as string))
+- Relation[METER_ID#52,FH#53,SUMMERTIME#54,MAGNITUDE#55,SOURCE#56,ENTRY_DATETIME#57,BC#58,T_VAL_AE#59,T_VAL_AI#60,T_VAL_R1#61,T_VAL_R2#62,T_VAL_R3#63,T_VAL_R4#64] PhoenixRelation(DAILYREADS,10.0.0.13:2181:/hbase-unsecure)

== Optimized Logical Plan ==
Filter (cast(FH#53 as string) > 2017-02-10 11:38:03.44615)
+- Relation[METER_ID#52,FH#53,SUMMERTIME#54,MAGNITUDE#55,SOURCE#56,ENTRY_DATETIME#57,BC#58,T_VAL_AE#59,T_VAL_AI#60,T_VAL_R1#61,T_VAL_R2#62,T_VAL_R3#63,T_VAL_R4#64] PhoenixRelation(DAILYREADS,10.0.0.13:2181:/hbase-unsecure)

== Physical Plan ==
Filter (cast(FH#53 as string) > 2017-02-10 11:38:03.44615)
+- Scan PhoenixRelation(DAILYREADS,10.0.0.13:2181:/hbase-unsecure)[METER_ID#52,FH#53,SUMMERTIME#54,MAGNITUDE#55,SOURCE#56,ENTRY_DATETIME#57,BC#58,T_VAL_AE#59,T_VAL_AI#60,T_VAL_R1#61,T_VAL_R2#62,T_VAL_R3#63,T_VAL_R4#64]
None
如果我将FH列设置为timestamp,它会推送过滤器,但会引发异常:

Caused by: org.apache.phoenix.exception.PhoenixParserException: ERROR 604 (42P00): Syntax error. Mismatched input. Expecting "RPAREN", got "12" at line 1, column 219.
    at org.apache.phoenix.exception.PhoenixParserException.newException(PhoenixParserException.java:33)
    at org.apache.phoenix.parse.SQLParser.parseStatement(SQLParser.java:111)
    at org.apache.phoenix.jdbc.PhoenixStatement$PhoenixStatementParser.parseStatement(PhoenixStatement.java:1280)
    at org.apache.phoenix.jdbc.PhoenixStatement.parseStatement(PhoenixStatement.java:1363)
    at org.apache.phoenix.jdbc.PhoenixStatement.compileQuery(PhoenixStatement.java:1373)
    at org.apache.phoenix.jdbc.PhoenixStatement.optimizeQuery(PhoenixStatement.java:1368)
    at org.apache.phoenix.mapreduce.PhoenixInputFormat.getQueryPlan(PhoenixInputFormat.java:122)
    ... 102 more
Caused by: MismatchedTokenException(106!=129)
    at org.apache.phoenix.parse.PhoenixSQLParser.recoverFromMismatchedToken(PhoenixSQLParser.java:360)
    at org.apache.phoenix.shaded.org.antlr.runtime.BaseRecognizer.match(BaseRecognizer.java:115)
    at org.apache.phoenix.parse.PhoenixSQLParser.not_expression(PhoenixSQLParser.java:6862)
    at org.apache.phoenix.parse.PhoenixSQLParser.and_expression(PhoenixSQLParser.java:6677)
    at org.apache.phoenix.parse.PhoenixSQLParser.or_expression(PhoenixSQLParser.java:6614)
    at org.apache.phoenix.parse.PhoenixSQLParser.expression(PhoenixSQLParser.java:6579)
    at org.apache.phoenix.parse.PhoenixSQLParser.single_select(PhoenixSQLParser.java:4615)
    at org.apache.phoenix.parse.PhoenixSQLParser.unioned_selects(PhoenixSQLParser.java:4697)
    at org.apache.phoenix.parse.PhoenixSQLParser.select_node(PhoenixSQLParser.java:4763)
    at org.apache.phoenix.parse.PhoenixSQLParser.oneStatement(PhoenixSQLParser.java:789)
    at org.apache.phoenix.parse.PhoenixSQLParser.statement(PhoenixSQLParser.java:508)
    at org.apache.phoenix.parse.SQLParser.parseStatement(SQLParser.java:108)
    ... 107 more

谢谢

通常在Apache开发人员列表中询问问题,问题在JIRA中报告。您的JIRA(在4小时内)得到了如下回复:


不太喜欢推特+堆栈溢出+开发列表+JIRA的鸟枪式方法。请记住,回复的人都有白天的工作。

再次为这件事道歉。我不知道去哪里问,在不同的地方试过。下次只有吉拉。谢谢