Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/amazon-web-services/13.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Hadoop SparkSQL不适用于配置单元UDAF_Hadoop_Amazon Web Services_Apache Spark_Hive_Emr - Fatal编程技术网

Hadoop SparkSQL不适用于配置单元UDAF

Hadoop SparkSQL不适用于配置单元UDAF,hadoop,amazon-web-services,apache-spark,hive,emr,Hadoop,Amazon Web Services,Apache Spark,Hive,Emr,我使用的是AWS EMR+Spark 1.6.1+Hive 1.0.0 我有这个UDAF,并将其包含在spark的类路径中 并通过sqlContext.sql在spark中注册它(“将临时函数maxrow创建为'some.cool.package.hive.udf.GenericUDAFMaxRow'”) 但是,当我在下面的查询中调用Spark时 CREATE VIEW VIEW_1 AS SELECT a.A, a.B, maxrow

我使用的是AWS EMR+Spark 1.6.1+Hive 1.0.0

我有这个UDAF,并将其包含在spark的类路径中

并通过sqlContext.sql在spark中注册它(“将临时函数maxrow创建为'some.cool.package.hive.udf.GenericUDAFMaxRow'”)

但是,当我在下面的查询中调用Spark时

CREATE VIEW VIEW_1 AS
      SELECT
        a.A,
        a.B,
        maxrow ( a.C,
                 a.D,
                 a.E,
                 a.F,
                 a.G,
                 a.H,
                 a.I
            ) as m
        FROM
            table_1 a
        JOIN
            table_2 b
        ON
                b.Z = a.D
            AND b.Y  = a.C
        JOIN dummy_table
        GROUP BY
            a.A,
            a.B
它给了我这个错误

16/05/18 19:49:14 WARN RowResolver: Duplicate column info for a.A was overwritten in RowResolver map: _col0: string by _col0: string
16/05/18 19:49:14 WARN RowResolver: Duplicate column info for a.B was overwritten in RowResolver map: _col1: bigint by _col1: bigint
16/05/18 19:49:14 ERROR Driver: FAILED: SemanticException [Error 10002]: Line 16:32 Invalid column reference 'C'
org.apache.hadoop.hive.ql.parse.SemanticException: Line 16:32 Invalid column reference 'C'
                at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:10643)
                at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10591)
                at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3656)
但是,如果我删除GROUPBY子句和聚合函数,它就会工作。所以我怀疑SparkSQL是否认为它不是一个聚合函数


感谢您的帮助。谢谢。

事实上,我发现只有在将UDAF放入CREATE VIEW语句时,UDAF才起作用。在SELECT语句中单独使用它是可以的。事实上,我发现只有当将UDAF放入CREATE VIEW语句中时,UDAF才不起作用。在SELECT语句中单独使用它是可以的。