Hadoop phoenix表上的配置单元查询引发ColumnNotFoundException

Hadoop phoenix表上的配置单元查询引发ColumnNotFoundException,hadoop,hive,emr,amazon-emr,phoenix,Hadoop,Hive,Emr,Amazon Emr,Phoenix,我使用hbase和hive(hive-server2)运行一个EMR集群 我的ETL管道使用数据创建并填充Phoenix表 CREATE TABLE IF NOT EXISTS UNMAPPED_FACTS ( ACCOUNT VARCHAR NOT NULL, CONTAINER VARCHAR NOT NULL, UID_TYPE VARCHAR NOT NULL, UID VARCHAR NOT NULL, TS_EPOCH_MILLIS BIGINT NOT NULL

我使用hbase和hive(hive-server2)运行一个EMR集群

  • 我的ETL管道使用数据创建并填充Phoenix表

    CREATE TABLE IF NOT EXISTS UNMAPPED_FACTS (
      ACCOUNT VARCHAR NOT NULL,
      CONTAINER VARCHAR NOT NULL,
      UID_TYPE VARCHAR NOT NULL,
      UID VARCHAR NOT NULL,
      TS_EPOCH_MILLIS BIGINT NOT NULL,
      DP_KEY VARCHAR NOT NULL,
      DP_VALUE VARCHAR NOT NULL
        CONSTRAINT pk PRIMARY KEY (ACCOUNT, CONTAINER, UID_TYPE, UID, TS_EPOCH_MILLIS, DP_KEY)
    ) SALT_BUCKETS = 256
    
  • 然后,我在hive metastore中创建一个指向我的phoenix表的
    EXTERNAL

    CREATE EXTERNAL TABLE IF NOT EXISTS unmapped_facts (
        account STRING,
        container STRING,
        uid_type STRING,
        uid STRING,
        ts_epoch_millis BIGINT,
        dp_key STRING,
        dp_value STRING
    )
    STORED BY 'org.apache.phoenix.hive.PhoenixStorageHandler'
    TBLPROPERTIES (
    "phoenix.table.name" = "UNMAPPED_FACTS",
    "phoenix.zookeeper.quorum" = "${zookeeper_host}",
    "phoenix.zookeeper.znode.parent" = "/hbase",
    "phoenix.zookeeper.client.port" = "2181",
    "phoenix.rowkeys" = "ACCOUNT, CONTAINER, UID_TYPE, UID, TS_EPOCH_MILLIS, DP_KEY"
    );
    
  • 然后我对它运行
    hive
    查询,例如:

    select * from unmapped_facts limit 10
    
  • 当我使用
    emr5.6.0
    Phoenix 4.9.0-HBase-1.2
    HBase 1.2.3
    )时,所有这些都起到了作用

    现在我将EMR升级到最新的
    5.7.0
    Phoenix 4.11.0-HBase-1.3
    HBase 1.3.1
    ),现在步骤1和2工作正常,但执行查询会引发异常(见下文)。我可以使用
    sqlline
    在phoenix表上执行sql查询,没有任何问题

    如果您能帮助调试此问题,我们将不胜感激

    Caused by: org.apache.phoenix.schema.ColumnNotFoundException: ERROR 504 (42703): Undefined column. columnName=UNMAPPED_FACTS.account
            at org.apache.phoenix.schema.PTableImpl.getColumnForColumnName(PTableImpl.java:818) ~[phoenix-4.11.0-HBase-1.3-hive.jar:?]
            at org.apache.phoenix.compile.FromCompiler$SingleTableColumnResolver.resolveColumn(FromCompiler.java:478) ~[phoenix-4.11.0-HBase-1.3-hive.jar:?]
            at org.apache.phoenix.compile.TupleProjectionCompiler$ColumnRefVisitor.visit(TupleProjectionCompiler.java:208) ~[phoenix-4.11.0-HBase-1.3-hive.jar:?]
            at org.apache.phoenix.compile.TupleProjectionCompiler$ColumnRefVisitor.visit(TupleProjectionCompiler.java:194) ~[phoenix-4.11.0-HBase-1.3-hive.jar:?]
            at org.apache.phoenix.parse.ColumnParseNode.accept(ColumnParseNode.java:56) ~[phoenix-4.11.0-HBase-1.3-hive.jar:?]
            at org.apache.phoenix.compile.TupleProjectionCompiler.createProjectedTable(TupleProjectionCompiler.java:109) ~[phoenix-4.11.0-HBase-1.3-hive.jar:?]
            at org.apache.phoenix.compile.QueryCompiler.compileSingleFlatQuery(QueryCompiler.java:528) ~[phoenix-4.11.0-HBase-1.3-hive.jar:?]
            at org.apache.phoenix.compile.QueryCompiler.compileSingleQuery(QueryCompiler.java:507) ~[phoenix-4.11.0-HBase-1.3-hive.jar:?]
            at org.apache.phoenix.compile.QueryCompiler.compileSelect(QueryCompiler.java:202) ~[phoenix-4.11.0-HBase-1.3-hive.jar:?]
            at org.apache.phoenix.compile.QueryCompiler.compile(QueryCompiler.java:157) ~[phoenix-4.11.0-HBase-1.3-hive.jar:?]
            at org.apache.phoenix.jdbc.PhoenixStatement$ExecutableSelectStatement.compilePlan(PhoenixStatement.java:475) ~[phoenix-4.11.0-HBase-1.3-hive.jar:?]
            at org.apache.phoenix.jdbc.PhoenixStatement$ExecutableSelectStatement.compilePlan(PhoenixStatement.java:441) ~[phoenix-4.11.0-HBase-1.3-hive.jar:?]
            at org.apache.phoenix.jdbc.PhoenixStatement.compileQuery(PhoenixStatement.java:1648) ~[phoenix-4.11.0-HBase-1.3-hive.jar:?]
            at org.apache.phoenix.jdbc.PhoenixStatement.compileQuery(PhoenixStatement.java:1641) ~[phoenix-4.11.0-HBase-1.3-hive.jar:?]
            at org.apache.phoenix.jdbc.PhoenixStatement.optimizeQuery(PhoenixStatement.java:1635) ~[phoenix-4.11.0-HBase-1.3-hive.jar:?]
            at org.apache.phoenix.hive.mapreduce.PhoenixInputFormat.getQueryPlan(PhoenixInputFormat.java:260) ~[phoenix-4.11.0-HBase-1.3-hive.jar:?]
            at org.apache.phoenix.hive.mapreduce.PhoenixInputFormat.getSplits(PhoenixInputFormat.java:131) ~[phoenix-4.11.0-HBase-1.3-hive.jar:?]
            at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextSplits(FetchOperator.java:372) ~[hive-exec-2.1.1-amzn-0.jar:2.1.1-amzn-0]
            at org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:304) ~[hive-exec-2.1.1-amzn-0.jar:2.1.1-amzn-0]
            at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:459) ~[hive-exec-2.1.1-amzn-0.jar:2.1.1-amzn-0]
            at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:428) ~[hive-exec-2.1.1-amzn-0.jar:2.1.1-amzn-0]
            at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146) ~[hive-exec-2.1.1-amzn-0.jar:2.1.1-amzn-0]
            at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2098) ~[hive-exec-2.1.1-amzn-0.jar:2.1.1-amzn-0]
            at org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:479) ~[hive-service-2.1.1-amzn-0.jar:2.1.1-amzn-0]
            ... 25 more
    

    我也有同样的问题。我通过这个链接解决了这个问题:

    您只需添加一个属性“phoenix.column.mapping”,然后将每个配置单元表列映射到phoenix表列,但映射中的每个左列必须是小写的

    工作得很有魅力