Json 无法使用字符串数据类型中的unixtimestamp列类型

Json 无法使用字符串数据类型中的unixtimestamp列类型,json,hive,timestamp,unix-timestamp,Json,Hive,Timestamp,Unix Timestamp,我有一个配置单元表来加载JSON数据。我的JSON中有两个值。两者的数据类型都是字符串。如果我将它们保留为bigint,则此表上的select将给出以下错误: Failed with exception java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException: org.codehaus.jackson.JsonParseException: Current token (VALUE_STRING) not numeric,

我有一个配置单元表来加载JSON数据。我的JSON中有两个值。两者的数据类型都是字符串。如果我将它们保留为bigint,则此表上的select将给出以下错误:

Failed with exception java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException: org.codehaus.jackson.JsonParseException: Current token (VALUE_STRING) not numeric, can not use numeric value accessors
 at [Source: java.io.ByteArrayInputStream@3b6c740b; line: 1, column: 21]
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. The following columns have types incompatible with the existing columns in their respective positions : uploadtimestamp
如果我把它改成两个字符串,那么它工作正常

现在,因为这些列是字符串,所以我不能对这些列使用from_unixtime方法

如果我尝试将这些列的数据类型从string更改为bigint,则会出现以下错误:

Failed with exception java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException: org.codehaus.jackson.JsonParseException: Current token (VALUE_STRING) not numeric, can not use numeric value accessors
 at [Source: java.io.ByteArrayInputStream@3b6c740b; line: 1, column: 21]
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. The following columns have types incompatible with the existing columns in their respective positions : uploadtimestamp
下面是我的CREATETABLE语句:

create table ABC
(
    uploadTimeStamp bigint
   ,PDID            string

   ,data            array
                    <
                        struct
                        <
                            Data:struct
                            <
                                unit:string
                               ,value:string
                               ,heading:string
                               ,loc:string
                               ,loc1:string
                               ,loc2:string
                               ,loc3:string
                               ,speed:string
                               ,xvalue:string
                               ,yvalue:string
                               ,zvalue:string
                            >
                           ,Event:string
                           ,PDID:string
                           ,`Timestamp`:string
                           ,Timezone:string
                           ,Version:string
                           ,pii:struct<dummy:string>
                        >
                    >
)
row format serde 'org.apache.hive.hcatalog.data.JsonSerDe' 
stored as textfile;
我可以用什么方法将这个字符串unixtimestamp转换成标准时间,或者用bigint处理这些列

  • 如果您谈论的是时间戳时区,那么可以将它们定义为int/big int类型。
    如果查看它们的定义,您会发现值周围没有限定符(“),因此它们在JSON文档中是数字类型:

    “时间戳”:1488793268598,“时区”:330



  • 即使已将时间戳定义为字符串,也可以在需要bigint的函数中使用它之前将其强制转换为bigint

    强制转换(`Timestamp`作为bigint)


  • 失败:SemanticException[错误10014]:行1:45参数错误 “timestamp”:类没有匹配的方法 org.apache.hadoop.hive.ql.udf.UDFFromUnixTime,带(字符串)。可能 选项:FUNC(bigint)FUNC(bigint,string)FUNC(int) FUNC(整型,字符串)

  • 如果您谈论的是时间戳时区,那么可以将它们定义为int/big int类型。
    如果查看它们的定义,您会发现值周围没有限定符(“),因此它们在JSON文档中是数字类型:

    “时间戳”:1488793268598,“时区”:330



  • 即使已将时间戳定义为字符串,也可以在需要bigint的函数中使用它之前将其强制转换为bigint

    强制转换(`Timestamp`作为bigint)


  • 失败:SemanticException[错误10014]:行1:45参数错误 “timestamp”:类没有匹配的方法 org.apache.hadoop.hive.ql.udf.UDFFromUnixTime,带(字符串)。可能的 选项:FUNC(bigint)FUNC(bigint,string)FUNC(int) FUNC(整型,字符串)


    你在说什么领域?请在JSON中给出它们的名称和定义。您在谈论哪些字段?请在JSON中给出它们的名称和定义

    | myjson.uploadtimestamp | myjson.pdid |myjson.data|

    |          1486631318873 |         123 | [{"data":{"unit":"rpm","value":"0","heading":null,"loc3":null,"loc":null,"loc1":null,"loc4":null,"speed":null,"x":null,"y":null,"z":null},"eventid":"E1","pdid":"123","timestamp":1486631318873,"timezone":330,"version":"1.0","pii":{"dummy":null}},{"data":{"unit":null,"value":null,"heading":"N","loc3":"false","loc":"14.022425","loc1":"78.760587","loc4":"false","speed":"10","x":null,"y":null,"z":null},"eventid":"E2","pdid":"123","timestamp":1486631318873,"timezone":330,"version":"1.1","pii":{"dummy":null}},{"data":{"unit":null,"value":null,"heading":null,"loc3":null,"loc":null,"loc1":null,"loc4":null,"speed":null,"x":"1.1","y":"1.2","z":"2.2"},"eventid":"E3","pdid":"123","timestamp":1486631318873,"timezone":330,"version":"1.0","pii":{"dummy":null}},{"data":{"unit":"percentage","value":"50","heading":null,"loc3":null,"loc":null,"loc1":null,"loc4":null,"speed":null,"x":null,"y":null,"z":null},"eventid":"E4","pdid":"123","timestamp":1486631318873,"timezone":330,"version":"1.0","pii":null},{"data":{"unit":"kmph","value":"70","heading":null,"loc3":null,"loc":null,"loc1":null,"loc4":null,"speed":null,"x":null,"y":null,"z":null},"eventid":"E5","pdid":"123","timestamp":1486631318873,"timezone":330,"version":"1.0","pii":{"dummy":null}}] |

    
    hive> with t as (select '0' as `timestamp`) select from_unixtime(`timestamp`) from t;
    
    hive> with t as (select '0' as `timestamp`) select from_unixtime(cast(`timestamp` as bigint)) from t;
    OK
    1970-01-01 00:00:00