Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/hadoop/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Hadoop Apache Pig不会将字符串解析为int/long_Hadoop_Apache Pig_Cloudera - Fatal编程技术网

Hadoop Apache Pig不会将字符串解析为int/long

Hadoop Apache Pig不会将字符串解析为int/long,hadoop,apache-pig,cloudera,Hadoop,Apache Pig,Cloudera,我是pig新手,正在尝试对包含如下事件的文件执行一些基本分析: 1345477765 2012-08-20 08:49:24 servername 12.34.56.78 192.168.1.4 joebloggs ManageSystem Here's your message 我尝试按如下方式加载文件: logs = LOAD '/path/to/file' using PigStorage AS (loggedtime:long, serverdate:charar

我是pig新手,正在尝试对包含如下事件的文件执行一些基本分析:

1345477765  2012-08-20  08:49:24    servername  12.34.56.78 192.168.1.4 joebloggs   ManageSystem    Here's your message
我尝试按如下方式加载文件:

logs = LOAD '/path/to/file' using PigStorage AS (loggedtime:long, serverdate:chararray, servertime:chararray, servername:chararray, externalip:chararray, internalip:chararray, username:chararray, systemtype:chararray,  message:chararray);
当我演示日志时,一切看起来都正常:

     Illustrate logs
    ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    | logs     | loggedtime:long      | serverdate:chararray    | servertime:chararray    | servername:chararray    | externalip:chararray        | internalip:chararray      | username:chararray    | systemtype:chararray            | message:chararray                                                                                                                         | 
    ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    |          | 1345477765 | 2012-08-20   | 08:49:24       | servername | 12.34.56.78 | 192.168.1.4 | joebloggs   | ManageSystem | Here's your message  | 
    ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
此外,当一个团队描述他们时,一切都如我所料:

logs: {loggedtime: long,serverdate: chararray,servertime: chararray,servername: chararray,externalip: chararray,internalip: chararray,username: chararray,systemtype: chararray,message: chararray}
但是,当我转储日志时,不包括loggedtime

dump logs;
(,2012-08-20,08:49:24,servername,12.34.56.78,192.168.1.4,joebloggs,ManageSystem,Here's your message)
因此,我的过滤器可能不会返回任何事件:

specificlog = FILTER logs BY loggedtime == 1345477765;

希望我错过了一些简单的事情。

我最终自己发现了这一点。要解析为long,我必须在数字的末尾加一个“L”

e、 g.通过将我的源数据更改为以下格式,我能够使其正常工作

1345477765L  2012-08-20  08:49:24    servername  12.34.56.78 192.168.1.4 joebloggs   ManageSystem    Here's your message

希望这能帮助有同样问题的人。

但如果您有数百万条记录,需要将字符串转换为长字符串,该怎么办呢。我想在每一张唱片上都加上“L”是不好的。