Apache pig 将具有非标准分隔符的文件加载到pig中

Apache pig 将具有非标准分隔符的文件加载到pig中,apache-pig,Apache Pig,我有一个如下所示的文件,其中字段由“?”分隔: (01-01-2011-04:43:50?2521795691802591407?94.20.58.165? 当我使用 mac = load 'Activity_1295336_01-01-2011.log.gz'using PigStorage('?'); 我仍然无法访问内部字段,例如,mac$1它在我这方面的工作与预期一样 $> pig --version Apache Pig version 0.9.2-cdh4.0.0 (rexp

我有一个如下所示的文件,其中字段由“?”分隔:

(01-01-2011-04:43:50?2521795691802591407?94.20.58.165?
当我使用

mac = load 'Activity_1295336_01-01-2011.log.gz'using PigStorage('?');

我仍然无法访问内部字段,例如,
mac$1

它在我这方面的工作与预期一样

$> pig --version
Apache Pig version 0.9.2-cdh4.0.0 (rexported) 
compiled Jun 04 2012, 17:42:27

$> cat temp1
01-01-2011-04:43:50?2521795691802591407?94.20.58.165?

grunt> a = load '/temp1' using PigStorage('?') as (datetime, id, ip); 
grunt> dump a;
grunt> >> (01-01-2011-04:43:50,2521795691802591407,94.20.58.165,)
grunt> b = foreach a { funky = CONCAT(ip, '_-* FUNKY'); generate datetime, id, funky;}
grunt> dump b;
grunt> >> (01-01-2011-04:43:50,2521795691802591407,94.20.58.165_-* FUNKY)

它在我这方面的效果和预期的一样

$> pig --version
Apache Pig version 0.9.2-cdh4.0.0 (rexported) 
compiled Jun 04 2012, 17:42:27

$> cat temp1
01-01-2011-04:43:50?2521795691802591407?94.20.58.165?

grunt> a = load '/temp1' using PigStorage('?') as (datetime, id, ip); 
grunt> dump a;
grunt> >> (01-01-2011-04:43:50,2521795691802591407,94.20.58.165,)
grunt> b = foreach a { funky = CONCAT(ip, '_-* FUNKY'); generate datetime, id, funky;}
grunt> dump b;
grunt> >> (01-01-2011-04:43:50,2521795691802591407,94.20.58.165_-* FUNKY)

? 是为参数保留的我想尝试更改分隔符以检查?是为参数保留的我想尝试更改分隔符以检查