Apache pig 清管器变平,无空
我有一个猪袋子Apache pig 清管器变平,无空,apache-pig,Apache Pig,我有一个猪袋子 (1139-50052,Aquatic,Consumer,6,makarina,2,{(),(Unknown)}) (1139-50052,Aquatic,Consumer,6,jabong,2,{(),(),(),(Unknown)}) 我需要把它展平,没有空值 (1139-50052,Aquatic,Consumer,6,makarina,2,Unknown) (1139-50052,Aquatic,Consumer,6,jabong,2,Unknown) 请注意。一个选
(1139-50052,Aquatic,Consumer,6,makarina,2,{(),(Unknown)})
(1139-50052,Aquatic,Consumer,6,jabong,2,{(),(),(),(Unknown)})
我需要把它展平,没有空值
(1139-50052,Aquatic,Consumer,6,makarina,2,Unknown)
(1139-50052,Aquatic,Consumer,6,jabong,2,Unknown)
请注意。一个选项是您可以在BagToString函数中传递包,这样空值将被丢弃,然后根据分隔符“\u1”分割包值 除了您的输入,它还适用于其他组合,示例如下 输入 笔迹: 输出:
FLATTEN(STRSPLIT(BagToString(BagName),'_+'))
1139-50052 Aquatic Consumer 6 makarina 2 {(),(Unknown)}
1139-50052 Aquatic Consumer 6 jabong 2 {(),(),(),(Unknown)}
1139-50052 Aquatic Consumer 6 test1 2 {(unknown1),(),(),(Unknown2)}
1139-50052 Aquatic Consumer 6 test2 2 {(unknown1),(unknown2),(),(Unknown3)}
A = LOAD 'input' USING PigStorage() AS (f0,f1,f2,f3,f4,f5,B:{T:(f7)});
B = FOREACH A GENERATE f0,f1,f2,f3,f4,f5,FLATTEN(STRSPLIT(BagToString(B),'_+'));
DUMP B;
(1139-50052,Aquatic,Consumer,6,makarina,2,Unknown)
(1139-50052,Aquatic,Consumer,6,jabong,2,Unknown)
(1139-50052,Aquatic,Consumer,6,test1,2,unknown1,Unknown2)
(1139-50052,Aquatic,Consumer,6,test2,2,unknown1,unknown2,Unknown3)