Filter 过滤并更换清管器中的一个柱
假设我的数据看起来像Filter 过滤并更换清管器中的一个柱,filter,foreach,replace,apache-pig,Filter,Foreach,Replace,Apache Pig,假设我的数据看起来像 row1 cats val12 val13 row2 dogs val22 val23 row3 cats val32 val33 ... data = load 'file' AS (row:chararry, pets:charray, val2:charray, val3:charray); 筛选数据以仅保存“cats”行 felines = filter data by (pets matches 'cats'); 现在将“猫”改为“狮子” lions = f
row1 cats val12 val13
row2 dogs val22 val23
row3 cats val32 val33
...
data = load 'file' AS (row:chararry, pets:charray, val2:charray, val3:charray);
筛选数据以仅保存“cats”行
felines = filter data by (pets matches 'cats');
现在将“猫”改为“狮子”
lions = foreach felines generate replace (pets, 'cats', 'lions');
dump lions;
(lions)
(lions)
...
我的目标是创建要添加到表中的新行
newFelines = foreach lions generate rows, lions, val1, val2;
Error ^^^^^
"Error during parsing. Scalars can be only used with projections"
如何获得具有以下新行的集合
row1 lions val11 val12
row3 lions val31 val32
TIA,逐行:
没有“chararry”或“charray”数据类型:
data = load 'file' USING PigStorage(' ') AS
(row:chararray, pets:chararray, val2:chararray, val3:chararray);
提取“猫”:
felines = filter data by (pets matches 'cats');
用“狮子”代替“猫”可以这样做:
lions = foreach felines generate row, REPLACE(pets, 'cats', 'lions'), val2, val3;
lions = foreach felines generate row, 'lions', val2, val3;
或者像这样:
lions = foreach felines generate row, REPLACE(pets, 'cats', 'lions'), val2, val3;
lions = foreach felines generate row, 'lions', val2, val3;
在这种情况下,您不需要提前过滤
狮子=foreach猫科动物生成行、替换(宠物、'cats'、'lions')、val2、val3代码>,它应该是lions=FOREACH数据生成行,替换(pets,'cats,'lions'),val2,val3代码>