Loops 在拉丁语中使用SUM()

Loops 在拉丁语中使用SUM(),loops,sum,apache-pig,Loops,Sum,Apache Pig,我刚开始用PIG编写一些脚本,我正在尝试对int列求和,我的脚本如下所示: DATA = LOAD 'SomeFile' as (fingerPrint, size, str1, str2); groupedChunks = GROUP DATA BY fingerPrint; uniqueChunks = FILTER groupedChunks BY COUNT(DATA)==1; sizes = FOREACH uniqueChunks GENERATE MAX($.size) as

我刚开始用PIG编写一些脚本,我正在尝试对int列求和,我的脚本如下所示:

DATA = LOAD 'SomeFile' as (fingerPrint, size, str1, str2);
groupedChunks = GROUP DATA BY fingerPrint;


uniqueChunks = FILTER groupedChunks BY COUNT(DATA)==1;
sizes = FOREACH uniqueChunks GENERATE MAX($.size) as size;
现在我有一个表,只有一列,如果我愿意的话,就是size列 调用descripe,它生成以下输出:
size:{size:int}

现在我需要这一步的帮助,我如何得到这个列所有大小的总和?

你能试试这个吗

result = FOREACH (GROUP sizes ALL) GENERATE SUM(sizes);
DUMP result;
更新:完整代码

input.txt

a       1       b       c
d       2       e       f
PigScript:

DATA = LOAD 'input.txt' as (fingerPrint, size, str1, str2);
groupedChunks = GROUP DATA BY fingerPrint;
uniqueChunks = FILTER groupedChunks BY COUNT(DATA)==1;
sizes = FOREACH uniqueChunks GENERATE MAX(DATA.size) as size;
result = FOREACH (GROUP sizes ALL) GENERATE SUM(sizes);
DUMP result;
(3.0)
输出:

DATA = LOAD 'input.txt' as (fingerPrint, size, str1, str2);
groupedChunks = GROUP DATA BY fingerPrint;
uniqueChunks = FILTER groupedChunks BY COUNT(DATA)==1;
sizes = FOREACH uniqueChunks GENERATE MAX(DATA.size) as size;
result = FOREACH (GROUP sizes ALL) GENERATE SUM(sizes);
DUMP result;
(3.0)
V=所有组数据;
result=FOREACH V GENERATE SUM(DATA.size)

我仍然收到错误,无法打开别名result的迭代器。这和铸造有关吗?大小警告:遇到警告隐式地\u CAST \u TO \u LONGa\t1\tb\tc d\t2\te\tf与\t我猜,你知道我的意思是表格。谢谢你,你认为有可能吗,问题是行大小=FOREACH uniqueChunks生成MAX(DATA.size)作为大小;如果:size=FOREACH uniquechunk生成MAX(DATA.size)作为size,会有什么不同吗;但是我只是复制了你的脚本,它仍然不起作用,我得到了一个输出:读取输入成功,但pig未能生成输出,这让我想知道,这是hadoop还是pig中的另一个问题?因为作业本身总是失败,但我仍然得到错误:无法打开迭代器fol alias result。