Apache pig Pig-对内部联接结果执行聚合
我在两个表上执行了内部联接。但是,我无法对其中一个克隆执行求和: 执行的查询:Apache pig Pig-对内部联接结果执行聚合,apache-pig,Apache Pig,我在两个表上执行了内部联接。但是,我无法对其中一个克隆执行求和: 执行的查询: sample1 = load '/user/tweets/samples.csv' using PigStorage AS (line:chararray); words = FOREACH sample1 GENERATE FLATTEN(TOKENIZE(REPLACE(LOWER(TRIM(line)),'[\\p{Punct},\\p{Cntrl}]',''))) AS word newinnerjoin =
sample1 = load '/user/tweets/samples.csv' using PigStorage AS (line:chararray);
words = FOREACH sample1 GENERATE FLATTEN(TOKENIZE(REPLACE(LOWER(TRIM(line)),'[\\p{Punct},\\p{Cntrl}]',''))) AS word
newinnerjoin = join words by word, wordlexion by lexword;
下面是表的输出:newinnerjoin
(important,important,2)
(irritated,irritated,-3)
(promoting,promoting,1)
(promoting,promoting,1)
(appreciate,appreciate,2)
(confidence,confidence,2)
我想对内部联接结果的第3列执行聚合。
所以,我想把总和计算为2+-3+1+1+2+2=5
有没有办法不用将内部连接结果存储在csv文件中?
请告知
谢谢你能添加下面3行代码并告诉我结果吗
A = GROUP newinnerjoin ALL;
B = FOREACH A GENERATE SUM(newinnerjoin.$2);
DUMP B;