Java Map-Reduce框架
我有一个文本文件,上面写着。。。。 这是阿帕奇猪,工作起来很有魅力。 所以我想计算每个角色重复的次数。 这应该是。。。 T=T的计数 H=H的计数 A=A的计数 B=Java Map-Reduce框架,java,Java,我有一个文本文件,上面写着。。。。 这是阿帕奇猪,工作起来很有魅力。 所以我想计算每个角色重复的次数。 这应该是。。。 T=T的计数 H=H的计数 A=A的计数 B= Can anyone tell me how do I break my words into characters into Pig. Any help would be greatly appreciated. input.txt This is Apache pig, works like a charm
Can anyone tell me how do I break my words into characters into Pig.
Any help would be greatly appreciated.
input.txt
This is Apache pig,
works like
a charm
PigScript:
A = LOAD 'input.txt' AS line;
B = FOREACH A GENERATE (REPLACE(line,'','\n')) AS (word:chararray);
C = FOREACH B GENERATE FLATTEN(TOKENIZE(word,'\n'));
D = GROUP C BY $0;
E = FOREACH D GENERATE group,COUNT($1);
DUMP E;
Output:
( ,6)
(,,1)
(A,1)
(T,1)
(a,3)
(c,2)
(e,2)
(g,1)
(h,3)
(i,4)
(k,2)
(l,1)
(m,1)
(o,1)
(p,2)
(r,2)
(s,3)
(w,1)