Apache pig 清管器脚本中的计数划分

Apache pig 清管器脚本中的计数划分,apache-pig,Apache Pig,我有上面的pig脚本代码,它计算两个计数。 现在我想将Chu计数除以Cau计数,并将其存储在一个文件中。 如何做到这一点?在Pig中没有方便的方法,但a可以帮助您: 清管器: 输出: (0.66666666) 输入: ch = LOAD 'ch.txt'; ch_all = GROUP ch ALL; ch_count = FOREACH ch_all GENERATE 'same' AS key, (DOUBLE) COUNT(ch) AS ct; ca = LOAD 'ca.txt'; c

我有上面的pig脚本代码,它计算两个计数。 现在我想将Chu计数除以Cau计数,并将其存储在一个文件中。
如何做到这一点?

在Pig中没有方便的方法,但a可以帮助您:

清管器:

输出:

(0.66666666)

输入:

ch = LOAD 'ch.txt';
ch_all = GROUP ch ALL;
ch_count = FOREACH ch_all GENERATE 'same' AS key, (DOUBLE) COUNT(ch) AS ct;

ca = LOAD 'ca.txt';
ca_all = GROUP ca ALL;
ca_count = FOREACH ca_all GENERATE 'same' AS key, (DOUBLE) COUNT(ca) AS ct;

ca_ch = JOIN ch_count BY key, ca_count BY key;

ca_ch_div = FOREACH ca_ch GENERATE ch_count::ct / ca_count::ct;

DUMP ca_ch_div;

如何将除法的输出截断为小数点后的两位数?一种方法是使用String.format(“%.2f”,number)进行自定义项。
ch = LOAD 'ch.txt';
ch_all = GROUP ch ALL;
ch_count = FOREACH ch_all GENERATE 'same' AS key, (DOUBLE) COUNT(ch) AS ct;

ca = LOAD 'ca.txt';
ca_all = GROUP ca ALL;
ca_count = FOREACH ca_all GENERATE 'same' AS key, (DOUBLE) COUNT(ca) AS ct;

ca_ch = JOIN ch_count BY key, ca_count BY key;

ca_ch_div = FOREACH ca_ch GENERATE ch_count::ct / ca_count::ct;

DUMP ca_ch_div;
cat ch.txt 
1
2
cat ca.txt 
1
2
3