Hadoop Pig,如何在加入和分组后引用字段
我在Pig中有这段代码(win、request和response只是直接从文件系统加载的表): 基本上,我想在加入和分组后对投标价格求和,但我得到一个错误:Hadoop Pig,如何在加入和分组后引用字段,hadoop,apache-pig,Hadoop,Apache Pig,我在Pig中有这段代码(win、request和response只是直接从文件系统加载的表): 基本上,我想在加入和分组后对投标价格求和,但我得到一个错误: Could not infer the matching function for org.apache.pig.builtin.SUM as multiple or none of them fit. Please use an explicit cast. 我的猜测是,我没有正确地指的是win.bid\u price当执行多个联接时,
Could not infer the matching function for org.apache.pig.builtin.SUM as multiple or none of them fit. Please use an explicit cast.
我的猜测是,我没有正确地指的是
win.bid\u price
当执行多个联接时,我建议为您的字段使用唯一标识符(例如,bid\u id)。或者,您也可以使用“::”,但这可能会变得非常脏
wins = LOAD '/user/hadoop/rtb/wins' USING PigStorage(',') AS (f1_w:int, f2_w:int, f3_w:chararray);
reqs = LOAD '/user/hadoop/rtb/reqs' USING PigStorage(',') AS (f1_r:int, f2_r:int, f3_r:chararray);
resps = LOAD '/user/hadoop/rtb/resps' USING PigStorage(',') AS (f1_rp:int, f2_rp:int, f3_rp:chararray);
wins_reqs = JOIN wins BY f1_w, reqs BY f1_r;
wins_reqs_reps = JOIN wins_reqs BY f1_r, resps BY f1_rp;
win_group = GROUP wins_reqs_reps BY (f3_w);
win_sum = FOREACH win_group GENERATE group, SUM(wins_reqs_reps.f2_w);
wins = LOAD '/user/hadoop/rtb/wins' USING PigStorage(',') AS (f1_w:int, f2_w:int, f3_w:chararray);
reqs = LOAD '/user/hadoop/rtb/reqs' USING PigStorage(',') AS (f1_r:int, f2_r:int, f3_r:chararray);
resps = LOAD '/user/hadoop/rtb/resps' USING PigStorage(',') AS (f1_rp:int, f2_rp:int, f3_rp:chararray);
wins_reqs = JOIN wins BY f1_w, reqs BY f1_r;
wins_reqs_reps = JOIN wins_reqs BY f1_r, resps BY f1_rp;
win_group = GROUP wins_reqs_reps BY (f3_w);
win_sum = FOREACH win_group GENERATE group, SUM(wins_reqs_reps.f2_w);