Apache pig 清管器错误中的AVG()

Apache pig 清管器错误中的AVG(),apache-pig,Apache Pig,我的代码: table = load '/tmp/baseball/Salaries.csv' using PigStorage(',') AS (yearID,teamID,lgID,playerID,salary); table2 = load '/tmp/baseball/Fielding.csv' using PigStorage(',') AS (playerID,yearID,stint,teamID,lgID,Pos,G,GS,InnOuts,PO,A,E,DP,PB,W

我的代码:

table = load '/tmp/baseball/Salaries.csv' using PigStorage(',') AS (yearID,teamID,lgID,playerID,salary);
table2 = load '/tmp/baseball/Fielding.csv' using PigStorage(',') AS      (playerID,yearID,stint,teamID,lgID,Pos,G,GS,InnOuts,PO,A,E,DP,PB,WP,SB,CS,ZR);
combined_table = JOIN table by (yearID,playerID),table2 by (yearID,playerID);
combined_table_grped = GROUP combined_table BY Pos;
avg = FOREACH combined_table_grped GENERATE     combined_table.Pos,AVG(combined_table.salary);
dump avg;
但我明白了:

<file 5_avg_salary_2.pig, line 5, column 57> Multiple matching functions for org.apache.pig.builtin.AVG with input schema: ({{(bytearray)}}, {{(double)}}). Please use an explicit cast.
org.apache.pig.builtin.AVG的多个匹配函数,输入模式:({(bytearray)}},{(double)})。请使用显式强制转换。
一般来说,我很困惑什么时候可以使用AVG。理论上我的代码应该可以工作…有人能帮我吗?谢谢

这是字段类型不匹配的问题。尝试将column类型初始化为double。因为默认类型将是bytearray。您也可以使用显式转换将结果存储为double,因为默认bytearray将抛出错误,如所示。

这是字段类型不匹配的问题。尝试将column类型初始化为double。因为默认类型将是bytearray。您也可以使用显式转换将结果存储为double,因为默认bytearray将抛出问题中所示的错误。

load语句的
As
部分中的字段名最好具有数据类型,例如
As(yearID:chararray,…,salary:double)
。这有助于Pig选择正确的UDF(即与类型匹配的UDF)。load语句的
AS
部分中的字段名具有数据类型是一种很好的做法,例如
AS(yearID:chararray,…,salary:double)
。这有助于Pig选择正确的UDF(即与类型匹配的UDF)。Pig字段默认为bytearray,而不是chararray。参考清管器字段默认值为bytearray而非chararray。参考