Apache pig 在清管器中使用多种条件的过滤柱

Apache pig 在清管器中使用多种条件的过滤柱,apache-pig,Apache Pig,我需要编写一个pig脚本,在该脚本中,我将查找多个列的平均值,并仅获取所有列值都大于计算平均值的行。我的剧本是: i2 = GROUP i1 all; i3 = FOREACH i2 GENERATE AVG(i1.user_followers_count) AS avg_user_followers_count, AVG(i1.avl_user_follower_following_ratio) AS avg_avl_user_follower_following_ratio, AVG(i1.

我需要编写一个pig脚本,在该脚本中,我将查找多个列的平均值,并仅获取所有列值都大于计算平均值的行。我的剧本是:

i2 = GROUP i1 all;
i3 = FOREACH i2 GENERATE AVG(i1.user_followers_count) AS avg_user_followers_count, AVG(i1.avl_user_follower_following_ratio) AS avg_avl_user_follower_following_ratio, AVG(i1.user_total_liked) AS avg_user_total_liked, AVG(i1.user_total_posts) AS avg_user_total_posts, AVG(i1.user_total_public_lists) AS avg_user_total_public_lists, AVG(i1.avl_user_total_retweets) AS avg_avl_user_total_retweets, AVG(i1.avl_user_total_likes) AS avl_user_total_likes, AVG(i1.avl_user_total_replies) AS avg_avl_user_total_replies, AVG(i1.avl_user_engagements) AS avl_avl_user_engagements, AVG(i1.user_reply_to_reply_count) AS avg_user_reply_to_reply_count;

top_inf = FILTER i1 BY (i1.user_followers_count > i3.avg_user_followers_count, i1.avl_user_total_retweets > i3. avg_avl_user_total_retweets, i1.avl_user_total_likes > i3.avg_avl_user_total_retweets);
但这带来了一个错误:

ERROR 1200: <file user.pig, line 70, column 103>  mismatched input '>' expecting RIGHT_PAREN
错误1200:不匹配的输入'>'应为正确参数

过滤多个条件下的行的正确方法是什么?

使用和分隔条件

top_inf = FILTER i1 BY (i1.user_followers_count > i3.avg_user_followers_count) 
                   AND (i1.avl_user_total_retweets > i3.avg_avl_user_total_retweets) 
                   AND (i1.avl_user_total_likes > i3.avg_avl_user_total_retweets);

我们是否需要在每种情况下使用这些支架?