Apache pig Pig Latin-for循环中的不同计数和字符串比较
我试图通过那些在个人资料中至少有两个国家的用户或他们来自美国的用户来过滤用户,我在Pig中尝试过这种方法Apache pig Pig Latin-for循环中的不同计数和字符串比较,apache-pig,Apache Pig,我试图通过那些在个人资料中至少有两个国家的用户或他们来自美国的用户来过滤用户,我在Pig中尝试过这种方法 B = group A by userid; C = foreach B { count = $1.country; count2 = distinct count; GENERATE (((SIZE(count2) > 1 OR count2.$0 != 'USA') ?
B = group A by userid;
C = foreach B {
count = $1.country;
count2 = distinct count;
GENERATE (((SIZE(count2) > 1 OR count2.$0 != 'USA') ? group : null)));
}
但这是一个错误
incompatible types in NotEqual Operator left hand side:bag :tuple(country:chararray) right hand side:chararray
我尝试了多种其他组合,但运气不佳。试试这个:
C =
foreach (group A by userid)
generate
group as userid,
COUNT(A) AS count,
FLATTEN(A) as country;
D = filter C by count > 1 OR country == 'US';
C是与模式{userid:chararray,count:long,country:chararray}的关系,其中count是与userid关联的国家数。D是根据您的标准筛选的。谢谢,这看起来是一个很好的解决方案,我试图在一个循环中解决它,但没有这样想。