Filter 使用piglatin按整数列表筛选列表

Filter 使用piglatin按整数列表筛选列表,filter,apache-pig,Filter,Apache Pig,我的列表如下所示:lista.csv: client-id priority client-start assignment 12345 1 1250125125 13 1246 3 1250122156 27 12616 1 1250122351 3 ... 我还有另一个列表,看起来像矢量listb.csv: 125125 124214 12

我的列表如下所示:lista.csv:

client-id    priority    client-start    assignment
12345        1            1250125125     13
1246         3            1250122156     27
12616        1            1250122351     3
...
我还有另一个列表,看起来像矢量listb.csv:

125125
124214
1246
125
...
我要做的是筛选所有客户机的列表,我也可以在listb中找到它们的ID

我尝试过类似的方法,但不起作用:

raw = LOAD 'lista.csv' USING PigStorage('\t') AS (client-id: int, priority: 
int, client-start: int, assignment: int); 
s4q = LOAD 'listb.csv' USING PigStorage('\t') AS (survs4id: int);
s4id = FOREACH s4q {
dd = FILTER raw by (client-id == s4q);
GENERATE dd;
}
DUMP dd;

有没有办法解决这个问题?

将这两个关系连接起来,只获取匹配的记录。这将用作过滤器

raw = LOAD 'lista.csv' USING PigStorage('\t') AS (client-id: int, priority: int, client-start: int, assignment: int); 
s4q = LOAD 'listb.csv' USING PigStorage('\t') AS (survs4id: int);
s4id = JOIN raw BY client-id,s4q BY survs4id;
dd = FOREACH s4id GENERATE s4id.$0,s4id.$1,s4id.$2,s4id.$3;
DUMP dd;