Apache pig Apache中基于键或值的筛选器映射

Apache pig Apache中基于键或值的筛选器映射,apache-pig,Apache Pig,我无法理解如何根据Apache中的键或值进行筛选 a = LOAD '/user/maria_dev/timesheet2tuples.csv' using PigStorage('|') AS (T:tuple(driverId:int,week:int),M:[mileslogged:int]); dump a; DESCRIBE a; 输出: ((10,1),[70#3300]) ((11,2),[70#3300]) ((12,3),[60#2800]) ((10,2),[70

我无法理解如何根据Apache中的键或值进行筛选

 a = LOAD '/user/maria_dev/timesheet2tuples.csv' using PigStorage('|')  AS (T:tuple(driverId:int,week:int),M:[mileslogged:int]);
 dump a;
 DESCRIBE  a;
输出:

((10,1),[70#3300])
((11,2),[70#3300])
((12,3),[60#2800])
((10,2),[70#3100])
((10,2),[70#3200])
((13,),[70#3300])
((14,),[70#3000])
((15,1),[80#3000])

a: {T: (driverId: int,week: int),M: map[mileslogged: int]}
现在,

如何根据值或键进行筛选,我找不到示例或

这项工作(基于)

但我正在寻找一个我不知道关键/价值的解决方案

大概是这样的:

B = FILTER a by (T.$0 == 10) AND (m#value == 3100) ;


B = FILTER a by (T.$0 == 10) AND (m#key == '70') ;

如果您使用的是pig 0.17,则可以展平地图,过滤它们并将它们放回地图

A = LOAD input.txt using PigStorage() AS (T:tuple(driverId:int,week:int),M:[mileslogged:int]);
B = FOREACH A GENERATE T, FLATTEN(M);
C = FILTER B by T.driverId == 10 and M::value == 3100;
D = FOREACH C GENERATE T, TOMAP(M::key,M::value) as M;
dump D;
describe D;

如果您使用的是pig 0.17,则可以展平地图,过滤它们并将它们放回地图

A = LOAD input.txt using PigStorage() AS (T:tuple(driverId:int,week:int),M:[mileslogged:int]);
B = FOREACH A GENERATE T, FLATTEN(M);
C = FILTER B by T.driverId == 10 and M::value == 3100;
D = FOREACH C GENERATE T, TOMAP(M::key,M::value) as M;
dump D;
describe D;

谢谢@koji,我有ApachePig版本0.16.0.2.6.5.0-292,将用您的方法进行更新和测试,。这个版本在M::valueKoji上抛出了一个错误,我已经安装了HDP的最新版本,即3.0.1,但是它的版本仍然是“pache Pig version 0.16.0.3.0.1.0-187(rUnversioned directory),编译于2018年9月19日,10:13:33”,当然也抛出了同样的错误,这个版本Tank you@koji有解决方案吗,我有ApachePig版本0.16.0.2.6.5.0-292,将用您的方法对其进行更新和测试,。这个版本在M::valueKoji上抛出了一个错误,我安装了HDP的最新版本,即3.0.1,但它的版本仍然是“pache Pig版本0.16.0.3.0.1.0-187(rUnversioned directory),编译于2018年9月19日10:13:33”,当然也抛出了同样的错误,这个版本有解决方案吗