Hadoop Filter命令在Pig中返回0条记录

Hadoop Filter命令在Pig中返回0条记录,hadoop,apache-pig,Hadoop,Apache Pig,但是DUMP C返回0条记录。但档案中有1956年的记录 样本数据: A = LOAD 'Batting.csv' USING PigStorage(','); B = foreach A generate $0 as id:int,$1 as year:int,$8 as run:int; C = FILTER B by year==1956; 转储B playerID,yearID,stint,teamID,lgID,G,G_batting,AB,R,H,2B,3B,HR,RBI,SB,C

但是DUMP C返回0条记录。但档案中有1956年的记录

样本数据:

A = LOAD 'Batting.csv' USING PigStorage(',');
B = foreach A generate $0 as id:int,$1 as year:int,$8 as run:int;
C = FILTER B by year==1956;
转储B

playerID,yearID,stint,teamID,lgID,G,G_batting,AB,R,H,2B,3B,HR,RBI,SB,CS,BB,SO,IBB,HBP,SH,SF,GIDP,G_old
aardsda01,2004,1,SFN,NL,11,11,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,11
aardsda01,2006,1,CHN,NL,45,43,2,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,45
aardsda01,2007,1,CHA,AL,25,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2
aardsda01,2008,1,BOS,AL,47,5,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,5
aardsda01,2009,1,SEA,AL,73,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
aardsda01,2010,1,SEA,AL,53,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
aaronha01,1954,1,ML1,NL,122,122,468,58,131,27,6,13,69,2,2,28,39,,3,6,4,13,122
aaronha01,1955,1,ML1,NL,153,153,602,105,189,37,9,27,106,3,1,49,61,5,3,7,4,20,153
aaronha01,1956,1,ML1,NL,153,153,609,106,200,34,14,26,92,2,4,37,54,6,2,5,7,21,153
aaronha01,1957,1,ML1,NL,151,151,615,118,198,27,6,44,132,1,1,57,58,15,0,0,3,13,151
aaronha01,1958,1,ML1,NL,153,153,601,109,196,34,4,30,95,4,1,59,49,16,1,0,3,21,153
aaronha01,1959,1,ML1,NL,154,154,629,116,223,46,7,39,123,8,0,51,54,17,4,0,9,19,154
aaronha01,1960,1,ML1,NL,153,153,590,102,172,20,11,40,126,16,7,60,63,13,2,0,12,8,153
aaronha01,1961,1,ML1,NL,155,155,603,115,197,39,10,34,120,21,9,56,64,20,2,1,9,16,155

您的
B
对于测试过滤是否有效并非完全必要

(zuvelpa01,1984,2)
(zuvelpa01,1985,16)
(zuvelpa01,1986,2)
(zuvelpa01,1987,2)
(zuvelpa01,1988,9)
(zuvelpa01,1989,10)
(zuvelpa01,1991,0)
(zuverge01,1951,0)
(zuverge01,1952,1)
(zuverge01,1954,1)
(zuverge01,1954,1)
(zuverge01,1955,0)
(zuverge01,1955,1)
(zuverge01,1956,0)
(zuverge01,1957,1)
(zuverge01,1958,0)
(zuverge01,1959,0)
(zwilldu01,1910,7)
(zwilldu01,1914,91)
(zwilldu01,1915,65)
(zwilldu01,1916,4)
您确实需要从文件中删除头。然后可以将数据转换为整数

或者,只需使用CLI工具即可

$ cat batting.pig
A = LOAD 'Batting.csv' USING PigStorage(',');
C = FILTER A by (int)$1==1956;
\d C

从您的文件中发布样本数据,很可能您没有引用正确的列。
DUMP B
查看您在该关系中得到了什么。在问题详细信息中编辑了这两个
$ sed -i '' 1d Batting.csv
$ pig -f batting.pig
...
(aaronha01,1956,1,ML1,NL,153,153,609,106,200,34,14,26,92,2,4,37,54,6,2,5,7,21,153)