Apache pig 为什么转储操作符返回路径?

Apache pig 为什么转储操作符返回路径?,apache-pig,Apache Pig,我有一个简单的pig代码: CRE_28001 = LOAD '$input' USING PigStorage(';') AS (CIA_CD_CRV_CIA:chararray,CIA_DA_EM_CRV:chararray,CIA_CD_CTRL_BLCE:chararray); -- Generer les colonnes du fichier Data = FOREACH CRE_28001 GENERATE (chararray) CIA_CD_CRV_CIA AS C

我有一个简单的pig代码:

CRE_28001 = LOAD '$input' USING  PigStorage(';') AS (CIA_CD_CRV_CIA:chararray,CIA_DA_EM_CRV:chararray,CIA_CD_CTRL_BLCE:chararray);  

-- Generer les colonnes du fichier

Data  = FOREACH CRE_28001 GENERATE
(chararray) CIA_CD_CRV_CIA AS CIA_CD_CRV_CIA,
(chararray) CIA_DA_EM_CRV AS CIA_DA_EM_CRV,
(chararray) CIA_CD_CTRL_BLCE AS CIA_CD_CTRL_BLCE,
(chararray) RUB_202 AS RUB_202;

-- Etablir le filtre exigee

CRE_28001_FILTER = FILTER Data BY (RUB_202 == '6');

LIMIT_DATA = LIMIT CRE_28001_FILTER 10;
DUMP LIMIT_DATA;
我确信我的过滤器是正确的。列RUB_202有100多行以“6”作为值。我多次证实了这一点

看看我得到了什么:

Input(s):
Successfully read 444 records (583792 bytes) from: "/hdfs/data/adhoc/PR/02/RDO0/BB0/MGM28001-2019-08-19.csv"

Output(s):
Successfully stored 0 records in: "hdfs://ha-manny/hdfs/hadoop/pig/tmp/temp1618713487/tmp-1281522727"

Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_1549794175705_3500029       ->      job_1549794175705_3500031,
job_1549794175705_3500031
请注意,我没有要求将数据保存在
hdfs://ha-manny/hdfs/hadoop/pig/tmp/temp1618713487/tmp-1281522727.

为什么这是自动生成的,我可以看到任何数据描述或演示


当我只是查看过滤器的结果时,我也会发现,解决方案是使用列的索引号而不是名称引用列。 换言之:

Data  = FOREACH CRE_28001 GENERATE
(chararray) $0 AS CIA_CD_CRV_CIA,
(chararray) $1 AS CIA_DA_EM_CRV,
(chararray) $2 AS CIA_CD_CTRL_BLCE,
(chararray) $3 AS RUB_202;
然后我使用了TRIM操作符,因为有些列的数据中有空格! 它是有效的