Filter 清管器10过滤器特征不为空,不为';行不通
我是新来的猪,我正在玩它,来到一个路障 假设我有以下几点:Filter 清管器10过滤器特征不为空,不为';行不通,filter,apache-pig,Filter,Apache Pig,我是新来的猪,我正在玩它,来到一个路障 假设我有以下几点: dump test; (1,2014-04-08 12:09:23.0) (2,2014-04-08 12:09:23.0) (3,null) (4,null) 我想过滤“test”以删除空值,所以我会这样做: filter_test = filter test by test.column2 is not null; (1,2014-04-08 12:09:23.0) (2,2014-04-08 12:09:23.0) 给我这
dump test;
(1,2014-04-08 12:09:23.0)
(2,2014-04-08 12:09:23.0)
(3,null)
(4,null)
我想过滤“test”以删除空值,所以我会这样做:
filter_test = filter test by test.column2 is not null;
(1,2014-04-08 12:09:23.0)
(2,2014-04-08 12:09:23.0)
给我这样的东西:
filter_test = filter test by test.column2 is not null;
(1,2014-04-08 12:09:23.0)
(2,2014-04-08 12:09:23.0)
但它返回相同的东西。它不会删除空行
我使用的是Pig 10,日期列是Charray类型
谢谢您的帮助。您的column2没有空值,这是一个字符。请参见实空值和空为字符的示例 示例1:null作为字符
input.txt
1,2014-04-08 12:09:23.0
2,2014-04-08 12:09:23.0
3,null
4,null
清管器:
A = LOAD 'input.txt' USING PigStorage(',') AS (f1:int,f2:chararray);
B = FILTER A BY f2!='null';
DUMP B;
(1,2014-04-08 12:09:23.0)
(2,2014-04-08 12:09:23.0)
A = LOAD 'input.txt' USING PigStorage(',') AS (f1:int,f2:chararray);
B = FILTER A BY f2 is not null;
DUMP B;
(1,2014-04-08 12:09:23.0)
(2,2014-04-08 12:09:23.0)
输出:
A = LOAD 'input.txt' USING PigStorage(',') AS (f1:int,f2:chararray);
B = FILTER A BY f2!='null';
DUMP B;
(1,2014-04-08 12:09:23.0)
(2,2014-04-08 12:09:23.0)
A = LOAD 'input.txt' USING PigStorage(',') AS (f1:int,f2:chararray);
B = FILTER A BY f2 is not null;
DUMP B;
(1,2014-04-08 12:09:23.0)
(2,2014-04-08 12:09:23.0)
示例2:实际空值input.txt
1,2014-04-08 12:09:23.0
2,2014-04-08 12:09:23.0
3,
4,
清管器:
A = LOAD 'input.txt' USING PigStorage(',') AS (f1:int,f2:chararray);
B = FILTER A BY f2!='null';
DUMP B;
(1,2014-04-08 12:09:23.0)
(2,2014-04-08 12:09:23.0)
A = LOAD 'input.txt' USING PigStorage(',') AS (f1:int,f2:chararray);
B = FILTER A BY f2 is not null;
DUMP B;
(1,2014-04-08 12:09:23.0)
(2,2014-04-08 12:09:23.0)
输出:
A = LOAD 'input.txt' USING PigStorage(',') AS (f1:int,f2:chararray);
B = FILTER A BY f2!='null';
DUMP B;
(1,2014-04-08 12:09:23.0)
(2,2014-04-08 12:09:23.0)
A = LOAD 'input.txt' USING PigStorage(',') AS (f1:int,f2:chararray);
B = FILTER A BY f2 is not null;
DUMP B;
(1,2014-04-08 12:09:23.0)
(2,2014-04-08 12:09:23.0)
令人惊叹的!成功了!我看到NULL将是一个空包/元组,我真的认为我尝试了你的方法。我想我没有。谢谢