Apache pig apache pig加载具有多个分隔符的数据

Apache pig apache pig加载具有多个分隔符的数据,apache-pig,delimiter,Apache Pig,Delimiter,大家好,关于使用ApachePig加载数据,我有一个问题,文件格式如下: "1","2","xx,yy","a,sd","3" A = LOAD 'file.csv' USING PigStorage('","') AS (f1,f2,f3,f4,f5); 所以我想通过使用多个分隔符“,”2双引号和一个逗号来加载它: "1","2","xx,yy","a,sd","3" A = LOAD 'file.csv' USING PigStorage('","') AS (f1,f2,f3,f4,

大家好,关于使用ApachePig加载数据,我有一个问题,文件格式如下:

"1","2","xx,yy","a,sd","3"
A = LOAD 'file.csv' USING PigStorage('","') AS (f1,f2,f3,f4,f5);
所以我想通过使用多个分隔符
“,”
2双引号和一个逗号来加载它:

"1","2","xx,yy","a,sd","3"
A = LOAD 'file.csv' USING PigStorage('","') AS (f1,f2,f3,f4,f5);

但是PigStorage不接受多重分隔符
,“
。我怎么做?多谢各位

PigStorage使用单个字符作为分隔符。您将从中使用内置函数。下载piggybank.jar并保存在与pigscript相同的文件夹中。在pigscript中注册jar

REGISTER piggybank.jar;

DEFINE CSVLoader org.apache.pig.piggybank.storage.CSVLoader();

A = LOAD 'test1.txt' USING CSVLoader(',') AS (f1:int,f2:int,f3:chararray,f4:chararray,f5:int);
B = FOREACH A GENERATE f1,f2,f3,f4,f5;
DUMP B;
另一种选择是将数据加载到一行中,然后使用


这是我的问题之一,它确实帮助我解决了其中的一部分。非常有用,谢谢。如果可能的话,请看一下这些评论: