Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/hadoop/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Hadoop猪。改变结构_Hadoop_Structure_Apache Pig - Fatal编程技术网

Hadoop猪。改变结构

Hadoop猪。改变结构,hadoop,structure,apache-pig,Hadoop,Structure,Apache Pig,你能帮我改变一下文件的结构吗 例如: 我有一个带2个分隔符的字符串: 1;2,3,4;2 我想将其更改为带1个分隔符的raws: 1;2;2 1;3;2 1;4;2 清管器脚本: A = LOAD 'a.csv' USING PigStorage(';') AS (value1:chararray,value2:chararray,value3:chararray); B = FOREACH A GENERATE value1, FLATTEN(TOKENIZE(value2, ',

你能帮我改变一下文件的结构吗

例如:

  • 我有一个带2个分隔符的字符串:

    1;2,3,4;2
    
  • 我想将其更改为带1个分隔符的raws:

    1;2;2
    1;3;2
    1;4;2
    

  • 清管器脚本:

     A = LOAD 'a.csv' USING PigStorage(';') AS (value1:chararray,value2:chararray,value3:chararray);
     B = FOREACH A GENERATE value1, FLATTEN(TOKENIZE(value2, ',')), value3;
     DUMP B;
    
    1;2,3,4;2
    
    (1,2,2)
    (1,3,2)
    (1,4,2)
    
    输入:

     A = LOAD 'a.csv' USING PigStorage(';') AS (value1:chararray,value2:chararray,value3:chararray);
     B = FOREACH A GENERATE value1, FLATTEN(TOKENIZE(value2, ',')), value3;
     DUMP B;
    
    1;2,3,4;2
    
    (1,2,2)
    (1,3,2)
    (1,4,2)
    
    输出:

     A = LOAD 'a.csv' USING PigStorage(';') AS (value1:chararray,value2:chararray,value3:chararray);
     B = FOREACH A GENERATE value1, FLATTEN(TOKENIZE(value2, ',')), value3;
     DUMP B;
    
    1;2,3,4;2
    
    (1,2,2)
    (1,3,2)
    (1,4,2)
    
    我们可以使用存储B;作为分隔符

     STORE B INTO 'requiredOutputLocation' USING PigStorage(';');
    

    我们可以使用“;”加载数据(分号分隔符),然后使用TOKENIZE函数格式化逗号分隔的值,并展平数据以实现目标。几天前回答过类似的问题,请检查: