Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/hadoop/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Hadoop 如何在Apache Pig中合并相同关系的行(项)_Hadoop_Apache Pig - Fatal编程技术网

Hadoop 如何在Apache Pig中合并相同关系的行(项)

Hadoop 如何在Apache Pig中合并相同关系的行(项),hadoop,apache-pig,Hadoop,Apache Pig,我是阿帕奇猪的新手。 我有如下数据 tempdata = (linsys4f-PORT42-0211201516244460,dnis=3007047505) (linsys4f PORT42-0211201516244460,incoming_tfn=8778816235,tfn_location=Ashburn Avaya,ivr_location=Ashburn Avaya,state=NC) (linsys4f-PORT42-0211201516244460,language=ENG

我是阿帕奇猪的新手。 我有如下数据

tempdata = 

(linsys4f-PORT42-0211201516244460,dnis=3007047505)
(linsys4f PORT42-0211201516244460,incoming_tfn=8778816235,tfn_location=Ashburn Avaya,ivr_location=Ashburn Avaya,state=NC)
(linsys4f-PORT42-0211201516244460,language=ENGLISH)
(linsys4f-PORT42-0211201516244460,outcome=Transfer to CSR,exitType=Transfer,exitState=SETDIR2^7990019)
(linsys4f-PORT43-0211201516245465,outcome=Transfer to CSR,exitType=Transfer,exitState=SETDIR2^7990019)
(linsys4f-PORT44-0211201516291287,dnis=3007047505)
(linsys4f-PORT44-0211201516291287,incoming_tfn=8778816235,tfn_location=Ashburn Avaya,ivr_location=Ashburn Avaya,state=NC)
我需要根据键insys4f-
PORT42-0211201516244460、linsys4f-PORT43-0211201516245465和linsys4f-PORT44-0211201516291287合并行。
并且输出应该如下所示:

(linsys4f-PORT42-0211201516244460,dnis=3007047505,incoming_tfn=8778816235,tfn_location=Ashburn Avaya,ivr_location=Ashburn Avaya,state=NC,language=ENGLISH,outcome=Transfer to CSR,exitType=Transfer,exitState=SETDIR2^7990019)

(linsys4f-PORT43-0211201516245465,dnis=3007047505,incoming_tfn=8778816235,tfn_location=Ashburn Avaya,ivr_location=Ashburn Avaya,state=NC,language=SPANISH)

(linsys4f-PORT43-0211201516245465,outcome=Transfer to CSR,exitType=Transfer,exitState=SETDIR2^7990019,dnis=3007047505,incoming_tfn=8778816235,tfn_location=Ashburn Avaya,ivr_location=Ashburn Avaya,state=NC).
我如何合并这个。任何帮助都将不胜感激。

尝试使用运算符并解决此问题:

我已将您的第一个字段分隔为链接、端口名和端口id,以获得更清晰的图片

A = LOAD '/home/coe_user_1/del/data.txt' USING PigStorage(',') AS
(port : CHARARRAY, dnis : CHARARRAY, incoming_tfn : CHARARRAY, tfn_location : CHARARRAY, ivr_location : CHARARRAY,state : CHARARRAY, language : CHARARRAY, outcome : CHARARRAY, exitType : CHARARRAY, exitState : CHARARRAY);

B = FOREACH A GENERATE
        FLATTEN(STRSPLIT(port, '-', 3)) as (link: chararray, port: chararray, pid: int),
        dnis AS dnis,
        incoming_tfn AS incoming_tfn,
        tfn_location AS tfn_location,
        ivr_location AS ivr_location,
        state AS state,
        language AS language,
        outcome AS outcome,
        exitType AS exitType,
        exitState AS exitState;
C = FOREACH B GENERATE
        port AS port,
        --pid AS pid,
        dnis AS dnis,
        incoming_tfn AS incoming_tfn,
        tfn_location AS tfn_location,
        ivr_location AS ivr_location,
        state AS state,
        language AS language,
        outcome AS outcome,
        exitType AS exitType,
        exitState AS exitState;
D = GROUP C BY port;

E = FOREACH D GENERATE
        group AS port,FLATTEN(BagToTuple(C.dnis)) AS dnis, FLATTEN(BagToTuple(C.incoming_tfn)) AS incoming_tfn, FLATTEN(BagToTuple(C.tfn_location)) AS tfn_location, FLATTEN(BagToTuple(C.ivr_location)) AS ivr_location ,FLATTEN(BagToTuple(C.state)) AS state,FLATTEN(BagToTuple(C.language)) AS language, FLATTEN(BagToTuple(C.outcome)) AS outcome,FLATTEN(BagToTuple(C.exitType)) AS exitType,FLATTEN(BagToTuple(C.exitState)) AS exitState ;

DUMP E;
输出:

(PORT42,outcome=Transfer to CSR,language=ENGLISH,incoming_tfn=8778816235,dnis=3007047505,exitType=Transfer,,tfn_location=Ashburn Avaya,,exitState=SETDIR2^7990019,,ivr_location=Ashburn Avaya,,,,state=NC,,,,,,,,,,,,,,,,,,,,,)
(PORT43,outcome=Transfer to CSR,exitType=Transfer,exitState=SETDIR2^7990019,,,,,,)
(PORT44,incoming_tfn=8778816235,dnis=3007047505,tfn_location=Ashburn Avaya,,ivr_location=Ashburn Avaya,,state=NC,,,,,,,,,,,)