Concurrency 在Erlang中对来自多个进程的数据进行排序

Concurrency 在Erlang中对来自多个进程的数据进行排序,concurrency,parallel-processing,erlang,multicore,Concurrency,Parallel Processing,Erlang,Multicore,我正在制作一个程序,将身份证号码分为房间和组。我试图对多个并行进程的输出进行排序,但结果是这样的 {'C003','Group A',1} {'C002','Group B',3} {'C015','Group C',5} {'C016','Group D',7} {'C003','Group A',2} {'C002','Group B',4} {'C015','Group C',6} {'C016','Group D',8} 但我希望它是这样的: {'C003','Group A',1}

我正在制作一个程序,将身份证号码分为房间和组。我试图对多个并行进程的输出进行排序,但结果是这样的

{'C003','Group A',1}
{'C002','Group B',3}
{'C015','Group C',5}
{'C016','Group D',7}
{'C003','Group A',2}
{'C002','Group B',4}
{'C015','Group C',6}
{'C016','Group D',8}
但我希望它是这样的:

{'C003','Group A',1}
{'C003','Group A',2}
{'C002','Group B',3}
{'C002','Group B',4}
{'C015','Group C',5}
{'C015','Group C',6}
{'C016','Group D',7}   
{'C016','Group D',8}
我想也许我可以把这些数字发送给一个进程,让它以某种方式按顺序打印它们和它们的组,但我不知道如何做到这一点,同时仍然让这些进程并行进行。我认为一个解决方案可以是选择性的,但我似乎也不能这样想。有人能帮我吗? 这是我的主程序中的一段代码 注意:我是Erlang的新手

-module(ppp).

-compile([export_all]).

categorise(L) ->

     Size = len(L) div 4,
     Rem = len(L) rem 4,    
     spawn(ppp, cat, [self(), 'Group A', L, 0, Size + Rem]),
     spawn(ppp, cat, [self(), 'Group B', L, (Size + Rem), Size]),
     spawn(ppp, cat, [self(), 'Group C', L, (2*Size + Rem), Size]),
     spawn(ppp, cat, [self(), 'Group D', L, (3*Size  + Rem), Size]), 
     wait(4). 

wait(0) -> {done};
wait(N) ->
receive
    done -> wait(N-1)
end.

cat(P, Name, L, Start, Elements) ->     
     Extract = lists:split(Start, L),   
     Group = element(2, Extract),   
     AGroup = lists:sublist(Group, Elements),
     spawn(ppp, putInRoom, [P, Name, AGroup]).

putInRoom(P, _, []) -> P ! done;

putInRoom(P, GroupName, [H|T]) -> 

if GroupName == 'Group A' ->        
    io:format("~w~n", [{'C003', GroupName, H}]),        
    putInRoom(P, GroupName, T);
    GroupName == 'Group B' -> 
    io:format("~w~n", [{'C002', GroupName, H}]),
    putInRoom(P, GroupName, T);
    GroupName == 'Group C' ->
    io:format("~w~n", [{'C015', GroupName, H}]),
    putInRoom(P, GroupName, T);
    GroupName == 'Group D' ->
    io:format("~w~n", [{'C016', GroupName, H}]),
    putInRoom(P, GroupName, T)
end.

len(L) -> 
    count(L, 0).

count([], Acc) -> Acc;

count([_|T], Acc) -> count(T, Acc + 1).

不是为每个组生成
putInRoom
进程,而是为组中的每个元素生成

-module(ppp).

-compile([export_all]).

categorise(L) ->

     Size = len(L) div 4,
     Rem = len(L) rem 4,    
     spawn(ppp, cat, [self(), 'Group A', L, 0, Size + Rem]),
     spawn(ppp, cat, [self(), 'Group B', L, (Size + Rem), Size]),
     spawn(ppp, cat, [self(), 'Group C', L, (2*Size + Rem), Size]),
     spawn(ppp, cat, [self(), 'Group D', L, (3*Size  + Rem), Size]), 
     wait(4). 

wait(0) -> {done};
wait(N) ->
receive
    done -> wait(N-1)
after 2000 ->
     ok
end.

cat(P, Name, L, Start, Elements) ->
     % io:format("cat: ~p L: ~p~n", [P,L]),     
     Extract = lists:split(Start, L),   
     Group = element(2, Extract),   
     AGroup = lists:sublist(Group, Elements),
     % io:format("AGroup: ~p~n", [AGroup]),
     lists:map(fun(G) ->
                 spawn(ppp, putInRoom, [P, Name, G])
               end, AGroup).

putInRoom(P, GroupName, H) -> 
    % io:format("putInRoom: ~p, List: ~p~n", [GroupName, H]),
    if GroupName == 'Group A' ->        
        io:format("~w~n", [{'C003', GroupName, H}]);
        GroupName == 'Group B' -> 
        io:format("~w~n", [{'C002', GroupName, H}]);
        GroupName == 'Group C' ->
        io:format("~w~n", [{'C015', GroupName, H}]);
        GroupName == 'Group D' ->
        io:format("~w~n", [{'C016', GroupName, H}])
    end,
    P ! done.

len(L) -> 
    count(L, 0).

count([], Acc) -> Acc;

count([_|T], Acc) -> count(T, Acc + 1).
它按顺序打印组

2> ppp:categorise(lists:seq(1,8)).
{'C003','Group A',1}
{'C003','Group A',2}
{'C002','Group B',3}
{'C002','Group B',4}
{'C015','Group C',5}
{'C015','Group C',6}
{'C016','Group D',7}
{'C016','Group D',8}
{done}
通过在生成
cat
进程之间添加
计时器:sleep(100)
来使用延迟。但是你有点放松了同时发生的事情。因此,如果你想按顺序打印消息,你必须想出一种不同的算法

-module(ppp).

-compile([export_all]).

categorise(L) ->

     Size = len(L) div 4,
     Rem = len(L) rem 4,    
     spawn(ppp, cat, [self(), 'Group A', L, 0, Size + Rem]),
     timer:sleep(100),
     spawn(ppp, cat, [self(), 'Group B', L, (Size + Rem), Size]),
     timer:sleep(100),
     spawn(ppp, cat, [self(), 'Group C', L, (2*Size + Rem), Size]),
     timer:sleep(100),
     spawn(ppp, cat, [self(), 'Group D', L, (3*Size  + Rem), Size]), 
     wait(4). 

wait(0) -> {done};
wait(N) ->
receive
    done -> wait(N-1)
after 2000 ->
     ok
end.

cat(P, Name, L, Start, Elements) ->
     % io:format("cat: ~p L: ~p~n", [P,L]),     
     Extract = lists:split(Start, L),   
     Group = element(2, Extract),   
     AGroup = lists:sublist(Group, Elements),
     % io:format("AGroup: ~p~n", [AGroup]),
     Pid = spawn(fun putInRoom2/0),
     Pid ! {P, cat, {Name, AGroup}}.

putInRoom2() -> 
    receive
        {P, cat, {_GroupName, []}} -> P ! done;
        {P, cat, {GroupName, L}} ->
            F = fun(G, [H|T]) ->
                    if
                        GroupName == 'Group A' ->        
                            io:format("~w~n", [{'C003', GroupName, H}]);
                        GroupName == 'Group B' -> 
                            io:format("~w~n", [{'C002', GroupName, H}]);
                        GroupName == 'Group C' ->
                            io:format("~w~n", [{'C015', GroupName, H}]);
                        GroupName == 'Group D' ->
                            io:format("~w~n", [{'C016', GroupName, H}])
                    end,
                    self() ! {P, cat, {G, T}}
                end,
            F(GroupName, L),
            putInRoom2()
    end.

len(L) -> 
    count(L, 0).

count([], Acc) -> Acc;

count([_|T], Acc) -> count(T, Acc + 1).

谢谢,但为什么此解决方案会按顺序打印?为每个元素创建一个进程会不会导致更大列表的性能开销?这会按顺序打印,因为它会按顺序生成进程。在您的代码中,它实际上按列打印
[1,2,3],[4,5,6]
表示您得到
1,4,2,5,3,6
。不知道为什么。也许是什么?关于性能,在当前状态下,
putInRoom
功能的寿命非常短。所以我想这不会太费劲。不过,您可以测量性能。由于生成的顺序是异步的,因此顺序不是您期望的。也就是说,您使用
[1,2]
[3,4]
作为数据生成
cat
putInRoom
显示
1
,但由于时间是递归的,并且由于代码是异步的,第二次生成将执行打印
3
,同样。添加延迟后,可以看到打印顺序发生变化。基本上,你想把它看作是一个衍生进程和同步消息传递。有谁能提供更好的答案吗。我不介意降低投票率,但我也在努力学习。:)这与并行无关,而与并发有关。问题是同步,因为进程之间没有同步。您需要以某种方式同步输出。在高负载的情况下,使用sleeps是不起作用的。因此Kadaj给出的答案是正确的,因为他建议同步生成这些进程