Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/vb.net/14.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
与boost::mpi发送/接收的消息同步?_Mpi_Thread Synchronization_Boost Mpi - Fatal编程技术网

与boost::mpi发送/接收的消息同步?

与boost::mpi发送/接收的消息同步?,mpi,thread-synchronization,boost-mpi,Mpi,Thread Synchronization,Boost Mpi,我用-np2调用mpirun。我指的是0级为主进程和1级从进程 目标: master偶尔会向slave发送消息,如mpi::send1、UPDATE、data;。其他消息类型包括DIE、COMPUTE等。这些消息类型是具有唯一值的常量整数。 从机运行无限循环,监听来自主机的任何消息。当它收到一条消息时,它会向主机发送一个确认信息。 实施: 从运行: ... int updateData, computeData; mpi::request updateRequest = world.irecv(

我用-np2调用mpirun。我指的是0级为主进程和1级从进程

目标:

master偶尔会向slave发送消息,如mpi::send1、UPDATE、data;。其他消息类型包括DIE、COMPUTE等。这些消息类型是具有唯一值的常量整数。 从机运行无限循环,监听来自主机的任何消息。当它收到一条消息时,它会向主机发送一个确认信息。 实施:

从运行:

...
int updateData, computeData;
mpi::request updateRequest = world.irecv(0,UPDATE, updateData);    
mpi::request computeRequest = world.irecv(0,COMPUTE, computeData);    

do {
  cerr << "slave ready to take a command" << endl;
  if(updateRequest.test()) {
    cerr << "slave ireceived UPDATE" << endl;
    world.send(0, UPDATE_ACK, 0);
    cerr << "slave sent UPDATE_ACK" << endl;

    /* do something useful 
    ...
    ...
    */

    updateRequest = world.irecv(0, UPDATE, updateData);

  } else if (computeRequest.test()) {
    ...
  } else {
    boost::this_thread::sleep( boost::posix_time::seconds(1) );
  }
}
当master运行时:

...
world.send(1, UPDATE, 10);
cerr << "master sent UPDATE" << endl;
int dummy;
world.recv(1, UPDATE_ACK, dummy);
cerr << "master received UPDATE_ACK" << endl;
...  
主代码的更多上下文:

...
// update1
world.send(1, UPDATE, params);
cerr << "master sent UPDATE" << endl;
int dummy;
world.recv(1, UPDATE_ACK, dummy);
cerr << "master received UPDATE_ACK" << endl;

// update2
world.send(1, UPDATE2, params2);
cerr << "master sent UPDATE2" << endl;
world.recv(1, UPDATE2_ACK, dummy);
cerr << "master received UPDATE2_ACK" << endl;

// update3
world.send(1, UPDATE3, params3);
cerr << "master sent UPDATE3" << endl;
world.recv(1, UPDATE3_ACK, dummy);
cerr << "master received UPDATE3_ACK" << endl;

...

// training iterations
do {
  
  mpi::request computeRecvReq1, computeRecvReq2;
  std::map<int, int> result1, result2;

  // for each line in a text file, the master asks the slave(s)
  // to compute two things and aggregates the results
  for(unsigned sentId = 0; sentId != data.size(); sentId++) {

    // these two functions won't return until at least one slave is "idle"
    CollectSlavesWork1(computeRecvReq1, result1);
    CollectSlavesWork2(computeRecvReq2, result2);

    // async ask the slave to compute and async get the results
    world.isend(1, COMPUTE, sentId);
    computeRecvReq1 = world.irecv(1, RESULT1, result1);
    computeRecvReq2 = world.irecv(1, RESULT2, result2);

  }

  // based on the slave(s) work, the master updates params1 
  // and send them again to the slave(s)
  world.send(1, UPDATE, params);
  cerr << "master sent UPDATE" << endl;
  world.recv(1, UPDATE_ACK, dummy);              // PROBLEM HAPPENS HERE
  cerr << "master received UPDATE_ACK" << endl;


} while(!ModelIsConverged())

...  
输出:

奴隶准备好接受命令了吗

主机已发送更新

从ireceived更新

从机发送更新确认

主机收到更新确认

奴隶准备好接受命令了吗

奴隶准备好接受命令了吗

主机已发送更新

从ireceived更新

从机发送更新确认

奴隶准备好接受命令了吗

问题:
主机第一次发送更新消息时,一切似乎都正常。但是,在第二次更新中,主机没有收到更新确认。

显示来自主机循环的更多代码。@hristoilev。完成。有关主代码,请参阅更多上下文。谢谢你的关注!老实说,除了我不熟悉的boost MPI绑定之外,我没有发现任何其他问题:您使用哪种MPI实现?我使用的是openmpi-1.6.3。再次感谢您对这个问题的兴趣。为了记录在案,我重新构造了我的代码,以使用所谓的集合mpi操作,如reduce和broadcast,并发现这比使用点对点发送/接收操作要简单得多。