C++ 如何在MPI C+中分散字符串数组+；_C++_Mpi

C++ 如何在MPI C+中分散字符串数组+；

c++ mpi

C++ 如何在MPI C+中分散字符串数组+；,c++,mpi,C++,Mpi,我想做的是在一些字符串上运行一个基本的MapReduce操作。我想：将字符串列表（平均）分发给我的所有进程在流程中：将收到的字符串映射到自定义类的对象（例如，WordWithFrequency）收集对象并再次将其发送到进程以进行进一步操作这应该是一个简单的任务，但我找不到一个正确的方法。这是我的密码： #include <iostream> #include <fstream> #include <mpi.h> #include <vector&

我想做的是在一些字符串上运行一个基本的MapReduce操作。我想：

将字符串列表（平均）分发给我的所有进程

在流程中：将收到的字符串映射到自定义类的对象（例如，

WordWithFrequency

）

收集对象并再次将其发送到进程以进行进一步操作

这应该是一个简单的任务，但我找不到一个正确的方法。这是我的密码：

#include <iostream>
#include <fstream>
#include <mpi.h>
#include <vector>

...

int main(int argc, char *argv[]) {
    // Initialize the MPI environment
    MPI_Init(&argc, &argv);

    // Find out the process rank and the world size
    int world_rank;
    MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
    int world_size;
    MPI_Comm_size(MPI_COMM_WORLD, &world_size);

    vector<string> words = { "a", "bc", "d" };
    const int wordsLength = words.size();
    const int wordsPerProcess = wordsLength / world_size;

    string *subWords = new string[wordsPerProcess];
    MPI_Scatter(&words, wordsPerProcess, MPI_CHAR, subWords, wordsPerProcess, ???customDataType???, 0, MPI_COMM_WORLD);

    printf("Process %d got words:\n", world_rank);
    for (int i = 0; i < wordsPerProcess; ++i) {
        cout << subWords[i] << endl;
    }

    ...

使用Boost.MPI是一项非常简单的任务：

#include <boost/mpi.hpp>
...

int main(int argc, char *argv[]) {
    // Initialize the MPI environment.
    mpi::environment env(argc, argv);
    mpi::communicator world;

    vector<string> words = { "foo", "bar", "baz", "..." };
    const int wordCount = words.size();
    const int wordsPerProcess = wordCount / world.size();
    vector<vector<string> > wordsByProcess(world.size(), vector<string>());
    for (int j = 0; j < world.size(); ++j) {
        for (int k = 0, wordIndex = j * wordsPerProcess + k;
             k < wordsPerProcess && wordIndex < wordCount; ++k, ++wordIndex) {
            wordsByProcess[j].push_back(words[wordIndex]);
        }
    }

    vector<string> subWords;
    mpi::scatter(world, wordsByProcess, subWords, 0);
    // subWords is equal to wordsByProcess[world.rank()] here in every process.

#包括
...
int main（int argc，char*argv[]）{
//初始化MPI环境。
环境环境（argc，argv）；
mpi：：通信器世界；
向量词={“foo”，“bar”，“baz”，“…”；
const int wordCount=words.size（）；
const int wordsPerProcess=wordCount/world.size（）；
向量字子进程（world.size（），vector（））；
对于（int j=0；j


散点采用元素向量，该向量由相应值将发送到的进程号索引。有关更多详细信息，请参阅：
请尝试使用std:：vector
，而不是new[]
。由于内存管理问题大大减少，这将大大简化您的生活。@tadman我认为（事实上，我确信）问题出在MPI_Scatter
上，而不是数组。我在这里做了一些完全错误的事情，但我在互联网上找不到任何字符串散射的例子。ps.std:：vector
没有帮助。在这种特殊情况下，它不会神奇地解决所有问题，但从长远来看，它会很有用你不会浪费几天的时间来追踪内存泄漏。长话短说的可能重复，MPI\u Scatter（）
不适用于指针数组。如果你真的想使用MPI\u Scatter（）
你应该使用一个2D数组char（例如，一个固定长度的“字符串”数组）我通过使用Boost.MPI解决了我的问题，它可以分散字符串向量而无需任何额外的工作。它还使序列化自定义对象变得超级容易。感谢您的帮助！
#include <boost/mpi.hpp>
...

int main(int argc, char *argv[]) {
    // Initialize the MPI environment.
    mpi::environment env(argc, argv);
    mpi::communicator world;

    vector<string> words = { "foo", "bar", "baz", "..." };
    const int wordCount = words.size();
    const int wordsPerProcess = wordCount / world.size();
    vector<vector<string> > wordsByProcess(world.size(), vector<string>());
    for (int j = 0; j < world.size(); ++j) {
        for (int k = 0, wordIndex = j * wordsPerProcess + k;
             k < wordsPerProcess && wordIndex < wordCount; ++k, ++wordIndex) {
            wordsByProcess[j].push_back(words[wordIndex]);
        }
    }

    vector<string> subWords;
    mpi::scatter(world, wordsByProcess, subWords, 0);
    // subWords is equal to wordsByProcess[world.rank()] here in every process.