C++ 矩阵乘法中的异步_C++_Asynchronous

C++ 矩阵乘法中的异步

c++ asynchronous

C++ 矩阵乘法中的异步,c++,asynchronous,C++,Asynchronous,我试图在我的代码中实现并发矩阵乘法。老师对我说我必须在我的do_multiply函数中使用异步，但我不知道怎么做。我认为do_乘法必须启动一个线程，然后等待结果。这是我的do_乘法函数： void do_multiply(matrix_wrap<T> result, matrix_wrap<T> lhs, matrix_wrap<T> rhs) { const unsigned height = result.get_hei

我试图在我的代码中实现并发矩阵乘法。老师对我说我必须在我的do_multiply函数中使用异步，但我不知道怎么做。我认为do_乘法必须启动一个线程，然后等待结果。这是我的do_乘法函数：

    void do_multiply(matrix_wrap<T> result, matrix_wrap<T> lhs, matrix_wrap<T> rhs) {
              const unsigned height = result.get_height();
              const unsigned width = result.get_width();
              const unsigned span = lhs.get_width();
              assert(span==rhs.get_height());
              for (unsigned i=0; i!=height; ++i)
                 for (unsigned j=0; j!=width; ++j) {
                       result(i, j) = 0;
                       for (unsigned k = 0; k != span; ++k)
                         result(i, j) += lhs(i, k) * rhs(k, j);                   
                  }

               }

而不是

result(i, j) += lhs(i, k) * rhs(k, j);

但我真的不知道它是如何工作的，如果工作正常，我只会出错，所以代码不会生成。有什么建议吗？

将MxN矩阵与NxP矩阵相乘将生成一个大小为MxP的矩阵。请注意，当乘以这些值时，必须对矩阵的每个组件执行单独的操作。您应该尝试分别启动每个矩阵元素的计算。由于每个任务都在写入不同的内存位置，因此不会出现竞争条件，您不必担心同步问题

确保在返回之前等待所有任务完成！它应该是这样的：

void do_multiply(matrix_wrap<T> result, matrix_wrap<T> lhs, matrix_wrap<T> rhs) {
    // here you should probably assert that the result, lhs, and rhs are of compatible
    // sizes. 

    // lambda function that will be launched once for each output matrix element.
    auto compute_element = [&result, &lhs, &rhs](int row, int col)
    {
        T value = T(0);
        ... //compute value here. 
        result(row, col) = value;
    };

    std::vector<std::future<void>> tasks;
    tasks.reserve(result.rows() * result.cols());
    //launch all async tasks. 
    for(int row = 0; row < result.rows(); row++)
    {
        for (int col = 0; col < result.cols(); col++)
        {
            auto task = std::async(std::launch::async, compute_element, row, col);
            taks.push_back(move(task));
        }
    }

    //task.get() will only return once the task is finished. 
    for(auto& task: tasks)
    {
        task.get();
    }
    //now all tasks are finished, so we can be sure that all elements
    //of the result matrix are populated. 
}

void do_multiply（矩阵_wrap result，矩阵_wrap lhs，矩阵_wrap rhs）{
//在这里，您可能应该断言结果lhs和rhs是兼容的
//尺寸。
//lambda函数，将为每个输出矩阵元素启动一次。
自动计算\u元素=[&result，&lhs，&rhs]（整行，整列）
{
T值=T（0）；
…//在这里计算值。
结果（行、列）=值；
};
向量任务；
tasks.reserve（result.rows（）*result.cols（））；
//启动所有异步任务。
for（int row=0；row


请注意，对于小型矩阵（少于数百行和数百列），以这种方式执行乘法很可能比仅在单个线程中执行乘法要慢
 您至少应该添加错误。我应该使用什么来代替异步？我是C++新手，我真的不知道什么是更好的并发，好吧。以后我会尝试我的笔记本电脑，现在我应该好好学习，重新思考我想做什么。谢谢你，保罗！与每个任务只执行一个（行、列）对不同，您可以将多个对分块。我想我的老师想要一些不同的东西。在我看来，他希望（A+B）*（C+D），其中A，B，C，D是矩阵，按以下方式执行：首先（A+B）和（C+D）同时执行，然后执行乘法。我应该使用这段代码，然后在sum方法中添加wait/sleep或其他内容吗？在这种情况下，您可以将do_add（A，B）作为异步任务启动，然后执行do_add（C，D）。然后，收集异步任务的结果并执行乘法运算。
void do_multiply(matrix_wrap<T> result, matrix_wrap<T> lhs, matrix_wrap<T> rhs) {
    // here you should probably assert that the result, lhs, and rhs are of compatible
    // sizes. 

    // lambda function that will be launched once for each output matrix element.
    auto compute_element = [&result, &lhs, &rhs](int row, int col)
    {
        T value = T(0);
        ... //compute value here. 
        result(row, col) = value;
    };

    std::vector<std::future<void>> tasks;
    tasks.reserve(result.rows() * result.cols());
    //launch all async tasks. 
    for(int row = 0; row < result.rows(); row++)
    {
        for (int col = 0; col < result.cols(); col++)
        {
            auto task = std::async(std::launch::async, compute_element, row, col);
            taks.push_back(move(task));
        }
    }

    //task.get() will only return once the task is finished. 
    for(auto& task: tasks)
    {
        task.get();
    }
    //now all tasks are finished, so we can be sure that all elements
    //of the result matrix are populated. 
}