C++ C++；11多线程处理比单线程慢_C++_Multithreading_C++11_Graphics

C++ C++；11多线程处理比单线程慢

c++ multithreading c++11 graphics

C++ C++；11多线程处理比单线程慢,c++,multithreading,c++11,graphics,C++,Multithreading,C++11,Graphics,我是一个多任务处理的初学者，我读了一些基础知识，并试图将其应用到我的项目中，以实现对象可视化。问题是，我实现的多线程解决方案比单线程解决方案慢，我不知道为什么，我有未知原因的意外应用程序代码。我给你们两个例子，我试图取得更好的表现。我想知道什么是我不理解的，以及我在总体观点上的错误。我给你一部分源代码，并在最后总结所有问题以下是我的线程工厂实现（非常基本，但只是开始）： threadfactory.h #pragma once #include <vector> #include

我是一个多任务处理的初学者，我读了一些基础知识，并试图将其应用到我的项目中，以实现对象可视化。问题是，我实现的多线程解决方案比单线程解决方案慢，我不知道为什么，我有未知原因的意外应用程序代码。我给你们两个例子，我试图取得更好的表现。我想知道什么是我不理解的，以及我在总体观点上的错误。我给你一部分源代码，并在最后总结所有问题

以下是我的线程工厂实现（非常基本，但只是开始）：

threadfactory.h

#pragma once

#include <vector>
#include "ThreadInterface.h"
#include "../MemoryManagement/MemoryMgr.h"
#include "../Logging/LoggingDefines.h"

class CThreadFactory : public CThreadIntearface
{
    public:
        CThreadFactory();
        CThreadFactory(BYTE max_threads);
        ~CThreadFactory();

        void Init(BYTE max_threads);
        void Clear(void);

        //update waves
        virtual void UpdateWavesInternalPoints(CWaves& waves);
        virtual void UpdateWavesNormals(CWaves& waves);

        //update vertices
        virtual void TransformVertices(const CObject& object, const vector<TVertex>& input, vector<XMFLOAT3>& output, const CXNAMatrix& matrix);

        static const char* GetHeapName(void) { return "Thread factory"; }
#if (defined(DEBUG) | defined(_DEBUG))
        /**
        *   Return class name. This function is compiled only in debug mode.
        *   \return class name
        */
        NAME_FUNC();
#endif

    private:
        void Join(vector<std::thread>& threads);
        void ReleaseThreads(vector<std::thread>& threads);

    private:
        UINT muiNumberofThreads;

    private:
        DECLARE_HEAP;
};

看起来第三方API（可能是DX）正在创建线程，但在process manager中，我只看到一个线程的使用。那可能是个问题

下面是我的问题：

我对线程工厂的实现是否如此错误，或者更新40000个顶点是否不需要划分为更多线程

如果我被锁了，我想知道为什么。顶点转换的解决方案是使用迭代器，顶点向量容器被分割，所以我不应该有锁定

我决定从一个原因为每个函数调用创建线程。起初，我将ThreadVector容器作为thread factory的一个成员类。但这导致了调试模式下的内存泄漏（发布模式没有这个问题）。只是单纯的声明，什么都不做。我一直不知道为什么。是否有其他必要的东西来正确释放线程

现在，我的应用程序以代码27结束，因为所有线程都返回了此错误代码。这是什么意思

奇怪的是，当我使用8个线程（8线程CPU上的7+主线程）时，在调试模式下，我看到所有8个线程都做了一些事情。但在发布模式下，仅使用一个线程（主线程）并没有任何变化。这是错误的行为还是出于某些原因可以预期

对不起，文字太长了，但我想说得更准确些，以避免误解。谢谢你的回答

2014年12月17日编辑：

我重新实现了线程使用的函数（并使其独立于Wave类），没有共享对象引用或变量，但仍然不起作用。我不明白为什么。。。。有趣的是，当我设置使用8个线程时，在调试可执行文件中，我看到我的CoreI7以100%的速度运行，但在帧速率方面没有任何好处。在发布可执行文件时，我只看到4个线程运行，CPU占用25%

新的多线程函数：

void UpdateWaveInteriorPoints(TVertexFieldIterator previous_vertex_field, TVertexFieldIterator actual_vertex_field, DWORD min_row, DWORD max_row, float k1, float k2, float k3, UINT column_count)
{
    if (min_row < 1)
        min_row = 1;

    /*if (max_row >(RowCount() - 1))
        max_row = (RowCount() - 1);*/

    for (DWORD i = min_row; i < max_row; ++i)
    {
        for (DWORD j = 1; j < column_count - 1; ++j)
        {
            // After this update we will be discarding the old previous
            // buffer, so overwrite that buffer with the new update.
            // Note how we can do this inplace (read/write to same element) 
            // because we won't need prev_ij again and the assignment happens last.

            // Note j indexes x and i indexes z: h(x_j, z_i, t_k)
            // Moreover, our +z axis goes "down"; this is just to 
            // keep consistent with our row indices going down.

            previous_vertex_field[i*column_count + j].Position.y =
                k1*previous_vertex_field[i*column_count + j].Position.y +
                k2*actual_vertex_field[i*column_count + j].Position.y +
                k3*(actual_vertex_field[(i + 1)*column_count + j].Position.y +
                actual_vertex_field[(i - 1)*column_count + j].Position.y +
                actual_vertex_field[i*column_count + j + 1].Position.y +
                actual_vertex_field[i*column_count + j - 1].Position.y);
        }
    }
}

void UpdateWaveInteriorPoints（TVertexField迭代器上一个顶点字段、TVertexField迭代器实际顶点字段、DWORD最小行、DWORD最大行、浮点k1、浮点k2、浮点k3、UINT列计数）
{
如果（最小行<1）
最小行=1；
/*如果（最大行数>（行数（）-1））
最大行=（行计数（）-1）*/
对于（DWORD i=最小行；i<最大行；++i）
{
对于（DWORD j=1；j<列计数-1；++j）
{
//在此更新之后，我们将丢弃以前的旧版本
//缓冲区，因此使用新更新覆盖该缓冲区。
//请注意我们如何就地执行此操作（读取/写入同一元素）
//因为我们再也不需要prev_ij了，作业是最后一次。
//注j索引x和i索引z:h（x_j，z_i，t_k）
//此外，我们的+z轴“向下”；这只是为了
//与我们的行指数下降保持一致。
上一个顶点字段[i*列计数+j]。位置.y=
k1*上一个顶点字段[i*列计数+j]。位置.y+
k2*实际顶点字段[i*列计数+j]。位置.y+
k3*（实际顶点字段[（i+1）*列计数+j]。位置。y+
实际顶点字段[（i-1）*列计数+j]。位置.y+
实际顶点字段[i*列计数+j+1]。位置.y+
实际顶点字段[i*列计数+j-1]。位置.y）；
}
}
}

创建线程的函数：

TVertexFieldIterator tActualVertexIterator = waves.mpObjectMesh->mVertices.begin();
        TVertexFieldIterator tPreviousVertexIterator = waves.GetPrevSolutionVertices().begin();
        std::vector<std::thread> threads;
        //std::vector<std::future<void>> threads;
        UINT dwWavePartDifference = waves.RowCount() / muiNumberofThreads;

        DWORD dwMinRow = 1, dwMaxRow = dwWavePartDifference;
        DWORD dwVertexCount = dwWavePartDifference*waves.ColumnCount();

        for (UINT i = 0; i < muiNumberofThreads - 1; i++)
        {
            //threads.emplace_back(std::async( std::launch::async, &CWaves::UpdateWaveInteriorPoints, &waves, tPreviousVertexIterator, tActualVertexIterator, dwMinRow, dwMaxRow, waves.GetK1(), waves.GetK2(), waves.GetK3(), waves.ColumnCount() ));
            threads.emplace_back(std::thread(&UpdateWaveInteriorPoints, tPreviousVertexIterator, tActualVertexIterator, dwMinRow, dwMaxRow, waves.GetK1(), waves.GetK2(), waves.GetK3(), waves.ColumnCount()));

            tActualVertexIterator += dwVertexCount;
            tPreviousVertexIterator += dwVertexCount;
        }

        tPreviousVertexIterator -= waves.ColumnCount(); //row - 1
        tActualVertexIterator -= waves.ColumnCount(); //row - 1
        waves.UpdateWaveInteriorPoints(tPreviousVertexIterator, tActualVertexIterator, dwMinRow, dwMaxRow, waves.GetK1(), waves.GetK2(), waves.GetK3(), waves.ColumnCount());

        for (UINT i = 0; i < muiNumberofThreads -1; i++)
        {
            //threads[i].wait();
            threads[i].join();
        }

tVertexField迭代器tActualVertexIterator=waves.mpObjectMesh->mVertices.begin（）；
TVertexFieldIterator TprevousVertexiterator=waves.GetPrevSolutionVertex（）.begin（）；
向量线程；
//向量线程；
UINT dwWavePartDifference=waves.RowCount（）/muiNumberofThreads；
DWORD dwMinRow=1，dwMaxRow=dwWavePartDifference；
DWORD dwVertexCount=dwWavePartDifference*waves.ColumnCount（）；
对于（UINT i=0；i


Marek@mareknr当我提出你的问题时，有10个相关的问题，答案在侧边栏中，所有这些问题都与为什么多线程实现比单线程实现慢有关。我想他们中的一个或多个会解决你的问题。
以下是其中几个的链接：

void CObject::TransformVerticesSet(vector<TVertex>::const_iterator input, vector<XMFLOAT3>::iterator output, UINT number_of_vertices, const CXNAMatrix& matrix) const
{
    for (UINT i = 0; i <= number_of_vertices; i++)
    {
        CMatrixTransformations::TransformPoint(input[i].Position, matrix, output[i]);
    }
}

The thread 0x229c has exited with code 27 (0x1b).
The thread 0x22dc has exited with code 27 (0x1b).
The thread 0x11ac has exited with code 27 (0x1b).
The thread 0x328c has exited with code 27 (0x1b).
The thread 0x205c has exited with code 27 (0x1b).
The thread 0xf4c has exited with code 27 (0x1b).
The thread 0x894 has exited with code 27 (0x1b).
The thread 0x3094 has exited with code 27 (0x1b).
The thread 0x2eb4 has exited with code 27 (0x1b).
The thread 0x2ef8 has exited with code 27 (0x1b).
The thread 0x22f4 has exited with code 27 (0x1b).
The thread 0x2810 has exited with code 27 (0x1b).
The thread 0x29e0 has exited with code 27 (0x1b).
The thread 0x2e54 has exited with code 27 (0x1b).
D3D11 WARNING: Process is terminating. Using simple reporting. Please call ReportLiveObjects() at runtime for standard reporting. [ STATE_CREATION WARNING #0: UNKNOWN]
D3D11 WARNING: Live Producer at 0x012F05A0, Refcount: 8. [ STATE_CREATION WARNING #0: UNKNOWN]
D3D11 WARNING:  Live Object at 0x012F1D38, Refcount: 0. [ STATE_CREATION WARNING #0: UNKNOWN]
D3D11 WARNING:  Live Object at 0x013BA3F8, Refcount: 0. [ STATE_CREATION WARNING #0: UNKNOWN]

The program '[13272] EngineDX.exe' has exited with code 27 (0x1b).

void UpdateWaveInteriorPoints(TVertexFieldIterator previous_vertex_field, TVertexFieldIterator actual_vertex_field, DWORD min_row, DWORD max_row, float k1, float k2, float k3, UINT column_count)
{
    if (min_row < 1)
        min_row = 1;

    /*if (max_row >(RowCount() - 1))
        max_row = (RowCount() - 1);*/

    for (DWORD i = min_row; i < max_row; ++i)
    {
        for (DWORD j = 1; j < column_count - 1; ++j)
        {
            // After this update we will be discarding the old previous
            // buffer, so overwrite that buffer with the new update.
            // Note how we can do this inplace (read/write to same element) 
            // because we won't need prev_ij again and the assignment happens last.

            // Note j indexes x and i indexes z: h(x_j, z_i, t_k)
            // Moreover, our +z axis goes "down"; this is just to 
            // keep consistent with our row indices going down.

            previous_vertex_field[i*column_count + j].Position.y =
                k1*previous_vertex_field[i*column_count + j].Position.y +
                k2*actual_vertex_field[i*column_count + j].Position.y +
                k3*(actual_vertex_field[(i + 1)*column_count + j].Position.y +
                actual_vertex_field[(i - 1)*column_count + j].Position.y +
                actual_vertex_field[i*column_count + j + 1].Position.y +
                actual_vertex_field[i*column_count + j - 1].Position.y);
        }
    }
}

TVertexFieldIterator tActualVertexIterator = waves.mpObjectMesh->mVertices.begin();
        TVertexFieldIterator tPreviousVertexIterator = waves.GetPrevSolutionVertices().begin();
        std::vector<std::thread> threads;
        //std::vector<std::future<void>> threads;
        UINT dwWavePartDifference = waves.RowCount() / muiNumberofThreads;

        DWORD dwMinRow = 1, dwMaxRow = dwWavePartDifference;
        DWORD dwVertexCount = dwWavePartDifference*waves.ColumnCount();

        for (UINT i = 0; i < muiNumberofThreads - 1; i++)
        {
            //threads.emplace_back(std::async( std::launch::async, &CWaves::UpdateWaveInteriorPoints, &waves, tPreviousVertexIterator, tActualVertexIterator, dwMinRow, dwMaxRow, waves.GetK1(), waves.GetK2(), waves.GetK3(), waves.ColumnCount() ));
            threads.emplace_back(std::thread(&UpdateWaveInteriorPoints, tPreviousVertexIterator, tActualVertexIterator, dwMinRow, dwMaxRow, waves.GetK1(), waves.GetK2(), waves.GetK3(), waves.ColumnCount()));

            tActualVertexIterator += dwVertexCount;
            tPreviousVertexIterator += dwVertexCount;
        }

        tPreviousVertexIterator -= waves.ColumnCount(); //row - 1
        tActualVertexIterator -= waves.ColumnCount(); //row - 1
        waves.UpdateWaveInteriorPoints(tPreviousVertexIterator, tActualVertexIterator, dwMinRow, dwMaxRow, waves.GetK1(), waves.GetK2(), waves.GetK3(), waves.ColumnCount());

        for (UINT i = 0; i < muiNumberofThreads -1; i++)
        {
            //threads[i].wait();
            threads[i].join();
        }