C++ OpenCV：加速EM算法预测_C++_Image_Opencv_Gaussian_Image Segmentation

C++ OpenCV：加速EM算法预测

c++ image opencv

C++ OpenCV：加速EM算法预测,c++,image,opencv,gaussian,image-segmentation,C++,Image,Opencv,Gaussian,Image Segmentation,我正在使用cv:：EM算法对图像流进行高斯混合模型分类。然而，当使用EM:：prediction方法将像素分类到不同的模型时，我发现它太慢了，一张600x800的图像大约需要3秒钟。另一方面，OpenCV提供的MOG背景减法器执行此部分的速度非常快，仅使用约30ms。所以我决定使用它的perform方法来替换EM:：prediction部分。然而，我不知道如何改变它我在预测部分使用的代码如下所示： cv::Mat floatSource; source.convertTo ( floatSou

我正在使用

cv:：EM

算法对图像流进行高斯混合模型分类。然而，当使用

EM:：prediction

方法将像素分类到不同的模型时，我发现它太慢了，一张600x800的图像大约需要3秒钟。另一方面，OpenCV提供的

MOG背景减法器

执行此部分的速度非常快，仅使用约30ms。所以我决定使用它的perform方法来替换

EM:：prediction

部分。然而，我不知道如何改变它

我在预测部分使用的代码如下所示：

cv::Mat floatSource;
source.convertTo ( floatSource, CV_32F );
cv::Mat samples ( source.rows * source.cols, 3, CV_32FC1 );

int idx = 0; 

for ( int y = 0; y < source.rows; y ++ )
{
    cv::Vec3f* row = floatSource.ptr <cv::Vec3f> (y);
    for ( int x = 0; x < source.cols; x ++ )
    {
        samples.at<cv::Vec3f> ( idx++, 0 ) = row[x];
    }
}

cv::EMParams params(2);  // num of mixture we use is 2 here
cv::ExpectationMaximization em ( samples, cv::Mat(), params );
cv::Mat means = em.getMeans();
cv::Mat weight = em.getWeights();

const int fgId = weights.at<float>(0) > weights.at<flaot>(1) ? 0:1;
idx = 0; 

for ( int y = 0; y < source.rows; y ++ )
{
    for ( int x = 0; x < source.cols; x ++ )
    {
        const int result = cvRound ( em.predict ( samples.row ( idx++ ), NULL );
    }
}

我如何更改此部分代码，以便在我的第一个代码中使用它来

em.predict

部分？先谢谢你

更新

我自己这样做是因为在代码中使用了

process8uC3

函数：

cv::Mat fgImg ( 600, 800, CV_8UC3 );
cv::Mat bgImg ( 600, 800, CV_8UC3 );

double learningRate = 0.001;
int x, y, k, k1;
int rows = sourceMat.rows;  //source opencv matrix
int cols = sourceMat.cols;  //source opencv matrix
float alpha = (float) learningRate;
float T = 2.0;
float vT = 0.30;
int K = 3;

const float w0 = (float) CV_BGFG_MOG_WEIGTH_INIT;
const float sk0 = (float) (CV_BGFG_MOG_WEIGHT_INIT/CV_BGFG_MOG_SIGMA_INIT);
const float var0 = (float) (CV_BGFG_MOG_SIGMA_INIT*CV_BGFG_MOG_SIGMA_INIT);
const float minVar = FLT_EPSILON;

for ( y = 0; y < rows; y ++ )
{
    const char* src = source.ptr < uchar > ( y );
    uchar* dst = fgImg.ptr < uchar > ( y );
    uchar* tmp = bgImg.ptr ( y ); 
    MixData<cv::Vec3f>* mptr = (MixData<cv::Vec3f>*)tmp;

    for ( x = 0; x < cols; x ++, mptr += K )
    {
         float w = mptr[k].weight;
         cv::Vec3f mu = mpptr[k].mean[0];
         cv::Vec3f var = mptr[k].var[0];
         cv::Vec3f diff = pix - mu;
         float d2 = diff.dot ( diff );

         if ( d2 < vT * ( var[0] + var[1] + var[2] ) )
         {
             dw = alpha * ( 1.f - w );
             mptr[k].weight = w + dw;
             mptr[k].mean = mu + alpha * diff;
             var = cv::Vec3f ( max ( var[0] + alpha*(diff[0]*diff[0]-var[0]),minVar),
                     max ( var[1]+ alpha*(diff[1]*diff[1]-var[1]),minVar),
                     max ( var[2] + alpha*(diff[2]*diff[2]-var[2]),minVar) );

             mptr[k].var = var;
             mptr[k].sortKey = w/sqrt ( var[0] + var[1] + var[2] );

             for ( k1 = k-1; k1 >= 0; k1 -- )
             {
                 if ( mptr[k1].sortKey > mptr[k1+1].sortKey )
                     break;
                     std::swap ( mptr[k1], mptr[k1+1] );
             }
             break;
         }
         wsum += w;
     }
     dst[x] = (uchar) (-(wsum >= T ));
     wsum += dw;

     if ( k == K )
     {
          wsum += w0 - mptr[k-1].weight;
          mptr[k-1].weight = w0;
          mptr[k-1].mean = pix; 
          mptr[k-1].var = cv::Vec3f ( var0, var0, var0 );
          mptr[k-1].sortKey = sk0;
      }
      else 
          for ( ; k < K; k ++ )
          {
              mptr[k].weight *= dw;
              mptr[k].sortKey *= dw;
          }
      }
  }
}

cv：：Mat fgImg（600,800，cv_8UC3）；
cv：：Mat bgImg（600800，cv_8UC3）；
双学习率=0.001；
int x，y，k，k1；
int rows=sourceMat.rows//源代码opencv矩阵
int cols=sourceMat.cols//源代码opencv矩阵
浮动alpha=（浮动）学习率；
浮动T=2.0；
浮动vT=0.30；
int K=3；
常量浮点w0=（浮点）CV_BGFG_MOG_WEIGTH_INIT；
常量浮点sk0=（浮点）（CV_BGFG_MOG_WEIGHT_INIT/CV_BGFG_MOG_SIGMA_INIT）；
常量浮点var0=（浮点）（CV_BGFG_MOG_SIGMA_INIT*CV_BGFG_MOG_SIGMA_INIT）；
常量float minVar=FLT_EPSILON；
对于（y=0；y（y）；
uchar*dst=fgimgptr（y）；
uchar*tmp=bgimgptr（y）；
MixData*mptr=（MixData*）tmp；
对于（x=0；x=0；k1--）
{
如果（mptr[k1].sortKey>mptr[k1+1].sortKey）
打破
标准：掉期（mptr[k1]，mptr[k1+1]）；
}
打破
}
wsum+=w；
}
dst[x]=（uchar）（（wsum>=T））；
wsum+=dw；
如果（k==k）
{
wsum+=w0-mptr[k-1]。重量；
mptr[k-1]。重量=w0；
mptr[k-1]。平均值=pix；
mptr[k-1].var=cv:：Vec3f（var0，var0，var0）；
mptr[k-1].sortKey=sk0；
}
其他的
对于（；k


它编译时没有错误，但结果完全是一堆。我怀疑这可能与T
和vT
的值有关，并用其他几个值对它们进行了更改，但没有任何区别。因此，我相信即使它编译没有错误，我使用它的方式是错误的
 不是直接回答您的问题，而是对代码的一些评论：
int idx = 0; 

for ( int y = 0; y < source.rows; y ++ )
{
    cv::Vec3f* row = floatSource.ptr <cv::Vec3f> (y);
    for ( int x = 0; x < source.cols; x ++ )
    {
        samples.at<cv::Vec3f> ( idx++, 0 ) = row[x];
    }
}

这不仅可以修复您的错误，还可以加快您的过程，因为使用Mat.at访问像素的速度并没有那么快，而重塑是O（1）操作，因为基础矩阵数据没有改变，只是行数/列数/通道数没有改变
其次，通过将完整样本矩阵传递给em:：predict而不是每个样本，可以节省一些时间。目前，您对em:：predict进行逐列调用，而您只需执行一个，再加上对mat.row（）的逐列调用，这将创建一个临时矩阵（头）
进一步加快这一速度的一种方法是并行化对predict的调用，例如使用OpenCV使用的TBB（编译OpenCV时是否打开了TBB？predict可能已经是多线程的，而不是选中该选项） 查看OpenCV中GrabCut的源代码：modules/imgproc/src/GrabCut.cpp。
该模块中有私有类GMM（实现训练高斯混合模型和样本分类）。为了初始化GMM，使用k-均值。如果您需要更快的初始化，可以尝试算法（请参阅modules/core/src/matrix.cpp module中的generateCentersPP函数）。
非常感谢您的回答。我按照你说的做了，通过使用重塑，然后使用完整的矩阵而不是每个样本。但是结果并没有太大的差别。你是否优化了OpenCV？现在不必重写所有内容的最好方法是：谢谢你的建议。我想知道为什么opencv提供的背景减法工作得这么快？我认为一定有一种方法可以很快做到这一点，这就是为什么我问这个问题。除了Parellize，我认为如果我能将这个“process8uC3”函数更改为我自己的代码，那么它就可以实现。你不这么认为吗？我不知道，两者之间有这么大的差异似乎有点奇怪。一个原因可能是你使用的混合物的数量在背景减法中会更小。但是在你的代码中你只使用了2，对吗？也许还有其他隐藏的东西会占用大量的处理时间。你试着分析你的代码了吗？是的，我的情况是2。我试着用“VerySleepy”来分析我的代码，得到了以下统计数据：RtlUserthreadstart（花费的时间：60.0%独占，80.0%包含）SleepEx（20.0%独占，20.0%包含）WaitForSingleObjectEx（20.0%独占，20.0%包含）BaseThreadIni
int idx = 0; 

for ( int y = 0; y < source.rows; y ++ )
{
    cv::Vec3f* row = floatSource.ptr <cv::Vec3f> (y);
    for ( int x = 0; x < source.cols; x ++ )
    {
        samples.at<cv::Vec3f> ( idx++, 0 ) = row[x];
    }
}

cv::Mat samples = floatSource.reshape(1, source.rows*source.cols)