C++ OpenMP并行化中的分段故障
我正在使用OpenMP并行以下函数C++ OpenMP并行化中的分段故障,c++,c,parallel-processing,openmp,C++,C,Parallel Processing,Openmp,我正在使用OpenMP并行以下函数 float myfunc ( Class1 *class1, int *feas, int numfeas, float z, long *k, double cost, long iter, float e ) { long i; long x; double sum; sum = cost; while ( change/cost > 1.0*e ) { change = 0.0;
float myfunc ( Class1 *class1, int *feas, int numfeas, float z, long *k, double cost, long iter, float e )
{
long i;
long x;
double sum;
sum = cost;
while ( change/cost > 1.0*e ) {
change = 0.0;
intshuffle ( feas, numfeas );
#pragma omp parallel for private (i,x) firstprivate(z,k) reduction (+:sum)
for ( i=0; i<iter; i++ ) {
x = i%numfeas;
sum += pgain ( feas[x], class1, z, k );
}
cost -= sum;
}
return ( cost );
}
我得到一个“双重免费或损坏(输出):0xb6a00468”错误
根据我对OpenMp的有限知识,我理解这是由于对class1和feas指针的错误内存访问造成的
作为参考,我还张贴了我的Class1代码
typedef struct {
float w;
float *c;
long a;
float co;
} Class1;
请就并联上述功能的正确方式提出建议
更新:
pgain函数
double pgain ( long x, Class1 *class1, double z, long int *numcenters )
{
int i;
int number_of_centers_to_close = 0;
static double *work_mem;
static double gl_cost_of_opening_x;
static int gl_number_of_centers_to_close;
int stride = *numcenters + 2;
//make stride a multiple of CACHE_LINE
int cl = CACHE_LINE/sizeof ( double );
if ( stride % cl != 0 ) {
stride = cl * ( stride / cl + 1 );
}
int K = stride - 2 ; // K==*numcenters
//my own cost of opening x
double cost_of_opening_x = 0;
work_mem = ( double* ) malloc ( 2 * stride * sizeof ( double ) );
gl_cost_of_opening_x = 0;
gl_number_of_centers_to_close = 0;
/*
* For each center, we have a *lower* field that indicates
* how much we will save by closing the center.
*/
int count = 0;
for ( int i = 0; i < class1->num; i++ ) {
if ( is_center[i] ) {
center_table[i] = count++;
}
}
work_mem[0] = 0;
//now we finish building the table. clear the working memory.
memset ( switch_membership, 0, class1->num * sizeof ( bool ) );
memset ( work_mem, 0, stride*sizeof ( double ) );
memset ( work_mem+stride,0,stride*sizeof ( double ) );
//my *lower* fields
double* lower = &work_mem[0];
//global *lower* fields
double* gl_lower = &work_mem[stride];
#pragma omp parallel for
for ( i = 0; i < class1->num; i++ ) {
float x_cost = dist ( class1->p[i], class1->p[x], class1->dim ) * class1->p[i].weight;
float current_cost = class1->p[i].cost;
if ( x_cost < current_cost ) {
// point i would save cost just by switching to x
// (note that i cannot be a median,
// or else dist(p[i], p[x]) would be 0)
switch_membership[i] = 1;
cost_of_opening_x += x_cost - current_cost;
} else {
// cost of assigning i to x is at least current assignment cost of i
// consider the savings that i's **current** median would realize
// if we reassigned that median and all its members to x;
// note we've already accounted for the fact that the median
// would save z by closing; now we have to subtract from the savings
// the extra cost of reassigning that median and its members
int assign = class1->p[i].assign;
lower[center_table[assign]] += current_cost - x_cost;
}
}
// at this time, we can calculate the cost of opening a center
// at x; if it is negative, we'll go through with opening it
for ( int i = 0; i < class1->num; i++ ) {
if ( is_center[i] ) {
double low = z + work_mem[center_table[i]];
gl_lower[center_table[i]] = low;
if ( low > 0 ) {
// i is a median, and
// if we were to open x (which we still may not) we'd close i
// note, we'll ignore the following quantity unless we do open x
++number_of_centers_to_close;
cost_of_opening_x -= low;
}
}
}
//use the rest of working memory to store the following
work_mem[K] = number_of_centers_to_close;
work_mem[K+1] = cost_of_opening_x;
gl_number_of_centers_to_close = ( int ) work_mem[K];
gl_cost_of_opening_x = z + work_mem[K+1];
// Now, check whether opening x would save cost; if so, do it, and
// otherwise do nothing
if ( gl_cost_of_opening_x < 0 ) {
// we'd save money by opening x; we'll do it
#pragma omp parallel for
for ( int i = 0; i < class1->num; i++ ) {
bool close_center = gl_lower[center_table[class1->p[i].assign]] > 0 ;
if ( switch_membership[i] || close_center ) {
// Either i's median (which may be i itself) is closing,
// or i is closer to x than to its current median
#pragma omp critical
{
class1->p[i].cost = class1->p[i].weight * dist ( class1->p[i], class1->p[x], class1->dim );
class1->p[i].assign = x;
}
}
}
for ( int i = 0; i < class1->num; i++ ) {
if ( is_center[i] && gl_lower[center_table[i]] > 0 ) {
is_center[i] = false;
}
}
if ( x >= 0 && x < class1->num ) {
is_center[x] = true;
}
*numcenters = *numcenters + 1 - gl_number_of_centers_to_close;
} else {
gl_cost_of_opening_x = 0; // the value we'll return
}
free ( work_mem );
return -gl_cost_of_opening_x;
}
问题可能是
pgain
的这一部分中可能存在竞态:
// Either i's median (which may be i itself) is closing,
// or i is closer to x than to its current median
class1->p[i].cost = class1->p[i].weight * dist ( class1->p[i], class1->p[x], class1->dim );
class1->p[i].assign = x;
由于class1
是一个指针,因此使用语句firstprivate(class1)
将私有化的是裸指针,而不是底层资源
这个问题的解决方案在很大程度上取决于程序的语义。如果可以对*class1
进行随机排序的更新,并且该资源确实是要共享的,则需要进行以下修改:
// Either i's median (which may be i itself) is closing,
// or i is closer to x than to its current median
#pragma omp critical LOCK_CLASS1
{
// Lock this part of the code for thread-safety
class1->p[i].cost = class1->p[i].weight * dist ( class1->p[i], class1->p[x], class1->dim );
class1->p[i].assign = x
}
将修复上面的代码。否则,在调用pgain
之前,应该为每个线程创建*class1
的私有副本
最后,您应该注意,上述推理对于任何资源都是正确的。例如,
Class1
中的指针float*c
显示相同的临界点,如果它指向共享的资源,并且您不同步内存更新。您确定i
必须是私有的吗?我很确定您可以省去该指令,而内联x
。我需要回顾一下我的OpenMP笔记…@Mysticial pgain是另一个函数签名是double pgain(长x,Class1*Class1,双z,长int*numcent)。但这并不重要,因为代码在没有OpenMP的情况下可以正确编译和运行pragmas@millinon我不需要是私有的,因为OpenMp足够聪明,可以识别它是私有的,但是x需要是私有的。问题可能在于共享内存以及feas和class1的端口。您是否有可能在该函数中分配内存?内存分配器是线程安全的吗?@Mysticial是的,我在PGAIN中运行malloc。那么您建议的parellization问题解决方案是什么?无法修复它。我已经用新信息更新了我原来的帖子。我接受你的回复。赛格出现故障的原因确实是赛道状况。
// Either i's median (which may be i itself) is closing,
// or i is closer to x than to its current median
class1->p[i].cost = class1->p[i].weight * dist ( class1->p[i], class1->p[x], class1->dim );
class1->p[i].assign = x;
// Either i's median (which may be i itself) is closing,
// or i is closer to x than to its current median
#pragma omp critical LOCK_CLASS1
{
// Lock this part of the code for thread-safety
class1->p[i].cost = class1->p[i].weight * dist ( class1->p[i], class1->p[x], class1->dim );
class1->p[i].assign = x
}