Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/c/61.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
C++ C优化-低级代码_C++_C_Optimization_Memory Management - Fatal编程技术网

C++ C优化-低级代码

C++ C优化-低级代码,c++,c,optimization,memory-management,C++,C,Optimization,Memory Management,我正在尝试编写一个与dlmalloc相当的内存分配器,dlmalloc是glibc中使用的malloc。dlmalloc是一个具有块拆分功能的最佳分配器,它在将块再次合并为大型块之前保留一个最近使用的块池。我正在编写的分配器是先适应样式的 我的问题有两个:(1)与glibc-malloc相比,我的代码的测试时间非常不规则;(2)有时,我的代码的平均运行时间将是glibc-malloc的3到4倍;(2) 这没什么大不了的,但我想理解为什么glibc malloc不会以同样的方式受苦。本文进一步展示

我正在尝试编写一个与dlmalloc相当的内存分配器,dlmalloc是glibc中使用的malloc。dlmalloc是一个具有块拆分功能的最佳分配器,它在将块再次合并为大型块之前保留一个最近使用的块池。我正在编写的分配器是先适应样式的

我的问题有两个:(1)与glibc-malloc相比,我的代码的测试时间非常不规则;(2)有时,我的代码的平均运行时间将是glibc-malloc的3到4倍;(2) 这没什么大不了的,但我想理解为什么glibc malloc不会以同样的方式受苦。本文进一步展示了(1)中描述的malloc和我的代码之间的行为示例。有时,一批1000个测试的平均时间会比malloc(上面的问题(2))高得多,有时平均时间是相同的。但是,对我的代码进行的一批测试的测试时间总是非常不规则(上面的问题(1));这意味着在一批测试中有时间跳跃,达到平均值的20倍,这些跳跃散布在其他常规(接近平均值)的时间中。glibc malloc不会这样做

我正在编写的代码如下

===================================

/* represent an allocated/unallocated  block of memory */
struct Block {

    /* previous allocated or unallocated block needed for consolidation but not used in allocation */
    Block* prev;
    /* 1 if allocated and 0 if not */
    unsigned int tagh;
   /* previous unallocated block */
   Block* prev_free;
   /* next unallocated block  */
   Block* next_free;
   /* size of current block */
   unsigned int size;
};

#define CACHE_SZ 120000000

/* array to be managed by allocator */
char arr[CACHE_SZ] __attribute__((aligned(4)));

/* initialize the contiguous memory located at arr for allocator */
void init_cache(){
/* setup list head node that does not change */
   Block* a = (Block*)  arr;
  a->prev = 0; 
  a->tagh = 1;
  a->prev_free = 0;
  a->size = 0;

/* setup the usable data block */
  Block* b = (Block*) (arr + sizeof(Block));
  b->prev = a; 
  b->tagh = 0;
  b->prev_free = a;
  b->size = CACHE_SZ - 3*sizeof(Block);
  a->next_free = b;

/* setup list tail node that does not change */
  Block* e = (Block*)((char*)arr + CACHE_SZ - sizeof(Block)); 
  e->prev = b;
  e->tagh = 1;
  e->prev_free = b;
  e->next_free = 0;
  e->size = 0;
  b->next_free = e;
}

char* alloc(unsigned int size){
  register Block* current = ((Block*) arr)->next_free; 
  register Block* new_block;

/* search for a first-fit block */

   while(current != 0){
       if( current->size >= size + sizeof(Block)) goto good;
       current = current->next_free;
   }

/* what to do if no decent size block found */
   if( current == 0) {
       return 0;
   }

/* good block found */
good:
/* if block size is exact return it */
   if( current->size == size){
       if(current->next_free != 0) current->next_free->prev_free = current->prev_free;
       if(current->prev_free != 0) current->prev_free->next_free = current->next_free;
       return (char* ) current + sizeof(Block);
   }

/* otherwise split the block */

   current->size -= size + sizeof(Block); 

    new_block = (Block*)( (char*)current + sizeof(Block) + current->size);
    new_block->size = size;
    new_block->prev = current;
    new_block->tagh = 1;
   ((Block*)((char*) new_block + sizeof(Block) + new_block->size ))->prev = new_block;

   return (char* ) new_block + sizeof(Block);
}

main(int argc, char** argv){
    init_cache();
    int count = 0;

/* the count considers the size of the cache arr */
    while(count < 4883){

/* the following line tests malloc; the quantity(1024*24) ensures word alignment */
   //char * volatile p = (char *) malloc(1024*24);
/* the following line tests above code in exactly the same way */
    char * volatile p = alloc(1024*24);
        count++;

    }
}
注意,malloc代码时间更为规则。在其他不可预测的时间,我的代码是0.070,而不是0.020,因此平均运行时间接近70毫秒,而不是25毫秒(问题(2),但这里没有显示。在这种情况下,我很幸运,它的运行速度接近malloc(25ms)的平均值


问题是,(1)我如何修改我的代码,使其具有更规则的时间,如glibc-malloc?(2)如果这是可能的,我如何使它比glibc malloc更快,因为我读到dlmalloc是一个典型的平衡分配器,并且不是最快的(仅考虑分割/最佳拟合/第一拟合分配器,而不是其他分配器)?

不要使用“实时”:尝试“用户”+“系统”。在大量迭代中求平均值。问题有两个方面:(a)处理器上的进程并不孤单,它会根据其他进程所做的事情而中断;(b)时间度量具有粒度。我不确定今天它是什么,但在以前它只是一个时间片的大小=>1/100秒。

对,我比较了两种解决方案,并在几个不同的变体中运行它们。我不确定问题出在哪里,但我的猜测是,大部分时间都花在“创建一个1200000000字节的大型连续板”上。如果我减小了大小,并且仍然执行相同数量的分配,那么时间就会缩短

指向这一点的另一个证据是
系统
时间是
真实
时间的很大一部分,其中
用户
时间几乎是零

现在,在我的系统上,一旦我在高内存负载下运行了几次这些东西,它就不会真的上下摆动那么多。这很有可能是因为一旦我换掉了内存中积累的一堆旧垃圾,系统就会有大量的“空闲”页面用于我的进程。当内存更受限时(因为我让系统去做一些其他事情,比如在我实验的“网站”上做一些数据库工作[这是一个真实网站的“沙盒”版本,所以它在数据库中有真实数据,可以快速填充内存等等],我会得到更多的变化,直到我再次清空内存

但我认为“神秘”的关键在于,系统时间是所用时间的很大一部分。另外值得注意的是,当对大数据块使用
malloc
时,内存实际上没有“真正分配”.当分配较小的块时,
malloc
似乎在某种程度上实际上更聪明,并且比“优化”的alloc更快-至少在更大的内存量上是如此。不要问我具体是如何工作的

以下是一些证据:

我将代码中的
main
更改为:

#define BLOCK_SIZE (CACHE_SZ / 5000)

int main(int argc, char** argv){
    init_cache();
    int count = 0;
    int failed = 0;
    size_t size = 0;

/* the count considers the size of the cache arr */
    while(count < int((CACHE_SZ / BLOCK_SIZE) * 0.96) ){

/* the following line tests malloc; the quantity(1024*24) ensures word alignment */
   //char * volatile p = (char *) malloc(1024*24);
/* the following line tests above code in exactly the same way */
    char * volatile p;
    if (argc > 1) 
        p = (char *)malloc(BLOCK_SIZE);
    else
        p = alloc(BLOCK_SIZE);
    if (p == 0)
    {
        failed++;
        puts("p = NULL\n");
    }
    count++;
    size += BLOCK_SIZE;
    }
    printf("Count = %d, total=%zd, failed=%d\n", count, size, failed);
}
使用
malloc
运行几次:

real    0m0.010s
user    0m0.000s
sys 0m0.009s
Count = 4800, total=11520000, failed=0

real    0m0.017s
user    0m0.001s
sys 0m0.015s
Count = 4800, total=11520000, failed=0

real    0m0.012s
user    0m0.001s
sys 0m0.010s
Count = 4800, total=11520000, failed=0

real    0m0.021s
user    0m0.007s
sys 0m0.013s
Count = 4800, total=11520000, failed=0

real    0m0.010s
user    0m0.001s
sys 0m0.008s
Count = 4800, total=11520000, failed=0

real    0m0.009s
user    0m0.001s
sys 0m0.007s
real    0m0.038s
user    0m0.001s
sys 0m0.036s
Count = 4800, total=115200000, failed=0

real    0m0.040s
user    0m0.001s
sys 0m0.037s
Count = 4800, total=115200000, failed=0

real    0m0.045s
user    0m0.001s
sys 0m0.043s
Count = 4800, total=115200000, failed=0

real    0m0.044s
user    0m0.001s
sys 0m0.043s
Count = 4800, total=115200000, failed=0

real    0m0.046s
user    0m0.001s
sys 0m0.043s
Count = 4800, total=115200000, failed=0

real    0m0.042s
user    0m0.000s
sys 0m0.042s
real    0m0.026s
user    0m0.004s
sys 0m0.021s
Count = 4800, total=115200000, failed=0

real    0m0.027s
user    0m0.002s
sys 0m0.023s
Count = 4800, total=115200000, failed=0

real    0m0.022s
user    0m0.002s
sys 0m0.018s
Count = 4800, total=115200000, failed=0

real    0m0.016s
user    0m0.001s
sys 0m0.015s
Count = 4800, total=115200000, failed=0

real    0m0.027s
user    0m0.002s
sys 0m0.024s
Count = 4800, total=115200000, failed=0
real    0m1.408s
user    0m0.002s
sys 0m1.395s
Count = 4800, total=1152000000, failed=0

real    0m1.517s
user    0m0.001s
sys 0m1.505s
Count = 4800, total=1152000000, failed=0

real    0m1.478s
user    0m0.000s
sys 0m1.466s
Count = 4800, total=1152000000, failed=0

real    0m1.401s
user    0m0.001s
sys 0m1.389s
Count = 4800, total=1152000000, failed=0

real    0m1.445s
user    0m0.002s
sys 0m1.433s
Count = 4800, total=1152000000, failed=0

real    0m1.468s
user    0m0.000s
sys 0m1.458s
Count = 4800, total=1152000000, failed=0
real    0m0.020s
user    0m0.002s
sys 0m0.017s
Count = 4800, total=1152000000, failed=0

real    0m0.022s
user    0m0.001s
sys 0m0.020s
Count = 4800, total=1152000000, failed=0

real    0m0.027s
user    0m0.005s
sys 0m0.021s
Count = 4800, total=1152000000, failed=0

real    0m0.029s
user    0m0.002s
sys 0m0.026s
Count = 4800, total=1152000000, failed=0

real    0m0.020s
user    0m0.001s
sys 0m0.019s
Count = 4800, total=1152000000, failed=0
将缓存大小增大10倍,将为
alloc
提供以下结果:

real    0m0.010s
user    0m0.000s
sys 0m0.009s
Count = 4800, total=11520000, failed=0

real    0m0.017s
user    0m0.001s
sys 0m0.015s
Count = 4800, total=11520000, failed=0

real    0m0.012s
user    0m0.001s
sys 0m0.010s
Count = 4800, total=11520000, failed=0

real    0m0.021s
user    0m0.007s
sys 0m0.013s
Count = 4800, total=11520000, failed=0

real    0m0.010s
user    0m0.001s
sys 0m0.008s
Count = 4800, total=11520000, failed=0

real    0m0.009s
user    0m0.001s
sys 0m0.007s
real    0m0.038s
user    0m0.001s
sys 0m0.036s
Count = 4800, total=115200000, failed=0

real    0m0.040s
user    0m0.001s
sys 0m0.037s
Count = 4800, total=115200000, failed=0

real    0m0.045s
user    0m0.001s
sys 0m0.043s
Count = 4800, total=115200000, failed=0

real    0m0.044s
user    0m0.001s
sys 0m0.043s
Count = 4800, total=115200000, failed=0

real    0m0.046s
user    0m0.001s
sys 0m0.043s
Count = 4800, total=115200000, failed=0

real    0m0.042s
user    0m0.000s
sys 0m0.042s
real    0m0.026s
user    0m0.004s
sys 0m0.021s
Count = 4800, total=115200000, failed=0

real    0m0.027s
user    0m0.002s
sys 0m0.023s
Count = 4800, total=115200000, failed=0

real    0m0.022s
user    0m0.002s
sys 0m0.018s
Count = 4800, total=115200000, failed=0

real    0m0.016s
user    0m0.001s
sys 0m0.015s
Count = 4800, total=115200000, failed=0

real    0m0.027s
user    0m0.002s
sys 0m0.024s
Count = 4800, total=115200000, failed=0
real    0m1.408s
user    0m0.002s
sys 0m1.395s
Count = 4800, total=1152000000, failed=0

real    0m1.517s
user    0m0.001s
sys 0m1.505s
Count = 4800, total=1152000000, failed=0

real    0m1.478s
user    0m0.000s
sys 0m1.466s
Count = 4800, total=1152000000, failed=0

real    0m1.401s
user    0m0.001s
sys 0m1.389s
Count = 4800, total=1152000000, failed=0

real    0m1.445s
user    0m0.002s
sys 0m1.433s
Count = 4800, total=1152000000, failed=0

real    0m1.468s
user    0m0.000s
sys 0m1.458s
Count = 4800, total=1152000000, failed=0
real    0m0.020s
user    0m0.002s
sys 0m0.017s
Count = 4800, total=1152000000, failed=0

real    0m0.022s
user    0m0.001s
sys 0m0.020s
Count = 4800, total=1152000000, failed=0

real    0m0.027s
user    0m0.005s
sys 0m0.021s
Count = 4800, total=1152000000, failed=0

real    0m0.029s
user    0m0.002s
sys 0m0.026s
Count = 4800, total=1152000000, failed=0

real    0m0.020s
user    0m0.001s
sys 0m0.019s
Count = 4800, total=1152000000, failed=0
使用
malloc

real    0m0.010s
user    0m0.000s
sys 0m0.009s
Count = 4800, total=11520000, failed=0

real    0m0.017s
user    0m0.001s
sys 0m0.015s
Count = 4800, total=11520000, failed=0

real    0m0.012s
user    0m0.001s
sys 0m0.010s
Count = 4800, total=11520000, failed=0

real    0m0.021s
user    0m0.007s
sys 0m0.013s
Count = 4800, total=11520000, failed=0

real    0m0.010s
user    0m0.001s
sys 0m0.008s
Count = 4800, total=11520000, failed=0

real    0m0.009s
user    0m0.001s
sys 0m0.007s
real    0m0.038s
user    0m0.001s
sys 0m0.036s
Count = 4800, total=115200000, failed=0

real    0m0.040s
user    0m0.001s
sys 0m0.037s
Count = 4800, total=115200000, failed=0

real    0m0.045s
user    0m0.001s
sys 0m0.043s
Count = 4800, total=115200000, failed=0

real    0m0.044s
user    0m0.001s
sys 0m0.043s
Count = 4800, total=115200000, failed=0

real    0m0.046s
user    0m0.001s
sys 0m0.043s
Count = 4800, total=115200000, failed=0

real    0m0.042s
user    0m0.000s
sys 0m0.042s
real    0m0.026s
user    0m0.004s
sys 0m0.021s
Count = 4800, total=115200000, failed=0

real    0m0.027s
user    0m0.002s
sys 0m0.023s
Count = 4800, total=115200000, failed=0

real    0m0.022s
user    0m0.002s
sys 0m0.018s
Count = 4800, total=115200000, failed=0

real    0m0.016s
user    0m0.001s
sys 0m0.015s
Count = 4800, total=115200000, failed=0

real    0m0.027s
user    0m0.002s
sys 0m0.024s
Count = 4800, total=115200000, failed=0
real    0m1.408s
user    0m0.002s
sys 0m1.395s
Count = 4800, total=1152000000, failed=0

real    0m1.517s
user    0m0.001s
sys 0m1.505s
Count = 4800, total=1152000000, failed=0

real    0m1.478s
user    0m0.000s
sys 0m1.466s
Count = 4800, total=1152000000, failed=0

real    0m1.401s
user    0m0.001s
sys 0m1.389s
Count = 4800, total=1152000000, failed=0

real    0m1.445s
user    0m0.002s
sys 0m1.433s
Count = 4800, total=1152000000, failed=0

real    0m1.468s
user    0m0.000s
sys 0m1.458s
Count = 4800, total=1152000000, failed=0
real    0m0.020s
user    0m0.002s
sys 0m0.017s
Count = 4800, total=1152000000, failed=0

real    0m0.022s
user    0m0.001s
sys 0m0.020s
Count = 4800, total=1152000000, failed=0

real    0m0.027s
user    0m0.005s
sys 0m0.021s
Count = 4800, total=1152000000, failed=0

real    0m0.029s
user    0m0.002s
sys 0m0.026s
Count = 4800, total=1152000000, failed=0

real    0m0.020s
user    0m0.001s
sys 0m0.019s
Count = 4800, total=1152000000, failed=0
再加上10倍的
alloc

real    0m0.010s
user    0m0.000s
sys 0m0.009s
Count = 4800, total=11520000, failed=0

real    0m0.017s
user    0m0.001s
sys 0m0.015s
Count = 4800, total=11520000, failed=0

real    0m0.012s
user    0m0.001s
sys 0m0.010s
Count = 4800, total=11520000, failed=0

real    0m0.021s
user    0m0.007s
sys 0m0.013s
Count = 4800, total=11520000, failed=0

real    0m0.010s
user    0m0.001s
sys 0m0.008s
Count = 4800, total=11520000, failed=0

real    0m0.009s
user    0m0.001s
sys 0m0.007s
real    0m0.038s
user    0m0.001s
sys 0m0.036s
Count = 4800, total=115200000, failed=0

real    0m0.040s
user    0m0.001s
sys 0m0.037s
Count = 4800, total=115200000, failed=0

real    0m0.045s
user    0m0.001s
sys 0m0.043s
Count = 4800, total=115200000, failed=0

real    0m0.044s
user    0m0.001s
sys 0m0.043s
Count = 4800, total=115200000, failed=0

real    0m0.046s
user    0m0.001s
sys 0m0.043s
Count = 4800, total=115200000, failed=0

real    0m0.042s
user    0m0.000s
sys 0m0.042s
real    0m0.026s
user    0m0.004s
sys 0m0.021s
Count = 4800, total=115200000, failed=0

real    0m0.027s
user    0m0.002s
sys 0m0.023s
Count = 4800, total=115200000, failed=0

real    0m0.022s
user    0m0.002s
sys 0m0.018s
Count = 4800, total=115200000, failed=0

real    0m0.016s
user    0m0.001s
sys 0m0.015s
Count = 4800, total=115200000, failed=0

real    0m0.027s
user    0m0.002s
sys 0m0.024s
Count = 4800, total=115200000, failed=0
real    0m1.408s
user    0m0.002s
sys 0m1.395s
Count = 4800, total=1152000000, failed=0

real    0m1.517s
user    0m0.001s
sys 0m1.505s
Count = 4800, total=1152000000, failed=0

real    0m1.478s
user    0m0.000s
sys 0m1.466s
Count = 4800, total=1152000000, failed=0

real    0m1.401s
user    0m0.001s
sys 0m1.389s
Count = 4800, total=1152000000, failed=0

real    0m1.445s
user    0m0.002s
sys 0m1.433s
Count = 4800, total=1152000000, failed=0

real    0m1.468s
user    0m0.000s
sys 0m1.458s
Count = 4800, total=1152000000, failed=0
real    0m0.020s
user    0m0.002s
sys 0m0.017s
Count = 4800, total=1152000000, failed=0

real    0m0.022s
user    0m0.001s
sys 0m0.020s
Count = 4800, total=1152000000, failed=0

real    0m0.027s
user    0m0.005s
sys 0m0.021s
Count = 4800, total=1152000000, failed=0

real    0m0.029s
user    0m0.002s
sys 0m0.026s
Count = 4800, total=1152000000, failed=0

real    0m0.020s
user    0m0.001s
sys 0m0.019s
Count = 4800, total=1152000000, failed=0
使用
malloc

real    0m0.010s
user    0m0.000s
sys 0m0.009s
Count = 4800, total=11520000, failed=0

real    0m0.017s
user    0m0.001s
sys 0m0.015s
Count = 4800, total=11520000, failed=0

real    0m0.012s
user    0m0.001s
sys 0m0.010s
Count = 4800, total=11520000, failed=0

real    0m0.021s
user    0m0.007s
sys 0m0.013s
Count = 4800, total=11520000, failed=0

real    0m0.010s
user    0m0.001s
sys 0m0.008s
Count = 4800, total=11520000, failed=0

real    0m0.009s
user    0m0.001s
sys 0m0.007s
real    0m0.038s
user    0m0.001s
sys 0m0.036s
Count = 4800, total=115200000, failed=0

real    0m0.040s
user    0m0.001s
sys 0m0.037s
Count = 4800, total=115200000, failed=0

real    0m0.045s
user    0m0.001s
sys 0m0.043s
Count = 4800, total=115200000, failed=0

real    0m0.044s
user    0m0.001s
sys 0m0.043s
Count = 4800, total=115200000, failed=0

real    0m0.046s
user    0m0.001s
sys 0m0.043s
Count = 4800, total=115200000, failed=0

real    0m0.042s
user    0m0.000s
sys 0m0.042s
real    0m0.026s
user    0m0.004s
sys 0m0.021s
Count = 4800, total=115200000, failed=0

real    0m0.027s
user    0m0.002s
sys 0m0.023s
Count = 4800, total=115200000, failed=0

real    0m0.022s
user    0m0.002s
sys 0m0.018s
Count = 4800, total=115200000, failed=0

real    0m0.016s
user    0m0.001s
sys 0m0.015s
Count = 4800, total=115200000, failed=0

real    0m0.027s
user    0m0.002s
sys 0m0.024s
Count = 4800, total=115200000, failed=0
real    0m1.408s
user    0m0.002s
sys 0m1.395s
Count = 4800, total=1152000000, failed=0

real    0m1.517s
user    0m0.001s
sys 0m1.505s
Count = 4800, total=1152000000, failed=0

real    0m1.478s
user    0m0.000s
sys 0m1.466s
Count = 4800, total=1152000000, failed=0

real    0m1.401s
user    0m0.001s
sys 0m1.389s
Count = 4800, total=1152000000, failed=0

real    0m1.445s
user    0m0.002s
sys 0m1.433s
Count = 4800, total=1152000000, failed=0

real    0m1.468s
user    0m0.000s
sys 0m1.458s
Count = 4800, total=1152000000, failed=0
real    0m0.020s
user    0m0.002s
sys 0m0.017s
Count = 4800, total=1152000000, failed=0

real    0m0.022s
user    0m0.001s
sys 0m0.020s
Count = 4800, total=1152000000, failed=0

real    0m0.027s
user    0m0.005s
sys 0m0.021s
Count = 4800, total=1152000000, failed=0

real    0m0.029s
user    0m0.002s
sys 0m0.026s
Count = 4800, total=1152000000, failed=0

real    0m0.020s
user    0m0.001s
sys 0m0.019s
Count = 4800, total=1152000000, failed=0
如果我们将代码更改为使
BLOCK\u SIZE
的常数为1000,则
alloc
malloc
之间的差异会小得多。以下是
alloc
结果:

 Count = 1080000, total=1080000000, failed=0

real    0m1.183s
user    0m0.028s
sys 0m1.137s
Count = 1080000, total=1080000000, failed=0

real    0m1.179s
user    0m0.017s
sys 0m1.143s
Count = 1080000, total=1080000000, failed=0

real    0m1.196s
user    0m0.026s
sys 0m1.152s
Count = 1080000, total=1080000000, failed=0

real    0m1.197s
user    0m0.023s
sys 0m1.157s
Count = 1080000, total=1080000000, failed=0

real    0m1.188s
user    0m0.021s
sys 0m1.147s
现在
malloc

Count = 1080000, total=1080000000, failed=0

real    0m0.582s
user    0m0.063s
sys 0m0.482s
Count = 1080000, total=1080000000, failed=0

real    0m0.586s
user    0m0.062s
sys 0m0.489s
Count = 1080000, total=1080000000, failed=0

real    0m0.582s
user    0m0.059s
sys 0m0.483s
Count = 1080000, total=1080000000, failed=0

real    0m0.590s
user    0m0.064s
sys 0m0.477s
Count = 1080000, total=1080000000, failed=0

real    0m0.586s
user    0m0.075s
sys 0m0.473s

20毫秒对于基准测试来说是一个非常短的时间-只有两个时间片-如果另一个任务需要一些CPU时间,你会得到更长的挂钟时间。要么增加测试的长度,要么使用
user
时间,而不是
real
时间进行比较。非常有趣的问题!我想指出,有很多在代码输出中有更多不规则的跳转。在我看来,当将输出与malloc outputan进行比较时,大于0.030s的所有内容都是不规则的。其他可能的问题是:内存池似乎有一个很大的静态分配—这可能在程序启动时没有完全连接,因此在运行时很可能会出现页面错误第一次访问新页面。@Paul R事实上,在使用perf分析后,页面错误不是问题,但这也是我的第一个猜测。与不规则和缓存未命中有更多的对应关系,但我不明白为什么会有更多缓存未命中。编辑:我现在将再次使用perf对此进行检查,但堆是相似的y被访问了,不是吗,因此会出现相同的页面错误?看看你是否能在web上的任何地方找到囤积分配器库和基准测试。编写和研究基准测试代码的人回答了你的问题,但我无法通过快速搜索再次找到它。这是我在~2000年代的主要项目的一部分。。。然而,SmartHeap最终获胜(ev)