C++ 比较复制内存缓冲区的运行时间:memcpy与ForLoopIndexCopying
我试图理解memcpy和for循环索引复制缓冲区在复制时间上的区别C++ 比较复制内存缓冲区的运行时间:memcpy与ForLoopIndexCopying,c++,visual-studio-2017,memcpy,C++,Visual Studio 2017,Memcpy,我试图理解memcpy和for循环索引复制缓冲区在复制时间上的区别 results: CopyingType | MemoryType | RunMode | Elapsed Time(ms) ----------------------------------------------------------- memcpy | stack | Debug | x
results:
CopyingType | MemoryType | RunMode | Elapsed Time(ms)
-----------------------------------------------------------
memcpy | stack | Debug | x
forLoopIndexing | stack | Debug | 30x
memcpy | stack | Release | 0
forLoopIndexing | stack | Release | 0
memcpy | heap | Release | 0
forLoopIndexing | heap | Release | 2000
for (long j = 0; j < NumPackets; j++) {
memcpy(packet2, packet1, packetLength);
}
这是我要运行的代码。。。也许我做错了什么
看起来很奇怪,复制500000字节的缓冲区100000次根本不需要时间,或者至少比机器的分辨率要低。。。就我而言,16毫秒
#include "stdafx.h"
#include <windows.h>
#include <iostream>
int main()
{
long baseTime;
const long packetLength = 500000;
//char packet1[packetLength];//stack
//char packet2[packetLength];//stack
char *packet1 = (char*)calloc(packetLength, sizeof(char));//heap
char *packet2 = (char*)calloc(packetLength, sizeof(char));//heap
memset(packet1, 0, packetLength);//init
memset(packet2, 0, packetLength);//init
long NumPackets = 100000;
long NumRuns = 10;
for (long k = 0; k < NumRuns; k++)
{
//create packet
printf("\npacket1:\n");
for (long i = 0; i < packetLength; i++) {
packet1[i] = (char)(i % 26 + 65);
}
printf("\nk:%d\n", k);
//index copy
baseTime = GetTickCount();
long ii = 0;
for (long j = 0; j < NumPackets; j++) {
for (long i = 0; i < packetLength; i++) {
packet2[i] = packet1[i];
}
}
printf("Time(IndexCopy): %ld\n", GetTickCount() - baseTime);
//memcpy
memset(packet2, 0, packetLength);//reset
baseTime = GetTickCount();
for (long j = 0; j < NumPackets; j++) {
memcpy(packet2, packet1, packetLength); //Changed via PaulMcKenzie.
}
printf("Time(memcpy): %ld\n", GetTickCount() - baseTime);
//printf("\npacket2\n");
for (long i = 0; i < packetLength; i++) {
//printf("%c", packet2[i]);
}
}
int iHalt;
scanf_s("%d", &iHalt);
return 0;
}
也许我做错了什么
更重要的是,在使用memcpy的代码中,您犯了一些错误
检查可能已优化的程序集代码。memcpypacket2、packet1、sizeofpacket2;-这和你想象的不一样。考虑到packet2是char*?@a.a-我认为你需要更新你的代码,这样你就不会在第三个memcpy参数中犯类似的错误了,因为OP正在做这些数组。生成的代码很可能优化了这个循环,因为它检测到你没有做任何事情。如果您确实使用packet1和packet2执行了某些操作,可能会在计时块之外打印它们的内容,那么您可能会得到所需的结果。编译器优化器在当今时代是聪明的——仅仅因为你有一个GetTickCount调用并不意味着编译器会知道你的调用意图是什么。对变量什么都不做,什么也得不到。但是如果堆栈上的内存是内存,ForLoopIndexcopy也会产生0时间。不管最终结果如何,memcpy仍然是错误的。你问你是否做错了什么,是的,你是。是的,对于堆内存和memcpy,这适用于。。。但是堆栈内存呢。。。注释掉的代码?对于索引和memcpy,我都得到了0时间。
const long packetLength = 500000;
char *packet1 = (char*)calloc(packetLength, sizeof(char));
char *packet2 = (char*)calloc(packetLength, sizeof(char));
//...
for (long j = 0; j < NumPackets; j++) {
memcpy(packet2, packet1, sizeof(packet2)); // <-- Incorrect
}
for (long j = 0; j < NumPackets; j++) {
memcpy(packet2, packet1, packetLength);
}