Performance 用定时器测试Erlang函数性能

Performance 用定时器测试Erlang函数性能,performance,timer,erlang,Performance,Timer,Erlang,我正在使用timer:tc/3在紧循环(比如5000次迭代)中测试函数的性能: {Duration_us, _Result} = timer:tc(M, F, [A]) 这将返回函数的持续时间(以微秒为单位)和结果。就参数而言,持续时间为N微秒 然后,我对迭代的结果执行一个简单的平均计算 如果我在timer:tc/3调用之前放置timer:sleep(1)函数调用,则所有迭代的平均持续时间总是>没有睡眠的平均持续时间: timer:sleep(1), timer:tc(M, F, [A]).

我正在使用
timer:tc/3
在紧循环(比如5000次迭代)中测试函数的性能:

{Duration_us, _Result} = timer:tc(M, F, [A])
这将返回函数的持续时间(以微秒为单位)和结果。就参数而言,持续时间为N微秒

然后,我对迭代的结果执行一个简单的平均计算

如果我在
timer:tc/3
调用之前放置
timer:sleep(1)
函数调用,则所有迭代的平均持续时间总是>没有睡眠的平均持续时间:

timer:sleep(1),
timer:tc(M, F, [A]).
这对我来说没有多大意义,因为
timer:tc/3
函数应该是原子函数,不关心在此之前发生的任何事情


有人能解释这个奇怪的功能吗?它在某种程度上与调度和缩减有关吗?

衡量性能是一项复杂的任务,尤其是在新硬件和现代操作系统中。有很多事情会影响你的成绩。首先,你并不孤单。当您在桌面或笔记本上测量时,可能会有其他过程干扰您的测量,包括系统过程。第二件事是硬件本身。Moder CPU具有许多控制性能和功耗的酷炫功能。它们可以在过热前短时间内提高性能,在同一芯片上的其他CPU或同一CPU上的其他超线程上没有工作时,它们可以提高性能。另一方面,他们可以进入省电模式时,没有足够的工作和CPU的反应不够快,突然变化。很难说这是否是你的情况,但对以前的工作很幼稚,或者缺乏它不会影响你的测量。您应该始终注意在稳定状态下测量足够长的时间(至少秒),并尽可能多地移除可能影响测量的其他东西。(也不要忘记Erlang中的GC。)

您的意思是:

4> foo:foo(10000)

其中:

-module(foo).
-export([foo/1, baz/1]).

foo(N) -> TL = bar(N), {TL,sum(TL)/N} .

bar(0) -> [];
bar(N) ->
    timer:sleep(1),
    {D,_} = timer:tc(?MODULE, baz, [1000]),
    [D|bar(N-1)]
    .

baz(0) -> ok;
baz(N) -> baz(N-1).

sum([]) -> 0;
sum([H|T]) -> H + sum(T).
我试过这个,很有趣。在sleep语句中,timer:tc/3返回的平均时间是19到22微秒,在sleep被注释掉后,平均时间下降到4到6微秒。相当戏剧性

我注意到计时中存在人工制品,因此类似这样的事件(这些数字是计时器:tc/3返回的单个微秒计时)并不少见:

---- snip ----
  5,5,5,6,5,5,5,6,5,5,5,6,5,5,5,5,4,5,5,5,5,5,4,5,5,5,5,6,5,5,
  5,6,5,5,5,5,5,6,5,5,5,5,5,6,5,5,5,6,5,5,5,5,5,5,5,5,5,5,4,5,
  5,5,5,6,5,5,5,6,5,5,7,8,7,8,5,6,5,5,5,6,5,5,5,5,4,5,5,5,5,
  14,4,5,5,4,5,5,4,5,4,5,5,5,4,5,5,4,5,5,4,5,4,5,5,5,4,5,5,4,
  5,5,4,5,4,5,5,4,4,5,5,4,5,5,4,4,4,4,4,5,4,5,5,4,5,5,5,4,5,5,
  4,5,5,4,5,4,5,5,5,4,5,5,4,5,5,4,5,4,5,4,5,4,5,5,4,4,4,4,5,4,
  5,5,54,22,26,21,22,22,24,24,32,31,36,31,33,27,25,21,22,21,
  24,21,22,22,24,21,22,21,24,21,22,22,24,21,22,21,24,21,22,21,
  23,27,22,21,24,21,22,21,24,22,22,21,23,22,22,21,24,22,22,21,
  24,21,22,22,24,22,22,21,24,22,22,22,24,22,22,22,24,22,22,22,
  24,22,22,22,24,22,22,21,24,22,22,21,24,21,22,22,24,22,22,21,
  24,21,23,21,24,22,23,21,24,21,22,22,24,21,22,22,24,21,22,22,
  24,22,23,21,24,21,23,21,23,21,21,21,23,21,25,22,24,21,22,21,
  24,21,22,21,24,22,21,24,22,22,21,24,22,23,21,23,21,22,21,23,
  21,22,21,23,21,23,21,24,22,22,22,24,22,22,41,36,30,33,30,35,
  21,23,21,25,21,23,21,24,22,22,21,23,21,22,21,24,22,22,22,24,
  22,22,21,24,22,22,22,24,22,22,21,24,22,22,21,24,22,22,21,24,
  22,22,21,24,21,22,22,27,22,23,21,23,21,21,21,23,21,21,21,24,
  21,22,21,24,21,22,22,24,22,22,22,24,21,22,22,24,21,22,21,24,
  21,23,21,23,21,22,21,23,21,23,22,24,22,22,21,24,21,22,22,24,
  21,23,21,24,21,22,22,24,21,22,22,24,21,22,21,24,21,22,22,24,
  22,22,22,24,22,22,21,24,22,21,21,24,21,22,22,24,21,22,22,24,
  24,23,21,24,21,22,24,21,22,21,23,21,22,21,24,21,22,21,32,31,
  32,21,25,21,22,22,24,46,5,5,5,5,5,4,5,5,5,5,6,5,5,5,5,5,5,4,
  6,5,5,5,6,5,5,5,5,5,5,5,6,5,5,5,5,4,5,4,5,5,5,5,6,5,5,5,5,5,
  5,5,6,5,5,5,5,5,5,5,6,5,5,5,5,4,6,4,6,5,5,5,5,5,5,4,6,5,5,5,
  5,4,5,5,5,5,5,5,6,5,5,5,5,4,5,5,5,5,5,5,6,5,5,5,5,5,5,5,6,5,
  5,5,5,4,5,5,6,5,5,5,6,5,5,5,5,5,5,5,6,5,5,5,6,5,5,5,5,5,5,5,
  6,5,5,5,5,4,5,4,5,5,5,5,6,5,5,5,5,5,5,4,5,4,5,5,5,5,5,6,5,5,
  5,5,4,5,4,5,5,5,5,6,5,5,5,5,5,5,5,6,5,5,5,5,5,5,5,6,5,5,5,5,
---- snip ----
我假设这就是你所指的效果,不过当你说always>N时,它总是,还是只是大部分?反正我也不总是这样

以上结果表明提取物无需睡眠。通常在使用睡眠计时器时:tc/3在没有睡眠的情况下,大部分时间返回4或5个低谷时间,但有时返回22个大谷时间,而在睡眠到位的情况下,通常返回22个大谷时间,偶尔会出现一批低谷时间


为什么会发生这种情况当然不清楚,因为睡眠实际上只是意味着产量。我想知道这一切是否都取决于CPU缓存。毕竟,特别是在一台不忙的机器上,人们可能会期望没有睡眠的情况下一次性执行大部分代码,而不会移动到另一个内核,也不会对内核执行太多其他操作,从而充分利用缓存。。。但是当你睡觉时,并因此放弃,然后再回来,缓存命中的机会可能会大大减少。

是的,这几乎是一般的想法。我又看了一遍,我对结果进行了平均——这与你所看到的更加一致。我更新了问题,以反映我使用平均值的事实。是的,我从平均值开始,当然,直到我意识到原始数据有人工制品,因此与平均值一样有趣。
erlang:statistics
函数在性能测试中可能对您有用;