Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/delphi/8.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Delphi 默认线程池的TParallel.的奇怪行为_Delphi_Parallel Processing_Rtl Ppl - Fatal编程技术网

Delphi 默认线程池的TParallel.的奇怪行为

Delphi 默认线程池的TParallel.的奇怪行为,delphi,parallel-processing,rtl-ppl,Delphi,Parallel Processing,Rtl Ppl,我正在试用DelphiXe7更新1的并行编程功能 我创建了一个简单的TParallel.For循环,它基本上执行一些伪操作来打发时间 我在AWS实例(c4.8xlarge)上的36个vCPU上启动了该程序,以尝试了解并行编程的好处 当我第一次启动程序并执行TParallel.For循环时,我看到了一个显著的增益(尽管比我预期的36个VCPU要小得多): 如果我不关闭程序并在不久后(例如,立即或大约10-20秒后)在36 vCPU机器上再次运行pass,并行pass会恶化很多: Parallel

我正在试用DelphiXe7更新1的并行编程功能

我创建了一个简单的
TParallel.For
循环,它基本上执行一些伪操作来打发时间

我在AWS实例(c4.8xlarge)上的36个vCPU上启动了该程序,以尝试了解并行编程的好处

当我第一次启动程序并执行
TParallel.For
循环时,我看到了一个显著的增益(尽管比我预期的36个VCPU要小得多):

如果我不关闭程序并在不久后(例如,立即或大约10-20秒后)在36 vCPU机器上再次运行pass,并行pass会恶化很多:

Parallel matches: 23077169 in 2322ms
Single Threaded matches: 23077169 in 2316ms
如果我不关闭程序,在再次运行pass之前等待几分钟(不是几秒钟,而是几分钟),我会再次得到第一次启动程序时得到的结果(响应时间提高了10倍)

在36 vCPUs机器上,启动程序后的第一次通过总是更快,因此这种效果似乎只在程序中调用
TParallel.For
时才会发生

这是我正在运行的示例代码:

unit ParallelTests;

interface

uses
  Winapi.Windows, Winapi.Messages, System.SysUtils, System.Variants, System.Classes, Vcl.Graphics,
  System.Threading, System.SyncObjs, System.Diagnostics,
  Vcl.Controls, Vcl.Forms, Vcl.Dialogs, Vcl.StdCtrls;

type
  TForm1 = class(TForm)
    Button1: TButton;
    Memo1: TMemo;
    SingleThreadCheckBox: TCheckBox;
    ParallelCheckBox: TCheckBox;
    UnitsEdit: TEdit;
    Label1: TLabel;
    procedure Button1Click(Sender: TObject);
  private
    { Private declarations }
  public
    { Public declarations }
  end;

var
  Form1: TForm1;

implementation

{$R *.dfm}

procedure TForm1.Button1Click(Sender: TObject);
var
  matches: integer;
  i,j: integer;
  sw: TStopWatch;
  maxItems: integer;
  referenceStr: string;

 begin
  sw := TStopWatch.Create;

  maxItems := 5000;

  Randomize;
  SetLength(referenceStr,120000); for i := 1 to 120000 do referenceStr[i] := Chr(Ord('a') + Random(26)); 

  if ParallelCheckBox.Checked then begin
    matches := 0;
    sw.Reset;
    sw.Start;
    TParallel.For(1, MaxItems,
      procedure (Value: Integer)
        var
          index: integer;
          found: integer;
        begin
          found := 0;
          for index := 1 to length(referenceStr) do begin
            if (((Value mod 26) + ord('a')) = ord(referenceStr[index])) then begin
              inc(found);
            end;
          end;
          TInterlocked.Add(matches, found);
        end);
    sw.Stop;
    Memo1.Lines.Add('Parallel matches: ' + IntToStr(matches) + ' in ' + IntToStr(sw.ElapsedMilliseconds) + 'ms');
  end;

  if SingleThreadCheckBox.Checked then begin
    matches := 0;
    sw.Reset;
    sw.Start;
    for i := 1 to MaxItems do begin
      for j := 1 to length(referenceStr) do begin
        if (((i mod 26) + ord('a')) = ord(referenceStr[j])) then begin
          inc(matches);
        end;
      end;
    end;
    sw.Stop;
    Memo1.Lines.Add('Single Threaded matches: ' + IntToStr(Matches) + ' in ' + IntToStr(sw.ElapsedMilliseconds) + 'ms');
  end;
end;

end.
这能按设计工作吗?我发现这篇文章()建议我让库决定线程池,但是如果我必须在请求之间等待几分钟,以便更快地处理请求,我看不到使用并行编程的意义

关于如何使用
TParallel.For
循环,我是否遗漏了什么

请注意,我无法在AWS m3.1大型实例(根据AWS为2个vCPU)上复制此内容。在这种情况下,我总是得到轻微的改进,并且在随后不久的
TParallel.For
调用中没有得到更差的结果

Parallel matches: 23077054 in 2057ms
Single Threaded matches: 23077054 in 2900ms
因此,当有许多可用的内核时,似乎会出现这种效果(36),这是一个遗憾,因为并行编程的全部目的是从许多内核中获益。我想知道这是否是一个库错误,因为内核数很高,或者在这种情况下,内核数不是2的幂

更新:使用不同vCPU的各种实例进行测试后 在AWS中,这似乎是一种行为:

  • 36 vCPU(c4.8XL)。您必须在后续呼叫到vanilla TParallel呼叫之间等待几分钟(这使它无法用于 生产)
  • 32 vCPU(c3.8XL)。您必须在后续呼叫到vanilla TParallel呼叫之间等待几分钟(这使它无法用于 生产)
  • 16 vCPU(c3.4XL)。你必须等待次秒。如果负载较低,但响应时间仍然很重要,那么它可能是可用的
  • 8个vCPU(c3.2xlarge)。它似乎工作正常
  • 4 vCPU(c3.xlarge)。它似乎工作正常
  • 2 vCPU(m3.大)。它似乎工作正常

我在您的基础上创建了两个测试程序,用于比较
System.Threading
和。我使用XE7更新1和OTL r1397构建。我使用的OTL源代码对应于3.04版。我使用32位Windows编译器构建,使用版本构建选项

我的测试机是一台运行Windows 7 x64的双Intel Xeon E5530。该系统有两个四核处理器。这一共是8个处理器,但系统说有16个是由于超线程。经验告诉我,超线程只是市场上的废话,我从未见过在这台机器上扩展到8倍以上

现在来看两个几乎相同的程序

系统线程

program SystemThreadingTest;

{$APPTYPE CONSOLE}

uses
  System.Diagnostics,
  System.Threading;

const
  maxItems = 5000;
  DataSize = 100000;

procedure DoTest;
var
  matches: integer;
  i, j: integer;
  sw: TStopWatch;
  referenceStr: string;
begin
  Randomize;
  SetLength(referenceStr, DataSize);
  for i := low(referenceStr) to high(referenceStr) do
    referenceStr[i] := Chr(Ord('a') + Random(26));

  // parallel
  matches := 0;
  sw := TStopWatch.StartNew;
  TParallel.For(1, maxItems,
    procedure(Value: integer)
    var
      index: integer;
      found: integer;
    begin
      found := 0;
      for index := low(referenceStr) to high(referenceStr) do
        if (((Value mod 26) + Ord('a')) = Ord(referenceStr[index])) then
          inc(found);
      AtomicIncrement(matches, found);
    end);
  Writeln('Parallel matches: ', matches, ' in ', sw.ElapsedMilliseconds, 'ms');

  // serial
  matches := 0;
  sw := TStopWatch.StartNew;
  for i := 1 to maxItems do
    for j := low(referenceStr) to high(referenceStr) do
      if (((i mod 26) + Ord('a')) = Ord(referenceStr[j])) then
        inc(matches);
  Writeln('Serial matches: ', matches, ' in ', sw.ElapsedMilliseconds, 'ms');
end;

begin
  while True do
    DoTest;
end.
OTL

program OTLTest;

{$APPTYPE CONSOLE}

uses
  Winapi.Windows,
  Winapi.Messages,
  System.Diagnostics,
  OtlParallel;

const
  maxItems = 5000;
  DataSize = 100000;

procedure ProcessThreadMessages;
var
  msg: TMsg;
begin
  while PeekMessage(Msg, 0, 0, 0, PM_REMOVE) and (Msg.Message <> WM_QUIT) do begin
    TranslateMessage(Msg);
    DispatchMessage(Msg);
  end;
end;

procedure DoTest;
var
  matches: integer;
  i, j: integer;
  sw: TStopWatch;
  referenceStr: string;
begin
  Randomize;
  SetLength(referenceStr, DataSize);
  for i := low(referenceStr) to high(referenceStr) do
    referenceStr[i] := Chr(Ord('a') + Random(26));

  // parallel
  matches := 0;
  sw := TStopWatch.StartNew;
  Parallel.For(1, maxItems).Execute(
    procedure(Value: integer)
    var
      index: integer;
      found: integer;
    begin
      found := 0;
      for index := low(referenceStr) to high(referenceStr) do
        if (((Value mod 26) + Ord('a')) = Ord(referenceStr[index])) then
          inc(found);
      AtomicIncrement(matches, found);
    end);
  Writeln('Parallel matches: ', matches, ' in ', sw.ElapsedMilliseconds, 'ms');

  ProcessThreadMessages;

  // serial
  matches := 0;
  sw := TStopWatch.StartNew;
  for i := 1 to maxItems do
    for j := low(referenceStr) to high(referenceStr) do
      if (((i mod 26) + Ord('a')) = Ord(referenceStr[j])) then
        inc(matches);
  Writeln('Serial matches: ', matches, ' in ', sw.ElapsedMilliseconds, 'ms');
end;

begin
  while True do
    DoTest;
end.
程序OTLTest;
{$APPTYPE控制台}
使用
Winapi.Windows,
Winapi.Messages,
系统诊断,
奥特帕莱尔;
常数
maxItems=5000;
数据规模=100000;
过程消息;
变量
msg:TMsg;
开始
而peek消息(Msg,0,0,PM_REMOVE)和(Msg.Message WM_QUIT)确实开始
翻译信息;
发送消息(Msg);
结束;
结束;
程序测试;
变量
匹配项:整数;
i、 j:整数;
sw:TStopWatch;
referenceStr:string;
开始
随机化;
SetLength(referenceStr,DataSize);
对于i:=从低(referenceStr)到高(referenceStr)do
参考文献tr[i]:=Chr(Ord('a')+Random(26));
//平行的
匹配项:=0;
sw:=TStopWatch.StartNew;
并行。对于(1,maxItems)。执行(
过程(值:整数)
变量
索引:整数;
发现:整数;
开始
发现:=0;
对于索引:=低(referenceStr)到高(referenceStr)do
如果((值mod 26)+Ord('a'))=Ord(referenceStr[index]),则
公司(发现);
原子增量(匹配项,已找到);
(完),;
Writeln('Parallel matches:',matches,'in',sw.elapsedmillesons,'ms');
处理线程消息;
//连载
匹配项:=0;
sw:=TStopWatch.StartNew;
对于i:=1到maxItems do
对于j:=从低(referenceStr)到高(referenceStr)do
如果((i mod 26)+Ord('a'))=Ord(referenceStr[j]),则
公司(火柴),;
Writeln('Serial matches:',matches,'in',sw.elapsedmillesons,'ms');
结束;
开始
尽管如此
溺爱;
结束。
现在是输出

系统线程输出

Parallel matches: 19230817 in 374ms Serial matches: 19230817 in 2423ms Parallel matches: 19230698 in 374ms Serial matches: 19230698 in 2409ms Parallel matches: 19230556 in 368ms Serial matches: 19230556 in 2433ms Parallel matches: 19230635 in 2412ms Serial matches: 19230635 in 2430ms Parallel matches: 19230843 in 2441ms Serial matches: 19230843 in 2413ms Parallel matches: 19230905 in 2493ms Serial matches: 19230905 in 2423ms Parallel matches: 19231032 in 2430ms Serial matches: 19231032 in 2443ms Parallel matches: 19230669 in 2440ms Serial matches: 19230669 in 2473ms Parallel matches: 19230811 in 2404ms Serial matches: 19230811 in 2432ms .... Parallel matches: 19230667 in 422ms Serial matches: 19230667 in 2475ms Parallel matches: 19230663 in 335ms Serial matches: 19230663 in 2438ms Parallel matches: 19230889 in 395ms Serial matches: 19230889 in 2461ms Parallel matches: 19230874 in 391ms Serial matches: 19230874 in 2441ms Parallel matches: 19230617 in 385ms Serial matches: 19230617 in 2524ms Parallel matches: 19231021 in 368ms Serial matches: 19231021 in 2455ms Parallel matches: 19230904 in 357ms Serial matches: 19230904 in 2537ms Parallel matches: 19230568 in 373ms Serial matches: 19230568 in 2456ms Parallel matches: 19230758 in 333ms Serial matches: 19230758 in 2710ms Parallel matches: 19230580 in 371ms Serial matches: 19230580 in 2532ms Parallel matches: 19230534 in 336ms Serial matches: 19230534 in 2436ms Parallel matches: 19230879 in 368ms Serial matches: 19230879 in 2419ms Parallel matches: 19230651 in 409ms Serial matches: 19230651 in 2598ms Parallel matches: 19230461 in 357ms .... 平行比赛:374ms 19230817 序列匹配:19230817在2423毫秒 平行比赛:19230698分374秒 系列匹配:19230698英寸2409毫秒 平行比赛:19230556分368秒 系列比赛:2433毫秒19230556 平行比赛:2412毫秒19230635 系列比赛:2430毫秒19230635 平行匹配:2441毫秒19230843 系列比赛:2413毫秒19230843 平行匹配:2493毫秒19230905 序列匹配:2423ms内19230905 平行匹配:2430毫秒19231032 序列匹配:2443ms中的19231032 平行比赛:2440毫秒19230669 系列比赛:2473ms中的19230669 平行比赛:19230811分2404ms 系列比赛:2432ms中的19230811 .... Parallel matches: 19230667 in 422ms Serial matches: 19230667 in 2475ms Parallel matches: 19230663 in 335ms Serial matches: 19230663 in 2438ms Parallel matches: 19230889 in 395ms Serial matches: 19230889 in 2461ms Parallel matches: 19230874 in 391ms Serial matches: 19230874 in 2441ms Parallel matches: 19230617 in 385ms Serial matches: 19230617 in 2524ms Parallel matches: 19231021 in 368ms Serial matches: 19231021 in 2455ms Parallel matches: 19230904 in 357ms Serial matches: 19230904 in 2537ms Parallel matches: 19230568 in 373ms Serial matches: 19230568 in 2456ms Parallel matches: 19230758 in 333ms Serial matches: 19230758 in 2710ms Parallel matches: 19230580 in 371ms Serial matches: 19230580 in 2532ms Parallel matches: 19230534 in 336ms Serial matches: 19230534 in 2436ms Parallel matches: 19230879 in 368ms Serial matches: 19230879 in 2419ms Parallel matches: 19230651 in 409ms Serial matches: 19230651 in 2598ms Parallel matches: 19230461 in 357ms ....