Mpi 如何正确理解IMB基准测试结果

Mpi 如何正确理解IMB基准测试结果,mpi,infiniband,Mpi,Infiniband,您好,目前我正在使用Infiniband并使用IMB基准测试性能,目前我正在测试并行传输测试 我想知道结果是否反映了8个进程的并行性能 对结果的解释太模糊,我无法理解。 由于每个结果中都提到了(6个额外的进程在MPI_屏障中等待),我怀疑它每个只运行2个进程 吞吐量列t_avg[usec]结果似乎得到了正确的结果,但我需要确保我理解正确 #----------------------------------------------------------------------------- #

您好,目前我正在使用Infiniband并使用IMB基准测试性能,目前我正在测试并行传输测试 我想知道结果是否反映了8个进程的并行性能

对结果的解释太模糊,我无法理解。 由于每个结果中都提到了(6个额外的进程在MPI_屏障中等待),我怀疑它每个只运行2个进程

吞吐量列t_avg[usec]结果似乎得到了正确的结果,但我需要确保我理解正确

#-----------------------------------------------------------------------------
# Benchmarking Sendrecv
# #processes = 8
#-----------------------------------------------------------------------------
上面这段话是否意味着我并行运行8个进程

#-----------------------------------------------------------------------------
# Benchmarking Sendrecv
# #processes = 4
# ( 4 additional processes waiting in MPI_Barrier)
#-----------------------------------------------------------------------------
这段话的意思是4个进程并行运行? 非常感谢熟悉IMB基准的人的帮助,谢谢

下面是完整的结果

# np - 8
#------------------------------------------------------------
#    Intel (R) MPI Benchmarks 2018, MPI-1 part
#------------------------------------------------------------
# Date                  : Mon Oct 16 14:14:20 2017
# Machine               : x86_64
# System                : Linux
# Release               : 4.4.0-96-generic
# Version               : #119-Ubuntu SMP Tue Sep 12 14:59:54 UTC 2017
# MPI Version           : 3.0
# MPI Thread Environment:


# Calling sequence was:

# ./IMB-MPI1 Sendrecv Exchange

# Minimum message length in bytes:   0
# Maximum message length in bytes:   4194304
#
# MPI_Datatype                   :   MPI_BYTE
# MPI_Datatype for reductions    :   MPI_FLOAT
# MPI_Op                         :   MPI_SUM
#
#

# List of Benchmarks to run:

# Sendrecv
# Exchange

#-----------------------------------------------------------------------------
# Benchmarking Sendrecv
# #processes = 2
# ( 6 additional processes waiting in MPI_Barrier)
#-----------------------------------------------------------------------------
       #bytes #repetitions  t_min[usec]  t_max[usec]  t_avg[usec]   Mbytes/sec
            0         1000        13.85        13.85        13.85         0.00
            1         1000        12.22        12.22        12.22         0.16
            2         1000        10.08        10.08        10.08         0.40
            4         1000         9.43         9.43         9.43         0.85
            8         1000         8.89         8.91         8.90         1.80
           16         1000         8.70         8.71         8.71         3.67
           32         1000         9.00         9.00         9.00         7.11
           64         1000         8.82         8.82         8.82        14.51
          128         1000         8.90         8.90         8.90        28.77
          256         1000         8.98         8.98         8.98        56.99
          512         1000         9.78         9.78         9.78       104.75
         1024         1000        12.65        12.65        12.65       161.91
         2048         1000        18.31        18.32        18.31       223.63
         4096         1000        20.05        20.05        20.05       408.52
         8192         1000        21.15        21.16        21.16       774.11
        16384         1000        27.46        27.47        27.46      1193.05
        32768         1000        36.93        36.94        36.93      1774.31
        65536          640        60.56        60.59        60.57      2163.39
       131072          320       117.62       117.63       117.63      2228.57
       262144          160       202.67       202.68       202.67      2586.78
       524288           80       323.86       324.28       324.07      3233.56
      1048576           40       615.05       615.47       615.26      3407.42
      2097152           20      1214.74      1216.89      1215.82      3446.74
      4194304           10      2471.83      2488.45      2480.14      3371.02

#-----------------------------------------------------------------------------
# Benchmarking Sendrecv
# #processes = 4
# ( 4 additional processes waiting in MPI_Barrier)
#-----------------------------------------------------------------------------
       #bytes #repetitions  t_min[usec]  t_max[usec]  t_avg[usec]   Mbytes/sec
            0         1000        11.14        11.15        11.15         0.00
            1         1000        11.16        11.16        11.16         0.18
            2         1000        11.11        11.12        11.12         0.36
            4         1000        11.10        11.11        11.10         0.72
            8         1000        11.03        11.04        11.03         1.45
           16         1000        11.21        11.22        11.22         2.85
           32         1000        11.81        11.81        11.81         5.42
           64         1000        11.58        11.58        11.58        11.05
          128         1000        11.77        11.78        11.78        21.72
          256         1000        11.88        11.89        11.89        43.05
          512         1000        13.03        13.03        13.03        78.57
         1024         1000        14.73        14.74        14.74       138.92
         2048         1000        19.37        19.39        19.38       211.24
         4096         1000        21.31        21.34        21.33       383.96
         8192         1000        26.19        26.22        26.20       624.84
        16384         1000        32.65        32.69        32.67      1002.26
        32768         1000        48.71        48.78        48.75      1343.52
        65536          640        75.14        75.22        75.18      1742.63
       131072          320       174.66       175.15       174.94      1496.65
       262144          160       301.22       302.02       301.44      1735.95
       524288           80       539.40       542.68       540.78      1932.21
      1048576           40      1015.45      1026.34      1020.59      2043.32
      2097152           20      1959.53      1985.57      1971.34      2112.39
      4194304           10      3549.00      3641.61      3590.76      2303.55

#-----------------------------------------------------------------------------
# Benchmarking Sendrecv
# #processes = 8
#-----------------------------------------------------------------------------
       #bytes #repetitions  t_min[usec]  t_max[usec]  t_avg[usec]   Mbytes/sec
            0         1000        12.81        12.83        12.82         0.00
            1         1000        12.82        12.84        12.83         0.16
            2         1000        12.73        12.75        12.74         0.31
            4         1000        12.82        12.85        12.84         0.62
            8         1000        12.87        12.88        12.87         1.24
           16         1000        12.83        12.86        12.84         2.49
           32         1000        13.25        13.28        13.26         4.82
           64         1000        13.44        13.46        13.45         9.51
          128         1000        13.49        13.51        13.50        18.94
          256         1000        13.72        13.74        13.73        37.27
          512         1000        13.69        13.71        13.70        74.72
         1024         1000        15.73        15.75        15.74       130.07
         2048         1000        20.72        20.76        20.74       197.28
         4096         1000        22.68        22.74        22.72       360.28
         8192         1000        29.48        29.52        29.50       555.04
        16384         1000        39.89        39.95        39.92       820.31
        32768         1000        57.38        57.48        57.43      1140.24
        65536          640        95.23        95.34        95.29      1374.78
       131072          320       214.61       215.16       214.83      1218.38
       262144          160       365.75       368.39       367.28      1423.18
       524288           80       679.82       687.10       683.13      1526.08
      1048576           40      1277.18      1309.22      1295.65      1601.83
      2097152           20      2292.99      2377.56      2339.35      1764.12
      4194304           10      4617.95      4919.67      4778.37      1705.12

#-----------------------------------------------------------------------------
# Benchmarking Exchange
# #processes = 2
# ( 6 additional processes waiting in MPI_Barrier)
#-----------------------------------------------------------------------------
       #bytes #repetitions  t_min[usec]  t_max[usec]  t_avg[usec]   Mbytes/sec
            0         1000        12.41        12.42        12.42         0.00
            1         1000        12.47        12.48        12.47         0.32
            2         1000        11.93        11.94        11.94         0.67
            4         1000        11.95        11.96        11.95         1.34
            8         1000        11.91        11.92        11.92         2.69
           16         1000        11.97        11.98        11.97         5.34
           32         1000        12.80        12.81        12.80        10.00
           64         1000        12.84        12.84        12.84        19.93
          128         1000        12.90        12.91        12.91        39.67
          256         1000        12.90        12.91        12.91        79.34
          512         1000        14.04        14.04        14.04       145.82
         1024         1000        17.13        17.14        17.13       239.02
         2048         1000        21.06        21.06        21.06       389.05
         4096         1000        23.32        23.33        23.32       702.41
         8192         1000        28.07        28.07        28.07      1167.45
        16384         1000        37.81        37.82        37.82      1732.64
        32768         1000        55.23        55.24        55.24      2372.75
        65536          640       101.04       101.06       101.05      2593.84
       131072          320       212.88       212.88       212.88      2462.84
       262144          160       362.37       362.38       362.37      2893.62
       524288           80       668.88       668.89       668.88      3135.26
      1048576           40      1286.48      1287.81      1287.15      3256.92
      2097152           20      2463.56      2464.13      2463.84      3404.29
      4194304           10      4845.24      4854.75      4849.99      3455.83

#-----------------------------------------------------------------------------
# Benchmarking Exchange
# #processes = 4
# ( 4 additional processes waiting in MPI_Barrier)
#-----------------------------------------------------------------------------
       #bytes #repetitions  t_min[usec]  t_max[usec]  t_avg[usec]   Mbytes/sec
            0         1000        16.46        16.46        16.46         0.00
            1         1000        16.42        16.43        16.42         0.24
            2         1000        16.17        16.17        16.17         0.49
            4         1000        16.17        16.17        16.17         0.99
            8         1000        16.19        16.20        16.20         1.98
           16         1000        16.21        16.22        16.22         3.94
           32         1000        17.20        17.21        17.20         7.44
           64         1000        17.09        17.10        17.10        14.97
          128         1000        17.24        17.25        17.25        29.68
          256         1000        17.40        17.41        17.40        58.83
          512         1000        17.59        17.61        17.60       116.32
         1024         1000        21.43        21.45        21.44       190.95
         2048         1000        29.49        29.50        29.49       277.71
         4096         1000        31.63        31.66        31.64       517.58
         8192         1000        36.70        36.72        36.71       892.41
        16384         1000        49.50        49.53        49.52      1323.07
        32768         1000        68.35        68.36        68.36      1917.38
        65536          640       108.80       108.85       108.82      2408.31
       131072          320       314.38       314.72       314.56      1665.91
       262144          160       521.71       522.24       521.94      2007.84
       524288           80       930.03       933.47       931.82      2246.62
      1048576           40      1729.81      1738.30      1734.66      2412.87
      2097152           20      3384.33      3414.99      3403.61      2456.41
      4194304           10      6972.50      7058.12      7028.16      2377.01

#-----------------------------------------------------------------------------
# Benchmarking Exchange
# #processes = 8
#-----------------------------------------------------------------------------
       #bytes #repetitions  t_min[usec]  t_max[usec]  t_avg[usec]   Mbytes/sec
            0         1000        18.91        18.93        18.92         0.00
            1         1000        19.06        19.08        19.07         0.21
            2         1000        18.91        18.92        18.92         0.42
            4         1000        19.07        19.09        19.08         0.84
            8         1000        18.81        18.83        18.82         1.70
           16         1000        19.02        19.03        19.03         3.36
           32         1000        19.85        19.85        19.85         6.45
           64         1000        19.76        19.78        19.77        12.94
          128         1000        19.94        19.96        19.95        25.65
          256         1000        20.16        20.18        20.17        50.75
          512         1000        20.50        20.51        20.50        99.86
         1024         1000        24.52        24.55        24.54       166.83
         2048         1000        36.35        36.39        36.37       225.14
         4096         1000        38.77        38.81        38.79       422.20
         8192         1000        44.79        44.82        44.81       731.12
        16384         1000        59.28        59.33        59.31      1104.68
        32768         1000        86.39        86.47        86.42      1515.87
        65536          640       142.47       142.60       142.53      1838.29
       131072          320       402.11       402.98       402.57      1301.04
       262144          160       648.90       650.30       649.68      1612.44
       524288           80      1209.17      1213.71      1211.74      1727.89
      1048576           40      2332.69      2355.17      2344.35      1780.89
      2097152           20      4686.88      4767.48      4733.77      1759.55
      4194304           10      9457.18      9674.69      9567.31      1734.13


# All processes entering MPI_Finalize

IMB
基准测试一次完成

  • 各种MPI子例程(
    MPI\u Sendrecv
    MPI\u交换
    此处)
  • 各种消息大小(从
    0
    4MB
    这里)
  • 各种通信器尺寸(
    2
    4
    8
由于使用
-np 8
调用一次
mpirun
,这意味着创建了
8
MPI任务。 因此,当测试尺寸
2
的通信器时,会在引擎盖下创建一个额外的尺寸
6
通信器,其
6
MPI任务只是挂在
MPI\u屏障上,因此会显示消息

# #processes = 2
# ( 6 additional processes waiting in MPI_Barrier)

我不熟悉您所问的技术或工具,但我觉得它更适合这个问题,因为它涉及专业计算技术,而且,嗯,这似乎不是。您所说的总吞吐量是什么意思
MPI\u Sendrecv()
是一种成对通信。。。