Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/linq/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/typescript/8.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
C++ 为什么链接到librt会在g++;还有叮当声?_C++_Benchmarking_Librt - Fatal编程技术网

C++ 为什么链接到librt会在g++;还有叮当声?

C++ 为什么链接到librt会在g++;还有叮当声?,c++,benchmarking,librt,C++,Benchmarking,Librt,我刚刚从@tony-d找到了一个测试虚拟函数调用开销的测试台代码。我使用g++检查了is基准: $ g++ -O2 -o vdt vdt.cpp -lrt $ ./vdt virtual dispatch: 150000000 0.128562 switched: 150000000 0.0803207 overheads: 150000000 0.0543323 ... 我得到了比他的更好的性能(比率约为2),但随后我用clang: $ clang++-3.7 -O2 -o vdt vdt.

我刚刚从@tony-d找到了一个测试虚拟函数调用开销的测试台代码。我使用
g++
检查了is基准:

$ g++ -O2 -o vdt vdt.cpp -lrt
$ ./vdt
virtual dispatch: 150000000 0.128562
switched: 150000000 0.0803207
overheads: 150000000 0.0543323
...
我得到了比他的更好的性能(比率约为2),但随后我用
clang

$ clang++-3.7 -O2 -o vdt vdt.cpp -lrt
$ ./vdt
virtual dispatch: 150000000 0.462368
switched: 150000000 0.0569544
overheads: 150000000 0.0509332
...
$ g++ -O2 -o vdt vdt.cpp
$ ./vdt
virtual dispatch: 150000000 0.4661
switched: 150000000 0.0815865
overheads: 150000000 0.0543611
...
$ clang++-3.7 -O2 -o vdt vdt.cpp
$ ./vdt
virtual dispatch: 150000000 0.155901
switched: 150000000 0.0568319
overheads: 150000000 0.0492521
...
现在比率上升到70左右

然后我注意到了
-lrt
命令行参数,在谷歌搜索了一下
librt
之后,我尝试在
g++
clang
中不使用它:

$ clang++-3.7 -O2 -o vdt vdt.cpp -lrt
$ ./vdt
virtual dispatch: 150000000 0.462368
switched: 150000000 0.0569544
overheads: 150000000 0.0509332
...
$ g++ -O2 -o vdt vdt.cpp
$ ./vdt
virtual dispatch: 150000000 0.4661
switched: 150000000 0.0815865
overheads: 150000000 0.0543611
...
$ clang++-3.7 -O2 -o vdt vdt.cpp
$ ./vdt
virtual dispatch: 150000000 0.155901
switched: 150000000 0.0568319
overheads: 150000000 0.0492521
...
正如你所看到的,性能被改变了

根据我对
librt
的发现,
clock\u gettime
和其他相关的时间计算需要它(可能我错了,在这种情况下请更正!),但是没有
-lrt
,代码编译得很好,而且从我看到的时间似乎是正确的

为什么链接/不链接
librt
会对代码产生如此大的影响


有关我的系统和编译器的信息:

$ g++ --version
g++-5 (Ubuntu 5.3.0-3ubuntu1~14.04) 5.3.0 20151204
Copyright (C) 2015 Free Software Foundation, Inc.

$ clang++-3.7 --version
Debian clang version 3.7.1-svn254351-1~exp1 (branches/release_37) (based on LLVM 3.7.1)
Target: x86_64-pc-linux-gnu
Thread model: posix

$ uname -a
Linux ****** 3.13.0-86-generic #130-Ubuntu SMP Mon Apr 18 18:27:15 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

我猜这与oprimizer有关(如果指定了-lrt,因为试图链接库,优化器有更多的数据,可以进行不同的优化)

至于差异,使用我的g++(4.8.4)时,我在使用和不使用-lrt时得到了相同的结果,但是clang(3.4.-lubuntu3)有差异。我尝试通过perftools统计数据运行此操作,结果如下:

$ g++ -O2 -o vdt vdt.cpp -lrt && perf stat -d ./vdt
virtual dispatch: 150000000 1.2304
switched: 150000000 0.131782
overheads: 150000000 0.0842732
virtual dispatch: 150000000 1.13689
switched: 150000000 0.137304
overheads: 150000000 0.0854806
virtual dispatch: 150000000 1.19261
switched: 150000000 0.133561
overheads: 150000000 0.0969093

 Performance counter stats for './vdt':

       4068.861539 task-clock (msec)         #    0.961 CPUs utilized          
             1,068 context-switches          #    0.262 K/sec                  
                 0 cpu-migrations            #    0.000 K/sec                  
               431 page-faults               #    0.106 K/sec                  
    11,977,128,883 cycles                    #    2.944 GHz                     [40.18%]
     6,088,274,331 stalled-cycles-frontend   #   50.83% frontend cycles idle    [39.92%]
     3,984,855,636 stalled-cycles-backend    #   33.27% backend  cycles idle    [39.98%]
     6,581,309,599 instructions              #    0.55  insns per cycle        
                                             #    0.93  stalled cycles per insn [50.06%]
     1,506,617,848 branches                  #  370.280 M/sec                   [50.12%]
       303,871,937 branch-misses             #   20.17% of all branches         [49.88%]
     2,708,080,460 L1-dcache-loads           #  665.562 M/sec                   [49.94%]
       559,844,530 L1-dcache-load-misses     #   20.67% of all L1-dcache hits   [50.28%]
                 0 LLC-loads                 #    0.000 K/sec                   [40.05%]
                 0 LLC-load-misses           #    0.00% of all LL-cache hits    [39.98%]

       4.232477683 seconds time elapsed

$ g++ -O2 -o vdt vdt.cpp && perf stat -d ./vdt
virtual dispatch: 150000000 1.11517
switched: 150000000 0.14231
overheads: 150000000 0.0840234
virtual dispatch: 150000000 1.11355
switched: 150000000 0.130082
overheads: 150000000 0.116934
virtual dispatch: 150000000 1.16225
switched: 150000000 0.13281
overheads: 150000000 0.0798615

 Performance counter stats for './vdt':

       4050.314222 task-clock (msec)         #    0.993 CPUs utilized          
               707 context-switches          #    0.175 K/sec                  
                 0 cpu-migrations            #    0.000 K/sec                  
               402 page-faults               #    0.099 K/sec                  
    12,213,599,260 cycles                    #    3.015 GHz                     [39.72%]
     6,987,416,990 stalled-cycles-frontend   #   57.21% frontend cycles idle    [40.25%]
     4,675,829,189 stalled-cycles-backend    #   38.28% backend  cycles idle    [40.17%]
     6,611,623,206 instructions              #    0.54  insns per cycle        
                                             #    1.06  stalled cycles per insn [50.54%]
     1,505,162,879 branches                  #  371.616 M/sec                   [50.48%]
       298,748,152 branch-misses             #   19.85% of all branches         [50.30%]
     2,710,580,651 L1-dcache-loads           #  669.227 M/sec                   [50.04%]
       551,212,908 L1-dcache-load-misses     #   20.34% of all L1-dcache hits   [49.86%]
                 3 LLC-loads                 #    0.001 K/sec                   [39.62%]
                 0 LLC-load-misses           #    0.00% of all LL-cache hits    [40.01%]

       4.080288324 seconds time elapsed

$ clang++ -O2 -o vdt vdt.cpp -lrt && perf stat -d ./vdt
virtual dispatch: 150000000 0.276252
switched: 150000000 0.11926
overheads: 150000000 0.0733678
virtual dispatch: 150000000 0.249832
switched: 150000000 0.0892711
overheads: 150000000 0.117108
virtual dispatch: 150000000 0.247705
switched: 150000000 0.109486
overheads: 150000000 0.0762541

 Performance counter stats for './vdt':

       1347.887606 task-clock (msec)         #    0.989 CPUs utilized          
               222 context-switches          #    0.165 K/sec                  
                 0 cpu-migrations            #    0.000 K/sec                  
               430 page-faults               #    0.319 K/sec                  
     3,558,892,668 cycles                    #    2.640 GHz                     [42.47%]
     1,316,787,839 stalled-cycles-frontend   #   37.00% frontend cycles idle    [42.61%]
       438,592,926 stalled-cycles-backend    #   12.32% backend  cycles idle    [40.57%]
     6,388,507,180 instructions              #    1.80  insns per cycle        
                                             #    0.21  stalled cycles per insn [50.49%]
     1,514,291,853 branches                  # 1123.456 M/sec                   [50.19%]
         1,095,265 branch-misses             #    0.07% of all branches         [48.66%]
     2,485,922,557 L1-dcache-loads           # 1844.310 M/sec                   [47.99%]
       577,213,257 L1-dcache-load-misses     #   23.22% of all L1-dcache hits   [48.20%]
                 2 LLC-loads                 #    0.001 K/sec                   [40.51%]
                 0 LLC-load-misses           #    0.00% of all LL-cache hits    [40.17%]

       1.362403811 seconds time elapsed

$ clang++ -O2 -o vdt vdt.cpp && perf stat -d ./vdt
virtual dispatch: 150000000 1.0894
switched: 150000000 0.0849747
overheads: 150000000 0.0726611
virtual dispatch: 150000000 1.03949
switched: 150000000 0.0849843
overheads: 150000000 0.0768674
virtual dispatch: 150000000 1.07786
switched: 150000000 0.0893431
overheads: 150000000 0.0725624

 Performance counter stats for './vdt':

       3667.235804 task-clock (msec)         #    0.993 CPUs utilized          
               356 context-switches          #    0.097 K/sec                  
                 0 cpu-migrations            #    0.000 K/sec                  
               402 page-faults               #    0.110 K/sec                  
    11,052,067,182 cycles                    #    3.014 GHz                     [39.98%]
     5,346,555,173 stalled-cycles-frontend   #   48.38% frontend cycles idle    [40.10%]
     3,480,506,097 stalled-cycles-backend    #   31.49% backend  cycles idle    [40.09%]
     6,351,819,740 instructions              #    0.57  insns per cycle        
                                             #    0.84  stalled cycles per insn [50.07%]
     1,524,106,229 branches                  #  415.601 M/sec                   [50.17%]
       299,296,742 branch-misses             #   19.64% of all branches         [50.05%]
     2,393,484,447 L1-dcache-loads           #  652.667 M/sec                   [49.93%]
       554,010,640 L1-dcache-load-misses     #   23.15% of all L1-dcache hits   [49.88%]
                 0 LLC-loads                 #    0.000 K/sec                   [40.33%]
                 0 LLC-load-misses           #    0.00% of all LL-cache hits    [39.83%]

       3.692786417 seconds time elapsed

我所看到的是,clang中的分支预测(分支未命中)存在一些差异(这同样适用于优化器)。

Btw。在第一种情况下(使用rt),您使用的是
-O3
与clang,在另一种情况下(不使用rt)
-O2
,这些结果是不可比的。在第一种情况下,您可以尝试使用
-O2
吗?@axalis抱歉,我用
-O2
-O3
对所有情况进行了测试,结果都是相同的(我已经更正了问题)。谢谢您的回答。在我的计算机上使用g++时,我显然得到了非常不同的结果(我使用的是g++5.3)。使用
perf
我得到了与您大致相同的解释,在坏的情况下有很多分支未命中(对于g++没有
-lrt
,对于clang没有
-lrt
)。我仍然发现g++和clang++的这种行为非常奇怪(事实上它们是完全相反的…)!