C++ 确定时的虚拟函数开销（c+；+；）_C++_Polymorphism

C++ 确定时的虚拟函数开销（c+；+；）

c++

C++ 确定时的虚拟函数开销（c+；+；）,c++,polymorphism,C++,Polymorphism,我知道虚拟函数本质上是包含在vtable上的函数指针，这会由于间接寻址等原因使多态调用变慢。但我想知道当调用是确定性的时候编译器的优化。所谓确定性，我指的是以下情况：对象是值而不是引用，因此不可能存在多态性：参考是指一个无子女的班级：在第一种情况下，这将不是一个虚拟呼叫。编译器将直接调用Foo:：DoSomething（）第二种情况更复杂。首先，它充其量只是一个链接时优化，因为对于特定的翻译单元，编译器不知道还有谁可以从该类继承。您遇到的另一个问题是共享库，它也可能在您的可执行文件不

我知道虚拟函数本质上是包含在vtable上的函数指针，这会由于间接寻址等原因使多态调用变慢。但我想知道当调用是确定性的时候编译器的优化。所谓确定性，我指的是以下情况：

对象是值而不是引用，因此不可能存在多态性：

参考是指一个无子女的班级：

在第一种情况下，这将不是一个虚拟呼叫。编译器将直接调用

Foo:：DoSomething（）

第二种情况更复杂。首先，它充其量只是一个链接时优化，因为对于特定的翻译单元，编译器不知道还有谁可以从该类继承。您遇到的另一个问题是共享库，它也可能在您的可执行文件不知道任何信息的情况下继承

不过，一般来说，这是一种编译器优化，称为虚拟函数调用消除（virtualfunctioncallelimination）或设备化（devirtualization），在某种程度上是一个活跃的研究领域。有些编译器在某种程度上做到了这一点，而另一些编译器则根本不这么做

参见GCC（g++）中的

-fdevirtualize

和

-fdevirtualize推测性地

。这些名称暗示了保证的质量级别。

在Visual Studio 2013中，即使行为是确定性的，虚拟函数调用也不会得到优化

比如说,

#include <iostream>

static int counter = 0;

struct Foo
{
    virtual void VirtualCall() { ++counter; }
    void RegularCall() { ++counter; }
};

int main()
{
    Foo* a = new Foo();
    a->VirtualCall(); //Overhead ? a doesn't seem to be able to change nature.
    a->RegularCall();
    std::cout << counter;
    return 0;
}

常规调用的机器代码显示函数是内联的-没有函数调用：

 a->RegularCall()

  00         inc     DWORD PTR _counter

通常，您可以相信编译器优化程序会做出正确的选择，当然这取决于优化设置

为了证明概念，这里使用了不同情况的代码，

Foo

和

Bar

定义如下：

struct Tzar : public Foo
{
   void DoSomething() override final;  // this is a virtual than can't be overriden further
};

Foo* factory ();  
Bar* bar_factory(); 
Tzar* tsar_factory(); 

int main()
{
    Foo myfoo;
    myfoo.DoSomething();  // this is a direct call

    Foo* a = new Foo();
    a->DoSomething();  //Overhead only without optimisation: a is clearly a Foo, so Foo::DoSomething(). 

    Foo* b = new Bar();
    b->DoSomething(); //Overhead only without optimisation:  b is clearly a Bar, so Bar::DoSomething().

    Bar* c = new Bar();
    c->DoSomething(); //Overhead only without optimisation: c is clearly a Bar, so Bar::DoSomething

    Foo* d = factory(); 
    d->DoSomething();  // Overhead required:  we don't know the type of d, unless global optimisation could predict it

    a = d; 
    a->DoSomething();  //the unknown propagates to a, so now this call is indirect 

    Foo*e = bar_factory(); 
    e->DoSomething();  // Overhead required:  we don't know the type of e: could be a Bar or a furhter derivate unknown in this compilation unit

    Foo*f = tsar_factory(); 
    f->DoSomething();  // Overhead could be optimised away : we don't know the type of f, but f::DoSomething() can't be overriden further
                       // but currently it isn't

  return 0;
}

您可以找到使用GCC 5.3.0提交的所有案例的生成，无需优化。它可以帮助您查看每个C++语句的汇编代码。第一次呼叫将始终是直接呼叫：

    lea     rax, [rbp-80]         ;  take the object pointer from the stack
    mov     rdi, rax              ;  set the this pointer of the invoking object
    call    Foo::DoSomething()    ; direct call to the function

在没有优化的情况下，

DoSomething（）

的所有其他调用都将使用间接调用。下面是

b->DoSomething（）

的示例：

    mov     rax, QWORD PTR [rbp-32]
    mov     rax, QWORD PTR [rax]
    mov     rax, QWORD PTR [rax]  ; load the function call from the vtable
    mov     rdx, QWORD PTR [rbp-32]
    mov     rdi, rax              ;  set the this pointer of the invoking object
    call    rax                   ; indirect call via register

如果您现在在编译器选项中设置了优化标志-O2，您将看到大多数间接调用都得到了优化，此时编译器可以预测多态指针的实际类型。在上述示例中，它将是：

    mov     rdi, rax            ;  set the this pointer of the invoking object
    call    Bar::DoSomething()  ; direct call !!

当编译器无法安全地预测实际类型时，它将使用间接调用。例如，如果您有一个函数

bar\u factory（）

，它返回一个

bar

指针，那么编译器无法知道它是否将返回一个指向

bar

对象的指针，或者返回一个从

bar

派生的类的对象的指针（可以在另一个编译单元中定义，这里不知道）

唯一意想不到的一点是，当您将虚拟函数定义为最终覆盖时（在我的示例中为Tzarclass）。在这里，您可以预期编译器将利用

DoSomething（）

不应该进一步派生的事实。但这不一定要完成

值没有多态性，因此我真的不理解你对a）的意思。

Foo*a=new Foo（）

和

Foo*b=new Foo（）

之间没有区别。您想在这里说明什么？这些都是可能优化的，因为

、

和

的动态类型在编译时都是已知的。编译器是否真正执行优化是另一回事。正如Yam Marcovic所指出的，这是一个活跃的研究领域，因此最先进的技术不断变化。你想知道编译器的优化，但我想知道你的问题是什么，你自己做了什么来回答它……这是使用O3的结果吗？我对汇编几乎一无所知。这（对于虚拟调用）是对“函数指针”的调用吗？

    lea     rax, [rbp-80]         ;  take the object pointer from the stack
    mov     rdi, rax              ;  set the this pointer of the invoking object
    call    Foo::DoSomething()    ; direct call to the function

    mov     rax, QWORD PTR [rbp-32]
    mov     rax, QWORD PTR [rax]
    mov     rax, QWORD PTR [rax]  ; load the function call from the vtable
    mov     rdx, QWORD PTR [rbp-32]
    mov     rdi, rax              ;  set the this pointer of the invoking object
    call    rax                   ; indirect call via register

    mov     rdi, rax            ;  set the this pointer of the invoking object
    call    Bar::DoSomething()  ; direct call !!