Virtual function's disadvantage and real-life examples

Here's an example: https://godbolt.org/z/cKfn94MjP

In function f_derived compiler does not know what d pointer is. Maybe it is some other Derived2 class that inherits Derived and overrides return value of Foo. So it generates indirect branch from vtable (with extra optimizations) - load from vtable happens in lines 5/6 and jump to loaded function pointer in line 12.

But in function f_derived_final compiler sees that df pointer can only be DerivedFinal object in Foo call. Because nobody else can override Foo method even if somebody inherits DerivedFinal. So it simply inlines Foo() call - and no indirect branch happens. Code is simpler and smaller, and executes faster.

Edited by Mārtiņš Možeiko on
So that's what inline means! I used to think inline just mean copy the function body then paste it into the caller :)
that is effectively what happens, and doing this allows for a bunch of additional optimizations to become possible. This is the more valuable part of inlining.
Another important benefit of inlining, in addition to what ratchetfreak already mentioned, is that you generally want to keep as much information in registers as possible, since registers are the fastest things to access, but when calling a function, the caller has to save registers to the stack before the call and load them back afterward (since it doesn't know how the callee will use the registers), which takes time, increases code size, and gets in the way of optimal register usage. Eliminating this "preamble/postamble" code is a big part of the reason why optimizing compilers pretty much always inline small functions, even where there isn't anything to be gained from other optimizations like constant folding and loop invariant hoisting.
So inline functions just use the same registers' information without the need to save it?
When compiler inlines function, it becomes no different than code without function call - just as you said before: "copy&paste" its body. And then compiler sees everything - everything how variables are modified, who writes, who reads to memory, etc.. So it can allocate registers more optimally without need to worry about who can overwrite them (because there is no function called). All that together with additional/better optimizations, of course.

Edited by Mārtiņš Možeiko on
So do inline functions have their own call stack? Because if you just do something similar to copy & paste, then what happens when I have some local variables, or return, break, and continue statement inside the function? Do those statements go to the outside scope?
That is kind of wrong question to ask. Each thread executing has its own stack - piece of memory where functions can write whatever they need. Functions don't get their own stack. When compiler compiles function it can decide to put some variables or temporaries on stack of thread who called function. Nothing to do with inlining. Compiler will do to stack whatever things it needs to produce correctly working and optimial code for any function, inlined or not.

if you have functions like this:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
int getSum(int len, int* arr)
{
  int sum = 0;
  for (int i=0; i<len; i++) sum += arr[i];
  return sum;
}

int sum100(int* arr)
{
  int s = sum(100, arr);
  return s/5;
}


you can think of inlining process producing function that looks like this:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
int sum100_inlined(int* arr)
{
  int s;
  {
    int sum = 0;
    for (int i=0; i<100; i++) sum += arr[i];
    s = sum;
  }
  return s/5;
}


If you'll compile this function and look in asm, you should see exact same thing as for sum100 function.

Edited by Mārtiņš Možeiko on

There is something that I still don't quite understand about inlining functions:

  1. Is inlined a function always good? Are there any cons for inlining a function? The compiler only refuses to inline a function when it literally can't do it, right?
  2. Are there any times where the compiler can remove the virtual function pointer itself because it knows it can always inline the virtual function? Or is it never allowed?
  3. Whether a function gets inlined or not depends on where they get called, right? Some callers can get inlined and some can't. What if the compiler decides that a function can always get inlined, does the function still need to be compiled separately?
  4. This's a dumb question, but I'm guessing that if you assign a function to a function pointer then that function can't be inlined. But what if a function only gets called in 2 places, the first one can be inlined, the second one is through a function pointer, then can the compiler makes the pointer points to the first place where it gets inlined, or does it still need to inline the function, compile the function separately and then make the pointer points to the compiled one?
  1. Not always. Inlining large functions increases code size much and that may be bad for code cache. Compilers decide on inlining based on function size, optimization levels & other things.

  2. Compilers try as hard as they can to remove virtual functions. It is called "devirtualization" and is important optimization in C++ compilers. Java and C# and similar languages too. Here's example where virtual function disappears, because compiler figures out which concrete function is used as virtual function: https://godbolt.org/z/1EnfrsYbz
    Here are two presentations from clang & llvm how they do devirtualization optimizations:

  1. Depends on function definition. If it has external linkage, it will still produce actual function code in object file, because maybe some other TU is using it. If it has internal linkage, then no - it won't produce function body. If you always want the latter, then use static inline vs just inline

  2. It can if compiler can inline / figure out exactly which function is called through function pointer. Example: https://godbolt.org/z/nnoqnh96o
    Function pointers do not point to inlined places, they point to actual function. Even if other places get inlined. Example: https://godbolt.org/z/KWPMe5adG You can see number() function has big() function inlined, but number_runtime() function has function pointer to original big() function.


Edited by Mārtiņš Možeiko on
Replying to longtran2904 (#25762)

Thanks for the answers!

Compilers try as hard as they can to remove virtual functions. It is called "devirtualization" and is important optimization in C++ compilers. Java and C# and similar languages too. Here's example where virtual function disappears, because compiler figures out which concrete function is used as virtual function: https://godbolt.org/z/1EnfrsYbz Here are two presentations from clang & llvm how they do devirtualization optimizations:

Haven't watched the 2 clips yet, but does "devirtualization" here mean just don't use the virtual pointer or literally remove the virtual pointer from the base class and all the derived ones (and reduce the size of all the instances of those classes) when it knows it can always inline the function.

Function pointers do not point to inlined places, they point to actual function. Even if other places get inlined.

Why is it? Why couldn't it point to the inlined one? Doing that will certainly reduce the overall code size, right? Especially in your example when the function just returns an int, so pointing it to the inlined one is very easy.


Replying to mmozeiko (#25763)

devirtualization means change virtual function calls (which are indirect) to direct calls - as that allows possibility of inlining and more optimizations removing whole call.

Virtual table pointer will still be there. Afaik that never gets removed.

Function pointers do not point to inlined places, because that's not a real function anymore. Inlined code gets mixed with code in wherever it was inlined - but when you call inlined function you don't want to execute that other code, only function's one.

Also the C standard requires two pointers be equal only when they point to same thing. So having &number == &big expression be true would violate C standard. They must be separate things.

My previous example is too simple to show how inlined function code changes. Yes, produced code looks identical - but that can be de-duplicated by linker (it will keep only one copy of identical functions).

Here's a better example: https://godbolt.org/z/jse649h8o

While big() function is inlined into foo() function, the bar() function refers to big() as function pointer. Because it cannot point anywhere inside foo() function to return that 100000 number, foo() function performs different calculation.


Edited by Mārtiņš Možeiko on
Replying to longtran2904 (#25772)

Yes, produced code looks identical - but that can be de-duplicated by the linker (it will keep only one copy of identical functions).

What did you mean here? What would the linker do?


Edited by longtran2904 on
Replying to mmozeiko (#25773)

If compiled functions (or global variables) contains identical bytes, the linker can put only one copy in final binary and adjust all references to point to same location.

On Windows this is done if you use /Gy cl.exe argument (which is used by default in /O1 or /O2) and /OPT:ICF link.exe argument (which is used by default if no /DEBUG is specified).


Replying to longtran2904 (#25774)
  1. But doesn't that make &ptr1 == &ptr2 expression equal to true.
  2. If the compiler can do that then why can't they do the same for the inlined case? For example, a simple function, that only returns 100, get called in 2 places: the first one gets inlined, the second one is like your example (through a function pointer that can't be inlined). Why can't it just point the pointer to the first one?

Replying to mmozeiko (#25775)