22

When we create a member function for a class in c++, it has an implicit extra argument that is a pointer to the calling object -- referred as this.

Is this true for any function, even if it does not use this pointer. For example, given the class

class foo
{
private:
    int bar;
public:
    int get_one()
    {
      return 1;  // Not using `this`
    }
    int get_bar()
    {
        return this->bar;  // Using `this`
    }
}

Would both the functions (get_one and get_bar) take this as an implicit parameter, even though only one of them actually uses it?
It seems like a bit of a waste to do so.

Note: I understand the correct thing to do would be to make get_one() static, and that the answer may be dependent on the implementation, but I'm just curious.

BeeOnRope
  • 51,419
  • 13
  • 149
  • 309
rtpax
  • 1,497
  • 12
  • 29
  • 1
    By not making the function `static` there is the implication that you will use `this`. It is up to the programmer to add `static` to the signature of functions which do not depend on state in the instance. – Paul Rooney Jan 15 '17 at 23:53
  • 1
    `this` will only result in code generation if it is actually needed, which it isn't in the first case. –  Jan 15 '17 at 23:54
  • 2
    @latedeveloper Take into account that compiler often don't know if method needs `this`, especially if the function definition is in another source file. – NO_NAME Jan 16 '17 at 00:00
  • 5
    If the member function is inlined, which is the most likely in your example, then the question is moot. If the function cannot be inlined, because the definition and the usr are in different translation units, then the compiler cannot know that the `this` value won't be needed. In short, if you care about this trivial time saving, declare the function static or make sure that it is always inlinable. – rici Jan 16 '17 at 00:02
  • @NO_NAME Good point - I was thinking about the function body rather than the call. –  Jan 16 '17 at 00:05
  • 2
    The question of whether "taking a parameter" correlates with being "a bit of a waste" is a question of code generation, not of language rules. There is no requirement from the language on any implementation to be wasteful. – Kerrek SB Jan 16 '17 at 00:13
  • Related: [In what scenario 'this' pointer is passed to the class methods?](http://stackoverflow.com/q/23165475/514235) – iammilind Jan 16 '17 at 06:53

4 Answers4

22

Would both of the functions (get_one and get_bar) take this as an implicit parameter even though only onle get_bar uses it?

Yes (unless the compiler optimizes it away, which still doesn't mean you can call the function without a valid object).

It seems like a bit of a waste to do so

Then why is it a member if it doesn't use any member data? Sometimes, the correct approach is making it a free function in the same namespace.

StoryTeller - Unslander Monica
  • 148,497
  • 21
  • 320
  • 399
  • @HolyBlackCat - Next time. Comment. The editing privilege is not for technical data. – StoryTeller - Unslander Monica Jan 15 '17 at 23:58
  • 2
    I will keep that in mind. (Sometimes I received the opposite reaction, so I decided not to bother you to add a minor detail.) – HolyBlackCat Jan 16 '17 at 00:01
  • 4
    @HolyBlackCat - I'm of the opinion that fixing typos and grammar is always a welcome edit. But I'd rather not be held accountable for technical details unless I had the opportunity to consider them. Sure I can rollback your edit, but it's just less forceful to add a comment IMO. IDK, maybe I'm weird. – StoryTeller - Unslander Monica Jan 16 '17 at 00:05
  • 1
    I've just read the edit priviledge page and now I also think that I should generally comment first unless I'm editing a simple typo or a new user post. – HolyBlackCat Jan 16 '17 at 00:09
  • Surely further qualifications can be added, after "unless" and "which still". I'm thinking, for example, if one is the compiler vendor and knows what the compiler will do with the code. Or, if the aim is to demonstrate exactly that code. And so on. Let's not stop at a paltry two-segment chain of qualifications! – Cheers and hth. - Alf Jan 16 '17 at 00:13
  • @Alf I'm getting winded by just reciting that sentence in my head. I think I'll stop at two. – StoryTeller - Unslander Monica Jan 16 '17 at 00:17
  • 1
    Marking this answer as correct because I believe it is, but comments by NO_NAME and rici on the OP were most relevant to understanding why this must be correct – rtpax Jan 16 '17 at 00:20
  • 1
    @rtpax - as I understood it, one thrust of your question was whether the use of member functions that don't actually need `this` imposes some kind of performance cost, and how big of a cost. I tried to cover that angle in particular in by answer below. – BeeOnRope Jan 16 '17 at 22:49
6

...class in c++, as I understand it, it has an implicit extra argument that is a pointer to the calling object

It's important to note that C++ started as C with objects.

To that, the this pointer isn't one that is implicitly present within a member function, but instead the member function, when compiled out, needs a way to know what this is referring to; thus the notion of an implicit this pointer to the calling object being passed in.

To put it another way, lets take your C++ class and make it a C version:

C++

class foo
{
    private:
        int bar;
    public:
        int get_one()
        {
            return 1;
        }
        
        int get_bar()
        {
            return this->bar;
        }
    
        int get_foo(int i)
        {
            return this->bar + i;
        }
};

int main(int argc, char** argv)
{
    foo f;
    printf("%d\n", f.get_one());
    printf("%d\n", f.get_bar());
    printf("%d\n", f.get_foo(10));
    return 0;
}

C

typedef struct foo
{
    int bar;
} foo;

int foo_get_one(foo *this)
{
    return 1;
}

int foo_get_bar(foo *this)
{
    return this->bar;
}

int foo_get_foo(int i, foo *this)
{
    return this->bar + i;
}

int main(int argc, char** argv)
{
    foo f;
    printf("%d\n", foo_get_one(&f));
    printf("%d\n", foo_get_bar(&f));
    printf("%d\n", foo_get_foo(10, &f));
    return 0;
}

When the C++ program is compiled and assembled, the this pointer is "added" to the mangled function in order to "know" what object is calling the member function.

So foo::get_one might be "mangled" to the C equivalent of foo_get_one(foo *this), foo::get_bar could be mangled to foo_get_bar(foo *this) and foo::get_foo(int) could be foo_get_foo(int, foo *this), etc.

Would both of the functions (get_one and get_bar) take this as an implicit parameter even though only one get_bar uses it? It seems like a bit of a waste to do so.

This is a function of the compiler and if absolutely no optimizations were done, the heuristics might still eliminate the this pointer in a mangled function where an object need not be called (to save stack), but that is highly dependent on the code and how it's being compiled and to what system.

More specifically, if the function were one as simple as foo::get_one (merely returning a 1), chances are the compiler might just put the constant 1 in place of the call to object->get_one(), eliminating the need for any references/pointers.

Hope that can help.

Community
  • 1
  • 1
txtechhelp
  • 6,147
  • 1
  • 27
  • 36
  • I don't think the compiler can ever reasonably remove the implicit `this` from the mangled function, because even if the compiler compiling the _function itself_ realizes that `this` is not used, the _caller_ doesn't know this, and will always generate code that passes `this` and expects the mangled name to include `this`. Of course, if the function is inlined, none of this needs to occur, but then the mangled name isn't used at all (this includes LTCG-type inlining). – BeeOnRope Jan 16 '17 at 22:52
  • In general, for the actual externally visible function, the compiler can't really communicate to the caller that "`this` isn't used". In particular, the linker is connecting call sites to implementations, and the linker can't say "oh, I didn't find a mangled name with `this`, let me try the version without `this`" - it is just going to fail if the expected name isn't there. So the amount of optimization that occur here for separately compiled functions is fairly limited (again, outside of LTCG). – BeeOnRope Jan 16 '17 at 22:54
  • @BeeOnRope .. agreed on the caller not knowing what `this` is if it's not there. I made an edit for clarification, specifically, in a simple case like `get_one` that merely returns a `1`, the compiler could optimize away the function call all together by merely putting a `1` in place (or if the function were to be in-lined), for example; in that case, there is no `this` pointer as one is not necessary in the assembled output. – txtechhelp Jan 16 '17 at 23:32
  • Correct, but it requires inlining, so it only happens within the same compilation unit. I showed some examples in my answer below on how inlining is the "magic" that lets the compiler ignore `this` here. – BeeOnRope Jan 16 '17 at 23:56
3

Semantically the this pointer is always available in a member function - as another user pointed out. That is, you could could later change the function to use it without issue (and, in particular, without the need to recompile calling code in other translation units) or in the case of a virtual function, an overridden version in a subclass could use this even if the base implementation didn't.

So the remaining interesting question is what performance impact this imposes, if any. There may be a cost to the caller and/or the callee and the cost may be different when inlined and not inlined. We examine all the permutations below:

Inlined

In the inlined case, the compiler can see both the call site and the function implementation1, and so presumably doesn't need to follow any particular calling convention and so cost of the hidden this pointer should go away. Note also that in this case there is no real distinction between the "callee" code and the "called" code, since they are combined at optimized together at the call site.

Let's use the following test code:

#include <stdio.h>

class foo
{
private:
    int bar;
public:
    int get_one_member()
    {
      return 1;  // Not using `this`
    }
};

int get_one_global() {
  return 2;
}

int main(int argc, char **) {
  foo f = foo();
  if(argc) {
    puts("a");
    return f.get_one_member();
  } else {
    puts("b");
    return get_one_global();
  }
}

Note that the two puts calls are just there to make the branches a bit more different - otherwise the compilers are smart enough to just use a conditional set/move, and so you can't even really separate the inlined bodies of the two functions.

All of gcc, icc and clang inline the two calls and generate code that is equivalent for both the member and non-member function, without any trace of the this pointer in the member case. Let's look at the clang code since it's the cleanest:

main:
 push   rax
 test   edi,edi
 je     400556 <main+0x16>
 # this is the member case
 mov    edi,0x4005f4
 call   400400 <puts@plt>
 mov    eax,0x1
 pop    rcx
 ret
 # this is the non-member case    
 mov    edi,0x4005f6
 call   400400 <puts@plt>
 mov    eax,0x2
 pop    rcx
 ret    

Both paths generate the exact same series of 4 instructions leading up to the final ret - two instructions for the puts call, a single instruction to mov the return value of 1 or 2 into eax, and a pop rcx to clean up the stack2. So the actual call took exactly one instruction in either case, and there was no this pointer manipulation or passing at all.

Out of line

In the out-of-line costs, supporting the this pointer will actually have some real-but-generally-small costs, at least on the caller side.

We use a similar test program, but with the member functions declared out-of-line and with inlining of those functions disabled3:

class foo
{
private:
    int bar;
public:
    int __attribute__ ((noinline)) get_one_member();
};

int foo::get_one_member() 
{
   return 1;  // Not using `this`
}

int __attribute__ ((noinline)) get_one_global() {
  return 2;
}

int main(int argc, char **) {
  foo f = foo();
  return argc ? f.get_one_member() :get_one_global();
}

This test code is somewhat simpler than the last one because it doesn't need the puts call to distinguish the two branches.

Call Site

Let's look at the assembly that gcc4 generates for main (i.e., at the call sites for the functions):

main:
 test   edi,edi
 jne    400409 <main+0x9>
 # the global branch
 jmp    400530 <get_one_global()>
 # the member branch
 lea    rdi,[rsp-0x18]
 jmp    400520 <foo::get_one_member()>
 nop    WORD PTR cs:[rax+rax*1+0x0]
 nop    DWORD PTR [rax]

Here, both function calls are actually realized using jmp - this is a type of tail-call optimization since they are the last functions called in main, so the ret for the called function actually returns to the caller of main - but here the caller of the member function pays an extra price:

lea    rdi,[rsp-0x18]

That's loading the this pointer onto the stack into rdi which receives the first argument which is this for C++ member functions. So there is a (small) extra cost.

Function Body

Now while the call-site pays some cost to pass an (unused) this pointer, in this case at least, the actual function bodies are still equally efficient:

foo::get_one_member():
 mov    eax,0x1
 ret    

get_one_global():
 mov    eax,0x2
 ret    

Both are composed of a single mov and a ret. So the function itself can simply ignore the this value since it isn't used.

This raises the question of whether this is true in general - will the function body of a member function that doesn't use this always be compiled as efficiently as an equivalent non-member function?

The short answer is no - at least for most modern ABIs that pass arguments in registers. The this pointer takes up a parameter register in the calling convention, so you'll hit the maximum number of register-passed arguments one parameter sooner when compiling a member function.

Take for example this function that simply adds its six int parameters together:

int add6(int a, int b, int c, int d, int e, int f) {
  return a + b + c + d + e + f;
}

When compiled as a member function on an x86-64 platform using the SysV ABI, you'll have to pass on register on the stack for the member function, resulting in code like this:

foo::add6_member(int, int, int, int, int, int):
 add    esi,edx
 mov    eax,DWORD PTR [rsp+0x8]
 add    ecx,esi
 add    ecx,r8d
 add    ecx,r9d
 add    eax,ecx
 ret    

Note the read from the stack eax,DWORD PTR [rsp+0x8] which will generally add a few cycles of latency5 and one instruction on gcc6 versus the non-member version, which has no memory reads:

add6_nonmember(int, int, int, int, int, int):
 add    edi,esi
 add    edx,edi
 add    ecx,edx
 add    ecx,r8d
 lea    eax,[rcx+r9*1]
 ret    

Now you won't usually have six or more arguments to a function (especially very short, performance sensitive ones) - but this at least shows that even on the callee code-generation side, this hidden this pointer isn't always free.

Note also that while the examples used x86-64 codegen and the SysV ABI, the same basic principles would apply to any ABI that passes some arguments in registers.


1 Note that this optimization only applies easily to effectively non-virtual functions - since only then can the compiler know the actual function implementation.

2 I guess that's what it's for - this undoes the push rax at the top of the method so that rsp has the correct value on return, but I don't know why the push/pop pair needs to be in there in the first place. Other compilers use different strategies, such as add rsp, 8 and sub rsp,8.

3 In practice, you aren't really going to disable inlining like this, but the failure to inline would happen just because the methods are in different compilation units. Because of the way godbolt works, I can't exactly do that, so disabling inlining has the same effect.

4 Oddly, I couldn't get clang to stop inlining either function, either with attribute noinline or with -fno-inline.

5 In fact, often a few cycles more than the usual L1-hit latency of 4 cycles on Intel, due to store-forwarding of the recently written value.

6 In principle, on x86 at least, the one-instruction penalty can be eliminated by using an add with a memory source operand, rather than a mov from memory with a subsequent reg-reg add and in fact clang and icc do exactly that. I don't think one approach dominates though - the gcc approach with a separate mov is better able to move the load off the critical path - initiating it early and then using it only in the last instruction, while the icc approach adds 1 cycle to the critical path involving the mov and the clang approach seems the worst of all - stringing all the adds together into on long dependency chain on eax which ends with the memory read.

Community
  • 1
  • 1
BeeOnRope
  • 51,419
  • 13
  • 149
  • 309
-5

If you don't use this, then you can't tell whether it's available. So there is literally no distinction. This is like asking whether a tree falling in an unpopulated forest makes a sound. It's literally a meaningless question.

I can tell you this: if you want to use this in a member function, you can. That option is always available to you.

Lightness Races in Orbit
  • 358,771
  • 68
  • 593
  • 989