Hi JLBorges,
Crikey, I hope you don't think that I am trying to argue with you (good grief, that would be crazy :+D ).
TheIdeasMan wrote: |
---|
I could be completely wrong, the compiler & JLBorges are vastly smarter than me :+D |
Instead, I try to ask questions in an attempt to further my education (& maybe others too). Also, it is a case of my critical thinking kicking in: what I am hearing doesn't match with my understanding, so I am trying to learn why. Or at least fill in gaps in my knowledge.
So in my last reply, I was trying to reason through to try and make sense of what was happening in a complex example.
What I am trying to get at:
if we have a template function like this:
1 2 3 4 5
|
template <typename T>
T foo(T a) {
const T b = 5;
return a + b;
}
|
Now if we have some container which has pointers to int, float and double.
We iterate through the container, and call
foo
for each dereferenced item.
When the compiler encounters an
int
as a parameter, does it
implicitly create a function where all the types are int? Similar for when it encounters floats or doubles - does it implicitly creates functions for those too, so now we have 3 overloaded functions that differ in their type?
Now if our container has 10,000 items in it, the compiler can call the appropriate function?
So I was thinking this function could not be inlined, because there are really 3 of them. That is, we couldn't have this:
1 2 3 4 5
|
for (;;) { //whatever looping / iterating construct
foo(a); // int
foo(a); // float
foo(a); // double
}
|
But there definitions could be elsewhere, which is what I am thinking the complier does implicitly.
I can understand how a simple function could be inlined in a loop where the type is always the same. I guess that is what normally happens: there are multiple containers which have items of a particular type (different to the other containers), and there is one template function for all of them. But as I understand it the compiler still implicitly makes a function for each type even if it is inlined. I imagine this is a problem if there were 20 containers of
int
, and all the functions were inline - there is now 20 times more code. Now I imagine the compiler can work out a tipping point where the cost of extra code exceeds the cost of a function call.
Is that how it works, or do I have that all screwed up?
Ok, now some questions about your last example:
1. Are you saying that the entire
bar
function is optimised down to:
1 2
|
movl $25000, %eax
retq
|
Or, is that just the asm for the return statement?
I am hoping it is not the latter.
2. Given my understanding above, are there multiple implementations of
foo
in the asm? One for each type of the first argument that is passed? Maybe it is still more efficient to inline them rather than have 7 function calls?
I could try to check that myself, but the last time I did
asm
was 20 years ago, and it was 16 bit DOS API - shall we say that things are a bit different now :+o
I do know how to compile to
asm
, and I could look for
call
instructions with
foo
in them, but frankly it easier to ask - I am sure you would know straight away. And your examples would have lots of code in them.
Anyway, it is late at my end - I need to be up for work in about 6 hours, so in about 20 hours time I can fool around with some simpler code, and see what I can learn.
Thanks for your help for today (and in advance)