C Standard Function Inlining

Pages: 12
Dec 16, 2009 at 5:35pm
Hi! I have a little question related to inlining...

Basic functions like memcpy(), memset(), strstr(), etc. can be very easily implemented in C by the programmer. On one hand, such implementations will be slower than optimized versions written in assembly by experts, but on the other hand, if I manage to write a fast implementation (for example, by copying 8 bytes in each iteration of the loop of memcpy()), then even if it's less efficient than the standard implementaion, it can be faster if it's inlined compared to a non-inlined standard implementation executed from a DLL.

So my question is: do C/C++ compilers inline such standard functions? If yes, do they inline them in all cases or only when optimizations are enabled?

Another question (I'm curious): how are memset() and memcpy() usually implemented?
Dec 16, 2009 at 7:34pm
These functions have been inlined by most C compilers for 20+ years now. They are called intrinsic functions when inlined.
Dec 17, 2009 at 1:33pm
They are already implemented by taking advantage of being able to write whole words to memory
in a single clock cycle if the address is aligned properly.
Dec 17, 2009 at 6:12pm
Thanks for the answers!
I have another question about inlining: let's say there's a short inline function first(). It executes another short inline function second(). And second executes a short inline function third(). Given that if I manually replace the call to second() and the call to third() with their contents, so that first() doesn't include calls to any of them (it includes their contents instead), first() is short enough for the compiler to inline it, will the compiler recognize this if I don't replace manually? In other words, will the compiler take into consideration the length of functions called inside first() and their inline-ness and decide whether to inline first() using this information?

I'm asking because I have a little dilemma: There is a library to which I'm writing a wrapper. One of the library's functions is wrapped by two functions: one takes the same parameters and passes them to the library function (after checking if they're valid, if debugging mode is on) and the other takes only some of the parameters and passes constant values to the rest of the library function's parameters. The question is whether the second wrapper function should be implemented by calling the first wrapper function or by calling the library function directly. The advantage of the second option is that the wrapper function will be inlined, while in the second option I'm not sure because it inline-inside-inline, which could maybe prevent inlining if many inline functions call each other. But the advantage of the first option is much more global: it regards the software structure and corrent engineering: every library function should only have one direct wrapper (which is an inline function that calls only that specific library function) so that the error checking code will be written once. Any change to the first wrapper due to updates will also have to be applied to the second. This is what code re-use and smart software engineering are supposed to solve - by calling the second wrapper through the first.

Thanks for reading...I know long questions don't attract people...
Dec 17, 2009 at 7:19pm
inline is just a hint to the compiler. It may or may not actually inline the function for you. And even if one compiler does, that doesn't mean that another one will do the same.

Typically, debug builds forces inlining off.
Dec 17, 2009 at 8:22pm
Bit like the register keyword ^^
Dec 18, 2009 at 5:31pm
I know inline is just a hint...I'm asking how things actually work in the popular compilers (GCC, VC++ ... ). And I'm interested in how they work in the final, optimizing compilation. If I can trust compilers to inline any functions whose length (the number of operations done from the beggining until it returns) is short enough to inline them, then I won't be worried about those first()-executes-second()-executes-third() cases. Otherwise I'll try to avoid them when it's not too much extra work.
Dec 18, 2009 at 8:43pm
The point is, you can't trust it to inline anything. It's a hint.
Dec 19, 2009 at 6:42am
I understand, but I'm not asking about C++ in general, but about specific C++ compilers. If anyone knows how inlining works in g++/VC++, I'd like to know too...
Dec 19, 2009 at 6:52am
Well, g++ is open source, so...

There's really no sure fire way of knowing whether a function will be inlined or not. It even depends on the context of the call.
Dec 19, 2009 at 1:31pm
If you find yourself targeting a compiler for a feature rather than working around its shortcomings, it's an indication you're design is headed down the wrong path.
Dec 19, 2009 at 6:20pm
"working around its shortcomings" - exactly. That's what I'm trying to do. Here's a new version of the original question, which will help clarify what exacly I'm trying to find out:

There are four functions: f1(), f2(), g() and h(). All four functions are very short. But they are not marked as inline in the code. f1() and f2() have similar behavior - they do exactly the same thing. The difference is that an execution of g() is included inside f1()'s definition, and an execution of h() is included inside g()'s definition. g()'s and h()'s code is directly written into f2() so it doesn't have calles to them.
There's a simple program with two versions: one uses f1() somewhere in the code, the other uses f2() instead (that's the only difference between the two versions of the program). Now we compile the first program and the compiler decides to inline f1(), g() and h() because they are very short and worth inlining. So calls to them are replaced with their contents. the question is: given this information, can I trust compilers to inline f2() when I compiler the second program?
Dec 20, 2009 at 3:56am
No, you can't, as kbw mentioned above.

Compilers are not required to inline anything.
Dec 22, 2009 at 4:47pm
I'm not asking about compilers in general, but specifically about the existing compilers. They are not required to inline anything, but they do inline functions, right? So I'd just like to know how compilers usually decide whether a function is worth inlining. Do they use the length of the C++ code in the calculations of the length of the byte-code definition on the function? In other words, does the WAY THE FUNCTION IS WRITTEN IN C++ affect the decision, even if all versions of the function end up producing the same byte-code. If anyone knows the answer, I'd like to know it too :)
Dec 22, 2009 at 7:00pm
The function body has to be available to the compiler at the call site. That's about
the only thing I can think of that would/should affect the decision to inline.

Older compilers used to balk at functions that had loops or multiple return points,
but I think compilers these days handle those fine.
Dec 23, 2009 at 4:43pm
So if function f1() only executes f2() (and f2() is a library function), f1() will be inlined? And what if f0() executes f1() which executes f2() which is a library function? Will f0() be inlined on recent compilers? (in other words, do they "understand" not inlining f1() and f0() is pure waste of running-time)
Dec 23, 2009 at 6:01pm
You should probably try it yourself, compile a simple program into assembly. From the assembly code you can find out how many functions did the compiler inline.
Dec 23, 2009 at 6:18pm
Why do you care so much whether the compiler inlines a function or not? You can't take advantage of the fact in code either way, so what difference does it make?
Dec 26, 2009 at 6:07pm
R0mai : How do I get the assembly code and how do I recognize function calls (I have very basic knowledge of assembly...)

helios: Actually I can optimize my code using this information. I'm writing a library, whose low layer is just a wrapper of an existing library (DLL). Most functions in the wrapper layer simply run one of the library functions. They either take the same parameters, or take only some of the library function's parameters and pass constant values to the other parameters.
But some of the library functions are special: they have 2 wrappers. One takes the same parameters the library function takes and passes them to it; the other wrapper only takes someof them and passes constant values to the other library function's parameters. Both wrapper functions have to exist because in dubugging mode they don't just run the library function but also check parameter validity and do some other tests for debugging purposes.

In the aspect of design, having both wrappers run the library function directly is not a good idea because the debugging code for the parameters common to both functions will have to be typed twice (or copied from one to the other...) and if someone changes the first function but forgets to change the second one, there will be problems. A better idea is to have the second wrapper function execute the more general, first wrapper function. So the second wrapper function doesn't need to have any debugging code because the first wrapper function already has it. Since these are just simple wrappers with very simple definitions that will probably never change, the design issue is not a problem.

Speed is more important. Since the wrapper functions (when debugging is disabled) the very very short (just a function call...), common compilers like g++ and VC++ will probably inline them. But I posted this topic here, in the forums, because I'm not sure if they will inline the second wrapper function (which calls the first wrapper function, which calls the library function). If compilers decide whether to inline a function using it length in byte-code as a parameter, then I'll have the second wrapper functiion call the first one, but if writing the code this way does affect the decision and can make the compiler decide not to inline the second wrapper function (but do inline the other version of it, which directly calls the library function), then I'll prefer to have bother wrappers call the library function directly.

Once I find out how to see the assembly code I'll know the answer (unless someone posts the answer here before that...).
Dec 26, 2009 at 6:25pm
Actually I can optimize my code using this information.
No, you can't. The compiler decides whether to inline based on the code, so you can't base your code on the compiler's decision to inline or not.
If you want to optimize, you have to assume the compiler doesn't inline.

I think you're putting too much brainpower into something very unimportant. Are the library functions really that fast that a couple of function calls will produce a measurable slow down?
Pages: 12