Question 5 about optimization

Probably the last optimization question and it is about initialization

Part I
Consider the two snippet below:

1
2
3
4
int main()
{
    int nbr;
}


1
2
3
4
int main()
{
    int nbr = 9;
}


I think the second snippet add one operation for the compiler to do, which is initialize the variable nbr. In term of performance I think the first snippet is better, am I wrong?

============================

Part II
initialization is always recommended, But I think sometimes is not, In my code below I think L4 and L5 do not need a initialized variables, and I think initializing them will add two unneeded operation like mentioned above(if I am correct about it) done by the compiler, am I wrong here?

1
2
3
4
5
6
7
8
9
main()
{
    int nbr = 0,dupNbr = 0;
    cin >> nbr;
    dupNbr = nbr * 2;
    cout << dupNbr << endl;
    /// we see that there is no important two initial those variables
    return 0;
}
Last edited on
it is slower to initialize.
as with the others, visualize the assembly.
lets assume it actually needs to create the integer, rather than register it, as we talked about.
in that case, it pushes the value onto the stack, the usual method for creation of local variables. If the CPU / OS/ whatever allows a push with value directly in the hardware, there is no extra cost. If it has to increment the stack pointer manually and then move the value into the memory location, it costs an extra step vs just incrementing the stack pointer. I don't know if modern hardware has a push with value instruction or not. Ive forgotten a great deal of this stuff.
Last edited on
Lx and Ly do not need a initialized variables


Lx, Ly ???

Do you mean nbr and dupNbr?

Yes, nvr doesn't actually need to be initialised as it's value is obtained from the cin statement.

You won't define dupNbr on L3. You'd define and initialise it on L5:

1
2
3
4
5
6
7
8
9
10
11
#include <iostream>

int main() {
	int nbr;

	std::cin >> nbr;

	const auto dupNbr { nbr * 2 };

	std::cout << dupNbr << '\n';
}


However IMO I still initialise all variables when defined - unless there's a very good reason why not.

Note that variables with static duration (including global) are initialised when defined - even if no specific value is provided.
Last edited on
Lx, Ly ???
L4 and L5, I cannot preview my question when I ask for the first time https://legacy.cplusplus.com/forum/lounge/284826/
In part 1 I think nbr is pointless in both snippets. With optimizations turned on there would be no difference in terms of performance.


Should you initialize variables before using std::cin?

Since C++11, if std::cin fails, but is not already in "fail mode", it will set the variable to 0 so if you access the variable afterwards you it won't lead to UB.
1
2
3
4
std::cin.clear(); // makes sure cin is not in "fail mode"
int x;
std::cin >> x;
std::cout << x; // Fine. If the read operation failed this will print 0. 

But if std::cin is already in "fail mode" then it will return immediately without setting the value.
1
2
3
4
5
6
7
std::cin.clear(); // makes sure cin is not in "fail mode"
int x;
std::cin >> x;
int y;
std::cin >> y;
std::cout << x; // Fine, as above.
std::cout << y; // UB if the first read operation failed 


Always initializing the variables seem like the safer option.
1
2
3
int age{};
std::cin >> age;
std::cout << "You are " << age " years old.\n";

On the other hand,
- someone reading the code might wonder why you are initializing the variable to that particular value (using {} instead of = to initialize the variable might make it more obvious the value is unimportant),
- using the value after the read operation has failed is most likely a mistake anyway (you should instead check if it failed and not use the variable in that case), and
- if you run your code through an UB sanitizer it won't tell you something is wrong if you accidentally use the variable (it won't do this in all cases anyway, as we have seen above, but still).
1
2
3
4
5
6
7
8
9
int age;
if (std::cin >> age)
{
   std::cout << "You are " << age " years old.\n";
}
else
{
   std::cout << "I don't know how old you are because you have entered an invalid age!\n";
}


Personally I do tend to leave out the initialization if I'm using std::cin (or another std::istream) to give it a value immediately on the next line but it's not primarily a performance concern.
Last edited on
Herb Sutter is someone who I listen to when he mentions something C++:

19. Always initialize variables

Summary
Start with a clean slate: Uninitialized variables are a common source of bugs in C and C++ programs. Avoid such bugs by being disciplined about cleaning memory before you use it; initialize variables upon definition.

Discussion
In the low-level efficiency tradition of C and C++ alike, the compiler is often not required to initialize variables unless you do it explicitly (e.g., local variables, forgotten members omitted from constructor initializer lists). Do it explicitly.

There are few reasons to ever leave a variable uninitialized. None is serious enough to justify the hazard of undefined behavior.

https://www.oreilly.com/library/view/c-coding-standards/0321113586/ch20.html

From
"C++ Coding Standards: 101 Rules, Guidelines, and Best Practices"
https://learning.oreilly.com/library/view/c-coding-standards/0321113586/

C++ had default class member initialization:
https://www.learncpp.com/cpp-tutorial/default-member-initialization/
I largely agree with @George P's post above, we've been trained that way, right?

However, there is a cost to initialization, and sometimes it's significant. So there are always special cases where such initialization might be deferred.
@seeplus wrote:
You won't define dupNbr on L3. You'd define and initialise it on L5:

Nice observation)

@Peter87 wrote
In part 1 I think nbr is pointless in both snippets. With optimizations turned on there would be no difference in terms of performance.

Here, I just wanted to prove that the compiler add one operation, but I guess it does the opposite if the optimizations turned on. Here do you mean that the .exe wont store any variable nbr in it.

I do not know if you can understand what I am thinking of, I do not remember but let say the program text has 3 variables of type int: nbr, nbr2 and nbr3, I think those variable are located in some sort of table inside the .exe. If this is the case, then if the optimization turned on, maybe the compiler wont shove that variable inside the .exe. because it is useless .
ninja01 wrote:
Here do you mean that the .exe wont store any variable nbr in it.

Variables only really exist in your code. After the program has been compiled there are just machine instructions that use main memory and registers in various ways when executed, but at this level there are no variables.

What I mean is that the resulting executable file will contain the same machine instructions (and therefore have the same performance) as you would have got if you had left the variable out.

I do not remember but let say the program text has 3 variables of type int: nbr, nbr2 and nbr3, I think those variable are located in some sort of table inside the .exe.

With debug symbols turned on the names and other debug information would get stored inside the executable file so that debuggers can use it.

Global variables and functions might have mangled names stored for linking purposes.

But if we ignore debug symbols there is no "table" to store local variables. It's up to the compiler to decide how to handle them. It might use a register, or it might use the stack, but those things are not really variables. It's just where the compiler decides to store the values (whether they belong to a variable or not). It's possible that the values of two variables gets stored in the same register (e.g. if their usage doesn't overlap) or it might not get stored at all if it's not used.

There is no one-to-one mapping between C++ code and machine instructions (at least not when optimizations are turned on). All that matters is that the observable behaviour is as expected.
Thank you very much @Peter)
I think sometimes is not, In my code below I think L4 and L5 do not need a initialized variables, and I think initializing them will add two unneeded operation like mentioned above
The biggest optimization is going from not working, to working code. Uninitialized variables are a big source of bugs (aka, not working code). Worse, sometimes the code appears to work because the variable happens to contain a valid value. Then you add some new code which moves things around a little and bam! That uninitialized variable is no longer valid.

So it's usually a good idea to initialize your variables. Let the compiler decide if it's redundant. The compiler is better at this decision than you are.
To go along with dhayden said, another pitfall is what an implementation does with a newly created variable depends on if compiled under debug or release mode.

IIRC MSVC++ default initializes all variables when created in debug mode, release mode that doesn't happen.
> Uninitialized variables are a big source of bugs

Yes; but incorrectly initialised variables are also sources of bugs.

These is what I tend to do:

1. As far as possible, don't define a variable until there is a valid value to initialize it with.

2. If 1. is not possible, if there is a reasonable default initial value (eg. nullptr for a raw pointer), initialise it with that value.

3. If there is no reasonable default initial value, leave it un-initialised. If we have carelessly forgotten to assign a valid value to it before use, letting the compiler or static analyser catch the un-initialised variable error is better than a spuriously initialised value resulting in run-time madness.
Also note that classes that have a default constructor will initialise when a variable of that class is defined without any specified initialisation by default (eg std::string). If a class doesn't have a default constructor then you can initialise using a provided constructor. So that value of say std::string is always known and valid. If neither a default nor another constructor has been provided (and used) then any public member variables can be direct initialised or any default values (which may not be) given to these are used. In this case, of course, the member variables will be in an unknown state.
Last edited on
Topic archived. No new replies allowed.