(0 < size_t(0) - 1) is true

I came across this while writing code that resembles this:

1
2
3
4
5
std::vector<int> nums;

for (int i = 0; i < nums.size() - 1; ++i) {
   std::cout << nums.at(i);
}


This code gives a runtime error and it was hard for me to figure out why.

The reason it gives a runtime error can be narrowed down to the fact that the expression (0 < size_t(0) - 1) is true.

Can somebody please explain why (0 < size_t(0) - 1) is true?
size_t is an unsigned integer type. Arithmetic on (only) unsigned integer types is performed modulo 2^n. If size_t is n bits long, then
-1 ≡ 2^n - 1 (mod 2^n)
@mbozzi, Thanks for replying.

mod (2^n) or mod (2^0 + 2^1 + .. + 2^n)?
(2^0 + 2^1 + .. + 2^n is MAX_SIZE)

Also, if it is mod MAX_SIZE then something like
(MAX_SIZE - 1) + 1 would be 0 because MAX_SIZE % MAX_SIZE is 0.

So perhaps you meant mod (MAX_SIZE + 1) ?
Then -1 would be MAX_SIZE?

Is size_t not promoted to int when operating with/comparing with an int? The int is casted to an unsigned int instead?

edit: ok so int would be casted to size_t. But why is signed casted to unsigned when an unsigned is operated with a signed? If unsigned were casted to signed, we wouldn't have had a problem in this snippet. So why is signed casted to unsigned and not the other way around?
Last edited on
Hello Grime,

The ".size()" function returns a "size_t" which is an "unsigned" something. The "something" will depend on the header files that are used to your compiler.

With the given code the vector is empty with a size of zero. So, nums.size() - 1 or 0 - 1 is -1. Unsigned numbers can nor be negative.

On my computer I did this:
1
2
size_t ans{};
std::cout << ans - 1;
And when it ran I got this output: 4294967295. I believe what happened is that it changed the negative number into the largest possible positive that an "unsigned int" could hold.

At least this time the "size_t" is a type def for an "unsigned int", but not always the case.

For what you need the code would be better written as:
1
2
3
4
5
6
std::vector<int> nums;

for (size_t i = 0; i < nums.size(); ++i)
{
    std::cout << nums[i]);
}

Removing the "- 1" in the for loop will work better especially if the vector is empty.

The ".at(i)" is extra work that is not needed. The for loop will keep you from going past the end of the vector, so all you need is nums[i];.

Give that a try and see what happens. You should just by pass the for loop.

Andy
@Handy Andy

Thanks for replying.

My actual for-loop needs to reference nums[i] and nums[i+1], so it was necessary to have the -1. I fixed by first checking whether the vector is empty.

I get what has gone wrong now. Thanks.

But why, when a signed and unsigned flavour of the same type are in an expression, the signed one is casted to unsigned and not the other way around? Because the code snippet would have worked if unsigned were casted to signed instead, right?
Last edited on
Hello Grime,

Say you had:
1
2
3
4
5
int num1{ 1 };
double num2{ 2.5 };

std::cout << num1 + num2 << '\n';

In this case "num1" would be promoted to a double before the addition and you would output a double.

Unfortunately the language does not do a demotion.

Then something like this:
1
2
3
4
int num1{ 1 }, ans{},
double num2{ 2.5 };

ans = num1 + num2;

Here "num1" is promoted to a "double before the calculation and the 2 numbers are added as "double"s, but since "ans" is an "int" only the whole number is stored and the decimal portion is dropped. This should produce a compiler warning about loss of data.

In C++ you can always go up, but not down.

Andy
Andy explained the problem. To fix it, change your code to:
for (int i = 0; i+1 < nums.size(); ++i) {
mod (2^n) or mod (2^0 + 2^1 + .. + 2^n)?

A 64-bit unsigned int holds integers in [0, 2^64). Arithmetic is done modulo 2^64, not 2^0 + 2^1 + .. + 2^64 = 2^65 - 1

Try it with a four-bit int to help convince yourself.

But why, when a signed and unsigned flavour of the same type are in an expression, the signed one is casted to unsigned and not the other way around?

This behavior is inherited from C. I don't know the exact rationale for the choice, but it's probably related to the fact that unsigned integer types historically have more strongly-defined semantics, whereas unsigned->signed conversions resulted in implementation-defined values.
Thanks for replying @Handy Andy

Thanks @dhayden, that's neat.

@mbozzi, thanks, you're right about modulo 2^64.

This behavior is inherited from C. I don't know the exact rationale for the choice, but it's probably related to the fact that unsigned integer types historically have more strongly-defined semantics, whereas unsigned->signed conversions resulted in implementation-defined values.


Thanks. One question. Does how signed integers are represented, floating point numbers are represented, and how big primitive types are depend on the compiler or the computer running the program?

If it depends on the computer, can you specify how many bits you want to use to represent a floating point or an integer, or is it fixed by the computer?
Signed integers are represented by 2's compliment (almost always) See https://en.wikipedia.org/wiki/Signed_number_representations

floating point numbers are represented using standard IEEE754. See https://en.wikipedia.org/wiki/IEEE_754

The size is fixed by the C++ standard/compiler. For MS, see https://docs.microsoft.com/en-us/cpp/cpp/data-type-ranges?view=vs-2019

Other os/compilers may be slightly different - that's why you don't rely on int being 32 bits etc and for 32 bit compliers pointers are 32 bits and for 64 bit compilers are 64 bits etc etc.
Signed integers are represented by 2's compliment

C++20 guarantees it, which eliminates some implementation-defined behavior.
But is it the compiler which specific how signed integers are represented, or the computer?

If it's the computer, how would a C++20 compiler guarantee that signed integers are represented in 2's complement?

If it's the compiler, can the compiler specify how to represent a particular primitive data type in bits, how big it would be
and how operations are performed on it? I used to think that the compiler has no control over how the primitive types work and that it's the computer which is responsible for it.
Last edited on
But is it the compiler which specific how signed integers are represented, or the computer?

The compiler could theoretically emulate two's complement integers in software, though I assume any decent compiler would yield to the machine's natural representation for integers. (Software emulation of floating-point is somewhat common, in the case of devices without hardware support.)

Anyway, a search indicated that C++ users exclusively write code for twos-complement machines. e.g., LLVM, MSVC, GCC only support two's compliment.

None of the WG14 (the C standard committee) members present at Brno were aware of any extant architecture with a modern C compiler that wasn't two's compliment:
p0907r4 wrote:
WG14 met in Brno to discuss [N2218]. The paper was received very positively, especially given that no one in the room knew of an extant architecture that was not two’s complement for which there was a reasonably modern C compiler. The closest anyone came was the Unisys ClearPath compiler documentation which says:

Two’s complement arithmetic is used on many platforms. On ClearPath MCP systems, arithmetic is performed on data in signed-magnitude form. This can cause discrepancies in algorithms that depend on the two’s complement representation.

However, this compiler documentation also says that they only target C90, and was last updated on 2017.

https://wg21.link/p0907
Last edited on
Thanks
Topic archived. No new replies allowed.