Can I use std::int8_t* where I want sign

Forum

Forum
General C++ Programming
Can I use std::int8_t* where I want sign

Can I use std::int8_t* where I want signed char*?

I like to create shorter names for fixed-width integer types:

1
2
3

using i32 = std::int32_t;
using i64 = std::int64_t;
//...

But I break the pattern for i8 (and u8):
using i8 = signed char;
Since I want to alias stuff with i8*.

Could I replace the definition of i8 with
using i8 = std::int8_t;
without breaking my code?

Thanks!

Last edited on

JLBorges (13770)

Yes, if this assertion holds: static_assert( std::same_as< signed char, std::int8_t > ) ;

Peter87 (11194)

mbozzi wrote:
using i8 = signed char; Since I want to alias stuff with i8*.

The standard only guarantees you can do that with char, unsigned char and std::byte.

signed char is not on that list.

Last edited on

seeplus (6541)

Note that the 'signess' of char can be set by the compiler. eg with VS to have char as unsigned use the /J option.

mbozzi (3925)

signed char is not on that list.

Thanks, I could have sworn it was all narrow character types & byte.

It looks like I've escaped problems so far since nobody actually uses signed char for that. Including me, topic post notwithstanding - I either use plain char or u8.

To me, JLborges post implies that it's allowed for uint8_t to differ from unsigned char. Do you guys know of any implementations where there's a difference? Or are they always the same in practice?

Last edited on

helios (17527)

I remember reading a few days ago about a DSP or some other embedded platform where char, short, int, and std::int16_t were all the same type. char is just the addressable unit of memory (i.e. the byte), which can be larger or smaller than an octet.

Cubbi (4774)

there are quite a few DSPs with 16-bit chars (and I think one with 32-bit?) but technically uint8_t should not be defined there, only uint_least8_t

Peter87 (11194)

It seems like GCC allows signed char to alias other types just like char and unsigned char.

I think it would have been a good thing if std::(u)int8_t had been implemented as separate types that were not allowed to alias because it would have lead to more efficient code.

C++20 added the non-aliasing unsigned char-sized type named "char8_t" but it's intended to be used to store UTF-8 string data. But if the standard doesn't add new non-aliasing 8-bit types (that are non-aliasing in practice, not just on paper) then I'm afraid people will be tempted to start using char8_t as a regular integer type.

Last edited on

JLBorges (13770)

> it's allowed for uint8_t to differ from unsigned char

An implementation with CHAR_BIT == 8 and std::signed_integral<char> == false,
(in theory) may define using uint8_t = char ; // unsigned integer type with width of exactly 8 bits

mbozzi (3925)

Thanks guys, I'm starting to put the pieces together.

Its dubious to use either uint8_t* or int8_t* for pointer aliasing because some of the implementation's viable options technically won't work:

|                 | char     | unsigned char | signed char  | char8_t      | language extension |
|-----------------+----------+---------------+--------------+--------------+--------------------|
| using uint8_t = | aliasing | aliasing      |              | non-aliasing | maybe aliasing     |
| using int8_t =  | aliasing |               | non-aliasing |              | maybe aliasing     |

The table assumes that char has the right signedness. And if CHAR_BIT > 8 then probably the system doesn't have uint8_t or int8_t at all.

Last edited on

seeplus (6541)

using i8 = signed char;
Since I want to alias stuff with i8*.

The standard only guarantees you can do that with char, unsigned char and std::byte.

VS alias int8_t to signed char. As the sign-ness of char can be changed by a compiler option, alias to char would also mean that the sign-ness of the alias would also depend upon compiler option. So if signed char couldn't be used and i8 was alias for char, then i8 could be either signed or unsigned...

This is different for int etc. int means signed int.

mbozzi (3925)

Ok good point. That makes the choice of char even less viable for implementations where those compiler options exist.
GCC and Clang have -funsigned-char and -fsigned-char too.

It follows that int8_t* is quite unlikely to alias without some language extension being involved.

Last edited on

seeplus (6541)

It's also 'interesting' that irrespective of how char is defined by the compiler, std::same_as fails when comparing with either signed char or unsigned char. It only succeeds when comparing char with char!

Consider for VS:

#include <iostream>
#include <concepts>

int main() {
#ifdef _CHAR_UNSIGNED
	std::cout << "unsigned\n";
	static_assert(std::same_as<char, unsigned char>);  // FAIL
	static_assert(std::same_as<char, signed char>);    // FAIL - as expected
#else
	std::cout << "signed\n";
	static_assert(std::same_as<char, unsigned char>);  // FAIL - as expected
	static_assert(std::same_as<char, signed char>);    // FAIL
#endif
}

All these static_asserts fail!

So really you have 3 types of char - char, signed char and unsigned char!

For int:

1
2

static_assert(std::same_as<int, signed>);
static_assert(std::same_as<int, signed int>);

both evaluate true.

JLBorges (13770)

> So really you have 3 types of char - char, signed char and unsigned char!

It has been that way for decades.

C++98:
Plain char, signed char, and unsigned char are three distinct types. 3.9.1/1

seeplus (6541)

Yes I know. I've used C++ since before C++98 - but hands up at the back for those that didn't. It's not something I think is intuitive especially considering int/signed int...

Consider:

#include <iostream>

int main() {
	const char a = 200;

#ifdef _CHAR_UNSIGNED
	std::cout << "unsigned char value: ";
#else
	std::cout << "signed char value: ";
#endif

	std::cout << "as int " << (int)a << " as unsigned " << (unsigned)a << '\n';
}

which gives these 2 outputs:


unsigned char value: as int 200 as unsigned 200

signed char value: as int -56 as unsigned 4294967240

which helps to explain the casting required for the cctype c functions' args. So if you have char as unsigned by the compiler, you don't need all that nasty casting for the args to these functions (although you still do for the return value).

Last edited on

Topic archived. No new replies allowed.