How to convert 3 chars to integer and how to extract 3 components of integer?

Pages: 12
I would like to the following:

I set three color components:

1
2
3
4
unsigned char R;
unsigned char G;
unsigned char B;
unsigned int color;


I would like to pass the three components into the color, so
that first 8 bits will represent R component value, second 8 bits will represent G component value and B will be the third group of 8 bits. (Actually I dont need the last group of integer). I know this should be simple possible to do it with binary operators, but I was never good with these binary operations. Could you please show me how to get these 3 values to integer and how to extract them back?
closed account (2UD8vCM9)
I'm sure there is a better way, but this is the only way I know about doing something like this.

Note: On windows 7 64 bit, Integer = 4 bytes, so that is why I have unsigned char Padding, it is to fill the last byte of the integer.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
#include <iostream>
using std::cout;
using std::endl;

int ConvertRGBToInt(unsigned char R, unsigned char G, unsigned char B)
{
	int ReturnInt = 0;
	unsigned char Padding = 0;
	unsigned char buffer[4];
	buffer[0] = R;
	buffer[1] = G;
	buffer[2] = B;
	buffer[3] = Padding;
	memcpy((char*)&ReturnInt, buffer, 4); //Copy 4 bytes from our char buffer to our ReturnInteger
	return ReturnInt;
}

void GetRGBFromInt(int IntegerToConvert, unsigned char&R, unsigned char&G, unsigned char&B)
{
	unsigned char buffer[4];
	memcpy(buffer, (char*)&IntegerToConvert, 4); //copy 4 bytes from our integer to our char buffer
	R = buffer[0];
	G = buffer[1];
	B = buffer[2];
}

int main()
{
	unsigned char R = 200;
	unsigned char G = 0;
	unsigned char B = 0;
	int Color = 0;
	Color = ConvertRGBToInt(R, G, B); //Set the integer color


	GetRGBFromInt(Color, R, G, B); //Get RGB from given color
	cout << "R:" << (int)R << endl;
	cout << "G:" << (int)G << endl;
	cout << "B:" << (int)B << endl;
	return 0;
}


Let me know if you have any questions.
Last edited on
Integer = 4 bytes
Or 2 (old PC and ome controllers) or 8 (some specific hardware). If you need at least some guarantees, use int32_t or int_least32_t
closed account (2UD8vCM9)
Thanks for adding that MiiNiPaa, I completely disregarded the fact that an integer may be a different size depending on what you're working with.
Doesn't memcpy-ing from/to an array to/from an int, instead of just bitshifting and adding, introduce endianness considerations?
And I actually made a mistake: size of int in mentioned specific hardware is 8 octets. sizeof(int) will give you 1, as that device uses 64bit char (so size of char == short == int == long == long long).
then I mean uint32_t as integer; I mean the integer should represent color.
Doesn't memcpy-ing from/to an array to/from an int, instead of just bitshifting and adding, introduce endianness considerations?
Yes. So this example is not really portable. However it works for x86.

Example of using bitwise operations:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#include<iostream>
#include<tuple>

using uchar = unsigned char;

std::uint32_t color_to_int(uchar R, uchar G, uchar B)
{
    return R | (G << CHAR_BIT) | (B << CHAR_BIT*2);
    //In this case you can use + instread of |
}

std::uint32_t color_to_int(std::tuple<uchar, uchar, uchar> t)
{
    return color_to_int(std::get<0>(t), std::get<1>(t), std::get<2>(t));
}

std::tuple<uchar, uchar, uchar> int_to_color(std::uint32_t color)
{
    uchar R = color & uchar(-1);
    color >>= CHAR_BIT;
    uchar G = color & uchar(-1);
    color >>= CHAR_BIT;
    uchar B = color & uchar(-1);
    return std::make_tuple(R, G, B);
}

int main()
{
    uchar R = 127, G = 255, B = 0;
    bool correct = int_to_color(color_to_int(R, G, B)) == std::tie(R, G, B);
    std::cout << correct;
}
like this:
R = (color & 0xff)
G = ((color >> 8) & 0xff)
B = ((color >> 16) & 0xff)


@MiiNiPaa

sizeof is supposed to return number of bytes an object would use in memory:

http://en.cppreference.com/w/cpp/language/sizeof
Doesn't memcpy-ing from/to an array to/from an int, instead of just bitshifting and adding, introduce endianness considerations?

Yes it does. h4ever, when you say that
first 8 bits will represent R
do you mean the byte with the lowest address? The most significant byte in the int? Or does it not matter?

This version is based on Pindrought's but uses a union. It stores the R,G,B components in the most significant to least significant bytes, regardless of byte ordering. To put them in the lowest addressed to highest addressed bytes, change line 21 to return result.i and change line 29 to tmp.i = IntegerToConvert
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
#include <iostream>
#include <arpa/inet.h>

using std::cout;
using std::endl;

union RGBInt32 {
    uint32_t i;
    char bytes[4];
};


int
ConvertRGBToInt(unsigned char R, unsigned char G, unsigned char B)
{
    RGBInt32 result;
    result.bytes[0] = R;
    result.bytes[1] = G;
    result.bytes[2] = B;
    result.bytes[3] = 0;
    return htonl(result.i);
}

void
GetRGBFromInt(int IntegerToConvert, unsigned char &R, unsigned char &G,
              unsigned char &B)
{
    RGBInt32 tmp;
    tmp.i = htonl(IntegerToConvert);
    R = tmp.bytes[0];
    G = tmp.bytes[1];
    B = tmp.bytes[2];
}

int
main()
{
    unsigned char R = 200;
    unsigned char G = 100;
    unsigned char B = 50;
    int Color = 0;
    Color = ConvertRGBToInt(R, G, B);   // Set the integer color
    GetRGBFromInt(Color, R, G, B);      // Get RGB from given color
    cout << "R:" << (int) R << endl;
    cout << "G:" << (int) G << endl;
    cout << "B:" << (int) B << endl;
    return 0;
}

@coder777 I know, I corrected myself in second post by saying that it will return size of 1 64bit byte.

@dhayden reading data from union member different from last written yields undefined behavior. It is a common misuse of usnions.
@coder777: Thanks this is what I wanted. And how to reverse it to join the R,G,B to integer?

Notice:
Right now I have one more need. I have string "\000\000\000\020" 4 bytes read from bmp file. This string represents offset of bmp header. I need to convert it to integer but this: uint32_t header_offset; header_offset = (int) n; gets 0 . How correctly convert it to get position in file?
Last edited on
How correctly convert it to get position in file?

You can extract the bytes one at a time and construct the int as MiiNiPaa has. If the data is aligned right you could also do header_offset = ntohl(*(int32_t*)&str); where str is the address of the 4 bytes.

reading data from union member different from last written yields undefined behavior

That makes sense because it depends on the hardware. But in cases like this, reinterpreting the bits is what's intended. The method you've given is more portable but quite likely results in a lot more code than simply reinterpreting the memory.
Undefined behavior != implementation defined. It is worse. With implementation defined behavior you have guarantees given by architecture and system. With undefined behavior, you have no guarantees. Compiler has all rights to just, say, never write stuff at all: you never read it anyway, and so read would give you some random data which happened in memory.
Never expect that assigning something will actually write stuff. It can be optimised away entirely (if it is not used), it might be stored in registers, it migt be just calculated in compile time and never actually do anything in runtime.
MiiNiPaa wrote:
I corrected myself in second post by saying that it will return size of 1 64bit byte.
No, sizeof(int) will not return 1. It has nothing to do with the size of char. 1 Byte == 8 bit

h4ever wrote:
Thanks this is what I wanted. And how to reverse it to join the R,G,B to integer?
See MiiNiPaa's color_to_int(...)

h4ever wrote:
How correctly convert it to get position in file?
1
2
3
4
5
header_offset = d[0];
header_offset <<= 8;
header_offset |= d[1];
header_offset <<= 8;
...
1 Byte == 8 bit
Ypu are mixing byte and octet. Byte is not always 8 bit.
1.7 The C++ memory model [intro.memory]
1 The fundamental storage unit in the C++ memory model is the byte. A byte is at least large enough to contain any member of the basic execution character set (2.3) and the eight-bit code units of the Unicode UTF-8 encoding form and is composed of a contiguous sequence of bits, the number of which is implementation defined. The least significant bit is called the low-order bit; the most significant bit is called the high-order
bit. The memory available to a C++ program consists of one or more sequences of contiguous bytes. Every byte has a unique address.
Correct way to address an 8-bit sequence is octet. Look at any network RFC.

This is done to support architectures with different sized bytes (9bit byte machines still exist) or for those which cannot address bitfields less than specific size (64 bit machine without commands to operate on lesser width values and disallowing unaligned access).

More:
Various implementations of C and C++ reserve 8, 9, 16, 32, or 36 bits for the storage of a byte.[11][12] The actual number of bits in a particular implementation is documented as CHAR_BIT as implemented in the limits.h file.
http://en.wikipedia.org/wiki/Byte#Common_uses
A byte in this context is the same as an unsigned char, and may be larger than 8 bits, although that is uncommon.
http://en.wikipedia.org/wiki/Sizeof
Thank you guys.
Every byte has a unique address.

There are some machines, such as the HP Saturn processor upon which many of their calculators were built, that are nibble-addressable. Each address in memory is the location of a 4-bit nibble rather than an 8 (or 9, or 16, or ....) bit byte.

A C++ compiler for the Saturn would use an 8-bit byte (to hold ASCII characters) and every byte would have a unique address, but the byte addresses would not be consecutive. So you'd probably get things like this:
1
2
3
4
char ch[2];
int i = (int)&ch[1] - (int)&ch[0]; // i = 2
i = &ch[1] - &ch[0]; // i = 1 because of scaling.
i = sizeof(char);   // probably 1, because sizeof() returns size of bytes? 

I didn't reply to:
dhayden
h4ever, when you say that
first 8 bits will represent R
do you mean the byte with the lowest address? The most significant byte in the int? Or does it not matter?

I meant "first" as read from the left. My mistake I did not specify the direction.

This is what I did not understood:
- What is the lowest address? Do you mean the lowest as read from right? Do you mean 8 bits read from right?
- What is most significant byte? Ok I found it: http://whatis.techtarget.com/definition/most-significant-bit-or-byte
so by MSB you mean number like 11111111 - 8 bits of value 1, right?
I meant "first" as read from the left. My mistake I did not specify the direction.

No, the issue is really endian-ness: http://en.wikipedia.org/wiki/Endianness

Consider a 32 bit integer 0x05060708. 05 is the most significant byte and 08 is the least significant byte. But how will the computer store this in memory? Suppose you store it starting at memory address 100. Two common ways are:
1
2
3
4
5
Address  Little endian   Big endian
100            08             05
101            07             06
102            06             07
103            05             08

So when you said "first" I didn't know if you meant most significant byte, or first byte in memory, which depends on the endianness of the computer. If the code is part of a larger system that, say, sends the integer to a piece of hardware, then the byte order probably matters. If you're just trying to encode/decode the data within the program then endianness probably doesn't matter.

I hope this explains the issues of endianness and why I was asking.
Pages: 12