What is happening here??
Is arr now a pointer to x??, and we can then loop over x by the arr pointer??
I mean , why now x is char array?? (If it is true )??
The compiler should refuse the make the assignment. You can override this decision with a cast.
x defines a long, the actual number of bytes is implementation dependent, but lets say 4 bytes as this is normal on 32bit systems.
arr defines a pointer to a char, which we'll say is 4 bytes for the reasons above.
The assignment copies the value of the address of the start of x to arr. The actual byte that is used is again implementation dependent and depends on whether the architecture is big or little endian.
The code below is implementation dependent for the reasons above.
1 2 3 4 5 6 7 8 9 10 11 12 13 14
#include <iostream>
#include <iomanip>
int main()
{
long x = 0x112232;
char* arr = reinterpret_cast<char*>(&x);
size_t n = sizeof(x) / sizeof(*arr);
for (size_t i = 0; i < n; ++i)
std::cout << std::hex << int(arr[i]) << std::endl;
return 0;
}
char* arr = &x; is an illegal statement, you can't convert the address of a long to a 'pointer to char'.
However, if you really want to do it, shut up the compiler and take responsibility for potentially unsafe code yourself, you can use a cast. In this case, you want to interpet one type as another without changing the value, hence reinterpret_cast.
sizeof(x) is the number bytes used to hold x, a long for your implementation.
sizeof(*arr) is the number of bytes used to hold a char for your implementation.
sizeof(x) / sizeof(*arr) is the number of chars in a long for your implementation.
The size of a long may vary considerably. For example, on a 32bit system it's typically 4 bytes, but can be 8 bytes on a 64bit system. The actual size isn't defined, it just no smaller than an int.
I expect you are seeing that the size of a pointer is 4 bytes, but sizeof(*arr) is the size of a char, the pointer is dereferenced.
Incrementing a pointer to a char (1 byte) will make the value increase by 1.
Incrementing a pointer to a long (4 bytes) will make the value increase by 4.
You'll need to check out an article on pointer arithmetic.
Back to your question, lets go over my example again.
Line 6 declares a 4 byte variable on the stack which we call x.
Line 7 takes the address of that 4 byte variable, stops treating it as the address of a 4 bytes variable and treats it as the address of a 1 byte variable and assigns it to arr.
Line 9 calculates the actual size of the block in a platform independent way. We've said it's 4 bytes here, but it can be a different size, depending on the implementation.
We now have a char array that starts at arr, and is n bytes long.
Lines 10 and 11 traverse the array and print the content of the elements as hex numbers.
Thank you for this wonderful discussion :-), but let us go on a little bit more.
You said:
Line 7 takes the address of that 4 byte variable, stops treating it as the address of a 4 bytes variable and treats it as the address of a 1 byte variable and assigns it to arr.
Can you explain this again?? Why should I treats it in this way??
Also, in the previous post, I have asked about :
How come that a pointer to char , which is char* and pointing to someplace in memory , can be treated like this.
lets say that I am doing the following:
1 2
char arr[] = {"Hello"}; // This means an array with 6 cells.. right??
char* arr2= "Hello"; // It is a pointer to a place in memory that holds "Hello"
can we use arr2 as normal array and loop over it?? like an array that has cells??
Maybe we should step back a bit and talk about type. Ultimately, the computers we use address memory as bytes, which are 8 bit units typically. But to be really useful, we nned to group bytes together and treat them in a particular way.
For example, an ANSI char can be represented in 7 bits, but in C we use a whole byte. We can group a pair of bytes together to give us an integer with the range [0, 65,535] or [-32,768, 32,767], or we can group a four bytes to give us an integer with a larger range. A string can be represented with an array of chars with either a special terminating character at the end (ASCIIZ) or by holding the length at the beginning (ASCIIC).
Type is the scheme that describes our grouping of memory and how we treat it. Our high level computer languages know about type, C has a few built in types and C++ allows you to create your own types and use them as you would use a built in type.
So going back to your original example, a long is a group of 4 bytes, and we printed the bytes as hex numbers. We did this by pointing to the first char, and printing them all until we did them all. The actual mechanics for doing that in C/C++ are described above.
Next, your second example. you are correct. arr is the name of a six char block and arr2 is a pointer (probably 4 bytes) that points to a fixed six char block of memory somewhere.
Yes you can threat these as arrays of chars six elements long and yes you can move arr2 to point to something else. You should be aware that if you move arr2, that block of memory will be lost, but will sit there taking up space.
As I can understand, we point to the 4 bytes block of the long int (32 bit OS), by char*, and then we can jump to the next byte bye increasing the pointer of char by one.
the question is , if my long X=1234567, how can they be divided to 4 bytes , I mean when I print or insert the value of the first byte or the second, or any one, into a buffer , or printing to the screen , how this X presented. I mean , what makes sure that the first 2 digits of X for example to be in the first byte , and the second 2 digits in the second byte... I do not know if I have succeeded to explain my self :-)
Is 1234567 a decimal number? If that the case, then
what makes sure that the first 2 digits of X for example to be in the first byte , and the second 2 digits in the second byte...
Nothing. That's not the case.
When representing binary values, hexadecimal is the preferred numeral system.
decimal 1234567 == 0x0012D687
Each pair of hex digits fit into a byte. To answer your question, if we have long a=1234567, there's no way of knowing what the value of *(char *)&a will be without knowing the endianness of the target processor. On a big endian CPU (e.g. PowerPC), *(char *)&a==0, and on a little endian CPU (e.g. x86), *(char *)&a==0x87.
This fact can be [ab]used to find out the endianness at run time: