how int is stored & little and big endian & a snippet of program please..!

I executed this program on g++ - linux..


1
2
3
4
5
6
7
8
9
10
11
#include<iostream>

int main(){
  int i =10000;
  char * c = (char *)&i;
  *c = 10;
   std::cout<<"i : "<<i<<endl;
}

output  here i get is :
      9984



Don't understand it why...!
My tutor gave me hint that, GO AND
READ LITTLE ENDIAN AND BIG ENDIAN

tried a lot but can't understand...because don't know how integer is stored in
here in C or C++ programs....


please explain it...or help me with links/url's where i can find
good material to understand it...

.
Endianness depends on the computer, not on C++
Big endian store bytes from the highest to the lowest, Little endian from the lowest to the highest
http://en.wikipedia.org/wiki/Endianness#Method_of_mapping_registers_to_memory_locations
*assuming 8-bits per byte (typical in modern PCs)*
* I'm using $ to represent hexadecimal numbers here -- but in C++ hex is represented by 0x *
* I don't feel like explaining the difference between decimal and hexadecimal number bases -- if you are unfamiliar, search wikipedia for hexadecimal *

Since a byte is only 8-bits wide and single byte can only represent 2^8 unique numbers ($00-$FF = 0-255).

Integers wider than a byte use multiple bytes (typically 2, 4, or 8, as powers of two are easiest to work with). Some bytes are designated as "low" and others are "high".

If this is unclear... think of how decimal numbers work. A single digit can only represent the numbers 0-9. To expand beyond this, we put another, "higher" digit in front of it. Therefore the number fifteen requires 2 digits: "15". The "1" is the high digit and the "5" is the low digit. Fundamentally, it's the same idea, but instead of "digit" you have "byte"

Converting from dec->hex, the number 10000 = $2710. This means that you need at least 2 bytes to represent it... $27 is the high byte, and $10 is the low byte.

This is where endianness comes in. Different processors store the sequence of these bytes differently. "Little" endian systems store the lowest bytes first and store sequentially higher bytes in ascending order. So the 2-byte number $2710 would be represented as:

$10 $27 (lowest byte first)

However big endian systems are exactly the opposite -- they list the bytes in descending order starting from the highest byte:

$27 $10 (highest byte first)

If the number $2710 is in a 4-byte container (such as 'int' in your above code example -- assuming int is 4 bytes):

1
2
$10 $27 $00 $00  <-- little endian representation
$00 $00 $27 $10  <-- big endian representation


what this crap is doing:
1
2
  char * c = (char *)&i;
  *c = 10;


is using very ugly pointer casting to treat the 4-byte integer as an array of individual bytes. the first line makes 'c' point to the first byte of 'i', and the second line changes the first byte of 'i'

However... depending on the endianness... the "first" byte is either the highest (big endian) or the lowest byte (little endian). So that *c = 10 has the following effect

1
2
3
4
5
$0A $27 $00 $00  <-- little endian  -> $0000270A = 9994
$0A $00 $27 $10  <-- big endian     -> $0A002710 = 167782160
 ^
 |
 changed byte (10 = $0A)



EDIT -- bah I should stop writing lengthy explanations when I can just link to wikipedia XD
Last edited on
Topic archived. No new replies allowed.