Ok I see. I made a binary file "testfile.bin" and filled it with 4 bytes: "ABCD" according to Little Endian, giving the result "DCBA" in the file, which I wanted. |
"ABCD" is 2 bytes. Sounds to me like we need to go over some fundamentals here. Excuse me if you already know all this...
1 bit is a single binary digit. It can be 0 or 1... nothing else.
1 byte is traditionally 8 bits. so it can range from:
00000000 (0 in decimal, 0x00 in hexadecimal)
to:
11111111 (255 in decimal, 0xFF in hexadecimal)
11111111, 255, and 0xFF are all the exact same number with absolutely zero differences as far as the computer is concerned. The only difference is the way they are presented to us human beings. "11111111" is represented in binary form (base 2), 255 is in decimal form (base 10) and 0xFF is in hexadecimal form (base 16).
But again that's just
textual representation. The number itself is not stored any differently in the computer. "1010", "10", and "0x0A" are all the number "ten" -- they're just that number printed in different numerical bases.
Hexadecimal is traditionally used to view binary data because it conveniently represents a single byte with 2 digits. It also is much easier to convert hex to binary and vice versa.
When you look at a file in a hex editor, many of them will put spaces between each byte so that it's easier to read.
Here's a screenshot of a file in a hex editor I'm currently using. Your hex editor I'm sure will look very similar.
http://i45.tinypic.com/izyec8.png
There are 3 main columns here.
The black column on the far left (that has rows of numbers that count up by 10 each row) is the
offset which is a fancy term to mean "file position". The highlighted "4E" value in the picture is at offset 0 because it's the first byte in the file. The "45" after is is offset 1, and so on.
The big column in the middle with all the blue and red numbers is the actual data in the file. Each 2-digit pair is a single byte.
The black column on the far right (that starts with "NES") is the ASCII representation of each byte in the file. The 'N' is highlighted because I'm also highlighting the '4E' in the center column... and both of those things represent offset 0 (0x4E is the ASCII code for the character 'N'). If you were to open this file in a text editor like notepad... it would display text similar to what you see in this far right column.
That is the fundamental difference between text editors and hex editors. Text editors look at a file and assume that each byte is a character of text, and display the data as text (that is, if they see a '4E' byte in the file, they will display it as the character 'N'). Whereas hex editors just give you the actual raw data without doing any conversion.
For giggles, you can try opening a plain text file in a hex editor just to get the idea.
When I open the Windows CMD and type the file, it does not type it out in binary or hexadecimal format but it actually types it out as "DCBA". Since it is a binary file, nothing is wrong with it because of this right? |
I'm not sure what you're doing with CMD. As far as I know, CMD doesn't display binary files.... so I don't know what's going on. Is your hex editor commandline based? If it is throw it out and get a real one.
I'm using "translhextion" in my example. It's actually pretty crappy and I wouldn't ordinarily recommend it, but I don't know of any better free ones (too lazy to really look). You can get it here:
http://www.romhacking.net/utilities/219/
Look for the "Download file now" link below the screenshot. Don't let the "hacking" in the url scare you -- it's just a game mod site... it's perfectly safe and friendly.
Since "bytes" is a char array, it can only store chars/characters and not numerical values like 0x12, or can it? |
It can. This is where C/C++ are a little confusing.
chars are normal integers... just like ints. The only difference is that a char is 1 byte wide whereas an int is usually 4 bytes wide. You can store numbers in chars and add them together just like they were ints.
Actual character literals are converted to their ASCII codes by the compiler. The only reason they're in the language at all is to be more convenient for the programmer.
We've already established that the ASCII code for 'N' is 0x4E. You can test this out by doing this in a C++ program:
1 2 3
|
if( 'N' == 0x4E )
{
cout << "This will print because the above expression is true!";
|
So when you do something like this:
char foo = 'N';
The compiler treats it the same as this:
char foo = 0x4E;
This is why
0
and
'0'
are different. 0 is actually zero, whereas '0' is the ASCII code for the numeral '0' (which is 0x30)
Hopefully that clears up things a bit more. Keep those questions coming... I feel like I'm committed to teaching you this stuff. :)
EDIT:
It dawns on me that you may have been abbreviating your 4 bytes to ABCD and you didn't mean ABCD literally.
Oh well -- sorry about that =x. Hopefully the post is still informative.