An unsigned byte has 0-255 (256, 8 bits worth) total values. When printing to the screen a subset of those are printable (many of the ones near zero are not printable) and you use the ones that make sense for text. In binary any byte can be any value and trying to print it gives nonsense. The only sensible way to print binary to the screen is 1 byte at a time, usually in hex but integer format is ok at times, so you can see the true value rather than gibberish.
Okey so no extra encoding/encryption is added to the initial tls message and the reason why the output was strange is because every possible value/char that a byte can represent is not printable?
And also, is everything stored as ones and zeros, then when we want to print to screen/file (eg std::cout) there is a translation going on?
I just wrote the code in the editor, the arg should have been unsignedchar* buffer, rather than char* buffer. You're seeing artefacts of sign extension.
I used printf because the code fits on one line and it should be obvious what it's trying to do; simplicity.
If you're struggling to just see the https conversation, I would assert that you are not yet skilful enough to write a TLS 1.2 implementation in line in your server.
I've already pointed you to a practical solution, use a 3rd party library that wraps openssl, like Poco.
Okey so no extra encoding/encryption is added to the initial tls message and the reason why the output was strange is because every possible value/char that a byte can represent is not printable?
I do not know what TLS does (encrypt or not). But yes, not every byte is printable on all consoles in all modes and the default (ascii) absolutely has unprintables in the first 20 or so values, google ascii table.
everything in a computer is in bytes, in binary but in groups of 8 bits per chunk. It is just not handy to make hardware that can deal with 1 bit sized items, and not very useful, and for a number of reasons 8 bits became the standard. you are printing the binary in hex.
look ... the value of a number is an abstract concept, and its representation is for whatever purpose. Lets take the number 5.
in binary: 0101
in hex for 1 byte: 05
as a double: 5.0
as a roman numeral: V
and on and on we can go.... they all mean "five"
you are printing hexidecimal or hex for short.
the computer cannot print binary with a cout / printf flag but its easy to translate between hex and binary as each hex digit is a lookup of a 4 bit value so a little 16 entry table can convert. It is rare to want to see pure binary even as a coder. Some coders muddle hex and binary at times, talking about a 'binary file' for example which is really 'a file of raw bytes' and is usually 'printed in hex'.
If you're struggling to just see the https conversation, I would assert that you are not yet skilful enough to write a TLS 1.2 implementation in line in your server.
As i mentioned earlier, i figured it out. I know the values(numbers) in the message represent eg. tls version, size etc. I was excepting something more direct when I looked at the message the first time (for example "tls version: 1.3", clear readable headers )
If there is something else to think about, just let me know :)
the computer cannot print binary with a cout / printf flag
But before, I printed out of the received data without touching it(no converting) I still got an output (even if it was strange), I was told this was binary. (even if this wasn't ones and zeros) So because you're saying we can't directly print binary with "cout", I assume there has been a translation already before cout. (from binary to something else)
As i mentioned earlier, i figured it out. I know the values(numbers) in the message represent eg. tls version, size etc.
Did you look at the network trace with a network analyser; like wireshark? All that stuff's pretty obvious in a tool like that as is decodes the message for you.
Yes, everything is binary, but "binary" in colloquial computing terms, when referring to files or data, means that it's in some non-meant-to-be-human-readable format.
For example, the number 432 can be stored as ASCII in a file, where each individual character is stored, '4', '3', '2' (each byte acting as a printable character).
But 432 can also be stored in 'binary', which means storing the actual binary representation of 432 instead of the individual printable characters.
If you run something like:
1 2 3 4 5 6 7 8
#include <iostream>
#include <fstream>
int main()
{
std::ofstream fout("test", std::ios::binary);
int num = 432;
fout.write((char*)&num, sizeof(num));
}
and then try to open the 'test' file in Notepad++, you see mostly unprintable ASCII characters (4 of them, most likely).
But before, I printed out of the received data without touching it(no converting) I still got an output (even if it was strange), I was told this was binary.
jargon problem.
binary is base 2 representation. so the first few numbers in binary are
0000 = 0
0001 = 1
0010 = 2
0011 = 3
0100 = 4
0101 = 5
you can't say
int x = 5;
cout << bin << x;
and see on the screen: 0101
because its not useful to most people, and its easy to write if you need it. this is 'really binary'.
jargon, though, computer people refer to raw data that we, as humans, see in hexidecimal format, as 'binary' because, as I stated, computers work in bytes (8 binary digit chunks) and printing groups of bytes is more commonly done in base 16 instead of base 2. For 2 or 3 main reasons: 1 being that base 2 and base 16 are directly related, so its easy to mentally convert, and base 16 is a lot easier to {read, type, etc} than base 2, and back when we printed things a lot, printing costs for binary would have been insane.
so you you printed on the screen, being very precise, then:
you printed 'binary data' as if it were 'ascii text'. That is, if you needed to print the value 123, you printed the letter at ascii_table[123]. Because ascii has unprintable characters and odd symbols and such, you got nonsense, and because an integer like 12345678 stored in 64 bits (8 bytes) will print 8 bytes of ascii letters that are totally unrelated to each other in any useful way as text, they only make sense as bits in a larger entity.
try it.
uint64_t foo = 1234567890ULL;
cout << hex << foo << endl; //the integer, in hex
char * cp = & foo;
for(int i = 0; i < 8; i++)
cout << cp[i]; //gibberish ascii
cout << hex << cp[i]; //the bytes of the integer one by one.
and if you want to see it in true binary:
string bin[] = {"0000", "0001", "0010", ... "1111"} //fill this out
and pick a hex value like 0x1A you print bin[1] and bin[10] and that is your binary value (hex a is 10)