Hi,
I don't have a clue about programming (I'm a school teacher), it's just a friend bet that I couldn't resolve this algorithm problem with the internet. Let's prove him wrong.
Also, bonus is the following question (he is scandinavian):
"Why is "å" (U+00E5 in unicode) encoded as C3A5 in hex in UTF-8?"
A brief examination of the UTF-8 format will show that secondary bytes of a codepoint all have bit 7 set and bit 6 clear. Therefore to see if a byte is the first in a sequence or not, it's very simple:
1 2 3 4 5 6 7 8
if( (byte & 0xC0) == 0x80 )
{
// not the first byte in a sequence
}
else
{
// first byte
}
Of course, he probably knows the answer, too. He just wanted to challenge me to get an answer for something in a topic I know anything about, because he always tells me coders on the internet will never help you.
So thats the answer I should tell them for 'first byte of the last character from a UTF-8 string'?