I've tried opening a unicode file by many methods. If I knew "the file is ANSI or Unicode" then I would use the proper method to solve them. Now I'm having a strange text file. (Ansi - Unicode??? - nobody knows unless it's opened by someone) Actually I couldn't detect any text file what type of a text file is. ANSI? UNICODE? So I got a big trouble. If the detection failed I would not open any text file properly and correctly.
Does any one know? Any help would be greatly appreciated. :)
Encoding BOM (hex) BOM (dec)
----------------------------------
UTF-8 EF BB BF 239 187 191
UTF-16 (BE) FE FF 254 255
UTF-16 (LE) FF FE 255 254
(see Wikipedia for more)
But note that not all UTF-8 files have the BOM. And this might be the case for other encodings, too. Though modern editors are supposed to use a BOM when they write a file.
Without a BOM, you'd need to to use some sort of statistical approach, like these guys: