getline returns odd unicode characters?

Forum

Forum
Windows Programming
getline returns odd unicode characters?

getline returns odd unicode characters?

Jun 19, 2011 at 1:45am

Hey guys got a quick question if you dont mind, When i run this to get the string from the file, it returns odd unicode, though the actual file that I am reading from is a UTF-8, no matter what I do it always return strings I cant decipher i.e. useless to me. And even when i load this text file with a string of "2.2.17" it still returns garbage.. And its different everytime i reload the program?
Maybe I'm incorrectly using the buffer?

I have been a show troll of this forum, and all the tuts for a while now, because I am just now learning C++; (tired of being a scripter)
so i have read various threads on file use and have been studying the stream functions for about a week now..

If anyone can help, I will credit you in source if you wish.

BOOL CEditor::OnInitDialog() {

	CDialogEx::OnInitDialog();

	char buffer[256];
	string TEST;
	ifstream VChk;
	VChk.open ("C:\\Program Files\\path\\to\\test.txt");
	while(!VChk.eof()) {

		getline(VChk, TEST);

                wsprintf(buffer, "Version %s", TEST);
		
		MessageBox(buffer, NULL, MB_OK | MB_ICONEXCLAMATION);
	}
}

Thanks,
Devon

Jun 19, 2011 at 3:05am

Disch (13742)

You're mixing wide character strings (wsprintf) with narrow character strings (char, string). This doesn't work like you might think.

If you want to convert the UTF-8 data read to the file to UTF-16 (used by Windows), you'll have to either do the conversion yourself or call some other function to do the conversion.

WinAPI can help with this in the form of MultiByteToWideChar:

http://msdn.microsoft.com/en-us/library/dd319072%28v=vs.85%29.aspx

Example usage:

string TEST;  // a narrow string
wchar_t testbuf[256];  // a wide buffer
wchar_t buffer[256];  // another wide buffer

//...

// get the narrow (UTF-8) string
getline(VChk, TEST);

// convert it to UTF-16
//  yes, it's a confusing function...
MultiByteToWideChar( CP_UTF8,  MB_PRECOMPOSED, TEST.c_str(), -1, testbuf, 256 );

// 'testbuf' now has our UTF-16 string
//  note for this next part, since we're calling the wide wsprintf,
//  everything MUST be wide:
//  1)  buffer must be a wchar_t buffer
//  2)  the string literal "Version %s" must be wide (notice the L preceeding it)
//  3)  testbuf must also be a wchar_t buffer
wsprintf(buffer, L"Version %s", testbuf);

// lastly, pass it to MessageBoxW  (note again:  W because we're giving it a wide string)
MessageBoxW(buffer, ... );

Jun 19, 2011 at 6:04am

XDEV (3)

ooohhh wow thank you very much!!
aswell thanks for the link!

edit:
OK so after I edited everything so it would enter debug, It just returns Chinese?

Last edited on Jun 19, 2011 at 6:28am

Jun 19, 2011 at 6:10am

LB (13399)

What about a basic_string<wchar_t> ?

Jun 19, 2011 at 7:10am

XDEV (3)

***SOLVED***

I dont know why but for some reason (probably because I was thinking about it to much :/) I completely spaced using fgets and now I even have it checking for current version + older + newer(betas)

I thank you once again for your help :D

Topic archived. No new replies allowed.