Just use std::wstring
But std::wstring doesn't work with most standard lib functions, or with most 3rd party libs. std::basic_string<unsigned long> is even worse. What good is a string class if you can't use it anywhere? You'll have to be constantly converting to and from various formats for I/O. That's fine if I/O is minimal and text processing is heavy -- but I'd wager that in most applications that's usually not the case. Text input and printing are far more common in my experience.
And wchar_t is not any less ambiguous than any of the other types |
True.
Re: the rest:
If you want to draw the distinction between supporting any text or just supporting most text, you can weigh the balances as to whether or not you want to go with 16-bit or 32-bit characters internally. Or you can let sizeof(wchar_t) to decide your fate (and your memory consumption).
If random access is really a concern, UTF-32 is always an option. Or if you want to forgo codepoints over U+FFFF, you can go with UCS2 to cut down on memory usage -- but in either event, wchar_t is a poor solution. It either isn't large enough for UTF-32, or burns a hole in your RAM for UCS2.
Though, honestly, I don't see where you'd need random access in a string class for most things. About the only thing I can think of would be for formatting numerals for output. Any other text processing I can think of is (or could be) done sequentially most of the time -- so an iterator would work just fine. I suppose if you're working on a text editor where you want to modify exceedingly large files gracefully, you'd need to develop something custom for that (but you'd need to customize file i/o too -- simply reading the entire file to a string/buffer and modifying it with random access isn't very advisable -- random access is also a poor choice if you're going to allow deleting/inserting characters mid-string). If you have more practical examples of why random access is such a crucial selling point of a string class, I'm interested in hearing them. From my standpoint it just doesn't seem important at all. Far less important than being able to read/transfer/output any possible form of text.
I think it comes down to "most things" vs. "all things". There won't be a string class that satisfies everyone's needs in every possible situation. But for most things, IMO, you're better off with some kind of UTF string class that doesn't necessarily have random access.
Barring those extraordinary circumstances where you're better off coming up with something yourself -- a UTF string class, to me, seems by far the most benefitial in the most ways most of the time. If random access with limited character support is more important, then use a vector. That's basically all std::string is anyway -- a vector with a few operator overloads.
Given that the standard lib is meant to be generic -- it should apply to most things. std::string does not, and therefore fails hard in my book.