First, this problem is a bit harder then you might be thinking it is.
But, let's make your program work correctly first:
1 2
|
const char *s= str.c_str();
return s
|
;
What happens to
str.c_str()
after
str
is no longer in scope (i.e., doesn't exist any more)? Your program is exhibiting
undefined behavior, since it is accessing a value that no longer exists. On my system, your code crashed the first time and printed 424 the second time.
The punchline is that
std::string
has a
size
member function, which returns the length in constant time; it's both quicker and safer then
strlen()
. Do not manipulate C-strings or pointers unless you're forced to.
1 2 3 4 5
|
int main() {
std::string user_string;
std::getline(std::cin, user_string);
std::cout << user_string.size() << "\n";
}
|
Now to answer the question correctly:
Because of the way string encoding works, the number of bytes the string container contains (
std::string::size()
) or the number of bytes from a pointed-to-address until a zero-byte (
strlen()
) does not correspond necessarily to the number of characters in the string. This is because there are more possible characters than values in a byte -- so, for character encodings which support more than 256 characters, there must be at least some characters that compose more than a single byte -- and so, strictly speaking, your program's output will be wrong.
This article has been circulated quite a lot, but it's worth reading if you're unfamiliar with the idea of character encoding:
http://www.joelonsoftware.com/articles/Unicode.html
If you are using Unicode UTF-8, then here's a function which will count the number of characters contained in a string. ASCII is compatible with UTF-8, so if you're using an English locale, you'll probably not notice a difference.
Here is an example program. The single test case I've included here uses a random Arabic phrase copied from internet search results:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
|
# include <string>
# include <iostream>
template <typename Allocator>
std::size_t u_strlen(std::basic_string<char, Allocator> const& s) {
auto cnt = std::size_t{0};
for (char c: s) cnt += (c & 0xC0) != 0xC0;
return cnt;
}
int main (int, char **) {
std::string s{R"(لوحة المفاتيح العربي)"};
std::cout << s.size() << "\n";
std::cout << u_strlen(s) << "\n";
}
|