I'm confused. Somebody declares array as char c[1024];. But I found the compiler does not stop me and there's no error after I've declared the array with char c[] = "";.
Is there actually any potential problems if I do it this way?
This does not declare an empty array. It declares an array that is big enough to hold the empty string, which requires one character because strings need to end with a null character '\0'.
Why are cstrings null terminated? What's the point for terminating with null?
consider initializing like this: char c[20] = { '1', '2' };
There's no null terminating character in c[20] but what consequences can I come across?
Are std::strings null terminated?
edit:
Okay I now sort of know "why" null terminating is important in cstrings..
But what actually happens when you don't terminate with a NULL ('\0')? Does the compiler continue reading from adjacent memory addresses until it reaches a NULL? Is that what's happening?
TO OP:
Whenever you write double quotes "", there is always a null character, that's how cstrings work (and for a reason). So when you used "" to initialize, the compiler assumed a null character to terminate.
You could try initializing your array like this: char c[] = {}; and you will notice that you will get an error.
consider
char c[6];
cin >> c;
And you type in 'hello'
c would have { 'h', 'e', 'l', 'l', 'o', '\0'} as elements
character arrays must always have null characters.
But what actually happens when you don't terminate with a NULL ('\0')? Does the compiler continue reading from adjacent memory addresses until it reaches a NULL? Is that what's happening?
No. The compiler only runs when your program is compiling. Some functions work by looking through memory until they find a zero, but that's got nothing to do with the compiler.
What was the right phrasing? Thanks for correcting me Repeater. Am I along the right lines with consequence though? Does it continue reading from the memory?
Depends on the function. Some functions have been programmed to look through memory until they find a zero value. Some functions do not. It is up to you, the programmer, to understand what the functions you call do. If the function you call depends on some characters in memory having a zero at the end, the documentation will tell you.
Ah huh, so std::strings are not null terminated otherwise.
1 2 3
string hal = "a";
hal = hal.c_str();
hal will now have "a\0" ?
How does std::string handle 'cout<<'? Does it specify where to start reading and where to stop? If so why don't cstrings use that mechanism or why does string use this mechanism?
How does std::string handle 'cout<<'? Does it specify where to start reading and where to stop?
A string object knows how many characters it contains. A string object is not just an array of char. It contains other data too. It contains a number representing how many characters it has. The operator << can interrogate the string object. Can ask it how many characters it contains.
If so why don't cstrings use that mechanism
Because a cstring is just an array of char. It doesn't contain extra data. It doesn't contain a number representing how many characters it has.
That said, from C++11 onwards, there is a guarantee that inside the string object, it will be keeping its actual characters in contiguous memory and will put a zero on the end anyway.
> so std::strings are not null terminated otherwise
It is also null terminated if we access the element just after the last character at the back of the string
([pos] or at(pos) where pos == size())
The standard does not require that the sequence of characters must be null-terminated otherwise.
> hal will now have "a\0" ?
The size of the string does not change: hal.size() would still yield 1, hal.back() would yield a reference to the character 'a' etc. The null character is not logically a part of the sequence of characters that make up the string.
However, the underlying storage will now contain "a\0" (there is an extra null character at the end).
Why are cstrings null terminated? What's the point for terminating with null?
many, many reasons. The most basic one:
you can make c-strings bigger than they need to be. eg char str[1000] = "hello"; how does it know how long this string is? How does it know when to stop printing letters out? The hidden terminal character, is how. Strcpy, strcat, strlen, etc ALL work off the hidden zero character. The string class works off size() instead, but it still maintains the zero to be compatible with C and c-like C++ code. (and who knows, under the hood, the string class may be using the c-string library calls in some cases).
what happens if you don't terminate?
its about like this
char x[10];
cout << x; //if none of the elements of x randomly happen to be zero, you have an access violation! The C tools will read until it finds a zero byte in memory, even out of bounds.
also, you can have arrays of just characters or just bytes, eg a binary file. This isn't a c-string. Usually those are unsigned but either way, its important to understand that not all char arrays HAVE to be strings.
the only other approach to this I have seen in raw-strings in older languages is the pascal string, where the first 2 bytes are the size of the string as an integer. Modern languages have heavy string classes with extra fields for the size and more.
The standard does not require that the sequence of characters must be null-terminated otherwise.
I was under the impression that &str[0] could be used to get a mutable c string but based on what you are saying it is not actually guaranteed to be null-terminated. No wonder they added a non-const version of std::string::data() in C++17.
In other words, yes everything about char c[] = ""; is a "potential hazard". Just do char c = '\0'; if you want to store a null character, because that's essentially all you're doing.
Because you've now entered the realm of undefined behavior. Sure, your program might not crash for such a simple operation, but that is not guaranteed. The program is still illegal.
To avoid overflows, the size of the array pointed by destination shall be long enough to contain the same C string as source (including the terminating null character), and should not overlap in memory with source.
PS: On my machine, it crashes.
1 2 3 4 5 6 7 8 9 10 11 12 13
// Example program
#include <iostream>
#include <cstring>
int main()
{
usingnamespace std;
char c[] = "";
strcpy(c,"This is a sentence.");
printf(c);
}
Also, the compiler generates a warning.
D:\code\cplusplus248142>make main
g++ main.cpp -o main
main.cpp: In function 'int main()':
main.cpp:11:11: warning: 'void* __builtin_memcpy(void*, const void*, long long unsigned int)' writing 20 bytes into a region of size 1 overflows the destination [-Wstringop-overflow=]
strcpy(c,"This is a sentence.");
~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~