Also, we don't pass arrays. We don't use arrays unless we're forced to. We use vectors, or maybe std::array sometimes. Beginners shouldn't use arrays. Beginners should be using vectors. Beginners shouldn't use arrays. Beginners should be using vectors.
Rarely do you ever "always" do something. It depends on the situation.
For example, when passing an null-terminated character array ("c-string"), you can use the null-terminator instead of passing a size.
A normal old school array? int arr[12]; for example? Yes, you need to pass the size as well.
With a std::vector (or other STL containers) there is no need, the size (number of elements) is part of the class. So you just pass the container into a function, value or pointer/reference, and the container in the function "knows" its size.
@repeater, Ganado, Furry Guy
I ask because I usually work on an STM microcontroller, whose built-ins (afaik) don't include STL data structures (nor templates). If I were using c++ to write a general-purpose program, you bet I'd use vectors!
You pass it if you need it. Do you need it?
Does this mean "as long as the caller of the function knows the array indices boundaries (i.e., the size is fixed by specification or, as Ganado pointed out, a marker/sentinel element is used) the size need not be passed?"
I think you folks have otherwise answered my question: depends on the situation and best avoided, although not required -- use a marker or pass in a variable that indicates access boundaries.
"as long as the caller of the function knows the array indices boundaries (i.e., the size is fixed by specification or, as Ganado pointed out, a marker/sentinel element is used) the size need not be passed?"
Yes, pretty much. The inherent problem in passing a separate size variable is that it creates redundant information that needs to be maintained as a codebase changes and grows, but sometimes it can't be avoided.
a marker, like a '\0' character in a c-string is a worse solution than passing the size.
you now have an impossible value, the array needs to be bigger and the user has to be aware so it doesn't increase the complexity of algorithms (for example, doing a lot of strcat() into the same c-string)
> pass in a variable that indicates access boundaries.
not sure what you mean with this
I agree, I don't particularly like c-strings. I would much rather have string format be [size][data], but just saying that's how a lot of C library functions work.
He means a sentinel value (like the null character).
@ne555
I mean an int, size_t, uint -- whatever type that stores a number that represents the last (and/or first) valid index that the function can access.
the answer is still:
Do you need it in the local function. If yes, can you determine it yourself or need it handed to you?
some problems have a global constant # of things.
some problems have a local constant # of things.
some problems have a sentinel or can otherwise determine the # of things (still may be easier to pass it).
some problems, you don't need it at all. (very rare)
there are other ways to pass the index, eg a pointer to the last element.
some problems, you have to pass it in.
vectors do it for you.
some designs may put the size and the pointer together (assuming that vector is not available on your system here) in a struct or something.
At the end of the day, when dealing with raw memory blocks, you generally DO need to know how big the block is and how much of it you have used. So most of the time, you need this info, however you got it (passed in or not). Even when this is sort of hidden for you (modern c++ vectors, iterators, nifty for loop constructs, etc) the underlying tools know this info. It is unavoidable, but may be behind the scenes.
a marker, like a '\0' character in a c-string is a worse solution than passing the size. you now have an impossible value, the array needs to be bigger
The total data is actually smaller. Without the 1-byte null terminator, you need a store the size, which requires sizeof(size_t) bytes. That's how Pascal stores strings as I recall: there's a size field at the beginning, followed by the data.
Back when 4k was a large and expensive amount of RAM, every byte mattered. In that situation, a null-terminator was and is a good choice. On most of today's modern computers, RAM is cheap so it's better to store the size directly.
DMR in 'The Development of the C Language' (a design decision that made the language simpler)
C treats strings as arrays of characters conventionally terminated by a marker. Aside from one special rule about initialization by string literals, the semantics of strings are fully subsumed by more general rules governing all arrays, and as a result the language is simpler to describe and to translate than one incorporating the string as a unique data type.
Some costs accrue from its approach: certain string operations are more expensive than in other designs because application code or a library routine must occasionally search for the end of a string, because few built-in operations are available, and because the burden of storage management for strings falls more heavily on the user. Nevertheless, C’s approach to strings works well. On the other hand, C’s treatment of arrays in general (not just strings) has unfortunate implications both for optimization and for future extensions.
Which IT or CS decision has resulted in the most expensive mistake?
...
The best candidate I have been able to come up with is the C/Unix/Posix use of NUL-terminated text strings.
...
We learn from our mistakes, so let me say for the record, before somebody comes up with a catchy but totally misleading Internet headline for this article, that there is absolutely no way Ken, Dennis, and Brian could have foreseen the full consequences of their choice some 30 years ago, and they disclaimed all warranties back then. For all I know, it took at least 15 years before anybody realized why this subtle decision was a bad idea, and few, if any, of my own IT decisions have stood up that long. In other words, Ken, Dennis, and Brian did the right thing.
...
The reality of the situation is that all other languages today directly or indirectly sit on top of the Posix API and the NUL-terminated string of C.
...
Thus, the costs of the Ken, Dennis, and Brian decision will keep accumulating, like the dust that over the centuries has almost buried the monuments of ancient Rome.
interesting takes on it. I don't consider it so big a mistake. In C you can deal with the length problem if it is causing you slowness. Most of the C built in string tools are looping on while [index] != 0. What has to get the length before it can iterate? I can't think of any. Its the user that might want to know the length, and if the code frequently needs this, they can supply it in several ways (a struct with the length is one simple way, but you can also hijack the first N bytes of the data block as a binary integer, and use a negative array index for the size from your char* when you need it).
for entertainment only... if you need the length many, many times for the same string variables in a program, for whatever reason, and you were without <string> ... mad hax get it done pascal style. Where it stinks is you would have to make your own routines to maintain the value ... memcpy can serve for strcpy, but the others, you need to write most of the modify ones again. As I said though, few routines really need the length as much as people think. Mostly iteration in reverse, or if you sorted the string, or other odd things.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
int main()
{
char * cp = newchar[100];
int* ip = (int*) cp;
ip[0] = strlen("Hello World");
cp += sizeof(int);
strcpy(cp, "Hello World");
cout << cp << endl;
cout << "cp len = " << (int)((int*)(&cp[-1*sizeof(int)])[0]) << endl;
//you can use the above mess in the cout to modify, track, etc
//the length as you work. its big and ugly but can be made efficient (the offset is a constant)
cp -= sizeof(int);
delete[] cp;
}