Are automatic char arrays expensive any more?

In the olden C days, code like this was deemed expensive:
1
2
3
4
void foo1()
{
  char str[128]; // allocated on the stack each time foo1() is called
}

So a better alternative is this:
1
2
3
4
void foo2()
{
  static char str[128]; // allocated on the heap, once
}

The c++ version is like this:
1
2
3
4
5
class MyClass1
{
  void foo3() { }
  char m_str[128]; // allocated once per instance
};

which works well, until I start to multithread and two threads are accessing the same instance (race condition). Of course, I can use a mutex, etc... ...but then I have to design for deadlocks. All that is too much trouble just overkill for a temporary string.

So my question is, if I go back to a foo1() equivalent and do this:
1
2
3
4
5
6
class MyClass2
{
  void foo4() {
    char str[128]; // allocated once per call
  }
};

is it going to be expensive? I am asking, in particular, in the case of gcc - I realize that compiler technology is improving all the time, so I'm just trying to get a handle of how gcc might handle these differences (hoping I don't have to go to a disassembler).

I call fooX() quite often and don't worry about replacing char str with std::string - I still have to deal with the same issue.

If it turns out that it's expensive, I may have to do this:
1
2
3
4
5
class MyClass3
{
  void foo5( char* str ) { // allocated by caller, in this case, a thread
  }
};

but I'm trying to avoid this if I can (too many args to fooX already).

Most likely, I will go with MyClass2 or MyClass3, but I'd like to hear some informed opinions on this issue. TIA.
Last edited on
Am I misunderstanding something?
Allocation on stack doesn't really take time. It's just sub esp, size_of_local_variables which can only be omitted if function doesn't have any local variables (or maybe if it doesn't call other functions). Of course if char[] is static, you may not need to initialize it every time, though that is not always true..
Yes, allocating space on the stack is super fast because the stack itself has already been allocated from the system. However when it comes to objects you need to remember that their constructor/destructor will be called each time. This can affect performance inside a loop.
Implement the one that is most correct. Profile if performance is not satisfactory.
Last edited on
Thx for the tips everyone - I will try MyClass2/foo4() out first and if it's noticeably painful, I will try the others and profile. You gotta realize - I'm from the super-old-school early days of C/C++ where even too many parameters or recursion can cause problems, so I was in need of a sanity check!

In my case, more than anything else, the problem is potentially, too much latency in a real-time app - but your insights are very helpful - tyvm.
Last edited on
The reason it was deemed "expensive" is not because of the speed, but because of
the stack space requirement. In the old days of DOS, your stack was fairly small in
size--you couldn't afford to put a bunch of 128 byte arrays on it without blowing the
end off it. Making it static moved it to the data segment which did not have nearly
the same size limit.
A temporary string should never be a class attribute to begin with. Perhaps I am misunderstanding that C++ example. Perhaps he is saying that the class might get created in the stack within functions. The comparison seemed a bit odd to me.
jsmith, yes, I remember reading in the dos-era, that there were limits on the size of variables on the stack for most OS, before it got thrown elsewhere. I wonder what those limits are, these days. It would be useful to know when creating the size of those temporary strings...

A temporary string should never be a class attribute to begin with.

Seen from an OOP perspective, I agree. In the OP, I just threw it out as an example of different possibilities for placing a temporary - thinking about it would also be useful for caches. If millions of objects were created or if you had to execute a method in a highly repetitive loop (you would hope the compiler would be smart enough to optimize this one), the different implementations would make a difference in speed.

I mean, I left this one out, too - in Bar.cpp
1
2
3
4
5
6
7
static char str[128];
void foo6()
{
}
void foo7()
{
}

Anyhow, as I move more and more code towards reentrancy, I want to remove as much state away from my classes as I can (functional programming, anyone?). I also want to decouple dependencies between classes as much as I can to make the classes fine-grained and independent.
Last edited on
In the 16-bit segmented memory era, typically your stack was limited to 1 segment, which for
MS-DOS was 64K. This was because otherwise stack operations would be very expensive.
You'd have to watch the stack pointer and if it reached the boundary of one segment, increment
(or decrement) the segment register.

With modern 32-bit OSes, MMUs (virtual address space) and flat memory models, the operating
system needs only a single register to access memory, so the compiler needn't worry about generating code that deals with segment registers and the OS can add pages to your stack at
will without worrying about the pages being in contiguous memory. Now, the OS can arbitrary
decide to limit your stack to whatever size it wants.



Excellent. TYVM, jsmith - that info's exactly what I was looking for.
Topic archived. No new replies allowed.