Why are structures slower than arrays?

Forum

Forum
Beginners
Why are structures slower than arrays?

Why are structures slower than arrays?

I have been reading about the differences between C and C++, and notice that people often say C++ is 'slower' because it uses classes.

This statement seems a little excessive given that C++ can be used like C anyway, but many people seem to agree that classes and structures are slower than arrays. Is this always the case?

If my program does this: myarr[5] is that really faster than mystruct.value_5. I know that the former is quite literally just *(myarr + 5), but what does the computer have to do to produce mystruct.value_5? Is it really more complex than indexing the array?

Disch (13742)

If my program does this: myarr[5] is that really faster than mystruct.value_5

No.

The "classes are slow" BS stems from the fact that member function and variable access adds another level of indirection.

Consider the below:

class A
{
  int avar;

  void function()
  {
    avar = 5;  // doesn't look like it, but this involves a pointer dereference
  }
};

Here, avar = 5; is actually this->avar = 5;. The added indirection of this does in fact involve some overhead.

However this doesn't make classes any slower than structs. The same thing could be done in C:

typedef struct
{
  int avar;
} A;

void function(A* a)
{
  a->avar = 5;
}

This code, and the above C++ code are pretty much identical.

Now where it gets even more complicated is with virtual function calls. Calling a virtual function 'foo':

virtual void foo();

involves a bit more overhead:

- possibly some pointer math to find the vtable (if you have multiple inheritance or somesuch)
- dereferencing the vtable to find the right 'foo' function pointer
- dereferencing the function pointer to actually call the function

So yeah this is a little slower, but it's nothing to write home about.

And again.. the same thing could be accomplished in C... and doing it in C would not be any faster because you'd have to do all of the same work.

So C++ isn't slower than C. What's really the case is that slower language constructs are slower than faster language constructs, and C++ overs more slower language constructs to the programmer.

And I use the term "slower" here very loosely. The above overhead is miniscule and 9999 times out of 10000 will not hinder your program's performance in any way shape or form, unless you're doing it a few hundred thousand times per second.

sohguanh (1236)

I think the effect can only be felt in extreme business requirements like say exchange trading buy and sell ? Can you imagine processing one million C++ objects in a second in comparison say to a simple C array of one million data of primitive types ? This will show-up on the monitoring tools.

However in most business applications, such extreme and harsh requirements are few. In fact they even opt for Java as the implementation language.

ausairman (308)

Ok that makes sense. I am designing a neural network that frequently (~50,000 per second) asks for a value in an array that is pointed to by a structure member (also inside an array), so something like this:

(*array_of_structure_pointers[i]).somearr[j] = value;

The alternative would be to create a massive array where each array is stored sequentially inside a bigger array. This would look something like this:

big_horrible_array(i * (arrays_per_structure * array_length) + array_no * array_length + j) = value;

The extra headache factor aside, is the complexity involved in finding the array inside the structure really bigger than the extra multiplication and addition involved in indexing big_horrible_array?

I would test this myself but it would require rewriting large amounts of code so I was hoping someone could just explain a) whether there's any theoretical benefit at all in this particular case or b) if there is a theoretical benefit, is it no larger than the extra overhead involved in multiplying and adding those values to index the array?

sohguanh (1236)

Actually to me the second approach may get you Out of memory error if your array is VERY BIG!!! So I would go for your first approach but still you need to test it out.

Usually for performance related testing, I write simple C++ console program just to test on that specific aspect. I will try various different data structures and/or algorithms, vary the number of input data etc. It is a time-consuming affair and unless it is part of the business requirement, I would gladly skip this "trial-and-error" development :P

And above have not factored in other aspects of the program!!!!

ausairman (308)

Well it would probably be in the order of 10-100MB in my case (maybe 1-10 million single-precision floats or so), so I don't think memory is an issue. However, after giving it some thought, that "headache" factor is starting to look really significant so I think I'll just stick with my nice looking readable structures for now!

Cheers

jsmith (5804)

C++ isn't slower than C. Object-oriented design, in general, is likely to produce slower programs than their structured equivalents. Why? Because, as Disch said, there are extra levels of indirection. Pieces of functionality tend to get broken down into smaller pieces in an object-oriented design, which results in more function calls. One can mitigate some of that through judicious use of inline functions, however that isn't the only source of slowness.

Consider, for example, any STL container. Let's take vector. If I have a vector<Foo> where Foo is an aggregate (class or struct containing several data members) and I want to add another Foo to the end of the vector, what do I do? I use vector::push_back. But let's look at what that really does. vector<Foo>::push_back( const Foo& ) takes a Foo object by const reference, then copies that foo into the memory owned by the vector. So:

myvec.push_back( Foo( 3.14, 42, "Hello World" ) );

first runs a user-defined constructor of Foo, then runs the copy constructor of Foo to copy the temporary to the container, then runs the destructor of Foo to destroy the temporary.

Compare this to the C equivalent, which I bet would be something like:

// assume:
Foo myvec[ 10 ]; // an array of Foo's
int nelems = 4; // number of elements in container

myvec[ nelems ].pi = 3.14;
myvec[ nelems ].meaning_of_life = 42;
myvec[ nelems ].message_to_the_world = "Hello World";
++nelems;

It can't really get any faster than that; there is no "temporary" work done that has to be undone. There is also no encapsulation/data hiding here either.

Topic archived. No new replies allowed.