why do push_back() and direct access wor

Forum

Forum
General C++ Programming
why do push_back() and direct access wor

why do push_back() and direct access work fine in these codes?

Sep 30, 2008 at 3:58am

Hello, could someone explain me why this two codes work? I got surprised when I found out that I can use either push_back() method or direct access mtr[i][j] when mtr was declared as: vector< vector<double> > mtr;

which of the codes would be the most efficient?

code 1:

#include <iostream>
#include <vector>

using namespace std;

int main()
{
  int length = 5;
  int num = 10;
  
  vector< vector<int> > mtr;
  
  mtr.resize(length);
  for(int i = 0; i < length; i++)
  {
    mtr[i].reserve(length);

    for (int j = 0; j < length; j++)
    {
      mtr[i][j] = num;
      cout<<"mtr["<<i<< "]["<< j<< "]= "<< mtr[i][j] <<endl;
    }
  }  
    
  return 0;
}

Edit & run on cpp.sh

code 2:

#include <iostream>
#include <vector>

using namespace std;

int main()
{
  int length = 5;
  int num = 10;
  
  vector< vector<int> > mtr;
  
  mtr.resize(length);
  for(int i = 0; i < length; i++)
  {
    mtr[i].reserve(length);

    for (int j = 0; j < length; j++)
    {
      mtr[i].push_back(num);
      cout<<"mtr["<<i<< "]["<< j<< "]= "<< mtr[i][j] <<endl;
    }
  }  
    
  return 0;
}

Edit & run on cpp.sh

Last edited on Sep 30, 2008 at 4:04am

Sep 30, 2008 at 1:13pm

jsmith (5804)

They are not the same thing.

push_back() adds a new element to the end of the vector.
operator[] accesses an already existing element of the vector.

Also, be careful with your use of reserve() vs. resize() because they
too are not the same thing.

resize(N) causes the vector to contain exactly N elements.
reserve(N) causes the vector allocate enough backing store for N elements but does not actually cause the vector to contain N elements.

Sep 30, 2008 at 4:43pm

aortizb (40)

Yes, I know and that is why I am asking. Why does then mtr[i][j] work? For the reason you say above it should only work push_back() method.

Sep 30, 2008 at 7:02pm

jsmith (5804)

It works because you are accessing memory that has been allocated by the vector.

So vector's guarantee to the user is that the memory allocated is one contiguous block.
vector has 3 data members: a pointer to the start of the memory block, a pointer to the
end of the memory block, and a pointer one past the last used element.

So in example 1, mtr[i] looks like this for example:
start = 0x1000
end = 0x1014
last = 0x1000

When you access mtr[i][j] with j = 1, for example, this writes a value to
memory location (start + sizeof(int)) or 0x1004. That memory is allocated
obviously. But the vector doesn't realize that index 1 is now used so
if you were to call size() at this time, it would still return 0.

Sep 30, 2008 at 10:32pm

aortizb (40)

Tanks. Yes, I read something similar in a book. Also, you will have problems using at(). So, the use of [] seems to be inconsistent here. For me that should be a warning saying: *Don't use it*. I particularly like push_back() method becuase is cheap. In a 1D vector there is no problem, but in 2D or higher order vectors inconsistencies like this arise. A partial solution I found is use resize before the loop and after the loop:

  mtr.resize(length);
  for(int i = 0; i < length; i++)
  {
    mtr[i].resize(length);

    for (int j = 0; j < length; j++)
    {
      mtr[i][j] = num;
      cout<<"mtr["<<i<< "]["<< j<< "]= "<< mtr[i][j] <<endl;
    }
  }

But this certainly will initiate the rows with default number of the types, in that case, integers. That implies an extra work than reserving room for a certain qty. of integers becuase the function will set with 0 all the elements. This entiles an access to the vector. On the contrary, if I use reserve() method that means an empty vector that I can fill after using push_back() method.

Oct 1, 2008 at 12:21pm

jsmith (5804)

I think it is a matter of personal preference. vectors were supposed to replace arrays and therefore have access similar to arrays, hence operator[]. Since no bounds checking is done when you access an array via [], no bounds checking is done on vector either. That is why at() was created -- it is the same thing as operator[], except it performs the bounds checking before it allows the user to access unallocated memory.

My preference is to use reserve() and push_back() instead of resize() and operator[] to fill out an array.

My preference also is to use iterators to walk the vector rather than indices, which means I use neither operator[] nor at() unless I'm being lazy.

Oct 1, 2008 at 5:10pm

exception (323)

I prefer the "+=" syntax with a std::back_inserter.

Since no bounds checking is done when you access an array via [], no bounds checking is done on vector either.

Not quite 100% true. The behavior after dereferencing an invalid pointer (or the past-the-end pointer, which is technically not 'invalid', but may not be dereferenced) is undefined, in C as well as in C++. Thus, operator[] might throw an exception if the operand is out of range (though it is not done, usually). If the operand is not out-of-range, operator[] just makes the guarantee to be done in O(1), so bound checking is possible. at() on the other hand is guaranteed to be range-checked and throw std::out_of_range if the index is invalid. But operator[] may be implemented in terms of at().

Oct 1, 2008 at 7:58pm

aortizb (40)

Thanks for the useful insights. I also prefer using reserve() and push_back() but, how can it be done in a 2d vector? I guess it is not possible without having the problem of size() and at() and possibly iterators.

Any suggestion?

Oct 2, 2008 at 12:23pm

jsmith (5804)

Generally I'd try to do something like this:

vector< vector< int > > twodvec;

for( size_t i = 0; i < NUM_VECS; ++i ) {
    vector< int > v;
    for( size_t j = 0; j < NUM_NUMS; ++j )
         v.push_back( rand() );
    twodvec.push_back( v );
}

If you want to create a 2D vector where all the values are the same, as in your code above, there is a simpler way:

vector< vector< int > > vec( NUM_VECS, vector<int>( NUM_NUMS, VALUE ) );

would create a NUM_VECS x NUM_NUMS matrix where each element has the
value VALUE.

Oct 2, 2008 at 5:17pm

aortizb (40)

Thank you. I have also tried the your two above suggestion. But I am looking for something which is more efficient than that. For a large 2d vector the above code will generate several reallocations which is expensive. That is why I would like to use reserve() method, but it seems that it is not possible to do it. So, I will have to survive with what is available.

Oct 2, 2008 at 5:25pm

exception (323)

push_back() runs in amortized constant time due to exponential growth, no reserve() or other hacks are required. So what is the big deal, anyways? Have you used a profiler and it shows that the memory allocation is a problem? I doubt it, and even if so, it is most certainly not due to push_back()s. Efficiency is achieved by good algorithms, not by implementation details.

Oct 2, 2008 at 6:29pm

jsmith (5804)

If you are concerned about performance, then in my 1st example above you just need
to add 2 calls to reserve: the first one at line 2, to reserve NUM_VECS elements in twodvec, and the second one between lines 4 and 5 to reserve NUM_NUMS elements in v.

My 2nd example above already is as efficient as it's going to get, though, since the vectors will construct themselves with only a single allocation each.

Oct 2, 2008 at 6:42pm

aortizb (40)

It seems reasonable. The only doubt I have then is how reserve() will know in line 2 the size that "v" will have between lines 4 and 5. I would think that reserve() will guess a number, say 10, but after when reserve() is called to allocate "v" the length of it might be a larger number, say 20, so reserve() has to allocate again. So, if this is encountered several times during the loop it might probably not be efficient. Maybe I am wrong?.

Oct 2, 2008 at 8:05pm

jsmith (5804)

It doesn't need to know. A vector has three data members: start, end, and last, each of which are pointers. If sizeof(T*)==4 then twodvec has to allocate enough space to hold (3*4)*NUM_VECS bytes, which is what it will do.

Topic archived. No new replies allowed.