Iterator vs unsigned for iterating over vector?

Hey all. Just a quick question:
What is considered the best way (efficient, safe, etc) to iterator over a vector.
Should you use an unsigned int and compare it to the size, or an iterator, and compare it to the end of the vector?
1
2
3
4
5
6
7
8
for(unsigned i = 0; i < vector.size(); ++i)
{
vector[i].Foo(); //This
}
for(std::vector<T>::iterator i = vector.begin(); i != vector.end(); ++i)
{
(*i).Foo();      //Or this?
}


I know the name iterator should kind of give it away, but it feels like using a simple unsigned should be more efficient speed wise? Or is there no difference?
In short, both methods are acceptable and neither is a one-size-fits-all.

The former is simpler and the latter is more generic (i.e. you can easily swap the vector for another standard library container with the iterator method). It depends on which matters most in your particular case.

And since we're talking about safety, you might as well use a size_t for indexing to be safe in both 32 and 64 bit builds.
Last edited on
To be (perhaps pendatically) correct, the type of i in the first loop should be std::vector<T>::size_type or at least std::size_t

As for speed: it depends on the vector implementation and on the optimizer.

In the following example, vector implementation stores the end pointer as a member of std::vector, while size has to be calculated as end minus begin, and the compiler couldn't determine that it was a loop invariant, so the first loop turned out less efficient:

1
2
3
4
5
6
7
8
9
10
11
12
13
# first loop (with .size())
..B1.3:
        addq      %r12, %rdi
        call     Bar::Foo()
..B1.4:
        movq      8+vector(%rip), %rcx
        movq      vector(%rip), %rdi
        addq      $8, %r12
        incq      %r13
        subq      %rdi, %rcx
        sarq      $3, %rcx
        cmpq      %rcx, %r13 
        jb        ..B1.3        
# second loop (with .end())
..B1.3:
        movq      %r12, %rdi
        call      Bar::Foo()
..B1.4:
        addq      $8, %r12
        cmpq      8+vector(%rip), %r12 
        jne       ..B1.3
In the following example, vector implementation stores the end pointer as a member of std::vector, while size has to be calculated as end minus begin, and the compiler couldn't determine that it was a loop invariant...

I'm curious: which compiler/vector implementation is that?
that was Intel 11.1 on linux with gnu library (I like all those %r's it uses).

Then I allowed Bar::Foo() to get inlined, and intel compiled the two loops to identical machine code, except for the jb (less than) vs jne (not equal), while gcc kept subtracting, for shame.

It really depends on the rest of the program, though.
Last edited on
For added variety, here are a few more platforms:

Sun, with both C++ libraries
1
2
3
4
5
6
7
8
9
10
# first loop, using .size()
.L900000205:
    call    Bar::Foo()
    add     %i3,1,%i3
    ld      [%i2],%i5
    ld      [%i2+4],%i1
    sub     %i1,%i5,%i0
    cmp     %i3,%i0
    bcs,pt  %icc,.L900000205
    add     %i5,%i3,%o0
second loop, using .end()
.L900000205:
    call    Bar::Foo()
    add     %i2,1,%i2
    ld      [%i3],%i4
    cmp     %i2,%i4
    bne,pt  %icc,.L900000205
    or      %g0,%i2,%o0


IBM
1
2
3
4
5
6
7
8
9
10
11
12
13
# first loop, using .size()
__L180:
        stw     r3,68(SP)
        stw     r4,64(SP)
        bl      Bar::Foo()
        ori     r0,r0,0x0000
        addi    r31,r31,1
        lwz     r0,8(r30)
        lwz     r4,4(r30)
        subf    r0,r4,r0
        add     r3,r31,r4
        cmpl    0,0,r31,r0
        bc      BO_IF,CR0_LT,__L180
# second loop, using .end()
__L180:
        bl      Bar::Foo()
        oril    r0,r0,0x0000
        l       r3,64(SP)
        l       r0,8(r31)
        st      r0,68(SP)
        cal     r3,1(r3)
        st      r3,64(SP)
        cmpl    0,r3,r0
        bc      BO_IF_NOT,CR0_EQ,__L180



HP
1
2
3
4
5
6
7
8
9
10
11
12
# first loop, using .size()
..L5:
        add             r38 = r36, r9
        add             r35 = 4, r37
        add             r36 = 1, r36
        br.call.sptk.many rp = Bar::Foo()
        ld4             r9 = [r37]
        ld4             r8 = [r35]
        add            gp = 0, r32
        sub             r8 = r8, r9
        cmp4.ltu.unc    p6, p0 = r36, r8
        br.dptk.many    ..L5
# second loop, using .end()
..L5:
        add             r37 = 0, r36
        add             r36 = 1, r36
        br.call.sptk.many rp = Bar::Foo()
        ld4             r8 = [r35]
        add             gp = 0, r32
        cmp4.ne.unc     p6, p0 = r36, r8
        br.dptk.many    ..L5


Looks like iterators win in general case, as they should.
To be fair, the compiler will never optimize that ::size() call away on a non-const container.
What if you 'cache' the size instead of requesting the size each loop? The function call should be optimized out for const vectors but with non-const vectors it may not have enough information to optimize it, and the function call every loop will degrade performance. (I think?)
Well, the function call itself could actually optimized out, but you can certainly cache it:

1
2
3
4
for (size_t size = v.size(), n = 0; n < size; n++)
  {
  v[ n ].foo();
  }

I would have actually expected the optimized assembly to inline the call...
@ Duoas: It would if it's inlined, wouldn't it? I can't imagine the size() call being anything other than a simple return _size; in the implementation. Seems like a prime candidate for inlining/optimizing.
Agreed.

I just looked at my GCC 4.3.0's STL and it seems that it keeps a couple pointers to the begin and end of the vector elements, so it actually calculates the size with a subtraction. So, even if it is inlined, it seems that caching the result actually would produce more optimized assembly.
Ah it looks like you replied again just before I sent my reply, so we ended up saying the same thing. Whoops.
To be fair, the compiler will never optimize that ::size() call away on a non-const container

They do optimize it away in some of my tests, given inlinable Foo().
Last edited on
I agree with LB on this, cache the size.

The "most efficient" would be to use a range operation... something like.

std::for_each(myvec.begin(), myvec.end(), CallFoo());

Without iterators the range semantics would not be possible and they are, very optimized algorithms, for the most part.

size varies in the different standard containers, some implementations of list take linear time when checking size, that's why Meyers says in his "Effective STL" to prefer using empty in place of checking size vs 0 (obviously empty doesn't apply here just some added info.)
Last edited on
I should have been more specific in my language, sorry.

I meant that the compiler will never optimize away the ::size(); it could inline the call itself. On a const container, however, it could easily assume that the ::size() never changes, and cache its value.
@Duoas: even if vector is non-const, compilers can optimize out size() if the body of Foo() is entirely visible and can be analyzed:

example

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
#include <vector>
#include <string>
#include <cstdio>
struct Bar {
        std::string data; // payload
        void Foo() const { puts("foo");} // doesn't modify vector
};

std::vector<Bar> vector;

int main()
{
    for(std::vector<Bar>::size_type i = 0; i < vector.size(); ++i)
    {
        vector[i].Foo();
    }
}
..B1.3:
        movl      $_2__STRING.0, %edi
        call      puts
..B1.4:
        incq      %r13 
        cmpq      %r12, %r13 
        jb        ..B1.3

True, but I didn't think any compilers were that are that smart.
Topic archived. No new replies allowed.