Vector class

Pages: 1234
closed account (zb0S216C)
cire wrote:
"When new fails, an exception is thrown. The object is never created."

I know that. My point is that by setting "MyVector::_capacity" to some value before allocating invites the possibility that, if "new" fails, "MyVector::_capacity" would be holding some value which implies that the allocation succeed when it might have failed. Yes, "new" throws an exception upon failure, but that doesn't mean the state of "MyVector::_capacity" will change.

cire wrote:
"Erm.. really? You never inspect elements of a constant object?"

Yes, but I specialise my classes to handle constant types.

Wazzak
Last edited on
cire wrote:
"Erm.. really? You never inspect elements of a constant object?"


Framework wrote:
Yes, but I specialise my classes to handle constant types.


1
2
3
4
5
6
7
8
template <typename T>
T sum( const MyVector<T> & v )
{
    T total = 0 ;
    for ( unsigned i=0; i<v.size(); ++i )
        total += v[i] ;
    return total ;
}


This shouldn't work then (assuming size is made const correct?)
Last edited on
I'm pretty late, but:
ResidentBiscuit wrote:
I just got into the habit of assigning 0 after deletion awhile ago.
It's actually "assign nullptr with deletion", like this:
1
2
3
int  *example = new int;
//blah
delete example, example = nullptr;
It has the same effect as if there were a semicolon instead of a comma, but it's clear that it is done together, and it is not susceptible to some code formatter splitting them onto separate lines.
This container is not exception safe, and therefore has potential memory leaks.

Whenever you are doing manual memory allocation, you have to be very careful about when and where exceptions can occur so they don't kick your butt.

Specifically, let's take a look at your reallocate function. I've marked where you're vulnerable to leaks due to exceptions being thrown at a bad time (notice how it's pretty much the entire function):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
template <typename T>
void MyVector<T>::reallocate()
{
    T* temp = new T[_capacity];  // <- you are vunlerable starting here
    for(int i = 0; i < _size; i++)
    {
        temp[i] = data[i];  // <- it's very possible that the = operator will throw an exception
            // especially if it involves more allocation.
    }
    _capacity = _capacity * 2;  // <- as was already mentioned, changing this prior to
       // confirmed success is unwise, as an exception could leave this reporting
       //  incorrect info
    delete[] data;
    data = 0;
    data = new T[_capacity];  // <- why allocate a 2nd buffer here?
    for(int i = 0; i < _size; i++)
    {
        data[i] = temp[i];
    }
    delete[] temp;  // <- no longer vulnerable
    temp = 0;
} 


Also, I'm not sure why you are allocating two buffers here (and mentioned in the comments). It makes much more sense to just allocate temp as the new buffer, delete the old buffer and assign data to the new buffer. Do only one move rather than two.

IE: you're doing this:
1
2
3
4
5
6
1)  allocate temp
2)  copy data to temp
3)  delete data
4)  reallocate data at 2x capacity
5)  copy temp to data
6)  delete temp


I'm suggesting you do this:
1
2
3
4
1)  allocate temp at 2x capacity
2)  copy data to temp
3)  delete data
4)  data = temp;



Also if you are using C++11, moving objects is potentially faster than copying them (and has less chance of throwing an exception). So rather than assigning values, maybe move them?

 
    temp[i] = std::move( data[i] );




EDIT: I just saw you already addressed the copy-twice problem. NM
Last edited on
@Disch, see the updated file on sourceforge, or the updated reallocate function in an earlier post. I'm aware I haven't pushed the update via git yet, so you may be seeing a slightly dated version. I was also not aware of std::move, I'll look into that. I have done no exception handling yet, I was getting a working class up and running first, then deal with exceptions. My experience with exceptions is limited in C++ so I was going to isolate working with those.

1
2
3
4
5
for(int i = 0; i < _size; i++)
    {
        temp[i] = data[i];  // <- it's very possible that the = operator will throw an exception
            // especially if it involves more allocation.
    }

May I ask why this would throw an exception/cause more allocation?

In regards to the overloaded operator[], I pulled that out of C++ Primer. I don't have it on me right now, but it stated if you're going to overload operator[] you should have both versions that I have included (though I'm not too sure why, the explanation was rather small).

@catfish...
Also what's the use for the else branch in the code below?

I wanted to make sure it would still work correctly even if size was some strange value for some reason (ie negative). I do have size getting decremented in points, so the possibility is there.
After reading that SO link, it doesn't seem like there's really an issue with underscore first as long as the first letter is lowercase, but I'll go in change it. Don't want anything silly to happen because of that :P

@Framework (big post),
1,2) Yea that hadn't occurred to me when writing this. It's obvious now so I'll definitely get on changing it.
3) I had a question about this earlier, I wasn't sure if I should validate it there or let the hypothetical user of this class deal with validating the indices.
4) Responded to above.
5) Why should it need to be const? And once the allocation is made exception safe, and _capacity is only changed when it's valid, should I need to do any checks here?
6) Yea yea I guess it's a little too much.

1) I was gonna work out who to inline once I was happy with the implementation. I don't like to try to optimize a rough draft :)
2) This one is the most interesting to me, though I don't fully understand what you're saying. Mind clarifying there?
3) I figured 10 was a good enough default value (what does std::vector use?), if the user needs a larger initial capacity they can use the second constructor. I do plan on adding a reserve function as well. As an aside, doubling the size at each reallocation was just kind of a place holder. I'm curious if there is a better mathematical way to go about it?
4) I'll look this over. But how would pointers be any faster than indexing arrays?
5) Also interesting, I hadn't thought of this. I also wasn't aware passing scalar types by reference was slower? How would this be fixed, though? I'm not sure how I would be able to distinguish between scalar and object types.
6) Won't this just affect compilation time?

@helios,
I was kind of feeling the same way. The added work to make this work with objects with no default constructor seems like a lot when the alternative is just say "objects used with this must have default constructor". Though this is all just for practice to brush up before a big internship interview, so I may still go ahead and add the functionality.

I'd prefer to not use malloc(), though I guess I don't have anything against it. Placement new is completely new to me, I haven't even heard of it until now. So I guess I'll need to look that up a bit.

p.s.,
I appreciate all the input here! This is awesome. Also, you should look at the file on sourceforge, not the commit on git as I haven't figured out how to commit updated files to git yet, so it's still the first version of this. The file is updated though.
closed account (zb0S216C)
ResidentBiscuit wrote:
"5) Why should it need to be const? And once the allocation is made exception safe, and _capacity is only changed when it's valid, should I need to do any checks here?"

It doesn't have to be constant, but if a function doesn't intend on modifying the object's state then why not? Also, even if the function is exception-safe, it still suffers the same problem. The only way you're going to resolve this is by setting "_capacity" after the allocation takes place.

ResidentBiscuit wrote:
"2) This one is the most interesting to me, though I don't fully understand what you're saying. Mind clarifying there?"

The "new" most people use allocates N * sizeof( T ) bytes and then constructs "N" objects within that allocated memory. Once all objects are constructed, a "void *" is returned to the calling function. Since this is a container implementation, all slots (a slot is where an element is placed when it's added to the container) are automatically filled with default-constructed objects.

What the "std::vector" does is allocate a large block of memory, and then leaves it at that. When "std::vector::push_back( )" is called, an object is constructed within the next free slot. Effectively, the "std::vector" doesn't contain any objects until "std::vector::push_back( )" is called.

ResidentBiscuit wrote:
"3) I figured 10 was a good enough default value (what does std::vector use?)"

The "std::vector" allocates memory for a good few thousand objects.

ResidentBiscuit wrote:
"I'm curious if there is a better mathematical way to go about it?"

Just allocate a bit more memory initially and then double it every re-allocation, just like you're doing now.

ResidentBiscuit wrote:
"4) I'll look this over. But how would pointers be any faster than indexing arrays?"

With indices, you're constantly recomputing the position of the next element with each cycle. With pointers, however, you simply increment a pointer (which points to an element) and jobs a good 'en. Many optimisation guides recommend avoiding the very way you were copying each element.

ResidentBiscuit wrote:
"How would this be fixed, though?"

Meta-programming. The solution isn't a quick one:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
struct False { static bool const Value_ = false; };
struct True { static bool const Value_ = true; };

template< typename T > struct Is_Integral : False { };
template< > struct Is_Integral< char > : True { };
template< > struct Is_Integral< short > : True { };
// ...and so on...

template< typename T > struct Is_Pointer : False { };
template< typename T > struct Is_Pointer< T * > : True { };

template< typename T > struct Is_Scalar
{
    static bool const Value_ = ( ( Is_Integral< T >::Value_ ) || ( Is_Pointer< T >::Value_ ) );
};

template< typename TT, typename TF, bool Cond_ >
struct Enable_If_Else
{
    typedef TT Type;
};

template< typename TT, typename TF > 
struct Enable_If_Else< TT, TF, false>
{
    typedef TF Type;
};

template< typename T >
struct Container 
{
    typedef typename Enable_If_Else< T const &, T, Is_Scalar< T >::Value_ >::Type PassType;

    void Function( PassType Value_ );
};

Long winded, no?

Wazzak
Last edited on
May I ask why [assignment operator] would throw an exception/cause more allocation?


Overloaded assignment operators (specifically those of container classes, such as other vectors or strings) will need to allocate buffers when they're assigned. Failed allocation = exception.

But really, any time you call a function (or a potentially overloaded operator), you are open to that function/operator throwing an exception.



Which reminds me. Your vector class also does not copy itself properly.

1
2
3
4
{
  MyVector<int> a;
  MyVector<int> b = a;
} // <- program crashes here 


You need to overload your assignment operator and copy constructor. And if you want to get C++11 fancy, you can overload the move constructor as well.
Last edited on
Framework wrote:
The "std::vector" allocates memory for a good few thousand objects.


Does it?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#include <vector>
#include <iostream>

int main()
{
    std::vector<int> v ;

    auto data_ptr = v.data() ;
    for ( int i = 1; i < 1000 ; ++i )
    {
        v.push_back(1) ;
        if ( v.data() != data_ptr )
        {
            std::cout << "Reallocation after adding " << i << " elements\n" ;
            data_ptr = v.data() ;
        }
    }
}


Results for GCC:
Reallocation after adding 1 elements
Reallocation after adding 2 elements
Reallocation after adding 3 elements
Reallocation after adding 5 elements
Reallocation after adding 9 elements
Reallocation after adding 17 elements
Reallocation after adding 33 elements
Reallocation after adding 65 elements
Reallocation after adding 129 elements
Reallocation after adding 257 elements
Reallocation after adding 513 elements


VC++:
Reallocation after adding 1 elements
Reallocation after adding 2 elements
Reallocation after adding 3 elements
Reallocation after adding 4 elements
Reallocation after adding 5 elements
Reallocation after adding 7 elements
Reallocation after adding 10 elements
Reallocation after adding 14 elements
Reallocation after adding 20 elements
Reallocation after adding 29 elements
Reallocation after adding 43 elements
Reallocation after adding 64 elements
Reallocation after adding 95 elements
Reallocation after adding 142 elements
Reallocation after adding 212 elements
Reallocation after adding 317 elements
Reallocation after adding 475 elements
Reallocation after adding 712 elements



@ ResidentBiscuit: so _size is a signed integer? Why?

1
2
3
4
5
template <typename T>
T MyVector<T>::get(int index)
{
    return data[index];
} 


Again, a signed integer. Anyway, this function is a good place to throw some exceptions.

The library provides you with some ready-made exceptions:
http://cplusplus.com/reference/stdexcept/
> The added work to make this work with objects with no default constructor seems like a lot when the
> alternative is just say "objects used with this must have default constructor"
It is not so complicated. Take a look at `std::allocator'
I found the restriction too restrictive. Using a dummy default constructor is error prone

> some way of passing the constructor parameters to resize() in a generic manner (perhaps by way of boost::bind?)
void resize( size_t n, const T &value );

> If "T" was a scalar type, you'd be losing some performance due to passing-by-reference.
¿couldn't the compiler do it?


About exception safety, take a look at http://www.stroustrup.com/except.pdf
Last edited on
Also! The MyVector::get() function should return a reference, otherwise you can't change the data. Or if you're OK with this, mark the function as const.
Meta-programming. The solution isn't a quick one:

Wow that is horrible looking. Is it even worth it?

So I should initially allocate memory for sizeof(T) * INITIAL_CAPACITY?
Actually, I'm already doing this. I guess I'm lost on this allocating extra space thing. It now seems like to me that I'm already doing just that.

Which reminds me. Your vector class also does not copy itself properly.

Yea I haven't written the copy control yet. That was actually my next goal.

@catfish,
Yea I should probably make those unsigned. But does that really fix this? Will a compiler not cast to signed if a user tries to enter some negative value? About the get() function, I was considering just dropping it. I had it there before I had the operator[] working, and just left it in case people wanted to use it. Though a getter() function traditionally doesn't allow changing, so I can't imagine someone calling get() and expecting to be able to change the value. So I'll either take it out, or make it const.

It is not so complicated. Take a look at `std::allocator'

Got any pointers on doing this myself? I'm trying to build up a nice portfolio of work, and I think it would look cool if I had this done myself. Though I guess it would be equally as beneficial as showing I know how to use the built-in C++ methods.
Wow that is horrible looking. Is it even worth it?

Not if you write C++11 code... I think.
http://cplusplus.com/reference/type_traits/

Will a compiler not cast to signed if a user tries to enter some negative value?

The value contained in an unsigned int will not be interpreted to be a negative number. So a -1 will magically turn into some huge positive value.
The value contained in an unsigned int will not be interpreted to be a negative number. So a -1 will magically turn into some huge positive value.

Oh duh, I should have remembered that.

Seems like type_traits does the same thing that framework had posted up there.
Just for variety, Sun Studio
Reallocation after adding 1 elements
Reallocation after adding 33 elements
Reallocation after adding 65 elements
Reallocation after adding 104 elements
Reallocation after adding 167 elements
Reallocation after adding 269 elements
Reallocation after adding 434 elements
Reallocation after adding 701 elements

(although it's sometimes really annoying that their default lib grabs a 32-element chunk)
> Wow that is horrible looking. Is it even worth it?
I suppose that passing a std::pair<char, char> would be faster by copy.
I suppose that that would be implementation dependent, or system dependent.
I suppose that's better to be handled by the compiler.
I suppose.


> Got any pointers on doing this myself?
¿what do you understand for `take a look'? Open your `memory' header and start reading.
I hope you're not going to keep calling this class "Vector"? You would be making the same mistake as the STL. We have the std::vector, which is an array, and we have std::array, which is a vector.
`std::valarray' is a vector
`std::array' is an array
ne555 wrote:
void resize( size_t n, const T &value );
Ah-ha! But then you're just moving the problem. Instead of requiring a default constructor you'd be requiring a default constructor or a single parameter constructor.
¿? I'm requiring a copy constructor.
If you prefer use the move version.
Pages: 1234