Indexed List - Criticism Required

Oct 20, 2010 at 4:29am
closed account (3hM2Nwbp)
Hello, for the framework I've been making I have need of an index oriented list that must be very fast in both linear and indexed look-ups. I've made this class and would like any suggestions for improvements or pitfalls that I may have missed. Specifically, I have a question on line 61. Thanks in advance for your time.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
#ifndef _INDEXED_LIST_
#define _INDEXED_LIST_

#include <boost/thread/mutex.hpp>
#include <list>

template<class T>
class Indexed_List
{

	private:
		
		/// A mutual exclusion to synchronize access between multiple threads of execution
		mutable boost::mutex mutex_;

		/// A list containing indicies that are available for use
		std::list<unsigned int> inactiveIndicies_;
		
		/// A list of pointers to templated objects that have been assigned an index
		/// This list is used for iteration purposes - while iterating over this list, explicit locking & unlocking is required
		std::list<T*> objects_itr_;
		
		/// An array of pointers to templated objects that have been assigned an index
		/// This array is used for indexed lookup purposes
		T** objects_idx_;

		/// The maximum capacity of this Indexed_List
		unsigned int capacity_;
	
		/// The default maximum capacity of this Indexed_List
		const static unsigned int _DEFAULT_CAPACITY_ = 0xFFFF;

		/// Initialises this Indexed_List
		/**
		 * @throws
		 *		If stack overflow occurs
		 */
		void init(void)
		{
			this->objects_idx_ = new T*[this->capacity_];
			for(auto index = 0; index < this->capacity_; index++)
			{
				this->inactiveIndicies_.push_back(index);
				this->objects_idx_[index] = nullptr;
			}
		}

	public:

		/// Subscript Overload
		/**
		 * @throws
		 *		std::exception if the index is out of the bounds of this Indexed_List
		 */
		T* operator[](unsigned index) const
		{
			if(index >= this->capacity_) //Thanks Disch
			{
				throw std::exception("Index out of bounds");
			}
			boost::mutex::scoped_lock(this->mutex_); /// Question : Do I need to synchronize here?
			return this->objects_idx_[index];
		}

		/// Constructs an Indexed_List with the default maximum capacity of 65535
		/**
		 * @throws
		 *		If stack overflow occurs
		 */
		Indexed_List(void)
			: capacity_(_DEFAULT_CAPACITY_)
		{
			this->init();
		}

		/// Constructs an Indexed_List with the specified maximum capacity
		/**
		 * @param capacity
		 *		The maximum capacity of this Indexed_List
		 *
		 * @throws
		 *		If stack overflow occurs
		 */
		Indexed_List(unsigned int capacity)
			: capacity_(capacity)
		{
			this->init();
		}

		/// Disables access to the backing lists
		/**
		 * @throws
		 *		Nothing
		 */
		void lock(void) //const - Compiler rules would allow, but lock explicitly screams out 'I am changing the state of this object'
		{
			this->mutex_.lock();
		}

		/// Enables access to the backing lists
		/**
		 * @throws
		 *		Nothing
		 */
		void unlock(void) //const - Compiler rules would allow, but unlock explicitly screams out 'I am changing the state of this object'
		{
			this->mutex_.unlock();
		}

		/// 1) Attempts to obtain an index for the specified object
		/// 2) If an index is available, the specified object is stored in this Indexed_List until it is explicitly removed
		/**
		 * @param pointer
		 *		A pointer to an object to aquire an index for
		 *
		 * @return
		 *		True if the index was successfully aquired, false if the capacity of this Indexed_List has been reached
		 *
		 * @throws
		 *		Nothing
		 */
		bool aquireIndex(T* pointer)
		{
			boost::mutex::scoped_lock(this->mutex_);
			if(this->inactiveIndicies_.size() == 0)
			{
				return false;
			}
			unsigned int index = this->inactiveIndicies_.front();
			this->inactiveIndicies_.pop_front();
			this->objects_itr_.push_front(pointer);
			this->objects_idx_[index] = pointer;
			pointer->setIndex(index);
			return true;
		}

		/// Removes the specified object from this Indexed_List and releases its index
		/**
		 * @param pointer
		 *		A pointer to the object to remove from this Indexed_List
		 *
		 * @throws
		 *		std::exception if the index is outside of the bounds of this Indexed_List
		 */
		void releaseIndex(T* pointer)
		{
			unsigned int index = pointer->getIndex();
			if(index >= this->capacity_) //Thanks Disch
			{
				throw std::exception("Index out of bounds");
			}
			boost::mutex::scoped_lock(this->mutex_);
			this->objects_itr_.remove(pointer);
			this->objects_idx_[index] = nullptr;
			this->inactiveIndicies_.push_front(index);
		}

		/// Retrieves the pointer at the specified index
		/**
		 * @param index
		 *		The index that holds the desired pointer
		 *
		 * @return
		 *		The pointer at the specified index
		 *
		 * @throws
		 *		std::exception if the index is outside of the bounds of this Indexed_List
		 */
		T* at(unsigned int index) const
		{
			if(index > this->capacity_)
			{
				throw std::exception("Index out of bounds");
			}
			boost::mutex::scoped_lock(this->mutex_);
			return this->objects_idx_[index];
		}

		/// Retrieves an iterable list of all elements in this Indexed_List
		/// Explicit synchronization, via the lock and unlock methods, is required while iterating over the returned list
		/**
		 * @return
		 *		An iterable list of all elements in this Indexed_List
		 *
		 * @throws
		 *		Nothing
		 */
		const std::list<T*>* all(void) const //Thanks Disch
		{
			return this->objects_itr_;
		}

		/// Destroys an Indexed_List
		/// This does not destroy the contents of the list
		~Indexed_List(void)
		{
			if(this->objects_idx_ != nullptr)
			{
				delete this->objects_idx_;
				this->objects_idx_ = nullptr;
			}
		}
};

#endif //_INDEXED_LIST_ 


Example Usage:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
#include <iostream>
#include "Indexed_List.hpp"

struct Thingy
{
private:
	unsigned int index_;

public:

	void setIndex(unsigned int index)
	{
		this->index_ = index;
	}


	unsigned int getIndex(void) const
	{
		return this->index_;
	}
};


int main()
{
	try
	{
		Indexed_List<Thingy> thingys;
		Thingy thingy;
		thingys.aquireIndex(&thingy);
		std::cout << "Thingy1: " << thingy.getIndex() << " : " << &thingy << " =?= " << thingys[thingy.getIndex()] << std::endl;
		thingys.releaseIndex(&thingy);
	}
	catch(std::exception& exception)
	{
		std::cout << "Exception: " << exception.what() << std::endl;
		return 0xBADF00D;
	}
	std::getchar();
	return 0;
}


Thingy1: 0 : 0046FCEC =?= 0046FCEC


Best regards,
Luc
Last edited on Oct 20, 2010 at 6:03am
Oct 20, 2010 at 5:36am
I didn't look at it in great detail but here are my initial thoughts:

1) you have an off-by-1 error on line 57 (should that be >=?)

2) what benefit does this class have over just using std::list or std::map directly? I can't tell what the point is. It just looks to consume more memory and add more overhead. (EDIT: or std::vector, even?)

3) You're half-assing thread safety. The class gives the impression that it's threadsafe, but it's easy to sidestep. Particularly with the all() function (std::list may use COW, which makes such usage thread unsafe)

4) Although... thread safety doesn't really belong in a container anyway, since the data it contains needs to be threadsafe also (and this is impossible to provide through a container). What good does adding thread safety to the [] operator do if you still need to put the actual data access behind a mutex anyway.

5) all() should probably be returning a const reference to the list and not a copy, although even that would be prone to misuse.

6) read point #2 again ;P
Last edited on Oct 20, 2010 at 5:38am
Oct 20, 2010 at 5:53am
closed account (3hM2Nwbp)
1) Yep :P

2) Surely iterating over a vector (skipping null elements) would be much slower than keeping a compact list? Especially when there are up to 0xFFFFFFFF elements. The std::list is in place for iteration, while having an array to handle index look-ups. If you know of a faster method, that would be great.

3) Even while explicitly calling Indexed_List::lock()? (As documented)

4) It makes more sense to me to lock the container with a built-in locking mechanism rather than to have a mutex in the class that uses the container and locking on that mutex. I could (probably) be wrong?

5) Intended to be const std::list<T*>* all(void) const but would still allow that pointer to persist outside of the expected scope (after calling Indexed_List::unlock()) and kill the synchronization :(

6) Did twice :P

Thanks for the responses,
Luc
Last edited on Oct 20, 2010 at 6:07am
Oct 20, 2010 at 6:07am
2) Use a list then. I don't see why you'd need to have indexed lookup at all. The only reason you'd need to have indexes is for random access... and the whole idea of having random access is destroyed if you have null elements. Why can't you just use iterators instead of indexes?

3) This is what I meant by half-assing. If it's the user's responsibility to make it thread safe, why does the class try do it internally?

4) I just don't see what benefit it has. It's all about delegating responsibility.

If you want to say it's the class's responsibility to make it thread safe, then the class should be 100% thread safe. But this probably isn't the way to go since it's arguably impossible (or at the very least impractical) with a container class because access to the data it's containing cannot [reasonably] be guarded.

If you want to say it's the user's responsibility, then leave it up to the user.

Splitting the job between the user and the container just adds unnecessary complication and is almost never a good idea.
Last edited on Oct 20, 2010 at 6:13am
Oct 20, 2010 at 6:12am
closed account (3hM2Nwbp)
2) Consider trying to look up an object by index using a list. Each element would have to be compared against desired index (up to 0xFFFFFFFF times). The indexed look-up is required because each entity in my implementation has to have a globally unique (and constant) identifier for the scope of their session.

3) I'm not sure what the majority of professional programmers think, but keeping the mutex responsible for protecting the backing lists encapsulated within the class makes the most sense to me. Suppose an implementation consists of 100 indexed lists, that would require 100 external mutexes, correct?

*Edit - I think that part of my ideology behind (3) is my experience with the Synchronized(map/list) wrapper that I had used for a long while in the Java language where the only time that explicit synchronization was necessary was during iteration.

Thanks for your patience with me,
Luc
Last edited on Oct 20, 2010 at 6:19am
Oct 20, 2010 at 6:17am
2) But why would you have an index? Instead of using an index, use an iterator. Then there's no problem.

3) My point is that whoever is using this class still has to guard accesses to the class. Example:

1
2
3
4
5
6
7
8
9
10
// assuming 'mylist' is an Indexed_List
mylist[5]->Foo();  // this is not threadsafe because the call to Foo() is not guarded

// to make it threadsafe, the user has to explicitly lock:
mylist.lock();
mylist[5]->Foo();  // now this is threadsafe
mylist.unlock();

// but since the user had to do the locking themselves... what's the point of the container doing it?
// it's redundant and pointless. 


EDIT:


I really need to sleep, but I'll check in on this topic tomorrow if you reply.
Last edited on Oct 20, 2010 at 6:22am
Oct 20, 2010 at 6:24am
closed account (3hM2Nwbp)
I think that I need to provide more information on my goal with this. This is part of a client-server setup in which the clients only have knowledge of each-other using their server-assigned indices. The server keeps track of which clients to update based upon the specific index assigned to the clients when they connect to it. The index assigned to a specific object must remain unchanged. (Believe me, I've thought about this for a while before going with this approach)

1
2
3
4
5
6
7
8
9
10
11
12
13
Indexed_List<Session> sessions; //defined in a server instance-wide scope
boost::mutex mutex; //defined in a server instance-wide scope

Session* someSession;
void sessionToSession(unsigned int otherSessionIndex)
{
	//otherSessionIndex was read from a client packet
	mutex.lock();
	Session* sessionToInteractWith = sessions[otherSessionIndex];
        //Do something with this session
	//sessionToInteractWith->sendMessageFromAnotherSession(someSession);
	mutex.unlock();
}



I was under the impression that when adding to a list, all of the indices were shifted to make room for the new addition, thus changing the indices associated with each object.

1
2
3
4
5
6
7
std::list<int> myList;
myList.push_front(4);

//myList[0] is 4 here

myList.push_front(5);
//myList[0] is now 5? 


I'll confirm whether this is the case now.

Thanks,
Luc

*Edit
1
2
3
4
5
6
7
8
9
10
11
12
13
boost::mutex myListMutex; //assuming correct scope

// assuming 'mylist' is an Indexed_List
mylist[5]->Foo();  // this is not threadsafe because the call to Foo() is not guarded

// to make it threadsafe, the user has to explicitly lock:
myListMutex.lock();
mylist[5]->Foo();  // now this is threadsafe
myListMutex.unlock();

//Do most users really prefer to define their own mutexes?
//(I don't program professionally, so I haven't any experience with users other than myself)
Last edited on Oct 20, 2010 at 7:02am
Oct 20, 2010 at 3:35pm
This is part of a client-server setup in which the clients only have knowledge of each-other using their server-assigned indices.


Here's what you don't seem to be understanding: You don't need an index for this.

Ask yourself the following questions:

1) Does the index itself have any significance other than being a unique identifier for each entry in the container?

2) Do you need to perform mathematical calculations on the index in order to retrieve other elements in the container (ie: container[index - 10] to get the element 10 places down)?

If the answer to the above questions is "no", then there is no reason to use an index. An iterator will work just as well for your purposes, and can be used just fine with a list.

KISS. You don't need to design a whole new container. Just use std::list.

I was under the impression that when adding to a list, all of the indices were shifted to make room for the new addition


Lists don't really have indexes.

But yes.. when you pop_front in a list, the order in which everything is stored changes... but that doesn't matter because the elements themselves have not moved in memory.

What lists do have is iterators that remain valid even after you add/remove elements.

For example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
std::list<int> mylist;
mylist.push_back(5);

std::list<int>::iterator i = mylist.begin();

std::cout << *i;  // prints '5'

// add/remove a bunch of crap
mylist.push_back(0);
mylist.push_back(0);
mylist.push_front(0);
mylist.push_front(0);
mylist.push_back(0);
mylist.push_back(0);

mylist.pop_front();
mylist.pop_back();

// try again:
std::cout << *i;  // still prints '5'.  i hasn't changed.

// it doesn't matter how many times you add/remove from the list... as long as you don't remove
//  the element i is referring to.  i will never change/move. 



Do most users really prefer to define their own mutexes?


When appropriate, yes. It's an encapsulation thing. It's about organizing/assigning responsibilities between different areas of the program.

The mutex should be owned by whoever is responsible for locking it. If the container owns the mutex, then the container should be the only one that is locking it. The user should know nothing about it.

However since the container cannot do that job adequately, and the user still has to be responsible for the mutex themselves, it doesn't really make any sense for the container to have the mutex.
Oct 20, 2010 at 4:45pm
closed account (EzwRko23)
If you really need O(1) indexed lookups and fast iteration over the whole list, and fast O(1) add/removal of elements without changing the indexes... why not use a hash_map?!


Topic archived. No new replies allowed.