Mar 2, 2010 (last update: Oct 12, 2011)

Linked Lists

Score: 3.3/5 (424 votes)
*****

Linked Lists


The linked list is used in many libraries/applications and for a good reason. The following are advantages it has over other containers, indirectly from the reference page on cplusplus.com:

  • Efficient insertion and erasure of elements anywhere in the container (constant time).
  • Iterating over the elements in forward order (linear time).
  • Efficient moving elements and block of elements within the container or even between different containers (constant time).

The following only applies to a doubly linked list which I will explain later on:
  • Iterating over the elements in forward or reverse order (linear time).

Something the reference doesn't explain is how it's implemented. Why would you *want* to know about that? For me, simply curiosity. For others, perhaps they may create their own type of linked list container. Either way, it's knowledge that *someone* will eventually use hopefully.


Design

The linked list is generally pictured as a list of linear nodes that are tethered together somehow. In C/++, you generally have a structure which contains data and a pointer to the next structure container which contains data and a pointer to the next structure container... and so on. The linked list's main advantage is that it doesn't contain data in a contiguous way and rather, a flexible way. This allows for fast insertion and better overall iteration. The linked list is often even used as the basis for other containers (such as queue or stack containers).

There are several variations of the linked-list. Actually, the term "linked list" doesn't really refer to the implementation (thus, an abstract term), just the nature of how the data of the container is held (linked via references). The most common implementation of the linked list is probably the doubly linked list. The doubly linked list is a list that contains nodes that contains references to the previous *and* next link in the list. This allows for fast node erasure, faster fetching of end nodes, and more flexible iteration.

At the cost of flexibility of the doubly-linked list, it generally costs more memory to use. In large lists, consider the idea of a node being twice the size as normal. This can affect the application quite a bit. In the case that you have no reason to iterate backwards, the doubly-linked list can be considered inefficient, simply by design. std::list is a doubly linked list. As a result, if I ever come about the need for a singly-linked list, I implement my own. A singly-linked list can only iterate forward simply because the node it holds only holds a reference to the next node in the list and not the previous but comes with the merit of one less reference.

Another type of linked-list that's not used quite as often is the circular linked-list. All this is a singly/doubly-linked list where the end node iterates forward to the beginning node (and possibly vice versa). There aren't many uses for this as it generally has problems with iterating through the list but take, for instance, a list that iterates through the nodes until a given signal is received. Also, it's really a technique used with linked lists... not always in implementation although special list implementations can handle circular linked lists better than others: TODO: Add example...

In embedded systems, the use of linked lists can be expensive. The holding of each reference can be so heavy that it's undesirable to use a linked-list with nodes that only hold one location of data. Instead, they generally used what's called unrolled linked lists. In an unrolled linked list, the node holds a reference to the next and previous, like normal, but each node holds arrays of data. When you iterate through each node, you iterate through the data in the array linearly, then move on to the next node. This way we can have three or four nodes of data in one while reducing the amount of nodes altogether (therefor saving memory). However, this generally uses contiguous memory to hold nodes so it becomes difficult to move nodes that are in the array. TODO: Give an example.

Wikipedia has fantastic pictures: http://en.wikipedia.org/wiki/Linked_list#Linear_and_circular_lists

Implementation


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
template <typename T>
struct SNode //Singly-linked list node
{
  SNode () : _next(0) {}
  SNode (T data) : _data(data), _next(0) {}
  SNode (T data, Node<T>* next) : _data(data), _next(next){}
  SNode (Node<T>* next) : _next(next) {}

  T data;
  Node<T>* next; //Only the next reference.
}

template <typename T>
struct Node : public SNode<T>
{
  Node<T>* prev; //This is the only difference in structure!
}


This is probably the most basic form of a list structure. However, this doesn't show a list being useful in anyway. We can simplify the interface greatly by adding complexity to implementation. std::list uses a parent class while controlling nodes internally using abstract methods (iterators) for access to nodes. It's of my belief that the interface that std::list provides is good so I'll make something that resembles such a thing:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
#include <iostream>

template <typename T>
class List;

template <class TNode>
class Iterator
{
	/* Helper class to provide pointer like facilities around a node */
	friend class List<typename TNode::value_type>;
	TNode* pNode; //The node oriented with this instance of iterator.
	
	Iterator(TNode* _pNode) : pNode(_pNode) {}
public:
	void operator++(){ pNode = pNode->_next; }
	void operator++(int){ pNode = pNode->_next; }
	bool operator!=(Iterator<TNode> rval){ return !(pNode == rval.pNode); }
	bool operator==(Iterator<TNode> rval){ return (pNode == rval.pNode); }
	typename TNode::value_type operator*(){	return pNode->_data; }
	Iterator<TNode> operator+(int _i)
	{
		Iterator<TNode> iter = *this;
		for (int i = 0; i < _i; ++i)
		{
			if (iter.pNode) //If there's something to move onto...
				++iter;
			else
				break;
		}
		return iter; //Return regardless of whether its valid...
	}
};

template <typename T>
class Node
{
	friend class List<T>;
	friend class Iterator<Node<T> >;
	Node () : _next(0) {}
	Node (T data) : _data(data), _next(0) {}
	Node (T data, Node<T>* next) : _data(data), _next(next){}
	Node (Node<T>* next) : _next(next) {}

	T _data;
	Node<T>* _next;
public:
	typedef T value_type;
};

template <typename T>
class List
{
	Node<T>* first;
	
public: 
	typedef Iterator<Node<T> > iterator;
	typedef T 		  value_type;
	
	List () : first(0) {}
	~List() 
	{ 
		if (first) 
		{ 
			Node<T> *iter = first;
			while (iter != 0)
			{
				Node<T>* tmp = iter;
				iter = iter->_next;
				delete tmp;			
			}
		}
	}

	void push_back(T data)
	{
		if (first)
		{
			Node<T> *iter = first;
			for (; iter->_next != 0; iter = iter->_next); //Iterate until we reach the end of our linked list.
			iter->_next = new Node<T>(data);
		}
		else
			first = new Node<T>(data);
	}; 
	
	void push_front(T data)
	{
		if (first)
		{
			Node<T> * tmp = new Node<T>(data);
			tmp->_next = first;
			first = tmp;			
		}
		else
			first = new Node<T>(data);
	}
	
	iterator begin(){ return iterator(first); }
	iterator end(){ return iterator(0); }
	
	bool erase(iterator& _iNode) //True for success, vice versa
	{
		/* This is rather inneffecient. Maybe a better way to do this? */
		/* Even then, it's *still* more effecient than a contiguous container */
		if (_iNode.pNode == first)
		{
			first = first->_next;
			delete _iNode.pNode;
			return true;
		}
		else
		{
			for (Node<T>* iter = first; iter->_next; iter = iter->_next)
			{
				if (iter->_next == _iNode.pNode) //Find our node.
				{
					iter->_next = _iNode.pNode->_next;
					delete _iNode.pNode;
					return true;
				}
			}
		}
		
		return false;
	}
};

int main(void)
{
	List<int> list;
	list.push_back(3);
	list.push_back(4);
	list.push_front(2);
	list.push_front(1);
	
	/*Print all elements*/
	for (List<int>::iterator iter = list.begin();
	     iter != list.end(); ++iter)
	{	
		std::cout << (*iter) << std::endl;
	}
	
	/*Delete second element and reprint*/
	List<int>::iterator tmp = list.begin() + 1;
	list.erase(tmp);

	for (List<int>::iterator iter = list.begin();
	     iter != list.end(); ++iter)
	{	
		std::cout << (*iter) << std::endl;
	}
	
	/*Now delete first node and print again*/
	tmp = list.begin();
	list.erase(tmp);
	
	for (List<int>::iterator iter = list.begin();
	     iter != list.end(); ++iter)
	{	
		std::cout << (*iter) << std::endl;
	}
	
	std::cin.ignore();
	//List object takes care of deletion for us.
	return 0;	
}


This is a huge improvement as far as feature goes. We now have (basic) memory management for our nodes, and ease-of-use iterators to iterate through our nodes without the dangers of pointers. What more could you possibly want?

The above is a quick implementation of a singly linked list. If you look at the code, it's relatively straight-forward and self explanatory (through logic). What isn't, has been commented on.

On the above, there are *plenty* of things that we can change and customize for our specific needs. For instance, (*)the iterator of the class can contain both the previous and current node to help make erasure a tad more efficient at the cost of memory and iteration time. For instance, if you have a large list and you keep moving in keep having to reiterate through the list, perhaps this isn't quite as efficient because of the added constant of reassigning multiple node references. If you are continuously erasing and/or swapping elements from the list then this would be highly efficient since you need to change the next reference held by the previous node of what is getting swapped or erased.

There's also a way to reduce the cost of memory with a doubly-linked list called XOR linking which uses XOR encryption on the pointer to shrink the size of memory used. TODO: Provide example.