Linked List and Null

To cut to the chase, I'm studying Chapter 17 of Stroustrup's PPP, and he gives the following code, which confused me quite a bit (I hate his naming convention for members and function parameters in this example):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
  struct Link {
   string value;
   Link* prev;
   Link* succ;
   Link(const string& v, Link* p = nullptr, Link* s = nullptr)
      : value{v}, prev{p}, succ{s} { }
};

Link* insert(Link* p, Link* n) // insert n before p; return n
{
   if (n==nullptr) return p;
   if (p==nullptr) return n;
   n–>succ = p; // p comes after n
   if (p–>prev) p–>prev–>succ = n;
   n–>prev = p–>prev; // p’s predecessor becomes n’s predecessor
   p–>prev = n; // n becomes p’s predecessor
   return n;
}

int main()
{
   Link* norse_gods = new Link{"Thor"};
   norse_gods = insert(norse_gods,new Link{"Odin"});
   norse_gods = insert(norse_gods,new Link{"Freia"});
   return 0;


I modified it a bit to be a bit more understandable, but the insert function? Well...

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
struct Link {
	//NOTE: THE PARAMETER LIST initializes, unless explicitly initialized
	Link(const string& value, Link* previous = nullptr, Link* next = nullptr)
		:name{ value }, prev{ previous }, next{ next } {}
	Link() {}

	string name;
	Link* prev;
	Link* next;
};

Link* insert(Link* existing, Link* current)
{
	if (!existing) return current;
	if (!current) return existing;
	current->prev = existing;
	current->prev->next = current; //existing->next = current
	return current;
}

int main()
{
	Link* norse_gods{ new Link{"Thor"} };
	norse_gods = insert(norse_gods, new Link{ "Odin" });
	norse_gods = insert(norse_gods, new Link{ "Freia" });
	return 0;
}


So I'm looking at my version of insert and his, and I'm trying to figure out why he has the test for p->prev. It just seems (to me, mind you) that my version does basically the same thing, and if it passes the initial if(existing && current) then there PROBABLY wouldn't be any issues?

This is not meant to be an, "AHA! Gotcha, Stroustrup!" moment, and I fully expect someone to say, "Well dummy, because this..." but from my novice viewpoint, it seems his insert function is a bit more verbose than required.

Just trying to figure out if and how badly I've messed this up. Pointers for me are exceptionally difficult to grasp, as I keep forgetting even the basics (not remembering that they point to objects I'm allocating on the free store) so any help is great. I just keep looking at the code, contrasting it with mine, and saying, "Either I'm missing something important or he wants something totally different, but what I've written seems to do what I suspect he's asking for but it's shorter, somehow..."
First of all, note that you've inverted the semantics of insert(). insert() should add the new node before the existing one, not after.

If existing->next is not null and you overwrite that pointer with current, the entire sublist that came after existing becomes leaked.
Before:
[existing] --> [existing->next] --> ...
After:
[existing] --> [current]
(dangling) [existing->next] --> ...

Also note that, since your insert() is inserting after the existing node and you return current, the list from the point of view of main() is malformed because norse_gods is pointing to the tail of the list instead of the head.
First of all, note that you've inverted the semantics of insert(). insert() should add the new node before the existing one, not after.


I'm glad you brought that up, because I DID note that! In fact, I made a note of it in the book. I saw Stroustrup said it, then I saw him use it as he did previously to add to the end of the list. I myself said, "Shouldn't this be called add() ?" which Stroustrup himself says in the paragraph DIRECTLY after the definition, but since that's how he used it, I thought, "Sure, OK then..."

And it's at THIS point that I have to stop you and ask: if I use the function (I defined) purely as an add() function, which is what I originally wanted to do (adding to the end of the list, then tying my links together and leaving current->next = nullptr to mark the end of the list) does what you say about dangling references still apply? Because that really was my original intent, and what I assumed Stroupstrup was going for too, which added (and still adds) to the confusion. Here is his original code that he (rightly) said was too verbose to be useful:

1
2
3
4
5
Link* norse_gods = new Link{"Thor",nullptr,nullptr};
norse_gods = new Link{"Odin",nullptr,norse_gods};
norse_gods–>succ–>prev = norse_gods;
norse_gods = new Link{"Freia",nullptr,norse_gods};
norse_gods–>succ–>prev = norse_gods;


//back to his insert definition, followed by...

"Given that, we could write:"

1
2
3
Link* norse_gods = new Link{"Thor"};
norse_gods = insert(norse_gods,new Link{"Odin"});
norse_gods = insert(norse_gods,new Link{"Freia"});


You see where my confusion comes from? He's using his insert() function in the way I thought an add() function was supposed to be used, so I assumed that's what he wanted, and where I derived my code.

I DO get what you're saying though! If current were to come in the middle of an existing list:

existing->link = (something)
current->prev = existing;
current->prev->next = current;
//UH OH! Now you've lost everything from current, which now is the new end of the list.

When if I were REALLY defining an insert function and wanted to put current after existing and before what was already there, I would (probably) want something like...
[Existing] -> [Something Else]
Want...
[Current] -> [Existing] -> [SE]

1
2
3
4
5
6
//Check Current and Existing for null... probably SE too...
if(existing->prev) current->prev = existing->prev;
current->prev->next = current;
current->next = existing;
existing->prev = current;
return current; //I think? 


I really need to go back and give this section a sixth or so read. It's all so confusing.
if I use the function (I defined) purely as an add() function, which is what I originally wanted to do (adding to the end of the list, then tying my links together and leaving current->next = nullptr to mark the end of the list) does what you say about dangling references still apply?
IMO it's a bad idea. The function should not leak memory just because it was passed the wrong node, specially since it's easy to detect when this is happening.

You see where my confusion comes from? He's using his insert() function in the way I thought an add() function was supposed to be used, so I assumed that's what he wanted, and where I derived my code.
I don't see what you mean. Both snippets are equivalent. They both construct a list from the back to the front.
Yeah, I don't see the leak :/ I'm not saying it isn't there, just that I don't see it.

I'm going to maybe check out yet another source, since in two books I'm clearly not picking this up, and maybe a YT demo on doubly linked lists and pointers will help. It's obvious I'm missing SOMETHING key, but I don't know what, and should probably stop now before I make what little I understand worse.

I appreciate your help man!
What do you mean you don't see the leak? You yourself wrote
I DO get what you're saying though! If current were to come in the middle of an existing list:

existing->link = (something)
current->prev = existing;
current->prev->next = current;
//UH OH! Now you've lost everything from current, which now is the new end of the list.
Oh that? Yeah I don't want to use that as an insert() function. That's terrible. I wanted to use this as an add() (to end of existing list, as the semantics of add() I assume dictate) function:

1
2
3
4
5
6
7
8
Link* add(Link* existing, Link* current)
{
	if (!existing) return current;
	if (!current) return existing;
	current->prev = existing;
	current->prev->next = current; //existing->next = current
	return current;
}


And for an insert() function, I'm still trying to cook one up, but it's looking something like the above:

1
2
3
4
5
6
//Check Current and Existing for null... probably SE too...
if(existing->prev) current->prev = existing->prev;
current->prev->next = current;
current->next = existing;
existing->prev = current;
return current; //I think?  


I think I may have confused you with all the talk of add() and insert(). Yeah I totally see where what I wrote would make a terrible insert() function, but it seems like it could be a decent (or at least start of) and add() function to the end of an existing list. But yeah, using it to insert in the middle of a list or between two nodes would be an awful idea.
I wanted to use this as an add() (to end of existing list, as the semantics of add() I assume dictate) function
Like I said, an add() function that operates on nodes is a bad idea. It doesn't do anything that an insert() function doesn't do, and it leaks memory if the caller passes a node other than the last. It doesn't matter that the function is called add() instead of insert(). You should assume that the caller will attempt to pass the wrong node at some point. If you don't want to make add() behave like insert() in that case, at the very least attempt to detect that situation and handle it gracefully.

Imagine if someone wrote delete_last_file_in_directory() that deletes the last file in a directory when passed the name of that file, and deletes all files in that directory if it's passed anything else.

An add function does make sense if you have a separate List class that manages the whole list.
1
2
3
4
5
class List{
    Link *head, *tail;
public:
    void add(Link *new_node);
};
Well, the good news is I get it now. I drew and re-drew then drew out a diagram of Stroustrup's insert() function, and I get it. The bad news is I'm an idiot, it took far longer than it should have taken (I'll admit I've been trying to wrap my head around this for an hour), and I never realized that the list was adding to the "end", or, "parents of", so to speak. So yeah, I'm pretty embarrassed. So thinking of it genetically...

1
2
3
4
5
6
7
8
9
10
11
12
Link* insert(Link* p, Link* n)
//n is the parent of p and becomes new head of list, apparently?
{
	if (!n) return p;
	if (!p) return n;
	n->succ = p; //n is the parent of p
	//if p was the child of someone...
	if (p->prev) p->prev->succ = n; //then n becomes the child of p's parent
	n->prev = p->prev; //previous line - tying together
	p->prev = n; //and p's parent is now n
	return n; //n is the eldest now
}


I don't know man. I'm just really ashamed it took this long to figure it out. I don't know how or why I got so confused - probably the wording - but it's still no excuse. Thanks a bunch man!
Topic archived. No new replies allowed.