DFS and DLS Algorithm

I am supposed to write a C++ code to answer the following questions but before that I need to be able to answer the questions on paper to write the correct code.

I have a graph as follows
                                         A
                                    /    |   \
                                   /     |    \
                                  /      |     \ 
                                 /       |      \
                                /        |       \
                               B         C        D
                             / |        /\      / | \
                            /  |       /  \    /  |  \
                           E   F       G  H    I  J   K
                          /|   |\      |  |    |  |\   \
                         / |   | \     |  |    |  | \   \
                        L  M   N  P    Q  R    S  T  U   V
                       /  /\
                      /  /  \
                     W  Y    Z

Here are my questions.

1) If A’ is the initial state and V’ is the goal state, how many states will depth-first search visit in the worst case?

my ans:
 In the worst case, DFS will visit 4 states from A' to V' along the longest path (A' -> D' -> K' -> V')
P.S: I don't get what "worst case" means here.


2) If A’ is the initial state and W’ is the goal state, how many operators will be on the path found by depth-first search?

my ans:
Here's the path from A' to W':
A' -> B' -> E' -> L' -> W'
To count the number of operators, we need to count the edges traversed in the path:
A' to B' (1 operator)
B' to E' (1 operator)
E' to L' (1 operator)
L' to W' (1 operator)

P.S: what is the difference of the question 2 and 1?


3) What is the lowest depth limit that will allow depth-limited search to find a path from any of the 24 states to any of the remaining 23 states?

my ans:
A' -> B' -> E' -> M' -> Y'. This path consists of 4 edges.



4) If B’ is the initial state and V’ is the goal state, what is the lowest cutoff needed by depth-limited search for solving this instance?

my ans:
starting from level 1 (B'), traversing through levels 0 (A') and 1 (D'), and then reaching level 2 (K'), and finally level 3 (V'), the path consists of 2 depth transitions. Therefore, the depth limit needed for depth-limited search (DLS) to find a path from B' to V' is 2

P.S: what "lowest cutoff needed" means here?


5) What is the minimum number of nodes expanded by depth-first search if A’ is the initial state and K’ is the goal state?

my ans:
The shortest path from A' to W' is A' -> B' -> E' -> L' -> W', then indeed, the depth-first search (DFS) algorithm will expand the nodes A', B', E', and L' before reaching the goal state W'. Therefore, the minimum number of nodes expanded by DFS from A' to W' is 4, including the nodes A', B', E', and L' along the shortest path.


I would really appreciate if someone could confirm my answers.
Last edited on
1) # of vertex + # of edges is the usual answer here. This should be explained in your studies, similar to how BST is shown/proven to be lg(N) in most books? You can figure this out too, but it should have been proven/explained somewhere.

2) what are you calling an operator here? It sounds like 'for this classroom' jargon more than c++ or algorithm class discussion.

3) I think so. I call it 5, you call it 4, but I think its just what exactly you want to count. Both could be correct depending on what the teacher wants.

4) ... what does the book or teacher say about lowest cutoff? Is it defined anywhere? I can guess, but I shouldn't have to.

5) I think so. Again, depends on WHAT they want you to count.

-- this exercise is like studying bubblesort: its how not to do things. I can't think of any reason to store data this way or to craft such an inefficient mess of a data store
1)
V is a leaf, so the number of nodes (states) that will be visited in the worst case is all of them.

A depth-first search completely explores each path before exploring the next. So if the search starts by exploring branch B, then C, then D... then I, then J... and so on in order — then V winds up being the very last node visited.

(This is quite predictable, since no DFS I know of chooses paths to search randomly — rather, the path is chosen based entirely on the order of child nodes. So, given your picture above, the order of nodes is A→[B,C,D], B→[E,F], etc, meaning your DFS will indeed visit all nodes first.

The only exception is if the nodes have an ordering on them, for example over a binary lookup tree or something similar, where the paths to search can be pruned based on the node value. That would be a specialized case, though, and not the general answer for any random DFS tree search.)



2)
“Operators”? == edges?

Generally speaking, the number of edges from N₁ to N₂ is: (depth of N₁) + (depth of N₂) - (edges in common).

Since A is the root node, this question can be restated as: how many edges from A to W?


3)
In order to find any node, its depth must be ≤ the minimum search depth. So, to find any two nodes, choose maximum( depth of N₁, depth of N₂ ).

The question specifically asks about finding any two nodes in the tree, so the maximum will be simply the depth of the tree.


4)
maximum( depth of B, depth of V )


5)
Remember, DFS takes nodes in order. So, tracing through the search, we go:
  A→B
    B→E
      E→L
        L→W
      E→M
        M→Y
        M→Z
    B→F
  ...
In the end we need to follow 22 edges before finding K.

Keep in mind that the example tree conveniently labels all nodes in search order, just to make your life easier. Should you have a tree that looks like:

1
2
3
4
5
      Q
     / \
    K   E
       / \
      A   R

The search order is still left-to-right. So it would follow:
  Q→K  Q→E  E→A  E→R

Always remember that there is a difference between the way we represent things to ourselves and how we actually lay data out in memory.

Hope this helps.
Last edited on
@jonnin
Graphs like this are one of the core data storage designs in CS. They are widely used in all kinds of spaces. Heck, programming languages are even designed around them (like Scheme).

Its ubiquity and utility is why people need to study it. I suspect our OP is studying in a language other than English, hence the odd (to us) choice of vocabulary.
Last edited on
@Duthomhas Thank you for your answer. I think the same, I just wasn't sure about it.

Yes.
 2) Operators”? == edges?


depth of B is one and depth of V is 3, so based on
4) maximum( depth of B, depth of V )

the answer will be 3. In this case we travel to A from B and after traversing 3 level down we reach to V. but what if B was connected to D? Then it would be 2?

Are sure about
 5) Remember, DFS takes nodes in order

I think the question did not mentioned anything regarding the order. I though the minimum happens when I expanded first, Then D and K will be visited. I thought the answer is 2.
Last edited on
@jonnin Thank you for your answer.

I think this
 1) of vertex + # of edges is the usual answer here.
is the time complexity of the problem and it doesn't say how many vertices will be visited in worst case.

It's the edge
2) what are you calling an operator here?


which one is correct? 4 or 5?
3) I think so. I call it 5, you call it 4


It's not a homework. Actually, I haven't found "lowest cutoff" anywhere. Mostly, my confusions are the questions and the way they're phrased not the answers. I also don't know what does "lowest cutoff" mean?
4) ... what does the book or teacher say about lowest cutoff?


As I mentioned earlier, The questions are not formulated very well.
this exercise is like studying bubblesort: its how not to do things. I can't think of any reason to store data this way or to craft such an inefficient mess of a data store


Last edited on
I've shared the DSF code in case someone might find it useful later on.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
#include <iostream>
#include <stack>
#include <unordered_map>
#include <vector>
#include <string>
#include <set>

using namespace std;

unordered_map<char, vector<char>> graph = {
    {'A', {'B', 'C', 'D'}},
    {'B', {'A', 'F', 'E'}},
    {'C', {'A', 'H', 'G'}},
    {'D', {'A', 'K', 'J', 'I'}},
    {'E', {'B', 'M', 'L'}},
    {'F', {'B', 'P', 'N'}},
    {'G', {'C', 'Q'}},
    {'H', {'C', 'R'}},
    {'I', {'D', 'S'}},
    {'J', {'D', 'T', 'U'}},
    {'K', {'D', 'V'}},
    {'L', {'E', 'W'}},
    {'M', {'E', 'Z', 'Y'}},
    {'N', {'F'}},
    {'P', {'F'}},
    {'Q', {'G'}},
    {'R', {'H'}},
    {'S', {'I'}},
    {'T', {'J'}},
    {'U', {'J'}},
    {'V', {'K'}},
    {'W', {'L'}},
    {'Y', {'M'}},
    {'Z', {'M'}}
};

int dfs(char start, char goal) {
    stack<vector<char>> s;
    unordered_map<char, bool> visited;
    set<char> vertices_visited;

    s.push({ start });
    visited[start] = true;

    while (!s.empty()) {
        vector<char> path = s.top();
        s.pop();
        char node = path.back();

        vertices_visited.insert(node);

        if (node == goal) {
            cout << "Shortest path: ";
            for (char vertex : path) {
                cout << vertex << " -> ";
            }
            cout << "\b\b\b   \n";

            cout << "Number of vertices visited: " << vertices_visited.size() - 2 << endl;

            cout << "Vertices visited: ";
            for (char vertex : vertices_visited) {
                if (vertex != start && vertex != goal) {
                    cout << vertex << " ";
                }
            }
            cout << endl;
            return path.size() - 1;
        }

        for (char adjacent : graph[node]) {
            if (!visited[adjacent]) {
                visited[adjacent] = true;
                vector<char> new_path = path;
                new_path.push_back(adjacent);
                s.push(new_path);
            }
        }
    }

    return -1;
}

int main() {
    char start = 'A';
    char goal = 'W';

    int nodes_expanded = dfs(start, goal);
    if (nodes_expanded != -1) {
        cout << "Minimum number of nodes expanded by DFS: " << nodes_expanded << endl;
    }
    else {
        cout << "Goal state not reachable from the initial state!" << endl;
    }

    return 0;
}


It will answer question 2. by some modification, it'll answer others too.
Cplusc wrote:
Are sure about
5) Remember, DFS takes nodes in order

Yes. I have never seen a general-purpose DFS that takes nodes in any other order than the order in which edges are stored in each node.

Doing so would create more overhead than benefit.
@Duthomhas
Thank you for the answer.
Graphs like this are one of the core data storage designs in CS. They are widely used in all kinds of spaces.


I have never seen a completely linear / brute force graph/tree in use. What is it offering over an unsorted vector? I believe you that its been used, but is there some justification that offsets the complex design vs lackluster performance?
Last edited on
“Lackluster performance”? Compared to what?

Just a week or two ago I implemented a topological BFS over any two corresponding 3D meshes to update shape-based deformations in Blender.

https://blenderartists.org/t/join-as-shapes-by-topology-new-add-on-i-wrote/1512963

Runs wickedly quick, because the meshes themselves are simple, undirected graphs.


So... pick a context. (You cannot store a graph in an unsorted vector.)
I suppose what is bothering me here is the example is a binary tree.

Context... searching every node every time you want one. But if you are just iterating all of them, or know (memorized/stored) a path to it already, not a problem.

I think you probably can force fit a graph into a vector, but its terrible. Vector of nodes, each node had the indices of what you can reach from there... but even setting it up, its trouble as you have to first insert all known nodes, then back up and add the connections after that. Eww :)
Ah, yes, because a binary tree is a specialization designed to speed up accesses to O(log n).

The trick, of course, is that the tree itself cannot have random structure — elements must be inserted in specific locations to maintain that access time.

If, however, you are modelling something like road traffic, then the graph representations become a whole lot harder to optimize without both adding complexity and reducing correctness for one or more potential uses — and making it more resistant to change when dealing with large systems.

Google maps, for example, has the following difficulty: I plug in an address, and it tells me I’ll be driving for a day — to a city three states over. But the address is for a (same-named) city right next to me, to a location ten minutes away. This seems to me a very obvious failure, and yet it happens. It can only happen when there is an error in the underlying graphs, in the attached data catalogues, or in the software examining them.

In any case, I know you (jonnin) are surely aware of all this... I’m just posting because... IDK, feeling wordy, I guess...
I appreciate all your words :) Thank you!

Graphs are actually one thing I have spent the least amount of time using. I spent 2 years on an idiotic idea with the TSP in college, but since then, have not used a 'real' graph in anything significant(many nodes or complex problem). I have used a few trees, and most of those ended up being coded into arrays; an example in a rather old post of mine on translating morse code shows the kinds of things we did long ago. So my graph theory is weak, and its not something that leaps into my mind as "this problem could use a graph" as a first idea -- my school glossed over them (couple weeks on theory and the data structure is all I had) and my early jobs were on small, single core machines where performance was king -- we avoided linked lists, trees, graphs, etc as last resorts and stuck to lookups, arrays mostly. Just the extra pointer storage would have been off putting back then.
Last edited on
Heh, glad I’ve been of use to you. :O)

A large part of my degree was understanding graphs, and I kind of enjoyed learning what little managed to sink into my brain.

Very useful in Design & Analysis of Algorithms, too, which was probably the hardest class I took. I remember professor Shende noticing all our brains were fried and coming forward and sitting down against a desk and telling us something like, “Look, I know this stuff is hard. It took me a while to make sense of it too.” But it was totally worth it — my ability to reason about CS concepts has improved significantly since then. And I still consider myself something of a hack beginner. The guys & gals that live and breathe this stuff do truly amazing things.
Topic archived. No new replies allowed.