log N time complexity of a graph

( Graph / Union find)

Hi guys,

I'm studying a course on algorithms,

the proposition is prove that any node x's depth is at most log(base 2) N

let's take the following tree

1
2
3
4
5
6
7
8
9
10
11
12
13
     
       1
      /|\  \   // tree starting with root 8 is merged with 
     2 3 5  \  // tree with root 1 so t1(root 8) = 5 items + t2(root 1) =               
    /  | |   \  // 7 items = 7 + 5 = 12 items
      / / \   \
     4 6  7    8
              /| \
             9 10 11
                    \ 
                    12
   
 


sorry if my diagram isn't the best, 1's children are 2,3,5 and 8 etc etc.

this quote is taken from a mentor on the coursera help section of the course this is how he explains it


For a moment, let us step aside from UnionFind concept. Imagine you had a total of N=16 items. How many times can we halve these items until we reach a single item ?

1 : 16 / 2 = 8
2 : 8 / 2 = 4
3 : 4 / 2 = 2
4 : 2 / 2 = 1

From the abstract above, we see that we can halve these 16 items exactly 4 times, or the lg 16 = 4. This is the very definition of logarithm with base 2.

In video materials, reverse argument is applied. How many times can we double items, starting from 1, until we reach N. As shown, we can only do this lgN times.



Ok so now let us go back to our tree, our tree has 12 items altogether, but when we merge tree 1 with tree 2 we know for an indisputable fact that tree 2 must be the same size or bigger than tree 1, so since tree 1 has 5 items tree 2 must have at least 10 items so N = 10.

ok so now we know N = 10, so according to this



In video materials, reverse argument is applied. How many times can we double items, starting from 1, until we reach N. As shown, we can only do this lgN times.



we can do the following

1 : 1 * 2 = 2
2 : 2 * 2 = 4
3 : 4 * 2 = 8
4 : 8 * 2 = 16

but can you see the problem, 16 is > than N (10), so what do we do? we clearly can't keep doubling until we hit 10 because if we keep doubling we will never hit 10, the closest number we get to 10 is 8.

so my main concern is what happens in this situation?

thanks
Last edited on
Your title mentions graph but you seem to be only dealing with trees (a subset of graphs). Also, you mention log base 2 but the diagram isn't a binary tree so I don't see how base 2 log applies.

but when we merge tree 1 with tree 2 we know for an indisputable fact that tree 2 must be the same size or bigger than tree 1

Why?

the proposition is prove that any node x's depth is at most log(base 2) N

This would be true in a perfectly balanced binary tree.

                6
      3                  9
   2     4            8       11 
1           5      7       10    12

ceil(lg(12)) = 4 # using lg for base 2 log
my bad I should have put tree instead

but when we merge tree 1 with tree 2 we know for an indisputable fact that tree 2 must be the same size or bigger than tree 1

Why?


because we merge the larger tree with the smaller tree, in this case tree 1 has 5 elements and tree 2 has 7 elements, therefore we know that if tree tree 1 has 5 elements when we merge it the size of merged tree is at least twice the size as it was before we merged.

I'm guessing the base 2 comes from when merge trees as we know that the now merged tree must be twice the size in respect to what it was before.

?
Last edited on
So they aren't binary trees but general trees?
And what if tree 1 had 5 nodes tree 2 had a thousand nodes?
Then simply saying the resultant tree must be at least 10 nodes doesn't say much.
Last edited on
yeah just general trees

very true I never even thought of that but still the depth of tree one would still only increase by one

let me copy and paste the whole comment from the mentor maybe that will further elaborate on my question


Propositon:

Depth of node X increases by 1 when tree T1 containing x is merged into another tree T2.
Consider following two trees:

The very idea behind the algorithm is to merge smaller tree into larger tree. Why ? Well, all nodes in tree getting merged will now receive a new root, effectively increasing cost to reach them by 1.

As you can see, before merging, we only needed one hop to reach 5 from 4. However, after merging sub-trees of 1 and 4, depth of nodes 4 and 5 increased by 1, now we need 2 jumps to reach node 5.

Propositon:

The size of the tree containing x at least doubles since | T 2 | ≥ | T 1 |
When two trees are merged together, the size of a tree containing X will at least double with respect to size of the tree before merging. This one should be obvious, we are merging smaller tree into larger which has at least as many items as smaller tree.

Propositon:

Size of tree containing x can double at most lg N times. Why?
You can find answer to the why in video materials but not in slides
For a moment, let us step aside from UnionFind concept. Imagine you had a total of N=16 items. How many times can we halve these items until we reach a single item ?


From the image above, we see that we can halve these 16 items exactly 4 times, or the lg 16 = 4. This is the very definition of logarithm with base 2.

In video materials, reverse argument is applied. How many times can we double items, starting from 1, until we reach N. As shown, we can only do this lgN times.

Putting it together

Initially, each node is a root for itself. When we merge a node with another, we will increase depth of a node by 1 in merged tree. Since we can only merge a single node lgN times, we can also increase size only up to lgN times.

Hope this helps



and maybe the code would help too

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58

class UFF{

  public:
      int size;
      int* id;
      int* sz;


      UFF(int size):size(size){

         id = new int[size];
         sz = new int[size];

         // set each nodes parent to itself
         for(int i = 0; i < size; ++i){
            id[i] = i;
            sz[i] = 1;
         }


      }

      int root(int a){

         int i = id[a];

         while(i != id[i])
            i = id[i];

         return i;
      }

      bool connected(int a, int b){

         return root(a) == root(b);
      }

      void unionNodes(int a,int b){

         int rootA = root(a);
         int rootB = root(b);
         
         if(rootA == rootB)
            return;
         
         if(sz[rootA] < sz[rootB]){
            
            id[rootA] = rootB;
            sz[rootB] += sz[rootA];
         }else{
         
           id[rootB] = rootA;
           sz[rootA] += sz[rootB];
         }
      }
};
Last edited on
the proposition is prove that any node x's depth is at most log(base 2) N

the worst case of a tree is a linked list, and the depth of a linked list is going to be N. So something about the proof stinks with what you stated, right from the get go. Is there some sort of additional criteria you have for your tree that prevents it from degradation into a list??

for a balanced binary tree, the statement is true, but your tree isn't binary nor balanced.

as far as logs go, it works in the math, but you have to add 1 or round up.
I mean, if you have a balanced bin tree, and 10 nodes, the depth of the deepest is lg(10)+1. the lg(10) is 3.3... and the depth is 4 if counting root as depth 1 or 3 if counting root as depth 0.

1
2
3
4
1    
23          
4567       
8 9 10 …  


if you have an unbalanced tree, all bets are off
1
2
3
4
5
6
7 8
9 10 //under 7

for a generic tree, the depth for a balanced tree will be tied to # of children.
try it.. your example, there are 3 children.
1
234
567 89,10 11,12,13
14,15,16, …

15 is at depth of 4. How do you know that?
there are 3 to the zero nodes in the first level.
there are 3 to the 1 nodes in the second level.
there are 3 to the 2 nodes in the third level.

15 is between 9 and 27. so if there are 15 nodes, the depth is more than 2, and less than 3. there are no fractional levels, so its 3, or +1 if calling root 1 instead of 0.
Last edited on
but this tree would unlikely become a linked list every node that is added will just add one to the depth, let's say x is 12 or the very last node

1
2
3
4
5
6
7
8
9
10
11
12

        1
       /|\   \   // tree starting with root 8 is merged with tree starting with root 1
     2 3 5  \                
        |  |   \  
       /  / \   \
      4 6  7   8
                 / | \
                9 10 11
                          \ 
                          12


lets say we have t2 which contains 1,2,3,4,5,6,7 and t1 which contains 8,9,10,11,12, when we merge t2 into t1 x's(12) hops to reach the root will increment by one.

the root node of the smaller tree will always merge to the root node of the bigger tree, I fail to see where in a worse case scenario the worst possible time complexity would be O(N)

the above algorithm would never allow for, unless my code is wrong in may very well be the case

if you have an unbalanced tree, all bets are off
1
2
3
4
5
6
7 8
9 10 //under 7


*edit sorry about the layout for some reason the formatting in this post won't match the formatting in which I want the tree to be layed out, check the first post for diagram as it's the same tree *
Last edited on
the root node of the smaller tree will always merge to the root node of the bigger tree, I fail to see where in a worse case scenario the worst possible time complexity would be O(N)

ah, you want your specific algorithm, not proof of a generic trees depth. Different problem.

if you hang a second tree off the root of the first tree, the max depth is just the max depth of the 2 individual trees + 1. That falls back to how THEY were built, which could degenerate into a list or not. I mean, if tree 1 is 5 deep, and tree 2 is 5 deep, its 6 deep after that kind of merge, because the 5 deep second tree is one deeper now by hanging off the root, right?

you need all the info to prove anything formally. You need how the trees are build originally AND how they are merged. With that, you can figure out the max depth. In some cases, that could be #of nodes, or lg(num nodes), or sum of the 2 depths, or various other answers. (where lg may become a different log base, as well).

if you reinsert into a balanced tree, from one into the other, it will have a max depth base off # of children as I showed in the 3 example (which I was editing as you were typing so its not what you read the first time now). But that isn't what you did, you just did a pointer hangoff merge.
Last edited on
oh no sorry my bad not generic trees, the actual tree algorithm I posted ( Union find I think is the algorithm name )
Last edited on
what you gave yields max depth of the 2 originals + 1.
Depth of node X increases by 1 when tree T1 containing x is merged into another tree T2.
Sheesh. First, there are several things that seem to be implied here:
- The trees are binary
- By "merged", they mean merged in such a way that the result is balanced.
- They mean worst case? Best case?

Actually the proposition isn't even true! Consider:
Tree A          Tree B
8                  4
                  / \
                 2   6
                /\   /\
                1 3 5  7


Merged and balanced:
                  4
                 / \
                2   6
               /\   /\
               1 3 5  7
                       \
                        8


So The depth of X=8 changes from 1 to 4 (or 0 to 3 depending on how you compute depth).
the trees aren't binary in the lecture video he shows a diagram of the trees and parents contain a lot more than just 2 children

I'm still not sure why he uses log to the base of 2, the lecturer says that when the depth of x( node contained in a tree ) increases the size of the new merged tree at least doubles

I'll copy verbatim what's on the lecture slide when he explains it

it says



RUNNING TIME
find : takes time proportional to depth of p and q
union : takes constant time given roots

PROPOSITION - Depth of any node x is at most lg N ( log base 2(N) )

PROOF : when does depth of x increase?
Increase by 1 when tree t1 containing x is merged with another tree t2
- The size of the tree containing x at least doubles since t2 >= t1
- Size of tree containing x can double at most lg N times. why?



going back to the example in the first post lets say we have two trees t1 with root node 8 and t2 with root node 1, since t1 has 5 elements we know that t2 will have at least 10 elements so N = 10, when we do log base 2(10) we get 3, 3 is the most hops it will take to get from x to the root.

I've sketched multiple trees on paper and this proof seems to hold true, but what I don't get is the second line of the proof Size of tree containing x can double at most lg N times. why?
Last edited on
Don't know if I'm missing something but "Size of tree containing x can double at most lg N times. why?" just doesn't make sense to me

lets say we have a tree t2 with 8 children and we have a separate tree with just one node, since t2 is > than t1 we will merge t1 into t2, so N was 1 but we know that t2 must at least contain one or more nodes so the merged tree doubles ( at least ) so N = 2

so log base 2 (N) or log base 2 (2) = 1?? :s

but that statement is false because if t1 can only double once at most the number of nodes in the merged tree should be 2 right?

but instead it doubles at least 4 times

just doesn't make sense to me :/ am I misinterpreting that statement?

he says and I quote " because if you start with 1 and double log N times, you get N and there is only N nodes in the tree"

ok another example so we start with one item we merge two trees, both trees obviously contain one item

1
2
3
   5
   |
   6 


we added 6, the tree must double so N now = 2, log base 2 (2) = 1, so it can double only once? that is true but going back to the former example with t2 starting with 8 children the tree can double way more than once even though log base 2 (2) = 1.
Last edited on
I do understand why log base 2 is used in a balance BST that has all the maximum nodes on each level, lets say we have N = 7 nodes

this is 2^0 + 2^1 + 2^2 = 1 + 2 + 4 = 7 = (2^3)-1

so a find operation on this balanced BST will take log(N)

but as mentioned this isn't a BST, this a general tree in the more specific disjoint-set data structure

we know that when the depth increases by one the tree that contains x( the node) will at least double AKA when a smaller tree is merged with a bigger tree we know that the size of the merged tree will at least double.

but I still fail to understand this statement "Size of tree containing x can double at most lg N times. why?"

because we could have a tree with 1000 elements merged with a smaller tree of just 3 elements, the tree that is going to merge with the larger tree doesn't know about the size of the bigger tree, all it knows for certain is that it is at least the same size of itself so with this we double 3 and get 6. but log(6) = 2, so we are told that the size of the tree x can double at most log base 2 N times? as mentioned log(6) = 2 so 3 * 2 = 6, 6 * 2 = 12.

but the bigger tree has much more elements than 12??

anyone possibly give their theory on why that statement holds true?


thanks
Last edited on
I'm still not sure why he uses log to the base of 2, the lecturer says that when the depth of x( node contained in a tree ) increases the size of the new merged tree at least doubles.

this one is simple :) the smallest base you can use in a proper tree is 2 (1 is a linked list of one child each, and 0 is just 1 data element no children). it could triple, but it at least doubles. Every time you add 1 depth to a tree, its maximum capacity (without increase of depth again) is increased this way. so if you have 5 children and have a root node, 5 to the 0 is 1 node. if you up the depth, its new max capacity is 6: add 5 to the 1 power. increase depth again, and you added 25 (more than double, its base 5 here). Adding depth to unbalanced trees is meaningless for talking about such things, see the linked list example again.

so yes, its true, adding +1 depth at least doubles capacity. the capacity of a binary tree is just powers of 2, 1,2,4,8,16 ... anything else, is more than double, 1,3,9,27, 1,4,16, .. 1,5,25,... etc ... all more than double, and you can formally prove this.

the other one, not sure what you have defined 'size' of a tree to be. but doubling things is huge. lg 1000 is ~1024 is 10 doubles. 3 doubled 10 times is 6,12,24,48,~100,200,400,800,1600, 3200
I am not going to do proofs here. I did my time :) But just running the numbers tells us that much ^^ so now its on you to show that it is generally true. Unless N isnt 1000, not sure what you think N should be in your statement. Be careful with log vs lg. Not at all the same.
Last edited on
Thanks Jonnin, I think I'm following

this one is simple :) the smallest base you can use in a proper tree is 2 (1 is a linked list of one child each, and 0 is just 1 data element no children). it could triple, but it at least doubles. Every time you add 1 depth to a tree, its maximum capacity (without increase of depth again) is increased this way. so if you have 5 children and have a root node, 5 to the 0 is 1 node. if you up the depth, its new max capacity is 6: add 5 to the 1 power. increase depth again, and you added 25 (more than double, its base 5 here). Adding depth to unbalanced trees is meaningless for talking about such things, see the linked list example again.


I'll try repeat what I think I learned, so if you start with one node and it has five children the capacity is 6, if you add another depth to this tree you get 5^0 + 5 ^1 + 5^2 or (5^3)-1, which will give us a capacity of 1 + 5 + 25, it increases five fold. is that the correct grasp on it?

I think I'm understanding a little but still failing to piece it altogether, I'm not sure why the lecturer still uses log base 2, I sketched it out on a piece of paper, I understand that if we had a binary tree and we wanted to get the depth of the tree we would do lg(n+1) this would give us the number of levels of the tree,

now lets say we had a balanced tree that each node had 5 children ( until the external nodes obviously ) we would have log base 5(n + 1) levels but it becomes a little fuzzy to me when we are using a tree each node can have multiple children,

I drew out multiple trees and when I double the number N, lets say we merge a small tree containing 3 into a bigger tree which has 7, all I know is since the size of the tree at least doubles I do log(6) and I get 2 hops which is correct

I think I'm almost there but what am I failing to piece together? I feel like I'm so close.
Last edited on
you have the 5s right.

where is he using log2, what for, and what does log2 even represent (in math terms, but in English, what is it?). (they are the same question....)

if you have 100 items... and a 5 child tree.. and log5 of 100 is 2.8
you know the depth though. 1+5+25 is 31 + another 125 is 156 items
the first 3 levels are totally full, taking 31 items, so its got 3 full levels and a partial level, so 100 is pushing it to 4 levels deep, or counting from 0, its 3 levels deep. I think you have this down as well.

keeping to a 5 child tree, 3 and 7 merged by inserting so the result is balanced...
you have 10 nodes total after merge. follow what numbers mean what to get it.
the 3 tree has a depth of 2. root, and 2 kids, but its *capacity* is 6.
you can add 3 nodes without adding a level, as that would reach 6. if you add 4 or more nodes the depth increases. when the depth increases, it *at least* doubles (the capacity). so we know the new tree will have 10 total items but a capacity of 31 items.

so, hopefully what you are seeing here is that if you add merge a tree to another and the second is >= to the first, then the result will be either the same capacity or at least twice as large. If that is what he is saying, its true.
Take that 10 item tree and add 11, and capacity stays the same, ok. Add 22 to 10, and it increases, and more than doubles again.

If he is saying something other than 'the capacity stays the same or at least doubles when a tree >= the first tree is merged in' then I don't know either. Could be he was careless what he said, or you did not repeat some critical part of the problem or I have misread something. Could be he was talking binary trees and you thought he meant N. You may have noted that binary trees have some extra properties and are a little easier to show some rules for.


Last edited on
Let's start by summarizing the tree structure and the merge algorithm:

These are N-ary trees, meaning that each node can have any number of children.

As shown by the code and the diagram, when you merge trees, the smaller tree becomes a child of the root of the larger tree. I think this threw all of us off because it's a strange way to merge trees.

Propositon 1:

Depth of node X increases by 1 when tree T1 containing x is merged into another tree T2.

Suppose we merge two trees T1 and T2, whose sizes are N1 and N2. Let's say that T1 is the smaller one, so T1 becomes a child of T2's root.

Let's called the merged tree T3. In T3, the depth of every node that was in T1 has increased by 1 because we added T1 as a child of T2's root.

Propositon 2:

The size of the tree containing x at least doubles since | T 2 | ≥ | T 1 |

This is poorly worded. The "tree containing x" is T1 before the merge and T3 after. Why T1 before the merge? Because Proposition 1 tells us that. Also |T1| and |T2| are the sizes of T1 and T2, what I'm calling N1 and N2.

So the proposition is that N3 >= 2*N1. Here's the proof:
N3 = N1 + N2
N1 <= N2          (because we merge the smaller into the larger)
N1+N1 <= N1 + N2  (add N1 to both sides)
2*N1 <= N3   (N3 is at least twice the size of N1)


Propositon 3

Size of tree containing x can double at most lg N times.

Another poorly worded proposition. :(. They're saying that you start with a bunch of trees. One tree contains x. You merge the trees, one after another and the resulting tree has N items. The proposition is that the resulting tree doubled lg(N) times at most.

Suppose you start with tree T1 that contains x. That tree's size is N1. Now merge it a bunch of times with other trees. The resulting tree has N nodes. The proposition says that the N1 doubled at most lg(N) times. Or:
N <= N1*2lg(N)
N <= N1*N
1 <= N1


And of course N1 <= 1 because N1 contains at least 1 item (x).

Honestly this last proof doesn't feel right but I can't see where I botched the math.

Does this answer your questions? If not, which proposition are you having trouble with?
So I think I misinterpreted with N is supposed to be, I was under the impression that size = N, so when a tree is merged with a larger tree N doubles so if N was 2 No is now 4..... that was wrong,

N is the number of elements in the UF data structure not the actual size of trees, many trees can exist but the total number of elements in the universe is N, so I was wondering what does the lecturer mean "Size of tree containing x can double at most lg N times. why?"

By this statement he means that lets say we have a UF object with 10 elements [0-9] N = 10, the size of a tree if it starts off at 1 can only double as much as lg (10) times which is 3.

so we start off with 1 element : size = 1, depth = 0

we merge it with another tree with one : size = 2, depth = 1

we merge it with another bigger tree double it's size : size = 4, depth = 2

now we merge it again with another bigger tree twice it's size : size = 8, depth = 2.

We can't add any more depths to the tree because if we do we will have to double the tree, and this isn't possible as we only have 10 nodes at our disposal.

We know that the max depth of that tree will be at most log(N), it could be one but I emphasise "most", because we could have a tree with root node 0 and it could have 9 children that are all leaf/external nodes and then the depth will only be 1, but we have to take into account the worst possible depth.

@ Jonnin, that makes sense,I think it was my misinterpretation rather than the lecturers.

@ Dhayden I think I'm on the same track, I thought that N was the size of the tree rather than the number of elements in the universe.
I think N is the size the of the tree. the problem is that they're sloppy about which tree they're referring to when they say "N".
Topic archived. No new replies allowed.