Table of Contents

Populate Diff Tree

This algorithm is the powerhouse of project proof, and is one of the most complicated and hairy things I've ever worked on.

The purpose of this algorithm is to add information to a tree about what will change in another tree. This includes information about creation and deletion of nodes, as well as movement.

We are given two trees - an original tree and a modified tree. The original tree will have the new information added. In the image below, the left tree is original, and the right tree is modified.

Note that both trees are linked. This means that we have decided that these two nodes are the same. The algorithm for doing this is pretty simple, but for now, you can imagine that two linked nodes are linked if they contain the same data.

(Note that I haven't drawn the links between the trees, so as to prevent clutter. Instead, nodes with the same letter are considered to be linked.)

Creation and Deletion - Level One

We proceed level by level, then parent by parent, from the top of the tree.

In this very simple example tree (no moving up or down levels!) we can ignore parent and root nodes, to make it a bit easier to tell what's going on. You might notice that there are links between nodes. This is because the children of a particular node are stored as a doubly linked list in memory, for reasons that will become relevant later.

Scanning for misalignments

The first thing we do is scan for misalignments. A misalignment is when the order of matched nodes differs. For instance, suppose that in our original tree we had “A→x→B→y→C”, and in our modified tree we had “A→w→k→B→s→C”. In this instance, the order of matched nodes is “ABC” in both trees, so there is no misalignment. However, if the modified tree had “A→w→k→C→s→B”, in the original we would have “ABC” and in the modified we'd have “ACB”, so there would be a misalignment.

Checking for a misalignment is easy. We start with a matched node in the original tree.

Then, we scan forward to the next matched node.

We repeat the process for the match of the starting node in the modified tree

Finally, we check if these two nodes match. They do, so there is no mismatch here.

We do this for each pair of matched nodes. On this level, there are no mismatched nodes, but there will be in the next level, under node F.

Scanning for, and handling, creation

Next, we check if we need to add new nodes. We start in the modified tree, scanning for any nodes that are unmatched.

B2 matches with B1, so we are fine.

C2 does not have a match!

So, we create a new node, C1. But where to put it? Since this is a doubly linked list, we need a node before and a node after. This is pretty simple. We begin by scanning backwards from C2 to the previous matched node.

We move to the node that matches this node, B1. This will be the previous node, before C1.

To find the node that comes after, we just scan forward to the next node, which is D1.

With that, we have all the information we need to add the new node to the original.

There are no more unmatched nodes in the modified tree.

Scanning for, and handling, deletion

Handling deletions is really simple.

We start at the first node in the original tree, and scan for unmatched nodes.

When we find one, we mark it as deleted.

That's literally it - we don't need to actually delete any nodes, we just need a node telling us that we deleted a node.