This is an old revision of the document!
Proof
Proof is a tool for generating and viewing a semantic diff from two sources.
Semantic vs Pedantic Diffs
Software Engineers spend a large amount of time looking at diffs. We do this when we are…
- Reviewing code
- Getting an update on what has changed in code
- Remembering what we did in code
The first point is the most important. Human error is inevitable but costly, and code reviews reduce the chance of human errors having costly consequences. This is especially true in the medical software world, where a bug can mean the difference between life and death.
There are two ways to look at how a piece of source code has changed. The first way is to compare the text of the previous source with the text of the new source. I call this a pedantic diff. This is the way practically all diff tools work in the modern day. The reason it is so widespread is:
- It is easy. Implementing a pedantic diff is pretty much trivial. There are some ways to improve the presentation quality of the diff using substrings, but this has been explored and described in depth.
- It is language independent. Any piece of text can be diffed.
The biggest problem with this method is that source code is not text. Yes, it is written and displayed as text, but it is governed by a tree structure that represents how the program will function. This tree, known as an Abstract Syntax Tree, is the truest representation of the source code. It represents the semantic meaning of the code, which is why I call a diff between two ASTs a semantic diff. Semantic Diffs have the following advantages:
- Unimportant changes to the source code are ignored, which allows reviewers to focus on the meaning of the code.
- More meaningful information can be represented. Statements can be broken down to reveal the specific parts that changed. Moved code can be tracked through files.
- Insights can be more effectively gathered. New variables can be easily tracked, for instance.