diff

This library provides methods for comparing two sequences.

The sequences could be seq[string] of words, or any other sequence providing the elements support == and hash().

To get a comparison, first call let diff = newDiff(a, b) where a and b are both sequences containing the same type of object. Then call diff.spans() to get a sequence of Spans which if followed would turn sequence a into sequence b. (See the tests/test.nim file for examples.)

Example:

let diff = newDiff(a, b) # a and b are sequences of the same type
for span in diff.spans(skipEqual = true):
  case span.tag
  of tagReplace: # handle replace (== delete + insert)
  of tagDelete: # handle delete
  of tagInsert: # handle insert
  of tagEqual: doAssert(false) # Should never occur

(The algorithm is a slightly simplified version of the one used by the Python difflib module's SequenceMatcher.)

See also diff on github. For other Nim code see FOSS.

Types

Match = tuple[aStart, bStart, length: int]
Tag = enum
  tagEqual = "equal", tagInsert = "insert", tagDelete = "delete", tagReplace = "replace"
Span = tuple[tag: Tag, aStart, aEnd, bStart, bEnd: int]
Diff[T] = object
  a: seq[T]
  b: seq[T]
  b2j: Table[T, seq[int]]

Procs

proc newDiff[T](a, b: seq[T]): Diff[T]

Creates a new Diff and computes the comparison data.

To get all the spans (equals, insertions, deletions, replacements) necessary to convert sequence a into b, use diff.spans().

To get all the matches (i.e., the positions and lengths) where a and b are the same, use diff.matches().

If you need both the matches and the spans, use diff.matches(), and then use spansForMatches().

proc newMatch(aStart, bStart, length: int): Match {...}{.raises: [], tags: [].}
Creates a new match: only public for testing purposes.
proc `==`(a, b: Span): bool {...}{.raises: [], tags: [].}
Compares spans: only public for testing purposes.
proc longestMatch[T](diff: Diff[T]; aStart, aEnd, bStart, bEnd: int): Match

Returns the longest Match between the two given sequences, within the given index ranges.

This is used internally, but may be useful, e.g., when called with say, diff.longest_match(0, len(a), 0, len(b)).

proc matches[T](diff: Diff[T]): seq[Match]

Returns every Match between the two sequences.

The differences are the spans between matches.

To get all the spans (equals, insertions, deletions, replacements) necessary to convert sequence a into b, use diff.spans().

proc newSpan(tag: Tag; aStart, aEnd, bStart, bEnd: int): Span {...}{.raises: [], tags: [].}
Creates a new span: only public for testing purposes.

Iterators

iterator spansForMatches(matches: seq[Match]; skipEqual = false): Span {...}{.raises: [],
    tags: [].}

Yields all the spans (equals, insertions, deletions, replacements) necessary to convert sequence a into b, given the precomputed matches. Drops any tagEqual spans if skipEqual is true.

Use this if you need both matches and spans, to avoid needlessly recomputing the matches, i.e., call diff.matches() to get the matches, and then this function for the spans.

If you don't need the matches, then use diff.spans().

iterator spans[T](diff: Diff[T]; skipEqual = false): Span

Yields all the spans (equals, insertions, deletions, replacements) necessary to convert sequence a into b. If skipEqual is true, spans don't contain tagEqual.

If you need both the matches and the spans, use diff.matches(), and then use spansForMatches().