Text edits example

Text edits are based on the Levenshtein algorithm. It “observes” how many keystrokes are required to transform an initial text into the changed text. If a new text has 10 more characters it will assume an edit distance of 10. For a more precise and complete description see the Wikipedia page: Levenshtein algoritm.

The following is an example to demonstrate how the edit distance properties are calculated in the Text edits report .

Segment

Initial text

Updated text

Edit distance (ED)

Max length (L)

Normalized ED

Segment

Initial text

Updated text

Edit distance (ED)

Max length (L)

Normalized ED

1

 

abcd

4 chars

4 chars (maximum of 0 and 4)

1.00 (4 divided by 4)

2

abcd

abcdef

2 chars

6 chars (maximum of 4 and 6)

0.29 (2 divided by 6)

3

abcd

abcd

0 chars

4 chars

0.00 (0 divided by 4)

Given these figures, we now have the total for all the changes:

  • editDistanceSumLength= 16 : It is the sum of all L values 4 + 6 + 4

  • editDistanceSum = 6 : Sum of all ED values 4 + 2 + 0

  • editDistance = 0.38 : Is 6 divided by 16 rounded up to the 2nd decimal.

  • editDistanceSumNorm = 1.34. It is the sum of individual ED / L fractions. Calculation is (4 / 4) + (2 / 6) + (0 / 4)

Note that normalized edit distances are always rounded up to the 2nd decimal.

 

Copyright Wordbee - Buzzin' Outside the Box since 2008