Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

texts

Total number of texts edited.

int

edits

Total number of edits.
Value edits can be greater than value texts. For example there may be the initial translation (edit 1) and then a revision (edit 2). The aggregationMode also influences this value.

int

Edit distance

Edit distances are calculated with the Levenshtein algoritm. See an example of how it is calculated here: Text edits example

editDistance

The total normalized edit distance (“ED”) for all events aggregated in this row. The ED is a value between 0 (no edits at all) and 1 (completely reworked text). The formula is:

Code Block
editDistance = editDistanceSum / editDistanceSumLengths

decimal

editDistanceSum

Sum of all (non-normalized) edit distances. For example if the initial translation is empty and a user changed to “abcd” then this contributes 4. If the reviser then changes the text to “abcdef” it adds 2 on top.

int

editDistanceSumLengths

Sum of text lengths of all edits. If text “ab” is changed to “abcd” then we add 4 which is the maximum of initial/edited lengths 2 and 4.

int

editDistanceSumNormalized

With each individual edit, the system calculates the normalized ED: Edit distance of individual edit divided by the maximum length of initial and changed texts. To obtain the average normalized ED of all records in this record use:. See also Text edits example

Code Block
average edit distance = editDistanceSumNormalized / edits

decimal

Adjusted word counts

words

The number of source words adjusted by the normalized edit distance of the edited text.

The result is stored with two decimals and always rounded up.

The idea is simple, if the source text has 10 words and the translator had to write the translation from scratch, we count 10 words. On the other hand, if the translator started off with a pre-translation and made 20% changes, then we will count only 2 words.

The % is given by the normalized edit distance. An ED of 0 means no change was done and we count nothing. An ED of 1 points to a translation from scratch and the full source words need to be counted. Any value in between means more or less effort that was required by the worker.

decimal

chars

Similar to words. See above.

decimal

wordsTarget

Same as words except that we apply the ED % to the number of target words (after editing). This information is useful for example if work shall be measured in terms of translated words/chars rather than source words/chars.

decimal

charsTarget

Similar to wordsTarget. See above.

decimal

Date range

Date information for all the edits we have aggregated in this record

dateMin

UTC date of earliest edit.

datetime

dateMax

UTC date of latest edit.

datetime

Group by

Data is aggregated according to the “groupby” parameter. The following properties tell the group’s properties.

did

The document id if we aggregate by document ID.

int?

uid

The editing user id if we aggregate by document ID.

int?

src

The segment’s source language if we aggregate by languages.

string?

trg

The target language of edits if we aggregate by languages.

string?

ed

The Last Editor (Enumeration) if we aggregate by it.

int?

...