QA Categories

The following QA Rules may be customized in Wordbee Translator to help you monitor the quality of your translations. During translation, the QA Checks will automatically flag the errors in the target segments. It is possible to create QA profiles with different sets of rules or alternative configurations.

 

List of rules and options

QA Profile Rule

Description

QA Profile Rule

Description

Character Classes

This rule checks translations for prohibited characters or classes of characters. For example, you might want to flag any Arabic letters in an English translation. You may select one of the provided predefined categories or fill in your own list of prohibited characters.



Date and Number Check

This rule checks dates and numbers in the translation. It verifies that dates and numbers in the source text are well present in the translation and that (optionally) these are properly localized.

The rule first looks for all dates and numbers including monetary amounts in the source text. It then scans the translation and verifies that all these dates and numbers are included. Secondly, you may optionally have the system verify if dates and numbers are localized according to the formats permitted by a region (language plus regional specifier), applying either source or target language formatting.

In order to check the date formatting, dates need to include day, month and year to be considered as date-strings.

Enclosed Alphanumerics

This rule checks that 'enclosed' alphanumeric characters (such as ③ ⑵ ⒋ ⒝ Ⓔ ⓭ ⓹) match in the source text and the translation. If any 'enclosed' alphanumeric characters are found, the rule expects that the same characters show up in the source text and the translation. The enclosed characters are in the Unicode range of 0x2460 to 0x24FF.

Examples Include:

  •  'Chapter ⓵ and ⓶.' vs. 'Chapitre ⓵ et ⓶.' : OK

  •  'Chapter ⓵ and ⓶.' vs. 'Chapitre ⓵ et ⓾.' : NOT OK

  •  'Chapter ⓵ and ⓶.' vs. 'Chapitre ⒈ et ⒉' : NOT OK - The numeric value matches but not the characters itself.

  •  '⒜ Introduction.' vs. '(a) Introduction.' : NOT OK - The source is a single character, the translation contains letter 'a' with brackets.

Forbidden Terms

This rule checks that no forbidden words or terms appear in a translation. These terms or words are defined within the rule configuration and used during the QA check to flag any occurence in the translation.

For example, you can add words with a negative connotation or terms not to be employed when working for a specific client. There are two lists, one finds correspondances that match exactly and the other disregards whitespace and punctuation differences.

You can enter more than 600 terms into a single profile

Markup

This rule finds missing, superfluous, or badly placed formatting tags (markup) in the translated text. It is very important that any markup present in the source text is also placed in the translation. Otherwise, the translated document may not be formatted correctly, lack contents or, in the worst case, cannot be opened at all.

The QA Markup rule has been enhanced to verify if there are duplicated tags in the target segments. The QA Check will flag the markup errors in the Translation Editor.

Pretranslations - Unedited

This rule finds unedited pretranslations and optionally, disregards those that have explicitely been validated with a green status. In many translation workflows users are asked to explicitely flag pretranslated texts with a green status (to validate a translation).

This rule may be used to find unedited pretranslations that have not been flagged green due to the user forgetting to validate them or the pretranslations not yet being looked at.

To use this rule, first tick all categories of pretranslation you want to check within the options and then make certain to choose the option for excluding translations in green status.

Punctuation - Doubles

This rule looks for punctuation signs occuring multiple times in sequence, such as ',,,' or '.?'. to ensure that punctuation is not doubled. Such sequences may point to potential errors in the translation. Whitespaces between punctuation are disregarded.

Examples:

  •  'Hello world..' : NOT OK - Two fullstops

  •  'Hello world:.' : NOT OK - A sequence of two punctuation signs

  •  'Hello,, world.' : NOT OK - Two commas

  •  'Hello,  , world.' : NOT OK - Two commas, even when separated by a whitespace

You can optionally configure the rule to accept sequences if these exist in the source text.

Examples:

  • 'Hello world :-)' vs 'Hello world :-)' : OK

  • 'Hello world :-)' vs 'Hello world :-(' : NOT OK - Sequence does not match

Punctuation - Ending

This rule checks if the punctuation at the end of a segment is the same in the source text and the translation. For example, a source text ending with a fullstop expects the translation to end with a full stop. The rule checks for basic punctuation signs such as . , ; ! ? : as well as any language specific signs.

Examples:

  •  'Hello world.' vs. 'Hello world.' : OK

  •  'Hello world.' vs. 'Hello world;' : NOT OK - Differing punctuation

  •  'Hello world.' vs. 'Hello world.   ' : OK - Trailing whites paces are disregarded

  •  'Hello world.' vs. 'Hello world..' : NOT OK - The translation has two full stops vs. one in the source

  •  'Hello world...' vs. 'Hello world...' : OK

Punctuation signs must match up exactly. However, white spaces at the end are disregarded. You will need to use a spacing related rule to find whitespace differences.

Regular expressions - Compare source & translation

The system can check whether a word/phrase/code with a defined pattern in the source text has been copied unchanged to the translation. You might, for example, want to make sure that any product part numbers with the pattern ABC-1234 in the source text have not been changed in the translated text. In the field ‘Regular expressions in source,’ type the following regular expression: ([A-Z]{3})-([0-9]{4]).

Regular expressions - Patterns in translated text

Use regular expressions to identify translation problems. If an expression matches the translation then it is considered erroneous. Optionally, you can apply the check to selected segments only: Use regular expressions on the source text to apply a filter.

Define Translation errors: Add up to 50 regular expressions (one per line). The translation is considered erroneous if one of the regular expressions matches.

Filter source texts (optional): Type up to 50 regular expressions. These will be evaluated on the source text. If none matches then the segment is skipped and considered OK.

Sentiment Check

This rule permits to identify if translations convey the expected level of positivity or, at minimum, no negative sentiment. A sentiment score ranges from -10 (very negative) to 10 (very positive).

Options:

Run sentiment analysis: for translations not yet scored / not yet scored or outdated

Sentiment must be equal or above:

  • None

  • Very positive (+6) > ... > Neutral (0) > ... > Negative (-4)

Sentiment polarity must match: so you can set expectations on the translation depending on the sentiment of the source texts.

Spaces - Before Punctuation

In many languages, punctuation signs such as . , : ; ? ! must not be preceeded by a space, a tab, or another whitespace character. This rule flags any blanks, tabs, or line breaks immediately before a dot, question mark, double point, comma or other punctuation sign.

For example, the system will flag "Hello world !" as erroneous because of the space before the exclamation mark. You may use the exceptions box to type any exceptions to this rule. As an example, when translating into French the double-point (:) must be preceeded by a blank and this may be added as an exception.

Spaces - Doubles

This rule looks for spaces, tabs, and other whitespaces occuring multiple times in sequence to verify that whitespaces are not doubled. Multiple spaces may point to potential errors in the translation.

Examples ('_' signifies a space, '(tab) a tab and (lf) a newline):

  •  'Hello(space)(space)world' : NOT OK - Double space

  •  'Hello(space)(tab)world' : NOT OK - Two whitespace in sequence

  •  'Hello(lf)(lf)(lf)world' : NOT OK

  •  'Hello(space)-(space)world' : OK

You can configure the rule to accept whitespace sequences if these also show up in the source text. For example, 3 spaces are then accepted if the original text contains the exact same sequence. 

Examples:

  • 'Hello(space)(tab)(space)world' vs 'Hello(space)(tab)(space)world' : OK - Sequence matches

  • 'Hello(space)(tab)(space)world' vs 'Hello(space)(space)world' : NOT OK - Sequence does not match

Spaces - Leading/Trailing

This rule checks if any leading and trailing whitespaces perfectly match between source text and translation. For example, a source text ending with a tab expects the translation to end with exactly one tab. The rule checks for any whitespace characters including tabulators, newlines and half- and full-width spaces.

Examples ('_' signifies a space, '(tab) a tab and (lf) a newline):

  •  'Hello world.' vs. 'Hello world(space).' : NOT OK - Superfluous space at the end

  •  '(space)(space)Hello world.(space)' vs. '(space)(space)Hello world(space).' : OK

  •  '(space)(space)Hello world.(space)' vs. '(space)Hello world(space).' : NOT OK - One space missing in the beginning

  •  '(tab)Hello world.(space)' vs. '(space)Hello world(space).' : NOT OK - Translation misses the tabulator

Spaces - Segmentation Boundaries

This rule verifies if a space needs to be placed at the end of a segment and specifically looks at segmentation boundaries. Always enable when translating from Chinese or Japanese (no spaces in between sentences) into a non-Asian language (with spaces in between sentences).

When a text is split into segments or sentences either by the system or the user, the spaces, tabs or newlines in between the segments are, by default, hidden in the translation editor. Upon building the translated document, the segments are joined together and the original whitespaces are inserted.

This approach works well unless you translate from languages such as Japanese or Chinese. In this case the translated document may contain sentences that lack spaces in between (because there were no spaces in the original text). The rule is pre-configured with common punctuation signs including closing parantheses.

Spaces - Sequence Consistency

This rule checks whether longer sequences of whitespaces match up in the source text and the translation. The system first finds all sequences of two or more whitespaces in both the source and the translation. It then compares if the number and length of those sequences matches. If not, the rule raises a problem.

Examples:

  • 'Hello___world' does not match 'Hello_world' (3 blanks vs 1 blank)

  • 'Hello___world' matches 'Hello___world' (3 blanks in both source and translation)

  • 'Hello___lucky___world___' does not match 'Hello___lucky___world' (3 sequences in the source but just 2 in the translation)

Spell Check

This rule runs a spell check on all translations and identifies any words not contained in the selected dictionary. The system selects the first dictionary compatible with the target language.

For example, if your target language is French (Canada) it will first look for a French (Canada) dictionary. If none is found it will choose the general French dictionary. Dictionaries can be configured, added or enhanced by the administrator of this platform.

Terminology Check

This rule looks up each term in a project's term bases to verify that terms are translated according to preselected terminology databases of a project. If a term is found, the rule verifies that the translation is correct. It requires that one or more terminology databases of a project are flagged for use by this rule. Terminology databases are assigned within the Resources Page of a project.

The Translation similarity option allows you to match terminology in your database and working document taking into account variations in the target language. You can set up these manually starting from 50% up to 100% (identical)

Translation - Text Length Check

Wordbee’s QA Check allows users to specify the minimum and maximum text length limits translations have to adhere and offers a useful Text Length Check, which provides translators with an easy to use tool, particularly for localization tasks where the text length often has to correspond to the original text or shouldn’t exceed its length more than the specified limit.

This QA rule checks the length of translated texts and warns users if a maximum length is exceeded. The limits can be specified as a fixed number of characters or as a percentage of the original text. For example, you could set the limit to 200 characters or 80% of the original text. Please not that these limits can also be specified on the QA tab under the relevant file format and can also be viewed and edited in the Translation Editor.

Alternatively, these limits can be specified and read from an Excel file, where each row has a column specifying the character limit of the translation. For example, the original text is in Column A and the translation in one language in Column B with the minimum and maximum character length of the translations in Column C and Column D.

This means that users can configure the QA tool to read the limits in Column C and Column D for each row and verify that those limits are adhered to. These limits apply to the text as a whole and do not affect Wordbee’s segmentation rules. Therefore, if a file is already segmented, the system will generate sub-segments based upon the specified limit values.

In this case, the original segments are checked as a whole and then all sub-segments are checked to verify that they adhere to the same limits as the original.

For Example:

  • Segment 2: Max: 200 Characters.

  • Sub-Segment 2-2: Max (Pre-set from Segment 2 as 200 characters).

  • Sub-Segment 2-3: Max (Pre-set from Segment 2 as 200 characters).

This means that when a user runs a QA check the system checks the limit individually per sub segment and then checks the limits on the complete paragraph.

Translations - Consistency

This rule verifies whether all identical source texts are translated the same way within a document. It helps make translations consistent throughout a text. The system starts by finding all identical source texts (repetitions). It then compares the respective translations to see if these are all identical or not.

For example, if the same phrase shows up twice, but is translated in two different ways, the system will consider/flag the second translation as invalid.

Translations - First Letter

This rule checks if the initial letter in a translation is properly capitalized. The rule can be configured to either expect systematic capitalization of the first letter or to check that capitalization matches the source text. Leading whitespace, punctuation, and markup are disregarded.

Example - Systematic capitalization option:

  •  'hello world' vs. 'Hello world' : OK

  •  '...hello world' vs. '...Hello world' : OK

  •  '...hello world' vs. '...hello world' : NOT OK - First letter is lowercase

  •  '110 hello world' vs. '110 hello world' : OK - The first 'letter' is a number

Example - Conditional capitalization option:

  •  'hello world' vs. 'hello world' : OK - Both are lowercase

  •  '...hello world' vs. '...Hello world' : NOT OK - case does not match

  • '...Hello world' vs. '...hello world' : NOT OK - case does not match

Translations - Identical to Source

This rule identifies translations that are identical to the source text. Source text and translation are considered identical if there is not a single difference, including spaces and markup.

Translations - Missing

This rule finds untranslated segments. A segment is considered untranslated if the translation box has no contents at all, including blanks or markup. For example, a translation consisting of a single blank is no longer considered untranslated.

Translations - Same for different source

This rule identifies identical translated segment for different source segments. It works the opposite direction of the Translations - Consistency option. When the same source text is translated differently, this may potentially point to inconsistencies in translation. This rule helps to quickly identify such cases that may be due to unedited fuzzy matches or to human error.
Examples:

  • Segment 1 source segment is ‘Hello world’

  • Segment 2 source segment is ‘Hello globe’
    > If both segments are translated as ‘Bonjour monde’, the rule will flag identical translations for different source text and will also indicate the segment number of the other identical segment(s).

Words - Repetitions

This rule identifies words that occur multiple times in sequence. For example, the rule would flag 'Hello world world'. Words or identifiers that occur two or more times in sequence may point to potential translation errors. Words must be separated by whitespace or markup.

Examples:

  • The system flags 'Hello hello world' but not 'Hello.hello world'

  • The system flags 'Total of 123 123' but not 'Chapter 1.1.1'

  • The system flags 'Hello hello' but not 'Hello</t1>hello'


Additional options

Skip segments with a green status

If you have decided to update segment statuses after your corrections, you may want to skip segments that have already been validated with a green status. 

The option skip if validated with the checkbox "Do not check segments in green status" is available at the end of the control list for each rule, so you are able to do this.

This option can also be used in combination with the Global Preferences for the Translation Editor.

Changes in status or bookmarks to highlight QA issues

As a result, when QA issues are encountered you can optionally modify segment attributes like bookmarks and/or statuses, or leave them unchanged.




Copyright Wordbee - Buzzin' Outside the Box since 2008