Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Sometimes my files contain many tags, and the result of the translation generated by my machine translation provider is not great. What can I do to improve the results?

...

Markup handling and placement is a very complex topic for which results are always best when dealing with small volumes and generic html HTML-based markup.

In Wordbee Translator, once a document is added to a project and marked for translation, the text gets extracted from the file using the rules defined in the file format configuration. This set of rules rely relies on RegEx and other segmentation mechanisms. This process parses the file to get all text that requires translation and creates a structure of translation units called segments.

...

  • Each segment is a machine-readable string that can be processed by any engine can process. These strings contain the text in the language processed and markup. The markup defined by extraction rules can be of different types (custom, htmlHTML-compatible), which makes the string unique.

  • Because of the nature in which markup can be defined in the text extraction rules, the system needs to prepare the initially extracted string in (1) further to make it compatible with machine-related processes that can happen outside Wordbee Translator. The way you extract the text and generate the markup are generated will have an impact on any machine-related processes.

Processing of segments (before and after machine translation)

When segments are prepared for machine translation, the following processing is applied to the segments:

  1. The text in the source language of the segment and its markup are further prepared to maximize the chances of getting the integrity of the content translated by the MT provider.

  2. The markup in the string is further converted into generic html HTML markup , to make it machine compatible.

  3. The new converted string is sent to the MT provider chosen, as per MT profile configuration.

  4. Once MT output is generated by the MT provider generates the MT provideroutput, the Wordbee Translator verifies if the markup obtained in the output is valid as per the initial MT request. It checks if the translation generated by the MT provider has done the following:

    1. returned all markup

    2. the markup was correctly placed
      Wordbee Translator has several mechanisms in place that allow you to "roughly" fix any major markup issues. These aim at making doing accurate translations and preventing issues problems when reconstructing the file with all translations, such as making the file readable in the first place.

  5. Finally, once the machine translation output is available and validated, the system needs to convert the htmlHTML-based markup back to the style initially parsed in Wordbee Translator.

...

If the MT profile selected sends the text to 'Microsoft MT’MT,' once translations are provided by the end MT engine, the system needs to convert back the markup handed over by Microsoft to what it originally was when the file was marked for online translation. The translation provided by Microsoft needs to be further processed to convert and place the markup accordingly. If things went well, there is none to little difference between these markups.

Panel
bgColor#DEEBFF

In a nutshell:
Your regex RegEx markup goes to Microsoft and back to us. We sometimes have the Wordbee Translator platform. As a result, the Wordbee platform sometimes has to "fix" the markup. Fixing Unfortunately, fixing is never perfect and , so you might sometimes occasionally see incorrectly placed markup.

...