When setting up a file format configuration for XLIFF files, there are many options to choose from to ensure extraction is successful. This page will explain each section of configuration options for XLIFF files.
...
- Viewing File Format Configurations
- Modifying Format Configurations
- Creating Format Configurations
- Test and validate file format configurations
...
General Tab
The General Tab contains options for extracting content, defining the file as HTML, handling whitespaces and symbols, excluding content, and text segmentation. The options are described in general below based on individual sections.
- Content Section - Extract XLIFF existing translations (if any) and set segment status to 'Translated' in the translated XLIFF file.
- Comments - Extract XLIFF notes on segment level to Wordbee comments and write new comments added during translation work in the translated file.
- HTML Content Section - Inform the system that the content is HTML, set up a configuration for HTML extraction, and split text at HTML break tags.
- Whitespaces and Symbols Section - Do not show leading and trailing whitespaces, do not show preceeding and trailing markup, do not translate texts containing neither letters or digits, and always preserve whitespacees by default.
- Text Segmentation Section - Split segments at XLIFF segmentation boundaries, enable SRX rules for text segmentation, and elect select to always split text at line breaks.
We have two scenarios when enabling this option "Always split text at line breaks" in the XLIFF configuration:
First scenario: If there is no HTML content in the XLIFF file:
There will be no line breaks in case of enabling this option "Content is HTML" because HTML parser removes white spaces from the segments which are considered line brakes in XLIFF.
Second scenario: If there is HTML content in the XLIFF file:
There will be line breaks in case of enabling this option "Content is HTML" but the HTML content must contain HTML breaking tags (eg: <p>, <div>....etc) because HTML parser removes white spaces from the segments unless the HTML breaking tags (eg: <p>, <div>) are included in both the HTML content in the XLIFF file and HTML configuration that is attached to the XLIFF configuration. (for more information click Here)
Do not translate tab
Exclude Content Sections - Configure content to be translated or not translated when the system looks for texts or regular expression patterns.
- Segments
- Words or terms
- Attributes and comments
SDL XLIFF Tab
The SDL XLIFF Tab may be used to load advanced properties when XLIFF files have been produced by other CAT tools.
- Extract Origin of Translations - The SDL 'origin' attribute specifies the origin of the translation: 'tm' for translation memory, 'mt' for machine translation, etc. The SDL 'percent' attribute tells whether a pretranslation is exact, fuzzy or perfect. These fields will be mapped to the respective fields in Wordbee Translator. The Wordbee word count will then take into consideration these values.
...
QA tab
Pass over restrictions on the size of the segments to highlight issues when performing quality assurance checks
...