When setting up a file format configuration for Microsoft Excel, there are many options to choose from to ensure extraction is successful. This page will explain the most common options for Excel files.
The following file extensions are supported when setting up file format configurations for Microsoft Excel: .xls, .xlt, .xlsx, .xlsm, .xltx, .xltm, .xlsb.
Please click on a section to see specific information regarding a configuration option:
To learn more about accessing and working with file format configurations, please see the following pages:
- Accessing File Format Configurations
- Viewing File Format Configurations
- Modifying Format Configurations
- Creating Format Configurations
- Testing File Format Configurations
General Tab
The General Tab contains options for choosing what type of content will be translated and what portions of the Excel file will be extracted. For example, a custom configuration would be helpful if you desired to only extract certain columes, rows, or sheets for monolingual content or when the translation will be for a multilingual content.
Configuration Option | Description |
---|---|
Mono or Multilingual Content | The default Excel file format configuration is set up to translate monolingual content. This option may be used to create a custom configuration with specific extraction rules for monolingual content or to translate multilingual files. A custom configuration is necessary to translate multilingual Excel files. |
Columns | Configure specific columns to extract from the Excel file and choose to exclude hidden columns from the translation. |
Rows | Configure what row the translation will start on and choose to exclude hidden rows from the translation. |
Sheets | Configure specific sheets to be translated and choose to exclude hidden sheets from the translation. |
Other Content | Extract or exclude document properties, headers, footers, sheet names, and user comments from the translation. |
Text Segmentation | Enable/Disable SRX rules for text segmentation and choose to spilt or not split text at line breaks. By default, the system segments the document by cell, this means that if there are three sentences within a cell, they will be considered one segment. If SRX rules are enabled for text segmentation, the sentences will be segmented at each punctuation mark or line break (as defined by the SRX rules). A default set of SRX rules is defined for Excel files; however, these may be customized if needed. Additionally, the default or a customized set of SRX might have splitting text at line breaks disabled. In this instance, you can enable "Always Split Text at Line Breaks" to ensure this portion of segmentation is handled correctly. |
Do Not Translate
The Do Not Translate Tab provides options for configuring items to exclude from the translation such as certain text colors, words, or segments of text.
Configuration Option | Description |
---|---|
Colors | Choose specific text colors to exclude from the translation such as rede, blue, or green. **Please note that the chosen color in the configuration must precisely match the text color in the Word source file for the text to be excluded from the translation. |
Segments | Enter certain texts or regular expressions for Wordbee Translator to locate and either exclude from or extract as part of the translation. Any text that does not match entered texts or patterns is automatically considered by the system to be translatable. Regular expressions may be entered in the system to protect entire segments or just terms with the file. These segments or terms will not be extracted for translation and be taken into account during the wordcount step. In the translation editor, they will appear as tags and can be used to protect parts of texts that should not be translated, but should still appear in the translated document. A good example, is entering terms or regular expressions to protect brand names or confidential content like software codes. |
Words or Terms | Enter single words, terms or segment portions to exclude from the translation. Any text captured by regular expressions are converted to markup to avoid modification. If no description is entered, than the original text will appear when the translator hovers over the markup in the target file. |
Whitespaces & Symbols
The Whitespaces & Symbols Tab provides configuration options for managing whitespaces and characters that are not letters or digits within the translation.
Configuration Option | Description |
---|---|
Do Not Show Leading and Trailing Whitespaces | Elect to show or exclude leading and trailing whitespaces within the translation. |
Convert Sequences of Multiple Whitespaces Into Markup | Elect to convert or not convert multiple whitespaces into markup within the translation. |
Do Not Show Leading and Trailing Characters that are Neither Letters nor Digits | Elect to show or exclude leading and trailing characters that do not contain letters nor digits within the translation. |
Convert Words Containing Neither Letters nor Digits Into Markup | Elect to convert or not convert words containing no letter or digits into markup within the translation. |
Reduce Markup
The Reduce Markup Tab provides configuration options for substantially reducing markup. Less markup means better translation memory hits and less work for the translator. Be careful with these options because removing markup may lead to translated documents with minor to small differences in fonts and text styles.
Configuration Option | Description |
---|---|
Remove Irrelevent Font or Style Changes | Choose to ignore formatting changes applied to whitespaces or to texts that do not contain letters nor digits during translation. |
Visually Reduce Markup in Translation Editor | Merge adjacent markup elements into a single markup to preserve the original styles and fonts of the source file during translation. When enabled, the translator will not see individual markup elements. This option is disabled in the default configuration. |
Embedded Files
The Embedded Files Tab provides options for extracted other Office files that have been embedded into Microsoft Excel files. These options are disable by default, but may be enabled to extract this type of content for translation.
Configuration Option | Description |
---|---|
Extract Embedded Excel Files | If enabled, any embedded Excel file content will be extracted for translation. This option is disabled in the default configuration. |
Extract Embedded PowerPoint Files | If enabled, any embedded PowerPoint file content will be extracted for translation. This option is disabled in the default configuration. |
Extract Embedded Word Files | If enabled, any embedded Word file content will be extracted for translation. This option is disabled in the default configuration. |
Please note that the system will extract all the content from an embedded file and not necessarily just the visible content.