When setting up a file format configuration a CSV (Comma Seperated Values) file, there are many options to choose from to ensure the translation is successful. This page will explain the most common options for CSV files.
The following file extensions are supported when setting up file format configurations for CSV files: .csv.
Please click on a section to see specific information regarding a configuration option:
To learn more about accessing and working with file format configurations, please see the following pages:
- Accessing File Format Configurations
- Viewing File Format Configurations
- Modifying Format Configurations
- Creating Format Configurations
- Testing File Format Configurations
General Tab
The General Tab contains options for choosing what type of content will be translated and what portions of the CSV file will be extracted. For example, a custom configuration would be helpful if you desire to only extract certain columns or rows for monolingual content or when the translation will be for a multilingual content.
Configuration Option | Description |
---|---|
Mono or Multilingual Content | The default CSV file format configuration is set up to translate monolingual content. This option may be used to create a custom configuration with specific extraction rules for monolingual content or to translate multilingual files. A custom configuration is necessary to translate multilingual CSV files. |
Encoding | The default CSV file format configurtion uses UTF-8 file encoding; however, this portion of the configuration may be altered to use a different type of encoding (ISO, Windows, Macintosh, ASCII, etc.). Additionally, if you do not know the encoding used for the file, an option is provided to check it for proper configuration. |
Columns | These options may be used to configure the column separater (tab, comma, semi-colon) and to configure specific columns to extract from the CSV file. |
Rows | This option may be used to configure what row the translation will start on. |
HTML Content | This option should be enabled if the CSV contains HTML content. Unless you have created a custom HTML configuration, Wordbee Translator will use the default configuration to complete the tranlsation. |
Text Segmentation | Enable/Disable SRX rules for text segmentation and choose to spilt or not split text at line breaks. By default, the system segments the document by cell, this means that if there are three sentences within a cell, they will be considered one segment. If SRX rules are enabled for text segmentation, the sentences will be segmented at each punctuation mark or line break (as defined by the SRX rules). A default set of SRX rules is defined for CSV files; however, these may be customized if needed. Additionally, the default or a customized set of SRX might have splitting text at line breaks disabled. In this instance, you can enable "Always Split Text at Line Breaks" to ensure this portion of segmentation is handled correctly. |
Do Not Translate Tab
The Do Not Translate Tab contains options for configuring what will not be extracted for translation within the source file.
Configuration Option | Description |
---|---|
Segments | Enter certain texts or regular expressions for Wordbee Translator to locate and exclude from the translation. Any text that does not match entered texts or patterns is automatically considered by the system to be translatable. Regular expressions may be entered in the system to protect entire segments or just terms with the file. These segments or terms will not be extracted for translation and be taken into account during the wordcount step. In the translation editor, they will appear as tags and can be used to protect parts of texts that should not be translated, but should still appear in the translated document. A good example, is entering terms or regular expressions to protect brand names or confidential content like software codes. |
Whitespaces & Symbols Tab
The Whitespaces & Synbols Tab may be used to hide whitespaces or symbols that exist within the source file.
Configuration Option | Description |
---|---|
Do Not Show Leading and Trailing Whitespaces | If enabled, leading and trailing whitespaces within the source file will not be shown in the translation. |
Do Not Show Texts Containing Neither Lettors nor Digits | If enabled, text (symbols) that do not consist of letters or digits will be hiddent from view in the the translation. |
View our CSV file format Questions and Answers section to learn how to perform common file format customisations. These examples are the most frequently answered by our support team.