When setting up a file format configuration a CSV (Comma Seperated Values) file, there are many options to choose from to ensure the translation is successful. This page will explain the most common options for CSV files.
The following file extensions are supported when setting up file format configurations for CSV files: .csv.
Please click on a section to see specific information regarding a configuration option:
To learn more about accessing and working with file format configurations, please see the following pages:
- Accessing File Format Configurations
- Viewing File Format Configurations
- Modifying Format Configurations
- Creating Format Configurations
- Testing File Format Configurations
General Tab
The General Tab contains options for choosing what type of content will be translated and what portions of the CSV file will be extracted. For example, a custom configuration would be helpful if you desire to only extract certain columns or rows for monolingual content or when the translation will be for a multilingual content.
Configuration Option | Description |
---|---|
Mono or Multilingual Content | The default CSV file format configuration is set up to translate monolingual content. This option may be used to create a custom configuration with specific extraction rules for monolingual content or to translate multilingual files. A custom configuration is necessary to translate multilingual CSV files. |
Encoding | |
Columns | These options may be used to configure the column separater (tab, comma, semi-colon) and to configure specific columns to extract from the CSV file. |
Rows | This option may be used to configure what row the translation will start on. |
HTML Content | |
Text Segmentation | Enable/Disable SRX rules for text segmentation and choose to spilt or not split text at line breaks. By default, the system segments the document by cell, this means that if there are three sentences within a cell, they will be considered one segment. If SRX rules are enabled for text segmentation, the sentences will be segmented at each punctuation mark or line break (as defined by the SRX rules). A default set of SRX rules is defined for CSV files; however, these may be customized if needed. Additionally, the default or a customized set of SRX might have splitting text at line breaks disabled. In this instance, you can enable "Always Split Text at Line Breaks" to ensure this portion of segmentation is handled correctly. |
Do Not Translate Tab
Configuration Option | Description |
---|---|
Segments |
Whitespaces & Symbols Tab
Configuration Option | Description |
---|---|
Do Not Show Leading and Trailing Whitespaces | |
Do Not Show Texts Containing Neither Lettors nor Digits |