When setting up a file format configuration for Web Pages, there are many options to choose from to ensure extraction is successful. This page will explain each section of the configuration options for Web Pages.
The following file extensions are supported when setting up file format configurations for Web Pages: htm, .html, .xhtml, . htmls, .php, .php2, .php3, .php4, .php5, .php6, .phtml, .csm, .jsp, .ahtm, .ahtml.
These sections have been provided to help you become familiar with available Web Page configuration options in Wordbee Translator:
To learn more about accessing and working with file format configurations, please see the following pages:
- Accessing File Format Configurations
- Viewing File Format Configurations
- Modifying Format Configurations
- Creating Format Configurations
- Testing File Format Configurations
General Tab
The General Tab contains options for configuring the type of encoding, HTML code, HTML attributes, content exclusion, and text segmentation. The options are described in general below based on individual sections.
Configuration Options | Description |
---|---|
Encoding | The default encoding selection for web pages is UTF-8; however, the encoding option may be used to select a different type of encoding such as Windows, Macintosh, ASCII, etc. An additional option is provided for converting characters that are not compatible with the target encoding into entity references. |
HTML Code | These options inform Wordbee Translator how the HTML code itself will be handled during translation. Within this configuraiton section, you will be able to configure:
|
HTML Attributes | The options inform Wordbee Translator how to handle specific HTML attributes during translation and include:
|
Exclude Content | This option may be used to configure specific content to be excluded from the extracted text for translation. Within this configuration section you can enter text segments or regular expressions. If a text/pattern matches, then it is possible to mark the segment as not translatable, as translatable or as potentially not translatable. The latter two will be shown to the translator. The system checks one pattern after the other until one matches. Text that matches none of the patterns is considered translatable. |
Text Segmentation | This configuration section may be used to enable options for text segmentation during tranlsation. Here you can:
|
Server and Client Side Code Tab
The Server and Client Side Code Tab contains options for configuring the extraction and exclusion of quoted strings and additional content within the web page to be translated.
Configuration Options | Description |
---|---|
Extract Quoted Strings | Web pages may contain Javascript or server side code such as PHP. You can decide whether the system will automatically extract quoted strings in code sections during translation. These configuration options are provided:
|
Exclude Quoted Strings | This configuration section may be used to enter specific text segments or regular expressions to exclude from the translation. For each piece of text (segments), the system looks for the texts or regular expression patterns entered in this configuration section. If a text/pattern matches, then it is possible to mark the segment as either translatable or not translatable. The system checks one pattern after the other until one matches. Text that match none of the entered patterns are considered translatable. |
Include or Exclude Additional Content | This configuration section may be used to specify regular expressions to extract text inside code (Javascript, etc.). The regular expressions are not limited to quoted strings but can capture anything. The regex MUST contain capture groups named "pattern1", "pattern2", etc. Example: @(?<pattern1>.*?)@ will extract any text delimited by "@". |
HTML Tags and Attributes Tab
Configuration Options | Desriptions |
---|---|
Translatable Attributes | |
Non-Breaking Tags | |
Whitespace Preserving Tags |
CMS Specific Settings Tab
Configuration Options | Description |
---|---|
For more information about working with file format configurations and creating custom configurations, please see these pages: