Upload a file and have it word counted and pre-translated. Then get back the translated file and word count details.
This method does not require a project.
Info |
---|
In most cases, the use of MT Hive should be preferred. MT Hive integrates with the online user interface where translated files can be downloaded and sent into post editing. |
Table of Contents |
---|
URL
Code Block |
---|
(POST) /api/apps/tools/translatefiles/translate |
PARAMETERS
Tip: A helper method exists to get some parameters from a project: Word count profile, Document format profile, Project memory and attached resource IDs.
The method is: apps/tools/translatefiles/presets/project/{pid}
The body must contain a JSON object with these properties:
Original file | ||
fileToken | The token that references the original file to process. Use media/upload to upload your file and to obtain a token. NOTE: This method will invalidate the file token. | string, Mandatory |
src | Source locale. This is the locale of the original file. | string, Mandatory |
trg | Target locale. This is the locale into which to translate. | string, Mandatory |
formatId | Optional ID of a specific document format configuration to use. The configuration defines how to extract text from the original file. The system selects a configuration as follows:
|
To enumerate all configurations, see: Document formats | int?, Optional | ||
formatProfileId | Optional ID of a document format profile from which to select a format configuration. Profiles are configured online at Settings > Document format profiles. Tip: Use helper method: apps/tools/translatefiles/presets/project/{pid} To enumerate all profiles, see: Document format profiles |
| |
Translation and word count | |||
wordcountProfileId | Optional word count profile to use. The profile defines how to leverage memories, how to pre-translate from memories, whether to use MT and more. If not set, then the system selects the alphabetically first word count profile configured online at Settings > Word count profiles. Tip: Use helper method: apps/tools/translatefiles/presets/project/{pid} To enumerate word count profiles, see Word count profiles | int?, Optional | |
resourceIds | Optional list of translation/project memories or term bases to use for leveraging (as well as pre-translation). If not set, then the system will only pre-translate with MT (if a system is enabled in the word count profile) Tip: Use helper method: apps/tools/translatefiles/presets/project/{pid} to get project memory ID and attached resource IDs for a reference project. | int[]?, Optional | |
disableMT | Optional. Default is false.
| bool?, Optional | |
copySourceToTarget | Optional. Default is true if not specified.
|
| bool?, Optional | |
Results | ||
buildTranslatedFile | Optional. Default is true if not specified. If true then the results will contain a file reference to download the translated file. Use false if you are only interested in the word count information. | bool?, Optional |
collectLeveragedHits | Optional. Default is false if not specified. If true, then the results will contain a file reference to download a JSON containing all the hits that were used for pre-translation of each segment. | bool?, Optional |
callbackurl, callback |
Specify a URL which will be called upon success or failure of operation. This makes polling for |
URL is called as POST request with operation result included in the body, see "RESULTS" chapter below for the JSON format.
Suggestion: Include your own references in the URL, example: http://callmeback.mycompany.com?operationid=22222&mydata=abcde
string?,operation status unnecessary. See Callbacks (with asynchronous operations) | Optional |
RESULTS
The operation may take more or less time depending on the amount of data to export. Therefore it is implemented as an asynchronous operation.The API method returns an Asynchronous operation result:
Code Block | ||
---|---|---|
| ||
{ "trm": { "requestid":32230, "status":"Waiting", "statusText":"Waiting..." } } |
If You can poll the status or use the callback parameter. When the operation is not completed (status = Finished), you need to poll every few seconds until completion with requestid. When finished you obtain a token to download the JSON file:
...
complete, the results are in the custom property
Code Block | ||
---|---|---|
| ||
{ "trm": { "requestid": 32230, "isbatch": false, "status": "Finished", "statusText": "Finished!" }, "custom": { [ ****** RESULTS ****** ] } "result": { "items": [] }, "custom": {} |
The custom property is a JSON object:
filetokenWordcount | The file token to download the word count results as a JSON document. The details are described here: projects/{pid}/wordcounts/{did}/{trg} but contain additional details. | string | ||
filetokenTranslation | The file token to download the translated file. | string | ||
fileTokenLeveragedHits | The file token to download a JSON with all hits used for pre-translation. Usemedia/get/{token} to download the JSON. The parameter See an example below this table. | string? | ||
parameters | The parameters used to word count and translate the file. These are those you submitted plus the default values populated. Example:
| object |
Leveraged hits
With the fileTokenLeveragedHits
token you can download a JSON with a list of the hits that were used to pre-translate the segments.
Example:
Code Block |
---|
{ "de": [ { "docNo": "1", "docSid": null, "sid": 27624594, "rid": 7837, "sim": 100, "ed": 1, "st": 0, "src": "en", "trg": "de", "stxt": "Good morning" }, { "docNo": "1-2", "docSid": 1, "filetoken": "1104cf62b1934e0f9ae40c43c4af1ae2", "sid": 27624595, "rid": 7837, "sim": 99, "ed": 1, "st": 0, "segmentssrc": 4"en", "textstrg": 9"de", } } |
Get the file: You need to use the call media/get/{token} with token is the custom/filetoken property in the previous result.
RESULT
A JSON object with these properties:
...
A dictionary of texts. The key is the locale and the value a text object, see below.
Code Block |
---|
"texts": { "en": {...}, "fr": {...} } |
...
EXAMPLES
...
"stxt": "How are you doing"
}
... |
The key properties of each hit are:
docNo
: The sequential segment number (string) in the document we translate: 1, 2, 3, 3-2, 3-3, etc.docSid
: Always null.stxt
: The hit’s source textsid, rid
: The hit’s numeric segment ID and resource IDsim
: The hit similarityed, st, src, trg
: The hit’s last editor, status, source and target language
EXAMPLES
We upload a Word file, see media/upload. We get back our fileToken, see below.
Code Block |
---|
POST /api/media/upload
BODY: multipart attachment of a Word file |
We submit the request:
Code Block |
---|
POST /api/apps/tools/translatefiles/translate
BODY:
{
"src": "en",
"trg": "fr",
"fileToken": "178eee3996bd4898b31da69a4fe5b206",
"wordcountProfileId" : 186,
"formatProfileId": 9,
"resources": [ 4630, 4401 ]
}
|
This returns the status of the asynchronous operation. We poll for completion. We can also include the URL callback parameter in the payload to notify us.
When the operation is finished, we get:
Code Block |
---|
{
"trm": {
"requestid": 0,
"isbatch": false,
"status": "Finished",
"statusText": "Finished!"
},
"result": {
"items": []
},
"custom": {
"filetokenWordcount": "424366598aa1472499a87a3434688ad2",
"filetokenTranslation": "d11e5b4e5c444109a8f940ec7205b53f",
"parameters": {
"fileToken": "178eee3996bd4898b31da69a4fe5b206",
"fileName": "sample.docx",
"src": "en",
"trg": "fr",
"formatProfileId": 9,
"formatId": null,
"wordcountProfileId": 186,
"resourceIds": [],
"disableMT": false,
"copySourceToTarget": true,
"buildTranslatedFile": true
}
}
} |
Download translated file:
Code Block |
---|
GET /api/media/get/d11e5b4e5c444109a8f940ec7205b53f |
Download word count details:
Code Block |
---|
GET /api/media/get/424366598aa1472499a87a3434688ad2 |
The word count details are a JSON document (see description of properties here projects/{pid}/wordcounts/{did}/{trg} )
Code Block |
---|
{
"segments": 1,
"words": 2,
"chars": 11,
"pages": null,
"minutes": null,
"wordsExcluded": 0,
"charsTranslated": null,
"wordsTranslated": null,
"wdPretransIdentical": 2,
"wdPretransIdenticalCtx": 0,
"wdPretransIdenticalPrevCtx": 0,
"wdPretransIdenticalPrev": 0,
"wdPretransIdenticalMT": 2,
"wdPretransFuzzy": 0,
"wd110": 0,
"wd100": 2,
"wdMatch1": 0,
"wdMatch2": 0,
"wdMatch3": 0,
"wdMatch4": 0,
"wdMatch5": 0,
"tags": 0,
"spaces": 1,
"punctuation": 1,
"nonAsianWords": 0,
"asianCharacters": 0,
"details": {
"counts": [
{
"locale": "fr",
"counts": {
"c": [
{
"o": 0,
"e": 5,
"s": 100,
"r": null,
"cc": 11,
"cw": 2,
"cs": 1
}
],
"dt": "2021-07-24T14:44:45.0783981Z"
}
}
],
"performance": {
"global_secs": 1,
"mt_secs": 0,
"markupfix_secs": 0,
"tm_secs": 0,
"tm_mode_full_secs": 0,
"tm_mode_full_cnt": 50,
"tm_mode_mixed_secs": 0,
"tm_mode_mixed_cnt": 0,
"tm_searches_ident_cnt": 0,
"tm_searches_ident_secs": 0,
"tm_searches_full_cnt": 0,
"tm_searches_full_secs": 0,
"tm_100_percent": 0.0
}
}
} |
TIPS & TRICKS
Machine translation
If you want to MT the files then make sure to assign a machine translation system in the word count profile specified by parameter wordcountProfileId.
For testing purposes you might opt for the “Pseudo Translation” system which translates by converting text into uppercase letters.
If you would like to find out how much content was translated by machine then download the word count results (described further up). Look for the wdPretransIdenticalMT property which shows the total words translated by machine.
XLIFF translation
XLIFF files may already contain translations. If you do not want the existing translations be replaced by machine translations then make sure to tick this option in the XLIFF configuration you are using:
...
You may also want to consider ticking below option. It extracts status and origin of the translation:
...