New sample: Internal or external supplier

Creates a new random sample for jobs of a specific supplier. Use cases:

  • Sample segments from jobs of an internal or external supplier.
  • Evaluate the quality of work from the sample.
  • Choose a date range for the sample. This lets you evaluate quality for different periods and detect trends. 

URL

(POST) /resources/segments/sampling/new

PARAMETERS

The parameters are a JSON object included in the request body:

typeValue must be: SupplierMandatory, string

cid

The supplier company id. See companies/list to list all companies or find companies by name.

Mandatory, int
uidThe optional supplier user id. Specify with internal workers, in addition of the company id for your platform company.Optional, int?
dsid

Optional project memory resource id. Use if you want to get segments for the supplier in a specific project.

To enumerate or find projects use projects/list

Optional, int?

dtfrom

dtto

Selects supplier's jobs completed between dtfrom and before dtto. Both dates are UTC.

Mandatory, datetime
srcThe source locale (language code). Only jobs in this source language are considered.Mandatory, string
trg

The target locale (language code). Only jobs in this target language are considered.

Mandatory, string
tskThe task code such as "TR", "RV" etc. Only such jobs are considered.Mandatory, string
size

The expected sample size. Default is 10.

This must be a value between 1 and 50.

Optional, int?
layout

Optionally specify the segments' fields to include in the results. This is done using a layout JSON object.

If not specified, the system will include:

  • Segment level properties such as all IDs, custom fields and labels
  • Source language text, flags, custom fields and labels
  • Source text related comments
  • Target language text, flags, custom fields and labels
  • Target text related comments
  • Target text revisions.
Optional, object?
persist

Optional boolean. Default is false.

Only set to true if required. If true, then the results are temporarily saved and assigned a token (see sampletoken in results).
You need this token when using the QA workflow API methods in order to create a workflow/jobs for the sample.


Optional, bool?
includeresults

Optional boolean. Default is true. If true then the returned JSON includes the result node. Otherwise only the summary statistics are returned.

If you further process results using the sampletoken you may not need the results with this call.

Optional, bool?

 

You can further fine tune the sample with these additional parameters:


Filter options
editorInitial

Optional filter on the initial translation done. Values are:

  • Any: No filter. Equivalent to dropping property.
  • MachinePretranslation: The initial translation was a machine translation. This permits to get a sample of post edits.
  • MemoryPretranslation: The initial translation was from a translation memory or a previous document version. This permits to get a sample of post edits.
  • NoPretranslation: The initial translation is not a pretranslation (machine, memory or previous document version)
  • Human: The initial translation was imported from XLIFF or other file formats and marked as human translated by the respective file filter.


Optional, string?
editorCurrent

Optional filter on the current translation. Values are:

  • Any: No filter. Equivalent to dropping property.
  • MachinePretranslation: The current version of the translation is a machine translation and was never post edited by a human. This filter permits to verify that the machine translation is indeed of sufficient quality and did not require correction.
  • MemoryPretranslation: The current version of the translation is memory pretranslation and was never post edited by a human. This filter permits to verify that the leveraged translation is indeed of sufficient quality and did not require correction.
  • NoPretranslation: The current version of the translation is not a pretranslation (and thus either a human translation or an automatic markup fix)
  • Human: The current version of the translation was done by a human.


Optional, string?
dteditfrom

Optional filter on the date of last translation edit. If set, the sample will include translations edited at or after this date only.


Optional, datetime?




Scoring options

boostWordsMin

boostWordsMax

This option lets you express a preferred word count of the segments to retain. The sample will then contain segments with similar word count at a higher probability than segments with less or more words (of the source text - not the translated text!).

  • boostWordsMin: The minimum preferred number of words in the segment.
  • boostWordsMax: The maximum preferred number of words in the segment. Optional.

Explanation:

If min is 10 and max is 15, the system will sample more segments with words in the range than other segments. Mathematically, the decrease of probability below min and above max is a Gaussian whereby the probability drops to below 0.2 beyond a certain range beyond the limits (between 3 words and twice the range width).



Optional, int?
int?


RESULTS

NOTE: If parameters are invalid or no job could be filtered then you get an error message. See next chapter on how jobs are selected.

A JSON with these properties:

samples

An array of samples. The present method produces a single sample, so there is always exactly 1 element in the array.

See table below for properties.

int
sampletoken

If assigntoken was set to true, then this field is a token. It is required to push the sample into a QA evaluation workflow (see related API methods).

string?


Each samples array element has these properties:

segmentsTotal segments in sample. Note that this number will be less than the expected sample count if there is no or not enough data or the filter is too restrictive.int
wordsTotal source text words in sample.int
tskThe task code such as "TR", "RV" etc. of the sample.string
srcThe source language of the samplestring
trgThe target language of the samplestring
cidThe supplier company idint
uidThe internal supplier person id. Null with external suppliers.int?
dtfrom, dtto

The date range of the sampled jobs. We recommend that you format dates with a timezone indication or "Z" for UTC. For example:

  • 2018-06-12T18:45:15.0000000-05:00
  • 2009-04-14T16:19:58.0785000Z

You can however also provide simpler dates such as "2018-10-10".

datetime?



result

Contains all the segments in the sample, information on the resources to which the segments belong as well as worker names.

object[]
result.rows

The list of segments.

Includes main segment properties as well as the data columns specified in the layout parameter.

The format is explained further down in this page.

object[]
result.docs

A dictionary with all documents that appear in the results.

This permits to show document names and more information per segment (see the did property of a segment).

The format is explained further down in this page.

object
result.users

A dictionary with all users/persons that are referenced by the segments included with the results.

A segment references the persons that have last changed a text, a status, a bookmark etc.

The format is explained further down in this page.

object
columnsAn array with the columns in the result.rows property. Each array element describes one column, see here: Spreadsheet Column (Object)object[]


JOB SELECTION

You receive an error message if your filters did not select any job. This is how jobs are selected:

  • The job must be completed
  • The job completion date must be within dtfrom and dtto
  • The job source / target language must exactly match parameters src and trg
  • The job task type must exactly match parameter tsk
  • The job must be assigned to company cid
  • If you set optional parameter pid then the jobs must further be assigned to this person
  • If you set optional parameter dsid then the jobs must be part of the project or more precisely the project memory resource 


ACCESS RIGHTS

The user must have administrator or manager level credentials.


EXAMPLE

Here we want to random sample 2 segments coming from German to English translation jobs done by company with ID 111. We could have set the "uid" parameter to further drill down on a specific user.

We prefer source texts with between 15 and 20 words. Our sample will then contain at a higher probability longer texts.

We do not specify other optional parameters such as the layout. If the latter is not set, the system returns by default the columns for source text, translation, comments and translation revisions.

POST /resources/segments/randomsample/new
BODY:
{
  "type": "Supplier",
  "cid": 111,
  "src": "de",
  "trg": "en",
  "tsk": "TR",
  "dtfrom": "2018-10-10",
  "dtto": "2019-10-10",
  "size": 2,
  "boostWordsMin": 15,
  "boostWordsMax": 20
}


The method returns the requested sample:

{
  "samples": [
    {
      "segments": 2,
      "words": 35,
      "src": "de",
      "trg": "en",
      "tsk": "TR",
      "cid": 1,
      "uid": null,
      "dtfrom": "2018-10-10T00:00:00",
      "dtto": "2019-10-10T00:00:00",
      "dsid": null,
      "result": {
        "rows": [
          {
            "no": "19",
            "sid": 4786921,
            "did": 7394,
            "dsid": 65,
            "cty": 1,
            "sdid": null,
            "bsid": 19,
            "bssid": 0,
            "edit": true,
            "tags": null,
            "tmx": [],
            "ctx": "h3",
            "ctx_edit": true,
            "chmin": null,
            "chmax": null,
            "ch_edit": true,
            "lbls": [],
            "lbls_edit": true,
            "cfs": [],
            "cfs_edit": true,
            "cols": {
              "_0": {
                "column": 0,
                "txt": {
                  "val": null,
                  "st": 0,
                  "bk": 0,
                  "tsk": null,
                  "loc": "de",
                  "cmc": null,
                  "ed": 0,
                  "usid": null,
                  "usdt": null,
                  "hh": false,
                  "sim": 0,
                  "err": null,
                  "lck": false,
                  "lck_edit": false,
                  "hn": null,
                  "hp": null,
                  "cfs": [],
                  "cfs_edit": false,
                  "lbls": [],
                  "lbls_edit": true,
                  "usfid": null,
                  "usfdt": null
                },
                "txt_edit": false
              },
              "_1": {
                "column": 1,
                "txt": {
                  "val": "How can I work as an external translator for the European Parliament?",
                  "st": 0,
                  "bk": 0,
                  "tsk": null,
                  "loc": "en",
                  "cmc": 0,
                  "ed": 0,
                  "usid": null,
                  "usdt": "2018-07-05T17:23:20.8830832Z",
                  "hh": false,
                  "sim": 0,
                  "err": null,
                  "lck": false,
                  "lck_edit": false,
                  "hn": 458050321,
                  "hp": 898309661,
                  "cfs": [],
                  "cfs_edit": false,
                  "lbls": [],
                  "lbls_edit": true,
                  "usfid": null,
                  "usfdt": null,
                  "tmx": []
                },
                "txt_edit": false
              },
              "_2": {
                "column": 2,
                "revs": [
                  {
                    "ty": "text",
                    "current": true,
                    "val": "How can I work as an external translator for the European Parliament?",
                    "tsk": null,
                    "ed": 0,
                    "dt": "2018-07-05T17:23:20.8830832Z",
                    "uid": null,
                    "loc": "en",
                    "mk": null
                  }
                ],
                "revs_edit": false
              },
              "_3": {
                "column": 3,
                "cms": [],
                "cm_edit": false
              },
              "_4": {
                "column": 4,
                "cms": [],
                "cm_edit": false
              }
            }
          },
          {
            "no": "12",
            "sid": 4786914,
            "did": 7394,
            "dsid": 65,
            "cty": 1,
            "sdid": null,
            "bsid": 12,
            "bssid": 0,
            "edit": true,
            "tags": null,
            "tmx": [],
            "ctx": "p",
            "ctx_edit": true,
            "chmin": null,
            "chmax": null,
            "ch_edit": true,
            "lbls": [],
            "lbls_edit": true,
            "cfs": [],
            "cfs_edit": true,
            "cols": {
              "_0": {
                "column": 0,
                "txt": {
                  "val": null,
                  "st": 0,
                  "bk": 0,
                  "tsk": null,
                  "loc": "de",
                  "cmc": null,
                  "ed": 0,
                  "usid": null,
                  "usdt": null,
                  "hh": false,
                  "sim": 0,
                  "err": null,
                  "lck": false,
                  "lck_edit": false,
                  "hn": null,
                  "hp": null,
                  "cfs": [],
                  "cfs_edit": false,
                  "lbls": [],
                  "lbls_edit": true,
                  "usfid": null,
                  "usfdt": null
                },
                "txt_edit": false
              },
              "_1": {
                "column": 1,
                "txt": {
                  "val": "In addition to conventional reference works, all translators have access to the Internet and to Intranet resources to check terms, expressions and facts.",
                  "st": 0,
                  "bk": 0,
                  "tsk": null,
                  "loc": "en",
                  "cmc": 0,
                  "ed": 0,
                  "usid": null,
                  "usdt": "2018-07-05T17:23:20.8820751Z",
                  "hh": false,
                  "sim": 0,
                  "err": null,
                  "lck": false,
                  "lck_edit": false,
                  "hn": 569877624,
                  "hp": 1481667450,
                  "cfs": [],
                  "cfs_edit": false,
                  "lbls": [],
                  "lbls_edit": true,
                  "usfid": null,
                  "usfdt": null,
                  "tmx": []
                },
                "txt_edit": false
              },
              "_2": {
                "column": 2,
                "revs": [
                  {
                    "ty": "text",
                    "current": true,
                    "val": "In addition to conventional reference works, all translators have access to the Internet and to Intranet resources to check terms, expressions and facts.",
                    "tsk": null,
                    "ed": 0,
                    "dt": "2018-07-05T17:23:20.8820751Z",
                    "uid": null,
                    "loc": "en",
                    "mk": null
                  }
                ],
                "revs_edit": false
              },
              "_3": {
                "column": 3,
                "cms": [],
                "cm_edit": false
              },
              "_4": {
                "column": 4,
                "cms": [],
                "cm_edit": false
              }
            }
          }
        ],
        "docs": {
          "_7394": {
            "did": 7394,
            "dsid": 65,
            "name": "ElasticSearch_V6.htm",
            "pmax": null,
            "pmin": null,
            "ptype": 1,
            "pdomain": "HTML",
            "previewapp": null,
            "previewurl": null,
            "edit": true,
            "ctags": [
              "[b]",
              "[/b]",
              "[i]",
              "[/i]",
              "[u]",
              "[/u]",
              "[s]",
              "[/s]",
              "[sup]",
              "[/sup]",
              "[sub]",
              "[/sub]",
              "[nbsp]/"
            ],
            "sub": []
          }
        },
        "users": {}
      },
      "columns": [
        {
          "index": 0,
          "fkey": "1~de~0",
          "fkeyLayout": "1~de~0",
          "ftype": 1,
          "fqualifier": 0,
          "name": "Allemand",
          "loc": "de",
          "loc_rtl": false,
          "loc_cmplx": false,
          "loc_ea": false
        },
        {
          "index": 1,
          "fkey": "1~en~0",
          "fkeyLayout": "1~en~0",
          "ftype": 1,
          "fqualifier": 0,
          "name": "Anglais",
          "loc": "en",
          "loc_rtl": false,
          "loc_cmplx": false,
          "loc_ea": false
        },
        {
          "index": 2,
          "fkey": "12~en~0",
          "fkeyLayout": "12~en~0",
          "ftype": 12,
          "fqualifier": 0,
          "name": "Revisions - Anglais",
          "loc": "en",
          "loc_rtl": false,
          "loc_cmplx": false,
          "loc_ea": false
        },
        {
          "index": 3,
          "fkey": "9~de~0",
          "fkeyLayout": "9~de~0",
          "ftype": 9,
          "fqualifier": 0,
          "name": "Comments - Allemand",
          "loc": "de",
          "loc_rtl": false,
          "loc_cmplx": false,
          "loc_ea": false
        },
        {
          "index": 4,
          "fkey": "9~en~0",
          "fkeyLayout": "9~en~0",
          "ftype": 9,
          "fqualifier": 0,
          "name": "Comments - Anglais",
          "loc": "en",
          "loc_rtl": false,
          "loc_cmplx": false,
          "loc_ea": false
        }
      ]
    }
  ]
}


 

 

 

 

Copyright Wordbee - Buzzin' Outside the Box since 2008