Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

XSLT stylesheets style sheets (.xsl, .xslt) can be used to convert often complex XML (and also XLIFF) files into nice looking web pages.

This transformation is helpful for better looking and easy to grasp previews of translated documents: Instead of showing raw XML to translators, you show nicely formatted web format.

Example

Open the How it works: Open an XML file in the new translation editor. Then click the "Preview" menu and select "Preview target document":


Regular XML previewPreview XML with a stylesheet

You get text back as an HTML file, ready to be edited or translated.

Table of Contents

Image to text sample

Let us look at a sample image and how the result is extracted. Yes you can copy & paste the text to the right, it is no image any longer.

...

Uploaded image:

Image Removed

...

Extracted text:

好一朵美丽的茉莉花
好一朵美丽的茉莉花
芬芳美丽满枝桠
又香又白人人夸
让我来将你摘下
送给别人家
茉莉花呀茉莉花

Convert image files to text

Go to a project and open the document library. Upload or drag & drop your image file:

Image Removed

Click images to select one or more images. Then click the OCR link above the files. The OCR tool opens:

Image Removed

Choose one of the OCR systems and hit "Process files". The results are saved next to the images:

Image Removed

To rapidly check if the text was properly recognized, right click one of the html files and select the "Open With" > "Web Browser" option:

Image Removed

Handwritten text (English only)

Among the integrated OCR systems you have one from Microsoft that can extract handwritten English text.

Image Removed

Sample image:

Image Removed

Converted to:

Code Block
chapter
Mr. Sherlock Holmes
In the year 1878 I took my degree of Doctor
of medicine of the university of London
and proceeded to Netley to go through
the course precribed for surgeons . Having
completed my studies there . I was duly attached
to the Fifth Northumberland Fusiliers as
Assistant
Surgeon

Enabling OCR systems

The OCR technology is provided by Google and Microsoft (more systems may be added in the future). You need to obtain credentials from either of those. Both propose free plans and the free plans allow to convert up to 5000 images per month. Pretty much. Beyond that you are charged but the cost is very reasonable.

Go to Settings > Image to Text (OCR) and enable the systems you need:

Image Removed

This page gives all information needed.

Info

The sign-up process with Google and Microsoft is not the most intuitive. If you have any questions please contact our support team.

What about PDF files?

If the PDF was created with a word processor and it is not a scan, then you can use the PDF converted tool in the project page.

If the PDF contains scanned pages, then you could save the individual pages as image files. This can be done with a screenshot tool.

...


Building style sheets and finding help

A style sheet is an XML file with instructions on how to convert an XML into usually HTML. For our sample above we created this simple sample1.xsl file:

Code Block
languagexml
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template match="p">
	<tr><td><xsl:value-of select="." /></td></tr>
</xsl:template>

<xsl:template match="/">
  <html>
  <body>
  <h3>My sample document</h3>
  <table style="border-collapse: collapse; border: 1px solid black; width: 500px;">
    <tr bgcolor="#9acd32">
      <th>Paragraphs</th>
    </tr>
    <tr><td><xsl:apply-templates select="//p"/></td></tr>
  </table>
  </body>
  </html>
</xsl:template>

</xsl:stylesheet>


Style sheets are all but easy to build, let's not hide the truth. Fortunately, there are good tutorials out there:

You can also find free online tools to test and validate your XSL files:


Configuring your XML filter

Once you have created and tested your style sheets you are ready to assign them to your XML filters.

Go to Settings > XML files and choose your filter. You will see the Web preview tab and a style sheet selector.

Image Added

If this is the first time you use the feature, the selector will be empty. Yes, you still need to upload your first style sheets.

Click the small open library link above and create an Xslt folder:

Image Added

Upload and edit all your style sheets from here. Once your style sheet added, assign it to your XML filter.

Ready to go

When you now mark an XML file for translation please select the XML filter with your style sheet setting.

The system will automatically apply the style sheet when the file is previewed in the new translation editor. Exactly as the example in the top of the page demonstrates.

Image Added