You can view earlier editions of the Tool Box Journal going all the way back the 2007
in the archives to which you have access if you support my work on the Journal.

Tool Box Logo

 A computer journal for translation professionals


Issue 20-1-308
(the three hundred eighth edition)  

Contents

1. I told you so!

2. TransTools

3. This is Intent(ed) to Clarify Things

The Last Word on the Tool Box

Characters with Characters

This Tool Box Journal ended up being much longer than planned. And so, rather than a separate introduction, here are my (admittedly very trivial) thoughts on the appearance of alphabets. Particularly of a language that most of you might not be familiar with: Northern Emberá.

Recently I read a book comparing the Latin and Arabic alphabets. Among several differences, one is directional: the Latin alphabet is vertical, while the Arabic alphabet is horizontal, at least according to author and font designer Rana Abou Rjelly. I had never thought about that characteristic before, but I was struck by how correct her assessment felt. Then again, that can so easily be changed with just a few letters in a Latin-character language like Northern Emberá, which uses the "combining long stroke overlay" for the letters b, d, and u (and plenty of tilde'd characters). According to my visual perception, this completely alters the directionality of the written language, as seen here in a couple of first sentences in the gospel of Luke. (Yes, these characters are a pain to enter, and true, they unfortunately don't display well on many computers.)

embera

You can also view this as part of my collection of characters right here.

ADVERTISEMENT

British scientists say: Tool Box Journal readers are smarter than the rest. We have not checked it, but we have a bonus NY resolution for you: Become more organized and productive this year. Work smart, not hard, with AnyCount, TO3000, and Projetex!

Win a discount from 25% to 90% with the Wheel of Fortune. Try your luck at translation3000.com/aitpn/557-78.html.

The wheel stops on January 20 -- get it or regret it!

 

1. I told you so!

I know we've always tried not to say this to our kids, but after reading the article I wrote in issue 297 of the Tool Box Journal about ModernMT, it's really hard not to at least think it. I described a refreshingly different and capable machine translation system (I'll give a short overview down below) with two major problems: the outrageously high pricing and the completely outdated privacy concept that used any customers' data for general training purposes. (Davide Caroselli, ModernMT's VP of product, whom I talked to for both that article and this, defended their privacy concept by saying: "You're right, we also find it difficult at first to explain the privacy issue to our customers.")

Well, I'm pleased to announce that they have completely overhauled both their pricing and their privacy considerations. There is no longer any cross-training, and any data that you upload to enhance your own machine translation will be strictly used by you only (see here). Kind of what you'd expect from a paid product these days. (By the way, the data is stored in their own data center in Italy, and they'll be adding a larger data center soon in the US to accommodate AirBnB, their flagship client.)

The pricing is in line with other tools. You pay by the number of characters as an LSP or translation buyer (between $8 and $50 per million characters, depending on whether you want to train the engine as you translate or queue documents for a batch translation) and a monthly fee if you are a freelance translator ($25).

ModernMT differs from its competitors (at least the ones off the shelf) because you adapt it to your own data and then continue to adapt it without actually spending time to train it. Its base engine is trained on the large data sets collected by Translated, the Italian LSP and tech developer of the massive MyMemory TM (Translated also owns a majority of the shares in ModernMT). And while the baseline engines are retrained once or twice a year, your data effectively sits in the middle and adapts the MT suggestions to your style and terminology. The system uses a technology called "instance-based adaptive NMT," which sends translation requests to a TM layer (consisting of even a relatively small TM as long as it's highly tuned). Once similar segments are found in that TM layer, the NMT engine's "hyperparameters" are adapted on-the-fly to generate a more suitable suggestion. (The concept is based on this paper by the Fondazio Bruno Kessler.)

I asked Davide how many freelance translators are using the tool with the old payment plan, and not surprisingly it's only a little handful. I've talked to one who praised the translation quality as far superior to other engines, a result that Intento has also concluded in one of their reports (see here -- note that they have also worked for ModernMT so there might be a conflict of interest). Either way, if you are inclined to work with adaptive MT, this might be worth a try. Presently you can use it directly within SDL Studio Trados with this app, and of course in MateCat (owned by translated) or via Intento (see elsewhere in this Journal).

 

ADVERTISEMENT

Speed up Your Translation Processes with Across v7

The new version of the Across Language Server and the Across Translator Edition is now available! We have addressed numerous subject areas in order to improve the user-friendliness, to reduce time to market, and to enable new working styles.

Get your Across Translator Edition v7 

 

2. TransTools

When I (once again) decided to write about TransTools, I realized I had (once again) forgotten why I had shied away from it previously. There is so much to say about it! In fact, it's not easy to know where to start.

Maybe we should start with its creator, Stanislav Okhvat, or as he might also be called, Gyro Gearloose (aka عبقرينو, Daniel Düsentrieb, Georg Gearløs, Géo Trouvetou, Lang Ling Lung, Archimede Pitagorico, 자 이로기어루스, Κύρος Γρανάζης, Sriegas Bevarztis, Szaki Dani, Oppfinnar-Jocke, Pelle Peloton, Professor Pardal, Diodak, Petter Smart, or really in this case: Винт Разболтайло), the endearing inventor in Disney's world (though Stanislav is less error-prone than Винт).

Stanislav used to be a technical translator, but a few years ago he switched to focusing on tool development and consulting. He had good reasons for the change. By that point he had already amassed a large number of tools that were developed with translators, editors, and LSPs in mind -- though I would venture to say they have the potential to be very helpful for others as well.

Presently, Stanislav offers a couple of single-purpose products (Term Morphology Editor -- I wrote about that in edition 298 of the Tool Box Journal -- and the Excel File Splitter for splitting Excel files to distribute evenly among translators and later merge them again) and two products with collections of utilities (little programs with specific purposes) that mostly integrate into MS Word and other MS Office programs: TransTools and TransTools+. What might be a little confusing (and will be "fixed" at some point in the future) is that TransTools+ is not necessarily the "pro" edition of TransTools; in fact, they can both be run side by side, complementing each other. The truth (about the "pro" notion) is that the tools in TransTools+ can mostly also be found in TransTools, though they are geared up to the next power level. It's just that not all of TransTools' tools have made that switch. . . . Confusing? I TOLD YOU!!

So, let's simplify a bit.

Let's start with TransTools+, a paid tool ($35, but it has a generous trial period), which has only a handful of utilities:

  • Hide / Unhide Text
  • Multiple Find and Replace
  • Highlighting Tool
  • Document Processing Tool

These all show up in the ribbon bar of MS Word:

TT_

The Hide / Unhide Text utility is a tool that is clearly geared toward users of translation environment tools. Most translation environment tools allow for an option to not translate hidden text in Word documents, so you can prepare your documents to hide exactly what you don't want to translate by applying the hidden feature. With Hide / Unhide Text you can do that by

  • searching for already highlighted text,
  • searching fora certain type of content (textboxes, footnotes, etc.),
  • selecting from a preset list of exclusions (such as all HTML or XML tags, or all time stamps in SRT files), or
  • using extremely sophisticated search features to locate the text (for more on that, see below).

The Highlighting Tool is similar (only, as the name implies, it applies highlighting), but it's useful for a different purpose. For instance, it's helpful for a project manager to batch highlight certain things in a document before sending it back to the translator or editor. Or, vice versa, for a translator or editor to communicate questions to a PM or client. Any of the steps or series of steps can be saved and applied to later files. Plus, with the Document Processing Tool, they can even be applied to any number of Word files (DOCX, DOC, RTF) simultaneously.

This brings us to the Multiple Find and Replace tool (whose search features are also part of the Hide and Highlighting tools). As the name implies, it can run a number of search and replace processes simultaneously and, as in the other tools, save that list of processes for later re-use. This version of the search & replace tool (in contrast to the one in the TransTools collection -- the one without the +) super-helpfully provides access to a huge range of regular expressions beyond the ones offered in Word, alongside descriptors that help you choose which one to use:  

regEx

This document provides a helpful overview of ways to use the tools, and a preconfigured list of common problems after using OCR on a document also serves as a good guide for how to put together a list:

replaceList

I don't know about you, but I spend a lot of time having to come up with effective ways of searching and replacing data in documents, often in a number of steps and/or over and over with similar processes. This tool is just about as helpful as I can think of when it comes to that.

Let's talk about TransTools. (continued below)

 

ADVERTISEMENT

Tool Box Journal subscribers save more on memoQ

Don't you have your own memoQ translator pro license yet? memoQ offers 30% off on all new translator pro licenses for Tool Box Journal subscribers!

Read our guide on the purchase process and apply code CAM_TBJ202030_PS in the memoQ webshop. Offer expires at midnight January 23 in any time zone.

 

2. TransTools (continued from above)

Let's talk about TransTools. You can see from its ribbon bar that there are many more tools:  

TransTools

As I mentioned above, the three main tools of TransTools+ (Hide / Unhide Text, Multiple Find and Replace, Highlighting Tool) can be found among the offered utilities but with a slightly different name and in a less powerful state (such as without all those fancy regular expressions beyond the regular Word ones). But there are plenty of other interesting tools that virtually everyone who has ever worked in technical translation will immediately understand. Here are some good examples:

  • Document Cleaner -- to get rid of typical problems after OCR'ing a file or converting a PDF or otherwise having to deal with lots of unnecessary codes (comparable to the well-known CodeZapper tool)
  • Find / Replace Excessive Spaces -- as the name says and including those before and after punctuation marks
  • Unbreaker -- to remove incorrect paragraph marks after copying and pasting text from a PDF
  • Dual Language Assistant -- to convert a document into a table by putting each paragraph of the source language into a cell in the left column
  • Manual Localization -- to automatically convert language-specific decimal separators, including non-breaking spaces
  • Document Format Converter -- to batch-convert any number of Word files of all formats (RTF, DOCX, DOC) to any other Word format or PDF
  • Quotation Magic -- to insert correct language-specific quotes
  • Correctomatic -- to automatically correct certain words (helpful when going between different forms of English)
  • What is this Symbol -- gives you the ANSI code for the character(s) you highlight
  • etc., etc.

TransTools optionally also installs tools in PowerPoint and Excel:

PPTTransTools

Most welcome in PowerPoint might be the Change Language tool (see the graphic above), which changes the spellchecking language in PowerPoint. I think we all can appreciate how helpful that is, knowing how tedious it is otherwise.

In Excel, the Glossary Search tool might be the most helpful: it allows you to search multiple glossaries with an independent program that is also installed at one time.

ExcelTransTools

By the way, you'll need to pay for some of the tools in TransTools after a 45-day trial period, including Unbreaker, Correctomatic, and Quotation Magic (these three will be ported to the TransTools+ version at the end of the month) as well as the Spellcheck Assistant (a tool to build up and use custom dictionaries) and the Document Format Converter. The Tag Cleaner and Unbreaker tools in Excel are paid as well, and TransTools for Visio and TransTools for AutoCAD that we have not talked about here are paid features, too. The TransTools version with all the paid features costs $25.

I can't imagine that you have not seen at least a couple of processes in this wealth of tools that you like, and chances are they might even be among the free offerings. Of course, the drawback of program-specific tools like these is that, well, they're program-specific (i.e., specific to programs like Word, Excel, etc.). What differentiates these is that the utilities they offer really are meant to prepare files that are to be translated or have been translated for use elsewhere (either into a different tool, such as a translation environment, or by a different person with a different function, such as an editor), so their use is potentially much larger than just making life easier in MS Word. And if there is anything to the Винт Разболтайло concept, the usefulness and range of tools will only grow.

 

ADVERTISEMENT

Join us on a special journey and kick-start your year by finding out what SDL Trados can do for you, as we share our resources designed to help you accelerate your career as a translation professional.

Learn more: sdltrados.com/landing/we-are-sdl-trados.html.

 

3. This is Intent(ed) to Clarify Things

I recently did an interview with Konstantin Savenkov of Intento about what his tool does and how or whether it can be useful to you and me:

Jost: Can you give us a short overview of what your company offers?

Konstantin: As of today, Intento helps large enterprises to procure and deploy the best-fit machine translation (MT) across a wide range of enterprise scenarios. Our main product is Enterprise MT Hub, which provides a universal API [=application programming interface, i.e., the interface that lets applications "talk" to each other] to almost every MT on the market, making it easy to integrate multi-engine MT portfolio with all software systems enterprises have onboard.

Basically, we customize and evaluate 10-15 MT systems on customer data, build a set of custom and stock MT engines and route requests to them to get the best quality in every language pair and scenario. The scenarios include traditional localization, customer support (tickets and chats), website translation, corporate translation portals, and some others. Each of them has a specific purpose for machine translation, and also specific data which may be used to customize and evaluate the MT engine.

We take the data, use our tools to clean it, customize MT engines and select the best one for each customer scenario and language pair, along with the expected ROI. Then in production, our integration platform is used to deliver MT where it's useful, typically a handful of places across the company.

J: Is this something that is relevant for freelance translators? Do you actually have freelance translators among your customers? Or is this mainly for LSPs and translation buyers?

K: One of the side-effects of our technology is that our plugins [see below for a list] work with lots of MT engines. Another thing which we found is useful to freelance translators is our general-purpose MT routing, where we route requests to the engine which is best for this language pair according to our benchmark.

We experiment a lot with different client segments, including freelance translators. Freelance translators are an important part of the ecosystem, so we are looking into how to make our tools affordable for them rather than see them as a revenue stream.

J: Can you give us a list of the different tool integrations that are available?

K: The full list of Machine Translation systems we support is here: Alibaba (General and eCommerce engines), Amazon, Baidu, CloudTranslation, DeepL, Google (Basic, Advanced and AutoML), GTCom, IBM, Kakao, Naver, Microsoft (including custom models), ModernMT, Naver, PROMT, SAP, several SDL and Systran systems, Tencent, Tilde, Yandex and Youdao.

We have plugins for memoQ, SDL Trados, and Matecat. Some other CAT tools use us on the backend as MT integration tool (e.g., Smartcat). We also have XLIFF connectors to TMS systems (such as XTM and Memsource), a Chrome extension, plugins for Microsoft Office, and connectors to general-purpose enterprise integration platforms, such as BMC or Mulesoft.

J: Can you tell us a little about your thoughts on using generic (stock) engines vs. customized engines?

K: We work a lot with customized NMT engines: Globalese, Google, IBM, Microsoft, ModernMT, Systran, Yandex. In most of the cases, customization with TM and/or glossaries improves the outcome a lot.

It takes resources to prepare the data, cook the engine and maintain it over time, so it should be a careful decision based on the project budget. For small projects, customization hassles may exceed the effort to edit stock MT by much. However, we see quite a bit of raw MT cases that would not be possible at all without the custom NMT.

J: In a Twitter conversation a few weeks ago, you were quoted with an anecdote from a client who is using your product and who found out that unedited, raw machine translation was more successful with their customers because the customers felt that the more polished, post-edited translations were sponsored and therefore not to be trusted. At face value, of course, this sounds kind of shocking, but I went to a recording of the talk where you mentioned this and it really was not as outrageous as it first sounded. You were referring to Chinese-to-English translations of product descriptions where the vast majority were raw machine translation and only a few stood out as post-edited. It kind of makes sense that customers see that as suspicious and don't trust those descriptors as much. The two conclusions that I draw from this are that a) the unedited machine translated segments must have been pretty bad if it was so easy to see the difference, and b) this anecdote certainly does not mean that overall raw machine translation sells better than edited or translator-translated data. Would you agree?

K: This anecdote is about the specific idea that the translation should be evaluated for the specific purpose the MT is used. End-users are rarely interested in the linguistic quality per se; they are looking for something else: increased conversion, customer satisfaction, turnaround time, reduced headcount. There may be some huge surprises like the one in this anecdote. There may be solutions to the issues, e.g., in another case with bad source content, it turned to be much better to avoid the translation altogether, replacing it with text generation.

 

ADVERTISEMENT

Memsource Editor for Mobile:
Translation in the Palm of Your Hand

Process urgent client requests, assign translators and resources, monitor project progress and even review and edit jobs wherever you are right from your mobile. Try it now, free!

 

The Last Word on the Tool Box Journal

If you would like to promote this journal by placing a link on your website, I will in turn mention your website in a future edition of the Tool Box Journal. Just paste the code you find here into the HTML code of your webpage, and the little icon that is displayed on that page with a link to my website will be displayed.

If you are subscribed to this journal with more than one email address, it would be great if you could unsubscribe redundant addresses through the links Constant Contact offers below.

Should you be interested in reprinting one of the articles in this journal for promotional purposes, please contact me for information about pricing.

© 2020 International Writers' Group  

 


Home || Subscribe to the Tool Box Journal