Tool Box Logo

 A computer journal for translation professionals


Issue 14-10-241
(the two hundred forty second edition)  

Contents

1. Mission Possible (Premium Edition)

2. This 'n' That

3. Mother Goose to the Rescue

4. Forced Sounds (Premium Edition)

5. Sticky Stuff

The Last Word on the Tool Box

Translation Is Impossible. Let's Do It!

This was the title of an exhibition at the New Museum in New York. I can't tell you much about the exhibition, but what a clever title! Sums up just about everything I know about translation.

It may seem impossible to select translation technology as well -- especially if you're relatively new to translation -- so I tried to give some suggestions on how you can choose the right technology at this year's ATA. I've uploaded the presentation right here:

(Cool title page design, huh? Can you believe that I actually bought this particular PowerPoint template because I liked it so much?)

In a nutshell, the presentation encourages you to look less at the technology itself and more at yourself and what you do. In my opinion, the four most important questions are these:

  1. Do clients/colleagues require me to use a specific, a compatible, or no specific technology?
  2. Is my data too confidential for online processing?
  3. Are my languages adequately supported by the relevant technology?
  4. Do I understand (and/or adequately utilize) the possibilities of technology?

Other questions such as how the technology processes which kind of files or how technical you have to be to work with the technology are really secondary (even though these are often considered the primary decision criteria).

Choosing translation technology? It's possible!

1. Mission Possible (Premium Edition)

I'm a man on a mission these days. Sort of. Here's my mission: As I've said in previous Tool Box Journals -- it's in all our interests to find better ways of utilizing machine translation than we have so far with post-editing. And "all" really means all, including translation professionals and translation buyers. There is a lot of potential in harvesting data from machine translation suggestions, but overall I think we're going about it the wrong way. Now, there are some machine-translated projects where post-editing is a good option. Those are projects with a highly trained MT engine in place for a well-suited language combination and text type where the post-editor essentially only has to do touch-up work. Anything else but post-editing would seem silly in that case.

But do you always work with engines and in situations like that? Nope, neither do I. (I've actually never done that!)

What are other ways to access content that comes out of a machine translation? Here's how MT engines work: Internally they come up with many, many propositions as translations for the text to be translated, but typically they expose only one of those options -- that's the one that's supposed to be post-edited in the old, and very often tired, model. We all know that there will be some parts in that suggestion that will be OK, even really good, but we also know that in its entirety the suggestion often needs so much work that it becomes tedious to use it. But here's the key: The machine translation engine holds in its dark recesses pretty much all or most of the translation elements we will end up using; it just doesn't release them to us. You can expose some of those usually hidden "phrase tables" in Google Translate when you look at the different suggestions for each phrase by clicking on parts of a target sentence. You can also see it in this beta version of the Moses-based WIPO MT engine.

(I just watched Obama's speech on immigration, so I'll borrow some of his fervor and authority for this persuasive paragraph.) Why, my fellow translators, should we trust the haphazard decision-making of the MT system on what to present to us when we know that we know better? Why should we abrogate our judgment when we know that the MT has what we're looking for on a phrase or subsegment level but is simply unwilling to share it with us? I believe in a fair and just system where we all have access to all the data. (Enough Obama-speak.)

Some tools already have features that are moving in the right direction:

  • Wordfast Classic (and presumably the long-announced new version of Wordfast Pro) has an AutoSuggest feature that displays both segments and terminology (which actually ends up being subsegments) as you type from the various MT engines you can connect (under Setup> AS).
  • Trados Studio offers two plugins that use AutoSuggest for whole and partial segments -- unfortunately only from one MT at a time (the now free -- and very helpful -- MT AutoSuggest that suggests from one of the associated MT engines and Google Translate AutoSuggest which is also free and allows you to use phrase suggestions from Google Translate without paying the typical Google usage fee).
  • Déjà Vu X3 offers AutoWrite options for all associated MTs for whole and partial segments (under File> Options> General).

I'm really excited about the progress that I've seen in these tools. However, none of these tools actually retrieves several suggestions from one MT engine per sub-segment (instead, they use several engines with one suggestion each). In addition, none of them is dynamic in the sense that the MT is continuously queried and re-queried on the basis of what has already been chosen as the right translation.

See, the power of getting several suggestions from one MT makes so much sense with customized machine translation engines. It might be helpful to get lots of suggestions from a non-specialized MT, but from a specialized MT, it's kind of like TM on steroids. It provides all the different combinations of language that you trained it on. Will it come up with the right solution? Sure it will, if you can dig deep enough and look for fragments rather than the whole segment, and if the digging can happen purposefully through the keystrokes that you enter rather than some awkward search functionality, thus placing the oversight and control squarely in your hands.

Now couple that with an interactive reformulation of the suggestions based on what you previously decided on as your translation? (Please stop drooling on your keyboard!)

. . . you can find some suggestions on how we can realize some of this in the premium edition. If you'd like to read more, an annual subscription to the premium edition costs just $25 at www.internationalwriters.com/toolkit. Or you can purchase the new edition of the Translator's Tool Box ebook and receive an annual subscription for free.

 

ADVERTISEMENT

Across Language Server v6: The Most Important New Features at a Glance:

  • Across Dashboard: Relevant project information at a glance.
  • crossWeb Review Mode: Easy involvement of international subsidiaries in the review process.
  • Across Data Cube: Easy collection of indicators.
  • Project Management Cockpit: Optimized work process for project managers.

Find out more about Across version 6 at www.across.net/en/support/whats-new/.

 

2. This 'n' That

I've already mentioned the MT AutoSuggest and the Google Translate AutoSuggest apps for Trados Studio in this Journal -- but overall it's hard to keep up to date with the many Trados Studio apps that are continuously being released. If you use Trados Studio and want to stay current, you should probably follow Paul Filkin's Twitter account to keep abreast of Paul's and other blog posts that will keep you in the loop. And you also might want to have a look at the presentation that Finnish translator-extraordinaire Tuomas Kostiainen gave at this year's ATA conference in which he attempts to structure and list the available apps in an insightful manner (along with giving some good suggestions on what apps are helpful).

 

Last week, Service Pack 2 for Trados Studio 2014 was released. I didn't have much time to look at it in great detail (though the upgrade went very smoothly), but others have. Here are some good blog posts on what to expect:

Some of these are more proficient experts in Trados Studio than I, so I'm happy to let them have the floor.

 

SanTrans contacted me recently about the work that he has done to the humongous termbase exchange download from the IATE termbase. Henk has developed a program that allowed him to clean the IATE termbase (of tags, synonyms, etc.) and extract the terms by language pair so you can easily download it into tools like Trados Studio, Déjà Vu 2/3, memoQ, CafeTran (for these tools he has actually very specifically prepared some file formats) or any other tool that supports exchange file formats. Do you need his services if you want to use the data? Absolutely not -- that is, if you have several hours (and the expertise) to process the data like he has. Sounds like a pretty good deal to me for only 10 euro per language pair!

 

ADVERTISEMENT

SDL Trados Studio 2014 Service Pack 2 has arrived

Enjoy a range of new SDL Trados Studio 2014 and SDL MultiTerm 2014 features and enhancements that will further enhance your translation productivity.

Highlights include:  

  • Improved terminology performance, including no more Java
  • Edit source for all files types
  • Support for alphanumeric strinngs

Learn more about these features in the Freelance edition and Professional edition.

 

3. Mother Goose to the Rescue

Christmas is coming. The goose is getting fat....
Is finding presents for your peeps a task you excel at?
Never fear if it is not. You can calm your consternation.
The best gift for all is Found in Translation.

You can find information on how to bulk order lots of copies for greatly reduced prices right here, and if you'd like me to sign the books before you send them out we can find a way to do that as well (just send me an email so we can talk about it).

"We ordered FOUND IN TRANSLATION for the majority of our staff. So nice that there are authors like you and Nataly Kelly." @EhlionGlobal

 

4. Forced Sounds (Premium Edition)

There are certain sounds and words that almost force themselves on you. Take "MateCat", for instance. Here's what I had to say a year ago:

As any cat owner knows (my very proud self included), mating cats ain't a pretty sight or sound.

You're with me so far?

But then there is this:

MateCat, however, might be much easier on your eyes or ears.

Man, I still can't get it out of my ear, though...

Seriously, though, MateCat has officially been released, and here's some of what I said in my original article:

I talked about MateCat with Alessandro Cattelan and Marco Trombetti from Translated in Italy (the developers behind MyMemory). MateCat was an EU-funded project that was supported by a large team of very impressive caliber, including Philipp Koehn, one of the leading developers of statistical machine translation and the open-source MT engine Moses.

The goal at first was purely academic, as a tool to research how much time post-editing of machine translation takes. While there seem to be some kinds of standards emerging for how to compensate for post-editing machine translation, in many ways it's still a nebulous affair, so tools like this can be helpful.

Because it's EU-funded, the core of the project had to be open-source. This is what you find when you go to this site. Essentially it's the same as its commercial counterpart right here, with the one (important) difference that the open-source version does not provide for any file filters except XLIFF. So if you were to choose to use it, you would first have to convert your Word, InDesign, or XML files to XLIFF with something like the free localization champion Rainbow. On the other hand, right now there is really no reason to stubbornly use the open source vs. the commercial version since at this point both are free.If you stop by the sites, you'll find a very easy and spartan browser-based interface with virtually no project management facilities except the ability to select your language combination, select a machine translation engine (or MyMemory, which contains both a large TM and MT), and create your personal TM (you don't have to do that, but if you don't all your translations end up in MyMemory, which might really frustrate your client).Once that is done, you drag your file or files over to the page where they are analyzed. The analysis uses the concept of "equivalent words," something you might know as leveraged or weighted word count, i.e., word counts that take repetitions and matches into consideration.

The table-based translation view is also extremely simple and essentially only offers to select from displayed translation memory matches and MT translations or to copy source to target and jump to the next segment. That's it.

There are several things happening in the background, though. First, you are of course being timed (remember this was the initial purpose of this tool to start with), and all kinds of statistical data is being collected. I'm not sure who has access to that kind of data, but you will eventually be able to install MateCat on your own computer or server, and then no one should be able to see it (you can view that data now by selecting the Editing Log link at the bottom of the screen). Also, the integrity of the individual segments is verified constantly to make sure that no tag or other element is missing. Lastly, and that's probably the most interesting part, an immediate learning of the associated statistical machine translation engine is happening. Now, statistical machine translation by default learns from the data you feed into it, but the updates only happen at certain times when the new data is computed and read into the engine (for instance, try to change Google Translate and assume that "your" correction will show up next time -- it's not gonna happen right away, but you might see it pop up in two weeks).

MateCat is able to do this through a caching procedure rather than translation memory (you can find a description of it in this very technical article).To come back to the file formats, the commercial version essentially supports all the file formats you want (all Office and OpenOffice formats, XML and HTML, TTX and even SDLX ITD, all the necessary DTP formats as well as a good number of software development formats) and of course XLIFF.In many ways, what you find today is very similar to what you found a year ago.  

It's more stable and slightly more feature-rich now . . .

. . . you can find the rest of this article in the premium edition. If you'd like to read more, an annual subscription to the premium edition costs just $25 at www.internationalwriters.com/toolkit. Or you can purchase the new edition of the Translator's Tool Box ebook and receive an annual subscription for free.

 

ADVERTISEMENT

memoQ.com is launched!

November 24th, 2014 ­-- Kilgray is very excited to have launched www.memoQ.com!

With months of work behind it and efforts contributed from literally the whole Kilgray team, a new website is now available at www.memoQ.com.

With a focus on simplicity of navigation and accessibility of information, the new website enables visitors to specifically understand how they will benefit from memoQ. www.memoq.com/why-memoq/benefits 

"This website is clearly a step forward in our commitment to better serve those interested in Kilgray and memoQ solutions. From recorded webinars to case studies, and featured Kilgray customers, www.memoQ.com is yet another demonstrative feat in Kilgray's ongoing progress." says Bryan Montpetit, Vice-President of Sales and Marketing.

Visit our new website -- www.memoQ.com

 

5. Sticky Stuff

I usually really like it when something that I write provokes excitement among readers. So even the semi-upset responses to my article on Clay Tablet were appreciated. To recap, Clay Tablet is the middleware product that allows access to many content management systems and that was recently purchased by Lionbridge. Some readers said that I completely blew this out of proportion and that the acquisition was not nearly as important as I made it sound. I agree that it was not as relevant as when SDL bought Trados or Idiom, both of which had a much bigger impact on our industry as a whole (and probably were much more expensive), but I do think the Clay Tablet acquisition was an important step. As I said, "It'll make it much harder for smaller vendors to get into content management systems for translation purposes," and that's principally important.

The folks at Clay Tablet also felt they were kind of short-changed by my article. They asked me to share with you that they will "remain a neutral provider of connectivity solutions between any content system and any translation provider or technology." I'm passing that along to you.

Clay Tablet's Robinson Kelly had another comment about the responses of other tool providers that I published, and I agree that those comments were indeed revealing:

The notion of "connectors" needs to be fully appreciated by tools vendors. Many of the folks you surveyed replied with a strategy of "enhanced APIs and SDKs" which is great -- but does not solve the actual problem faced by CMS users at the world's enterprises. Only a robust, feature-rich, mature, reliable, fully integrated and scalable integration right into the deepest corners of a content system will meet the true needs of the harried, non-technical, modern marketer. Building, let alone maintaining, such integrations is shockingly difficult.

I know -- and that's exactly why it's unfortunate that the only independent connection technology dedicated to exactly that type of integration is no longer independent. 

 

The Last Word on the Tool Box Journal

If you would like to promote this journal by placing a link on your website, I will in turn mention your website in a future edition of the Tool Box Journal. Just paste the code you find here into the HTML code of your webpage, and the little icon that is displayed on that page with a link to my website will be displayed.

If you are subscribed to this journal with more than one email address, it would be great if you could unsubscribe redundant addresses through the links Constant Contact offers below.

Should you be  interested in reprinting one of the articles in this journal for promotional purposes, please contact me for information about pricing.

© 2014 International Writers' Group    

 


Home || Subscribe to the Tool Box Journal

©2014 International Writers' Group, LLC