ToolkitSmall

A computer newsletter for translation professionals

Issue 10-7-170
(the one hundred seventieth edition)

Contents

1. Microsoft's Language Portal

2. The Ever-Hungry Giant

3. TranslatorsTraining!

4. Does Plunet Worx? (Premium Edition)

5. Other Good Ideas

The Last Word on the Tool Kit

Boring Times?

Anything happen these past two weeks? Nah, not really. Well, aside from the fact that Microsoft officially reopened its language portal and supports TBX; TranslatorsTraining, my site for comparing translation tools, is now free; and the World Cup has ended with a worthy winner? Oh, and SDL's acquisition of MT pioneer LanguageWeaver this morning. And you thought that things moved slowly during the summer!

But leaving all those aside for now, here's something truly earth-shattering.

Speech recognition is great. So great that, when I used it this week and dictated "niemals" (never), it wrote "immer" (always). Acoustically, that's a nebulous connection that borders on the impossible. Could it be what's called an Easter egg? (Unfortunately, I'm not making this up.) Here's a video that shows another facet of the beauty of speech recognition: http://www.youtube.com/watch?v=5FFRoYhTJQQ. (Thanks to the lovely Courtney Seals-Ridge for sharing it!) Watch it when you have a bad moment. It'll cheer you up!

Now on to the "serious" things.

1. Microsoft's Language Portal

Most of you are familiar with Microsoft's long and at times circuitous route toward terminology sharing. When it started sharing most, if not all, of its user interface translations in the mid-nineties (the files that were often incorrectly called "The Microsoft Glossaries" were really translation memories), it was widely, and rightly, welcomed as a very visionary step. While the files had to be downloaded from an FTP site and were in a rather cumbersome comma-separated (CSV) format, a good number of tools were offered that specifically or as an added feature supported the particular Microsoft CSV format.

This was a visionary step: Making these translation memories freely available ensured that all tools running on the Windows platform would use the same terminology in their translated versions, making it soo much easier for users to switch between products. (When some tools -- I'm thinking especially of SAP here -- decided definitely not to use the MS terminology for political reasons, it only served as a sort of back-handed confirmation of Microsoft's vision.)

After 12 years of offering these databases to the general public, Microsoft suddenly withdrew them and offered them only to (paid) subscribers of MSDN (and, later, also Microsoft TechNet). The general public was first provided with a multilingual CSV glossary and a few months down the road with the Microsoft Language Portal.

In its first incarnation, the Language Portal offered access to terminology searches, style guides in various languages, language-specific blogs (which were very infrequently updated), and a sort of crowd-sourced site for the terminology of some Microsoft products. At first, many viewed the site as a poor substitute for the free and large databases, but it eventually became the standard for Microsoft terminology. And, since the search queries were done through the URLs, it was even possible to search with third-party tools like IntelliWebSearch (which unfortunately is not possible with the otherwise very helpful TAUS engine).

Then last week, many of you contacted me directly or sent cries for help to Twitter: The Language Portal was gone! Alas, it was true -- but only for a few hours, hours that seemed to last longer because the old URLs continued to be inoperative. For in their place, under a new address, a completely new and handsome Portal had emerged.

I know it's very easy to be critical of Microsoft (and Google and SDL and Apple and . . .), but it's also important to give credit where credit is due. And credit is due right here.

Aside from a new look and an easier path to certain things (such as clear instructions for what to do if you are interested in the whole set of TMs), many of the features have stayed the same (terminology search, style guides, access to blogs). However, some features have been updated and expanded (including the number of languages that are currently covered or the commitment to more proactively publish new blog postings), and some things are completely new, including the ability to download bilingual files (English into other languages or other languages into English) in TBX format.

Is that a big deal? I think it is.

First of all, it's super helpful to once again download data so that it can be integrated into your own terminology resources/translation environment. And second, there is the TBX element.

I realize that some of you might ask what exactly TBX is. TBX, the TermBase eXchange standard, is an XML standard that allows for the interchange of terminology data, including detailed lexical information. The adoption of TBX has gone very slowly, partly due to the fact that many felt it was too complicated (for instance, see this article by Maxprogram's Rodolfo Raya, who has developed his own competing and much simpler standard). Still, many tools have now bought into it and are supporting it, including Across, Heartsome, Swordfish, XML-Intl, SDL MultiTerm, Wordfast Pro, Alchemy Publisher, and Star TermStar. If you own one of those tools, the TBX is easy to import (note that in the case of SDL MultiTerm you will first have to use the MultiTerm Convert program) -- especially since this is a TBX file with a simple structure: just source, target, and definition.

But what about the rest of us, those who don't have any of these TBX-supporting tools? XBench to the rescue! XBench is a free downloadable tool (I'm still waiting for the time when the makers of XBench start charging for it) that is good for many different functions, including quality assurance of translation files, lookup in glossaries, terminology databases, translation memories in many, many different formats, and the import and export of such files. So it's quite easy to import the TBX file and then export it into a TMX (Translation Memory eXchange) or CSV file, which can then be processed by your tool of choice (note that you might lose the Definition field in this process).

So why is this so big of Microsoft? Well, maybe "big" is the wrong term, but I join many others in being thankful that Microsoft has reached out a hand to support this important standard. What's also been very refreshing in my dealings with this particular team at Microsoft is that bugs I've pointed out to them have typically been fixed within a matter of hours -- not generally something I've been used to seeing from a very large software vendor.

One more thing about the Microsoft glossaries: They have also been integrated into the Evroterm termbase of Slovenia and the mighty EuroTermBank -- so if those are your preferred places to search, you'll get the terminology that way.

Oh, and in case you wonder which languages are supported (either in or out of English), here is a list: Afrikaans, Albanian, Amharic, Arabic, Armenian, Assamese, Azeri (Latin), Basque, Bengali (Bangladesh and India), Bosnian (Cyrillic and Latin), Bulgarian, Catalan, Chinese (Simplified and Traditional), Croatian, Czech, Danish, Dutch, Estonian, Filipino, Finnish, French, Galician, Georgian, German, Greek, Gujarati, Hausa, Hebrew, Hindi, Hungarian, Icelandic, Igbo, Indonesian, Inuktitut, Irish, isiXhosa, isiZulu, Italian, Japanese, Kannada, Kazakh, Khmer, Kinyarwanda, Kiswahili (Kenya), Konkani, Korean, Kyrgyz, Lao, Latvian, Lithuanian, Luxembourgish, Macedonian (FYROM), Malay (Brunei Darussalam and Malaysia), Malayalam, Maltese, Maori, Mapudungun, Marathi, Nepali, Norwegian (Bokmal and Nynorsk), Oriya, Pashto, Persian, Polish, Portuguese (Brazil and Portugal), Punjabi, Quechua, Romanian, Romansch, Russian, Sanskrit, Serbian (Cyrillic and Latin), Sesotho sa Leboa, Setswana, Sinhala, Slovak, Slovenian, Spanish, Swedish, Tamil, Tatar, Telugu, Thai, Turkish, Ukrainian, Urdu, Uzbek (Latin), Vietnamese, Welsh, Wolof, and Yoruba.

The amounts of translated terms vary between 2,000 and 18,000 terms (you can find more information on this in Jeromobot's Twitter stream).

 

ADVERTISEMENT

Discover the WORDBEE TRANSLATOR difference!

 

WORDBEE TRANSLATOR is the online Translation Environment Tool that makes it dramatically easier to translate, manage projects, share information and enhance productivity.

 

Move your translation team online with www.wordbee.com.

 

Get a 1 year SaaS license with 15% discount till July 30. Contact info@wordbee.com

 

2. The Ever-Hungry Giant

Too much has already been written today about SDL's acquisition of statistically based machine translation vendor LanguageWeaver (for the best and not particularly unbiased articles, see these three: 1, 2, and 3), so I won't bore you with yet another rendition of oh-how-terrible-this-is-for-the-future-of-LanguageWeaver's-technology-and-how-great-for-the-competition. In general, I agree with all of those takes. SDL does not have a good track record with acquired products (look at Idiom, Passolo, or Transparent Language) but there are some bright spots, of which I happen to think Trados is one (I'm strictly talking about the actual product and not the product philosophy or policy).

So here is what I wonder (and I had some help with these thoughts from someone within SDL): While the former team of rules-based machine translation developer Transparent Language is not very apparent to anyone outside of SDL, they are quite active within the organization and are responsible for much of the machine translation integration in the last few incarnations of Trados. I think that SDL just might have a chance to be among the pioneers that truly combine rules-based and statistically-based machine translation with their now two very knowledgeable teams.

Will they do it? Hard to tell. But if they do, they in particular and the language industry in general might really benefit from it.

And here is one other thought in regard to machine translation. Some of you know that the AMTA, the Association for Machine Translation in the Americas, has decided to co-locate its annual gathering with the ATA this year in Denver to make it easier for translators to attend both meetings and to allow both MT developers and translators to learn from each other (more on that in the next newsletter). Some of the organizers are still looking for translators who are interested in talking about their experience with post-editing of machine translation output on one or two of their panels. If there are any takers for that, please contact Laurie Gerber or Andy Bell directly.

 

3. TranslatorsTraining!

It is with no little satisfaction that I would like to announce that we have decided to go for free -- that is, "free" as in "cost-free" -- for the all-important first series of videos on TranslatorsTraining.com!

For the few who don't know, here's what it is: Rather than just asking translation environment tool vendors a question like "why is your tool so cool" (not a bad question -- it even rhymes!), we sent each of them an identical Word file and gave them very detailed instructions on how to translate the file. They then Flash-filmed the process of their tool performing the translation. After they sent the videos back, we refined and narrated them and have made them public.

Since the same file is translated in all of the videos and the same terms are sent to the terminology database, this is a great way to compare the tools that are out there (presently we have videos by Across, Cafetrans, Déjà Vu, memoQ, Similis, OmegaT, Star Transit XVand NxT, Wordfast Classic and Pro, Lingotek, Trados 2007 and Studio, SDLX, Swordfish, Heartsome, Metatexis, and MultiTrans) and learn how to start using them in the process.

So, to pick up the "cool" again: It's cool. The only cost you might have is a couple of minutes until the video downloads (during which time you can watch some of the many Jeromobot videos on the site) -- you can imagine that there are a lot of people trying to get to it right now -- but then you can also save a lot on some of the tools! A number of the more important vendors (including Trados, Déjà Vu, memoQ, Swordfish, and Heartsome) are offering special deals when you order through our site.

Don't know what to do in the summer heat? Don't want to end up like Jeromobot in this wretched video? Cool off with TranslatorsTraining.com and learn something in the process!

 

ADVERTISEMENT

Announcing Fluency Translation Suite 2010

 

Introductory Offer: Save $250

 

You're Fluent. Now Be Fluency Fast!

 

Easy-to-learn wysiwyg interface with automatic, integrated research tools.

Download a free trial now ... and you'll translate faster and easier than ever before!

 

4. Does Plunet Worx? (Premium Edition)

Awhile back I announced that I would talk to some agencies that use installations of Plunet and Worx, two of the leading business and workflow management tools for translation industries. I was finally able to finish that this morning, and here are some of the results.

. . . you can find the rest of this article in the premium edition. To subscribe, you can pay $15 for an annual subscription at www.internationalwriters.com/toolkit ($10 if you are an ATA member) or you can buy the highly acclaimed Tool Box computer primer at www.internationalwriters.com/toolbox for $50 ($30 if you are an ATA member) (new and existing owners of the book will automatically receive a year-long subscription).

 

5. Other Good Ideas

Once you are TranslatorsTraining.com'ed out and still have some time to kill, you might want to look at this 400+ page EU report on the translation industry. Interestingly, it's prepared by the same agency that is also behind Worx, but there is plenty to read about how important we are (at least in Europe).

Also, I ran into this crazy little bit of information about my favorite tool to hate: QuarkXPress. Quark's versions 6.5 and 8 run on Windows 7, but Quark's version 7 does not. Say that again? Yup, it's the strange truth (this comes via Thomas Bosch, by the way). In addition, there is still no way to export all stories in a QuarkXPress file into one file that can be processed  by a TEnT (unless you go with Napsys CopyFlow -- which not surprisingly has totally refashioned itself as serving the language industry -- when I first talked to its staff six or so years ago they didn't really know what a translator was).

Oh, and did I say, I don't like Quark?

Lastly, Jeromobot and I were extremely concerned when we heard that the new beta version of Firefox 4 might not be able to handle our fabulous Jeromobot skin that we recommend you use to embellish your Firefox browser. Phew, it works, as Javier Mallo found out.

Just one more thing: Here is my favorite example of successful interpretation.

 

The Last Word on the Tool Kit

If you would like to promote this newsletter by placing a link on your website, I will in turn mention your website in a future edition of the Tool Kit. Just paste the code you find here into the HTML code of your webpage, and the little icon that is displayed on that page with a link to my website will be displayed.

This reader just added a link:

www.leximania.gr

© 2010 International Writers' Group