Tool Box Logo

 A computer journal for translation professionals


Issue 15-1-244
(the two hundred forty fourth edition)  

Contents

1. Of Dragons and Speech Recognition Wizards and Apprentices

2. Minor? Major? New!

3. Data and the Fine Print or: How to Create a Sh*tstorm (Premium Edition)

4. Close to Home

The Last Word on the Tool Box

Feeling Good

Here is a video that may inspire a broad spectrum of responses: happiness, wistfulness, or even sadness, extreme youthfulness or extreme age. It did all of that for me at once. (And no, it's not directly translation-related.) 

100! 

Much more translation-related is this article. And it should make you happy because it's just about the most perfect fodder for a sales pitch.

I enjoyed writing this Tool Box Journal, and I especially appreciate the article by my co-contributor, Dragos Ciobanu. Hope you enjoy it as well.

1. Of Dragons and Speech Recognition Wizards and Apprentices (Guest column by Dragos Ciobanu)

Being the pre-programmed social beings that we are, we all enjoy communicating with other people (in addition to ourselves, our pets, the TV, and many other ... entities). More specifically, unless we have to use non-verbal communication, we prefer to speak, and for good reasons: most of us speak faster than we type, voice pitch and tone mark our discourse effectively, and gestures and facial expressions are often worth more than 1,000 words. However, when it comes to our professional work, most of us translators tend to approach it in silence: we sit down at our workstations -- unless we're a bit hip and we have high desks allowing us to work standing up -- and diligently type our translations. After we finish the first draft we run some QA checks depending on which software we use -- from the basic spell-check to identifying errors regarding numbers, segment length, non-translatables, locale-specific variants, and forbidden terms, to name just a few. We then do a final content check and deliver the job, with only the keyboard keys click-clacking softly and -- if you live around Leeds like I do -- with the wind howling outside for company.

While this was understandable until about ten years ago when, let's be honest, Automatic Speech Recognition (ASR) systems were not particularly accurate, now -- and especially for the translators working into English -- we have run out of excuses for avoiding ASR. There are a fair number of stories circulating in our community, usually covering the extremes: at one end of the spectrum, freelancers who have quintupled their productivity and can now output up to 10,000 words of high-quality translation using ASR without breaking too much of a sweat; at the other, translators who had pretty dire experiences with ASR tools at some stage in the past, and are not keen at all to give them a second chance. The result is that, with the exception of a handful of keen bloggers and public speakers known to all of us, the professionals in our field seem totally committed to their keyboards unless a particular injury or set of circumstances prevent them from typing.

This low uptake was also highlighted by research conducted in 2001 by the Institute of Translation and Interpreting (ITI), and then again in 2011 by the ITI together with the Chartered Institute of Linguists (CIoL). Both surveys indicated that not much over 10% of respondents were using ASR, despite clear economic advantages for those who did. Being a bit keen on technology, having had experience with both ASR and CAT tools, and having got over the occasionally misrepresented molehills which need to be climbed in order to adapt a translator's workflows to make the most of speech recognition technology, I was determined to dig a little deeper.

As a little aside and to give you some context, if you've never visited the University of Leeds Centre for Translation Studies and you quite like translation technologies, just think of us as a big playground: on the one hand, we have ASR tools such as Dragon Naturally Speaking v.13 Professional which our MA in Audiovisual Translation Studies students work with in conjunction with a range of subtitling software; we also have lots of CAT and software localization tools for our MA in Applied Translation Studies students: Déjà Vu X3 TeamServer and Déjà Vu X3 Workgroup, SDL Trados Studio 2014 and SDL Passolo 2015 Professional, memoQ 2014 Server and memoQ 2014 Project Manager, and OmegaT; and, last but not least, we have corpus processing tools as well as MT engines available to all our MA students to evaluate, from Systran to the EU's MT@EC environment. We also have chocolate with our students after every CAT team project, so we are not total robots.

Coming back to the story, in September 2014, I posted on several professional fora a link to a questionnaire designed to look closely at what professional experience ASR users had, which tools they used and for how long, which CAT-ASR combinations worked well, which advantages and disadvantages they saw to using ASR, how their workflows had changed, and what advice they had for the non-ASR community. There were a lot of questions, I realize, but some of the results are already being published, complete with pretty graphs and useful background scientific data. Without spoiling you the pleasure of browsing through Of Dragons and Speech Recognition Wizards and Apprentices, which has just appeared in the peer-reviewed open-access Revista Tradumàtica, I will just say that -- especially if you are working into English -- you have far more reasons to use ASR than to avoid it.

Contrary to the belief that ASR has never been good enough, professionals with vast experience in our field have been using ASR for decades. Moreover, most of my respondents -- to which I am very, very grateful, by the way -- reported at least a 30% increase in productivity. After all, you can comfortably and sustainably speak 100-130 words per minute, while typing at that speed and especially for prolonged periods is simply not possible. Speaking clearly, without hesitations, and in full sentences not only helps ASR systems disambiguate between homophones, but these are also good skills to have in your direct client recruitment strategy. Finally, is there anything better than not staring constantly at the computer screen, looking for typos, and forgetting to blink, and instead dictating the target text while looking up and moving around, having some space to think, and thus re-gaining some of the pleasure of creating functionally equivalent content in the target language rather than just post-editing TM or MT suggestions? What about asking the ASR tool to speak back to us what we have just dictated, while we glance over the source (or 'start') text to spot any factual inaccuracies we may have slipped in?

All in all, just like in the case of MT, we need to have realistic expectations of our speech recognition tools. Unlike MT, though, you won't have to look for a cool million words every time you want to re-train your system and hope errors disappear: ASR -- and I am now talking of Dragon Naturally Speaking because I know it best -- is very easily trained in those relatively rare cases when it is not 100% accurate. So next time you produce a translation, why not dictate it?

I trust you will tweet me your thoughts at @elearningbakery. And in the meantime, happy talking! 

 

ADVERTISEMENT

Let us dispel some myths about SDL Trados Studio 2014

2014 was a busy year. We visited translators from every corner of the globe, hosted more webinars, and connected with lots of new friends on social media.

On this journey we had the opportunity to receive lots of appreciated feedback and questions about our translation software.

Over the coming weeks we will be investigating some frequent "myths" we hear about SDL Trados Studio 2014, the translation tool that helps you complete your projects more quickly.

Myth: "I don't get any free help or support after I buy."

Fact: We provide customers with many ways of getting help with
SDL Trados Studio 2014 - for free!

Are you a translator? Learn more 

Do you work in a team? Learn more  

 

2. Minor? Major? New!

A number of reviewers have already mentioned that memoQ 2014 R2 should have been called memoQ 2015 instead: a major new "version" instead of a minor "release" upgrade. In a way I agree, but this whole naming game really just turns out to be semantics when it comes to a tool like memoQ, which allows free updates to whatever version to users with a maintenance contract (which in turn it strongly encourages its users to own). Plus, its main competitor, SDL Trados, has also had a policy recently of making rather major minor upgrades, so it fits within that pattern as well.

Fact is, that there is quite a bit of new stuff in this version of memoQ.

You know, we've been talking a lot about the world of translation as still in its infancy and quite immature, but I think there are some signs for real maturity. One, of course, is that the technology we use is maturing (of course, you expected me to say that in the context of this journal), but another is that our voices are getting stronger. For the last release of Trados Studio, I skipped writing a review and instead listed a number of third-party reviews that gave a good comprehensive overview of the new version's benefits.

For the new memoQ it's quite similar: some of the external reviews are just really great. Gone is the bickering about competing products that all too often marred reviews of translation environment tools or the fan boy's or girl's rosy glasses; instead, those have been replaced with some strong and intelligent views.

Here are some of the reviewers and their blogs:

Still, let me add a couple of comments about this new version of memoQ that might not have been highlighted (and some that underscore what the other reviews include).

Yes, it's a big version, and here are the major new features:

  • The user interface has changed and is now ribbon-based.
  • Approaches to segmentation have been completely overhauled.
  • The editor for translation memories has been redone and improved.
  • It's possible to share translation memories and termbases with other translators in real-time, even if you "only" have the Professional version.

As many of you know, I always liked the ribbon interface -- even in the early far-from-perfect MS Office implementations -- and I liked it even more when it became more interactive and customizable in later MS Office versions. I also thought it was remarkable and far-sighted when Star Transit presented everything in a ribbon interface much earlier than any of its TEnT competitors, and I welcomed Trados's and Déjà Vu's moves as well.

The kind of ribbon that memoQ is now using is quite similar to Office (especially the "File" menu, or in memoQ's case, the "memoQ" menu). It allows access to general options and licensing data, including the crazy left-arrow at the top that's supposed to bring you back to your project, which is located on the right-hand side of your screen. (If you think this is an inconsequential and slightly cranky comment, you're absolutely right -- I am feel old and reserve the right to be cranky!)

What is different and really q  uite clever about how the memoQ developers implemented the ribbon is that it's task-oriented (which is the very concept of the ribbon) as well as process-oriented. You can read about this in virtually all of the reviews mentioned above so I don't need to repeat it here, but I would venture to predict that Kilgray has set the bar for how ribbons should and will be used when dealing with something as process-oriented as translation.

I also really like the newly developed icons (though I wish they hadn't used the silly "smiley" for a number of icons -- yes, cranky again), but I naturally miss the "Do not push this button" button that's been lost in the shuffle. In her blog, Emma said this is a sign that memoQ has grown up. Unfortunately, I think it's a sign of a new generation of developers with a more business-oriented approach. I really hope it will pop up again in some later version. (In case you don't know what I'm talking about, there used to be an entirely useless button that not only invoked a clever quote from the Hitchhiker's Guide to the Galaxy but also a lovely cuckoo sound whose levity and, yes, uselessness often seemed to halve the burden of whatever translation the user was battling.)

Accompanying the change toward ribbons has been a serious cleanup of the memoQ interface, which I had long found way too cluttered. It's nice now that the ribbons expose features that many users might never have known about, plus there's ample explanation space with well-written tool tips.

One downside to memoQ's ribbon is that, like Trados's but unlike Déjà Vu's, it's not customizable. When I talked to Kilgray's Gábor Ugray about it, he promised that the memoQ-specific Quick Access ribbon will be customizable in the future.

Segmentation (i.e., how texts are segmented and abbreviations are correctly identified as non-breakables) has been improved in two ways: First, the previously ridiculously complicated segmentation editor is now actually human-readable (!). More importantly, though, they've come up with a feature that allows the user while preparing for translation to run a check for abbreviations that may not have been identified correctly in the respective source language, add those to the list of abbreviations (so they will be considered in later projects), and then -- and now comes the trick -- resegment the current project or document with that new rule in place.

If abbreviations were not found automatically, you can highlight them, add them to the list of abbreviations, and then also resegment. Very clever. (I'll bet any of you 10 bucks that by the end of this year at least two other tools will have this feature also.)

On to the new TM editor. A little while back I received this communication from John Musters:

"In my dual role of internal localiser/memoQ support specialist (...), I've been spending tons of time + energy helping people with the cumbersome task of editing/reviewing existing memoQ TMs. As you might know, this is close to impossible using memoQ's built in features."

(And then John recommended the "recently-made-freeware Heartsome TMX Editor" -- which I talked about a few months ago.)

So, yes, memoQ's TM editor was, like most of its competitors, rather poorly designed, making it welcome news that it's been reworked significantly (see here for a good description). I'm not sure whether it'll be good enough for John -- but it'll suffice for most others.

And on to the last thing: Resources that can be shared in real-time without extra cost. "You can create up to 4 TMs (two pairs, each in a specific language pair and its reverse), and 2 multilingual TBs with up to 5 languages. Each resource can be shared with one other person with write access, and two more with read-only access. The size of TMs and TBs is limited to 50k entries." This is what it says in the memoQ help file, and it's very generous in its own way.

What may not have been discussed fully in this context is something you can find in Kilgray's Privacy Policy for the Language Terminal. (To be perfectly clear: Kilgray is not trying to hide this as Gábor specifically mentioned this to me.) There it says this:

"If you subscribe to the translation memory and term base sharing service, Kilgray reserves the right to mine your data to improve their products in different language pairs. Entire segments or confidential information will not be disclosed to third parties, or included in the product. If you do not agree to this, do not use the translation memory or term base sharing service, but rather consider subscribing to a memoQ cloud server with one project manager license, where Kilgray does not analyse your data."

Now, this is not all that different from what Google or Microsoft does with your data when you use their machine translation services, so it's an interesting arena that Kilgray is entering.

A few months ago I pressed SDL about whether there is any way that they would use data processed in the cloud by their machine translation engine. They were outright incensed that I would even think about that option (so much so that they haven't really talked to me since). Since we know that SDL does not mine and analyze data provided by its users, it does make me wonder what Kilgray's move will do to the whole concept of using proprietary data for linguistic mining purposes. Will there be an outcry of some kind? There certainly does not have to be -- after all, Kilgray tells you what to do if you feel this is not a prudent way of dealing with your client's data: don't use that feature.

Still, I'd like to see a discussion about this, and I imagine that Kilgray's competitors would, too.

And in a wondersome manner, this brings us right into the next article.... 

 

3. Data and the Fine Print or: How to Create a Sh*tstorm (Premium Edition)

Most of you know that I like Twitter -- it's a good way to learn what's happening and at the same time have an additional motivation to process and curate information so that you can share noteworthy articles and information yourself. 'Twas in that spirit that I shared an article by Matthew Blake about the dangers of lawyers using Google Translate, specifically regarding quality and confidentiality. Not only that, but I even tagged on a "Good read" to my tweet.

It's true that I hadn't noticed it was "sponsored content" (but truth be told, I have ghost-written a number of articles for sponsored content placement and they were still pretty good, if I do say so myself), but either way I wasn't quite prepared for the storm that broke loose, a very small portion of which you can follow right here. I'm a little tired these days, so I didn't jump with both feet into the assumed controversy right away. But a few days after the original eruption, I actually revisited the contentious topic -- the issue of confidentiality when using services like Google Translate and Microsoft Bing Translator -- and was surprised by what I found.

 

. . . you can find the rest of this important article about data confidentiality in the premium edition. If you'd like to read more, an annual subscription to the premium edition costs just $25 at www.internationalwriters.com/toolkit. Or you can purchase the new edition of the Translator's Tool Box ebook and receive an annual subscription for free.

 

4. Close to Home

Here are a couple of housekeeping items I would like to mention that are near to my heart and my desk.

The obscenely priced Routledge Encyclopedia of Translation Technology should be an interesting read. I can really only judge that from the article on the history of translation technology in the US that I co-authored with Jennifer DeCamp. Again, not something that's in the typical freelance translator's price range, but it should be in the libraries of colleges with a translation program or any slightly larger translation company. (Thank you, Michel Huot, for reminding me to mention the book.)

 

Also, some of you will remember that I did a couple of webinars last January. Actually, "we" did a couple of webinars last January on what's missing in translation technology. Those were exciting events that had a real impact. Or did they? We should find out at the follow-up webinar on May 21. Make sure to mark it in your calendars.

 

A couple of months ago I mentioned in this journal that "I'm a man on a mission" to change the way we approach machine translation. In the latest TAUS Review, they graciously let me restate that mission -- even though they didn't really agree with it, which is all the more reason to appreciate their willingness to print it. I hope I may have a chance to discuss my hopes for MT with TAUS's chairman Jaap van der Meer in the next issue of the TAUS Review.

 

The Last Word on the Tool Box Journal

If you would like to promote this journal by placing a link on your website, I will in turn mention your website in a future edition of the Tool Box Journal. Just paste the code you find here into the HTML code of your webpage, and the little icon that is displayed on that page with a link to my website will be displayed.

If you are subscribed to this journal with more than one email address, it would be great if you could unsubscribe redundant addresses through the links Constant Contact offers below.

Should you be  interested in reprinting one of the articles in this journal for promotional purposes, please contact me for information about pricing.

© 2015 International Writers' Group    

 
 

Home || Subscribe to the Tool Box Journal

©2015 International Writers' Group, LLC