Possible (Premium Edition)
a man on a mission these days. Sort of. Here's my mission: As I've said
in previous Tool Box Journals -- it's in all our
interests to find better ways of utilizing machine translation than we
have so far with post-editing. And "all" really means all,
including translation professionals and translation buyers. There is a
lot of potential in harvesting data from machine translation
suggestions, but overall I think we're going about it the wrong way.
Now, there are some machine-translated projects where post-editing is a
good option. Those are projects with a highly trained MT engine in
place for a well-suited language combination and text type where the
post-editor essentially only has to do touch-up work. Anything else but
post-editing would seem silly in that case.
do you always work with engines and in situations like that? Nope,
neither do I. (I've actually never done that!)
are other ways to access content that comes out of a machine
translation? Here's how MT engines work: Internally they come up with
many, many propositions as translations for the text to be translated,
but typically they expose only one of those options -- that's the one
that's supposed to be post-edited in the old, and very often tired,
model. We all know that there will be some parts in that suggestion
that will be OK, even really good, but we also know that in its
entirety the suggestion often needs so much work that it becomes
tedious to use it. But here's the key: The machine translation engine
holds in its dark recesses pretty much all or most of the translation
elements we will end up using; it just doesn't release them to us. You
can expose some of those usually hidden "phrase tables" in Google
Translate when you look at the different suggestions for each phrase by
clicking on parts of a target sentence. You can also see it in this
beta version of the Moses-based WIPO MT engine.
just watched Obama's speech on immigration, so I'll borrow some of his
fervor and authority for this persuasive paragraph.) Why, my fellow
translators, should we trust the haphazard decision-making of the MT
system on what to present to us when we know that we know better? Why
should we abrogate our judgment when we know that the MT has what we're
looking for on a phrase or subsegment level but is simply unwilling to
share it with us? I believe in a fair and just system where we all have
access to all the data. (Enough Obama-speak.)
tools already have features that are moving in the right direction:
Classic (and presumably the long-announced new version of Wordfast
Pro) has an AutoSuggest feature that displays both segments and
terminology (which actually ends up being subsegments) as you type from
the various MT engines you can connect (under Setup> AS).
Studio offers two plugins that use AutoSuggest for whole and
partial segments -- unfortunately only from one MT at a time (the now
free -- and very helpful -- MT AutoSuggest that
suggests from one of the associated MT engines and Google Translate AutoSuggest
which is also free and allows you to use phrase suggestions from Google
Translate without paying the typical Google usage fee).
Vu X3 offers AutoWrite options for all associated MTs for whole and
partial segments (under File> Options> General).
really excited about the progress that I've seen in these tools.
However, none of these tools actually retrieves several suggestions
from one MT engine per sub-segment (instead, they use several engines
with one suggestion each). In addition, none of them is dynamic in the
sense that the MT is continuously queried and re-queried on the basis
of what has already been chosen as the right translation.
the power of getting several suggestions from one MT makes so much
sense with customized machine translation engines. It might be helpful
to get lots of suggestions from a non-specialized MT, but from a
specialized MT, it's kind of like TM on steroids. It provides all the
different combinations of language that you trained it on. Will it come
up with the right solution? Sure it will, if you can dig deep enough
and look for fragments rather than the whole segment, and if the
digging can happen purposefully through the keystrokes that you enter
rather than some awkward search functionality, thus placing the
oversight and control squarely in your hands.
couple that with an interactive reformulation of the suggestions based
on what you previously decided on as your translation? (Please stop
drooling on your keyboard!)
. . . you can find some suggestions on how we
can realize some of this in the premium edition. If you'd like to read
more, an annual subscription to the premium edition costs just $25
at www.internationalwriters.com/toolkit. Or you can purchase the new edition of the Translator's Tool Box ebook and receive an annual subscription for free.
Language Server v6: The Most Important New Features at a Glance:
Dashboard: Relevant project information at a glance.
Review Mode: Easy involvement of international subsidiaries in the
Data Cube: Easy collection of indicators.
Management Cockpit: Optimized work process for project managers.
out more about Across version 6 at www.across.net/en/support/whats-new/.
2. This 'n' That
already mentioned the MT AutoSuggest and the Google
Translate AutoSuggest apps for Trados Studio in this Journal
-- but overall it's hard to keep up to date with the many Trados
Studio apps that are continuously being released. If you use Trados
Studio and want to stay current, you should probably follow Paul
Filkin's Twitter account to keep abreast of Paul's and other
blog posts that will keep you in the loop. And you also might want to
have a look at the presentation that Finnish
translator-extraordinaire Tuomas Kostiainen gave at this year's ATA
conference in which he attempts to structure and list the available
apps in an insightful manner (along with giving some good suggestions
on what apps are helpful).
week, Service Pack 2 for Trados Studio 2014 was released. I
didn't have much time to look at it in great detail (though the upgrade
went very smoothly), but others have. Here are some good blog posts on
what to expect:
of these are more proficient experts in Trados Studio than I,
so I'm happy to let them have the floor.
contacted me recently about the work that he has done to the humongous
termbase exchange download from the IATE termbase. Henk has developed a
program that allowed him to clean the IATE termbase (of tags, synonyms,
etc.) and extract the terms by language pair so you can easily download
it into tools like Trados Studio, Déjà Vu 2/3,
memoQ, CafeTran (for these tools he has
actually very specifically prepared some file formats) or any other
tool that supports exchange file formats. Do you need his services if
you want to use the data? Absolutely not -- that is, if you have
several hours (and the expertise) to process the data like he has.
Sounds like a pretty good deal to me for only 10 euro per language pair!
Studio 2014 Service Pack 2 has arrived
a range of new SDL Trados Studio 2014 and SDL
MultiTerm 2014 features and enhancements that will further
enhance your translation productivity.
terminology performance, including no more Java
- Edit source
for all files types
- Support for
more about these features in the Freelance
edition and Professional
3. Mother Goose to
Christmas is coming. The goose is getting fat....
Is finding presents for your peeps a task you excel at?
Never fear if it is not. You can calm your consternation.
The best gift for all is Found in Translation.
You can find information on how to bulk order
lots of copies for greatly reduced prices right here, and if you'd
like me to sign the books before you send them out we can find a way to
do that as well (just send me an email so we can talk about it).
"We ordered FOUND IN TRANSLATION for the
majority of our staff. So nice that there are authors like you and
Nataly Kelly." @EhlionGlobal
4. Forced Sounds
are certain sounds and words that almost force themselves on you. Take
"MateCat", for instance. Here's what I had to say a year ago:
As any cat owner knows (my very proud self
included), mating cats ain't a pretty sight or sound.
with me so far?
then there is this:
however, might be much easier on your eyes or ears.
I still can't get it out of my ear, though...
though, MateCat has officially been released, and here's some
of what I said in my original article:
about MateCat with Alessandro Cattelan and Marco Trombetti from
Translated in Italy (the developers behind MyMemory).
MateCat was an EU-funded project that was supported
by a large team of very impressive
caliber, including Philipp Koehn, one of the leading developers of
statistical machine translation and the open-source MT engine Moses.
The goal at first was purely academic, as a tool
to research how much time post-editing of machine translation takes.
While there seem to be some kinds of standards emerging for how to
compensate for post-editing machine translation, in many ways it's
still a nebulous affair, so tools like this can be helpful.
it's EU-funded, the core of the project had to be open-source. This is
what you find when you go to this site. Essentially it's
the same as its commercial counterpart right here,
with the one (important) difference that the open-source version does
not provide for any file filters except XLIFF. So if you were to choose
to use it, you would first have to convert your Word, InDesign,
or XML files to XLIFF with something like the free localization
champion Rainbow. On the other
hand, right now there is really no reason to stubbornly use the open
source vs. the commercial version since at this point both are free.If
you stop by the sites, you'll find a very easy and spartan
browser-based interface with virtually no project management facilities
except the ability to select your language combination, select a
machine translation engine (or MyMemory, which contains both a
large TM and MT), and create your personal TM (you don't have to do
that, but if you don't all your translations end up in MyMemory,
which might really frustrate your client).Once that is done, you drag
your file or files over to the page where they are analyzed. The
analysis uses the concept of "equivalent words," something you might
know as leveraged or weighted word count, i.e., word counts that take
repetitions and matches into consideration.
The table-based translation view is also
extremely simple and essentially only offers to select from displayed
translation memory matches and MT translations or to copy source to
target and jump to the next segment. That's it.
several things happening in the background, though. First, you are of
course being timed (remember this was the initial purpose of this tool
to start with), and all kinds of statistical data is being collected.
I'm not sure who has access to that kind of data, but you will
eventually be able to install MateCat on your own computer or
server, and then no one should be able to see it (you can view that
data now by selecting the Editing Log link at the bottom of the
screen). Also, the integrity of the individual segments is verified
constantly to make sure that no tag or other element is missing.
Lastly, and that's probably the most interesting part, an immediate
learning of the associated statistical machine translation engine is
happening. Now, statistical machine translation by default learns from
the data you feed into it, but the updates only happen at certain times
when the new data is computed and read into the engine (for instance,
try to change Google Translate and assume that "your" correction
will show up next time -- it's not gonna happen right away, but you
might see it pop up in two weeks).
is able to do this through a caching procedure rather than translation
memory (you can find a description of it in this very technical article).To
come back to the file formats, the commercial version essentially
supports all the file formats you want (all Office and OpenOffice
formats, XML and HTML, TTX and even SDLX ITD, all the
necessary DTP formats as well as a good number of software development
formats) and of course XLIFF.In many ways, what you find today is very
similar to what you found a year ago.
more stable and slightly more feature-rich now . . .
. you can find the rest of this article in the premium edition. If
you'd like to read more, an annual subscription to the premium edition
costs just $25 at www.internationalwriters.com/toolkit. Or you can purchase the new edition of the Translator's Tool Box ebook and receive an annual subscription for free.
24th, 2014 -- Kilgray is very excited to have launched www.memoQ.com!
months of work behind it and efforts contributed from literally the
whole Kilgray team, a new website is now available at www.memoQ.com.
a focus on simplicity of navigation and accessibility of information,
the new website enables visitors to specifically understand how they
will benefit from memoQ. www.memoq.com/why-memoq/benefits
website is clearly a step forward in our commitment to better serve
those interested in Kilgray and memoQ solutions. From recorded webinars
to case studies, and featured Kilgray customers, www.memoQ.com is yet another demonstrative feat in
Kilgray's ongoing progress." says Bryan Montpetit, Vice-President of
Sales and Marketing.
our new website -- www.memoQ.com.
5. Sticky Stuff
usually really like it when something that I write provokes excitement
among readers. So even the semi-upset responses to my article on Clay
Tablet were appreciated. To recap, Clay Tablet is the
middleware product that allows access to many content management
systems and that was recently purchased by Lionbridge. Some readers
said that I completely blew this out of proportion and that the
acquisition was not nearly as important as I made it sound. I agree
that it was not as relevant as when SDL bought Trados or Idiom,
both of which had a much bigger impact on our industry as a whole (and
probably were much more expensive), but I do think the Clay Tablet
acquisition was an important step. As I said, "It'll make it much
harder for smaller vendors to get into content management systems for
translation purposes," and that's principally important.
folks at Clay Tablet also felt they were kind of short-changed
by my article. They asked me to share with you that they will "remain a
neutral provider of connectivity solutions between any content system
and any translation provider or technology." I'm passing that along to
Tablet's Robinson Kelly had
another comment about the responses of other tool providers that I
published, and I agree that those comments were indeed revealing:
The notion of "connectors" needs to be fully
appreciated by tools vendors. Many of the folks you surveyed replied
with a strategy of "enhanced APIs and SDKs" which is great -- but does
not solve the actual problem faced by CMS users at the world's
enterprises. Only a robust, feature-rich, mature, reliable, fully
integrated and scalable integration right into the deepest corners of a
content system will meet the true needs of the harried, non-technical,
modern marketer. Building, let alone maintaining, such integrations is
know -- and that's exactly why it's unfortunate that the only
independent connection technology dedicated to exactly that type of
integration is no longer independent.
The Last Word on
the Tool Box Journal
you would like to promote this journal by placing a link on your
website, I will in turn mention your website in a future edition of the
Tool Box Journal. Just paste the code you find here into the HTML code
of your webpage, and the little icon that is displayed on that page
with a link to my website will be displayed.
you are subscribed to this journal with more than one email address, it would be
great if you could unsubscribe redundant addresses through the links
Constant Contact offers below.
you be interested in reprinting one of the articles in this journal for promotional purposes, please contact me for
information about pricing.
© 2014 International Writers'