you will remember the conversation I started in the last Tool Box
Journal with Félix do Carmo, a translator and machine
translation researcher, about best practices for using neural machine
translation. (If you don't have that issue handy, you can read it right
here.) This is how our conversation continued and ended:
... What's being done in academia with NMT in a more practical manner
to move beyond "post-editing," as vague as that term might be?
I would say that current research is still very much focused on using
and applying NMT to produce better output to feed to traditional tools.
We should mention four areas of current research that will affect the
way NMT output will be presented to translators: INMT, AMT, APE, and QE.
Neural Machine Translation (INMT) is dedicated to developing ways to
incrementally feed output to translators from neural networks trained
on parallel corpora. These systems model the translation work as
described above: the translator generates the translation, starts
writing, the NMT system suggests the next fragment, and, all going
well, the translation is created faster than if the translator did not
have this "voice over the shoulder." For these systems to be accepted
and become regular tools translators use, they need to feed suggestions
that are adjusted to each context. Since INMT outputs words that are
constrained on the words already written, there is the expectation that
the suggestions presented by these systems will be better than those
possible with SMT engines. However, this is still an area which raises
more questions than answers. For example, can you constrain the output
not just on the previous target words, but also on a list of validated
terminology, and control how accurate the whole process is?
Machine Translation (AMT) has been proposed as a term to describe
systems that learn specific traits of each translator's work and adapt
suggestions to those traits. It is not yet clear how this will be done,
which traits these are (some call it "style," which is one of the
vaguest terms one can use), and how effective this actually is.
complementary area that is being researched is Automatic Post-editing
(APE). The name may sound like another way to replace translators, now
not only in the translation stage but also in the editing and revision
stages. Actually, I would say this is just another way to improve the
output. It has been shown that applying NMT technology to APE improves
the output of MT systems. However, again, despite this improvement in
the output, this does not change the nature of the translating/editing
work that is required, and the fact that this work requires
- An area we must
also refer to is Quality Estimation (QE), which tries to give some
indication of the segments that may not require much editing, and those
that may require extensive translating work. QE may also serve to
highlight words that are probably wrong in a translation suggestion.
This is complementary information which may help in the translation
decision process. The use of NMT methods for QE has also enhanced the
capacities of QE methods.
these four areas -- INMT, AMT, APE, and QE -- complement each other in
helping the translator: they provide the translator with better
suggestions (as interactive/dynamic pieces for the translator to build
his translation or as better full sentences for him to edit), and they
help filter out bad suggestions, guiding his attention to what may
really require more work.
describe how to leverage this technology to give the translator more
than just better output for him to edit, the discussions have been
going on around terms like "augmented translation" or
"knowledge-assisted translation," but the discussion started a few
years ago when we started talking about next generation translation
tools. Apart from the integration of some of the concepts above, like
INMT in Lilt, or QE in Memsource, most of these
ideas still did not come off the paper to become a reality in the daily
lives of most translators.
is a tendency in academia and the industry to discuss the names more
than make the revolution. One of the most recent signs of that is the
suggestion to stop talking about NMT (because it is said that it is now
officially the same as MT), and to talk instead about Artificial
Intelligence (AI). But all these new terms simply express the challenge
to combine not just the plethora of sources we mentioned earlier but
also the plethora of technological approaches into the same tools.
do actually like the suggestion to talk about AI instead of talking
about NMT, and it's also interesting to see that some of the research
areas have already found their way into tools, including the tools that
you mention but also SDL, Intento, and ModernMT. As a last question, I
would like to ask you something practical, though. The typical
translator does not have access to customized MT engines (with the
possible exceptions of the adaptive engines mentioned above, or if the
client gives access to a customized MT). If the translator chooses to
use an MT engine, they will end up using engines like Google,
Microsoft, or DeepL. How can one of these engines -- or indeed several
at the same time -- be used more productively or creatively than having
the translator essentially just responding to the suggestions that
these engines make? How can the translator be in the "driver's seat"
when using these resources?
For me, the next technological step will be personalization. (Actually,
it is not such a ground-breaking proposal; this is another buzzword
that has been hanging around for a while.) As our industry matures, we
should identify the value of each node in the supply chain, and we
should have technology and management of resources adapted to each of
those nodes. Corporations will go on managing big data, but they will
suffer from the anonymity and genericity of that data. LSPs will need
to manage their client's data judiciously, and freelancers will need
tools that help them manage their own data locally.
be in the driver's seat, translators will need to have a clear right to
manage the data they produce, and to keep personal TMs of all
translations they do, more than to have access to other translators'
and companies' resources, or to an increasing number of tools and
technologies. Translators need to know their work better, and they will
need tools that record and give them better insight into what they have
been doing in previous projects, whether these are individual projects
or collaborative ones.
scenario in which your translation tool receives input from MT engines,
personal, client or collaborative project TMs, terminology databases,
previous answers to queries, online discussions on translation
suggestions, and many other resources, a translator needs different
things (see below).
main thing about tools that are adapted to specialized translators is
that they should work in the background to feed the best suggestions
possible, but the whole translation decision needs to be done by the
the details of how to use these technologies productively and
creatively, instead of just responding to suggestions, let's think
about a futuristic scenario in which translators work in a mode simply
called "Interactive Translation," a scenario which integrates MT and
TM, different text resources and online features, and supports both
translating and editing work. And it supports both "interactive" and
"pre-translation" translators, those who prefer to type over some text,
and those who prefer to write from scratch.
Interactive Translation, everything comes down to the challenges of
building a good interaction with the translator, and this means having
an interface that adapts dynamically to his needs. I can describe parts
of how I envisage a tool that adapts to translators in the future.
interface should be very clean and uncluttered at the beginning,
helping the translator read the text he has to translate, maybe even
presenting him with an automatic summary of the text. It may also show
him other projects in his pool of resources that may be associated with
that text, and terms and segments which may constitute the main issues
he will deal with throughout the translation. Or it may make those
choices for him and not show them at this stage. At this initial stage,
the tool will also have very detailed statistics which estimate effort,
quality of the MT output, and other details which may be useful for
more advanced users, like the possibility to extract rules from style
guides and client instructions and to automate their checking.
translator may approach the translation in many different ways, from
the first segment to the last, starting with those problematic
instances, or following any other structure he identifies in the data
to translate. In the background, the tool selects the best resources
for each segment, either a TM, an MT engine solution, or a composition
from fuzzy matches, terminology, and any other resources.
the translator starts translating, he will see the best suggestion the
machine comes up with for each segment. If he sees that this suggestion
is perfect, he will validate it. If he wants to know more about that
suggestion, he will have a simple way to dig deeper and find where it
comes from, how reliable it is, if there are other alternatives from
other sources which he might prefer. And he can decide to act on these
suggestions one by one or to aggregate them -- for example, dealing
with all full matches from a reliable TM at once. But if he needs to
edit the suggestion, he will have several forms of support described in
a bit more detail below.
suggestions from the tool are always presented in full, but the
translator manipulates them at his will, moving things around, deleting
words and inserting new ones. When he selects a word to apply any of
these actions, the tool adapts and shows different supports. For
example, when he decides to replace words without moving them, the
system should be ready to present alternatives for that position, which
may simply be a change in the form of that word; when he moves words
around, the system should be able to suggest changes that depend on the
new position of those words. These suggestions are not the same for
each translator or for each project. So, it is fundamental that the
tools learn from the translator's behavior, to predict regular edits,
and to save and reuse them in similar contexts in other projects.
are other activities translators do which may be supported by these new
tools, like web searching, or making annotations and queries. The
knowledge behind decisions supported by these resources is not
integrated into translation tools, and it would be great to have this
closer at hand.
the translator stops, the tool can show him statistics on how far he is
in terms of the whole project, or other assignments he is currently
engaged in, and how the project is in terms of final checks. Before he
decides to submit, the tool can do a QA check and reuse the records of
the decisions he made to guide him in revising the project. For
example, it may help him prepare a report for the reviser with the most
troublesome passages, or a list of the sources he used for new
could go on dreaming of the details of such tools, but our dreams as
translators are not the same for everyone. We realized in our
conversation that you dream of tools which are not so focused on
editing as the ones I dream of, but which rely on the translator
generating the translation and the tool playing a not so intervening
main idea I take from this conversation is how we moved from the impact
of existing technologies to a discussion on how we use it. For me, this
is the right way to discuss technology: not to be afraid of how MT or
any other technology determines our work methods or even the definition
of our tasks, but in the type of research on technology that we need.
There is still a lot of research to be done on how each one of us
writes, edits, searches, trusts his tool to search for him, or prefers
to choose himself, how regular our methods are, how we deal with more
productivity and more tiredness, or how all these factors change
according to project, motivation, or even mood. It was great to see how
you and I share the excitement to think in terms of the future, and to
try to imagine how current and new generations of translators will use
smart tools that adapt to them.