right, kind of a lame title to this article, especially when it's about
two, maybe even three rather exciting things at
(The-company-formerly-known-as-Kilgray is now named after its flagship
product. That might be a good thing, though I do remember seeing the
first installment of their website with that screaming guy right next
to the company name and thinking, whoa, these people are different!
what exactly is SkyCAT? ;-)
the memoQ of 2018 has two really interesting new features.
is the mobile app Hey memoQ, still in its pre-beta stage (you
can sign up to be part of the beta group right here). One
interesting aspect of Hey memoQ is that it's a mobile app;
while the world of translation is kind of late in the game to adopt
mobile apps that actually help you be more productive, it's been fun to
see them coming out of the woodwork. The other intriguing aspect is
what it does. It's a voice recognition tool that works in something
languages and dialects. (I'm not going to list them here, but you
can see them by following the link above.) The app allows you to
dictate into your phone and have it transcribed on your PC, making it
similar to what Tiago Neto has been working
on by cobbling together a whole bunch of tools and resources, only
now it's a bit more streamlined and tool-specific.
step back a little, though.
you can see from the link, memoQ is using the Nuance
Recognizer (Nuance is the company behind Dragon, the
premier but very limited voice recognition product as far as the number
of supported languages). It accesses this through the Apple Speech
Recognition SDK (SDK = software development kit), so yes, you've
probably drawn the right conclusion, the app is available only for iOS
at this point. Gergely Vándor from memoQ assumed it's likely
that there will be an Android version at a later point if this proves
to be a successful first implementation. The idea for the app is about
a year and a half old and comes out of memoQ's "Innovations" department
headed by Gábor Ugray, one of the company's founders.
system is set up so the phone app on your iPhone or iPad talks to a
proxy server, which in turn communicates with both memoQ on
your computer and Nuance's speech recognition server (through the
above-mentioned Apple SDK). There is also some data traffic going from
your memoQ installation back to the speech recognition server
by using a "hint" feature that sends segment-specific termbase data to
Nuance to increase recognition accuracy (that way "I" does not become
"eye" or "aye" in an English context). According to Gergely, this
"hint" feature is a bit of a "black box" for memoQ, so it may or may
not be useful and there likely will be an option to deactivate it.
feature is streamlined elsewhere into memoQ with the inclusion
of some voice commands (stuff like "next segment" or "select XYZ,"
etc.), which will also likely be extended in the future (plus, the
upcoming beta phase should give the developers some clues about what
kind of commands are commonly used and which are not).
I let my enthusiastic self out for a little bit?
love this tool!
haven't yet tried it out myself, but here's what I think is so cool
about it: It's often been said by others (and myself) that voice
recognition is kind of the underrepresented productivity booster for
certain kinds of translators and certain kinds of translation. The
strange and somewhat frustrating thing about voice recognition is that
it really does not mesh well with other features provided by
translation environment tools. AutoWrite and AutoSuggest wait for data
that comes from single key hits, assemble assumes that it's sometimes
quicker to rearrange than freshly translate, fragment-based machine
translation typically uses processes similar to AutoWrite, and so on
and so forth. But I'm excited that a translation environment developer
who is right in the midst of all this should be able to find ways to
mitigate some of those problems. Since it clearly does not make sense
to forego one productivity feature to gain another, there needs to be a
way to combine them. That's what I eventually hope to see from this.
then there's the fact that dictation is suddenly open to so many more
languages and that it's free (which is an advantage even for those who
dictate in those few languages covered by Dragon).
issue that probably still needs to be addressed in some way is privacy
with the cloud-based voice recognition. Apple states on the website for
not perform speech recognition on private or sensitive information.
Some speech is simply not appropriate for recognition. Avoid sending
passwords, health or financial data, and other sensitive speech for
then. And Nuance -- for a different product -- says:
using Dragon Anywhere, you expressly consent and agree that
speech data, which may contain personal information, shall be stored
and processed in the United States. "Speech data" means the audio
files, associated text, transcriptions and log files provided by you or
generated in connection with Nuance products."
it looks like voice recognition providers are not quite at the point
machine translation providers arrived at earlier this year (and let's
say this all together: "Thank you, GDPR!"), but that might be only a
matter of time.
thing that surprised me with Hey memoQ was that memoQ chose to
use the Nuance products. Some of you will have read that earlier this
year, Nuance discontinued
keyboards for both Android and iOS, which had provided free access
to voice recognition as well. I'm not completely sure that the reason
was the powerful rise of Google's Gboard, but chances are it
was. Gboard provides keyboard access as well as voice access to
hundreds and hundreds of languages. It's not clear whether all the languages
listed (click on "See supported languages" at the bottom) are
voice-supported, but either way the list is much, much longer than the
one from Nuance, and likely with a more secure future. Maybe the next
version of Hey memoQ will (have to) make that switch.
second (and third) feature that I like in memoQ is the video
preview tool. memoQ is not the first with a tool like that (Star Transit has had this
for quite a while and Wordbee's came out
essentially simultaneously with memoQ's), but it's still important
and kind of a no-brainer with the incredible rise in subtitle
translation. The tool is based on the VLC Media
Player -- which you likely are already familiar with because
it's installed on your computer, anyway -- and supports essentially any
video format that is also supported by that. The only caveat is that
you don't just need the video but also a separate subtitle file (either
an SRT file or an Excel file that contains the translatable
text as well as the time stamps). The information in that subtitle file
governs at what position the video is being shown as you translate and
as you see your translated subtitles in the preview. You can also play
longer segments with a number of subtitles if you choose that to get a
better idea of the context in the video.
addition, since memoQ developed this for the VLC Media Player,
they had to open-source the code for the preview tool. By the time you
receive this Tool Box Journal, the code should have been posted
to github and might really be very useful. For instance, it could
easily be used to build a preview/synchronization tool for video games,
for software localization or other translation management systems, and
so on and so forth.
maybe, just maybe, this is the first step to a library of third-party
apps that memoQ might offer at some point?
course, Gergely is right when he said that memoQ "should focus on core
translation technology" and leave the -- albeit important -- rest to