Monday, March 26, 2012

What Machine Translation Can and Can’t Do

This is a guest post by Hassan Sawaf. Hassan Sawaf is the Chief Scientist at SAIC Linguistics Division where he works on the Omnifluent™ linguistics solution. He can be reached at hassan.sawaf@saic.com.

I have encountered quite a few myths regarding the capabilities of machine translation (MT) technology in the close to twenty years I have worked in the natural language processing and MT fields. Recently, the technology has developed to the point that most translators will use or encounter it at some point in their career. With this in mind, I have put together a short list of what MT can and can’t do so that translators can best use their time and resources.

MT Can: Translate speech-to-speech and speech-to-text.

It is often assumed machine translation only covers text-to-text translations, however recent advances in MT and automated speech recognition (ASR) technology have enabled far-reaching applications. Automated foreign language closed captioning is one example, conversing with a hotel owner while traveling would be another.

MT Can’t: Convey information instantly from all mediums or types of media.

MT technology is not yet the Star Trek “Universal Translator.” Even in the most advanced technology with fully integrated ASR, speech-to-speech and speech-to-text is near real time and a non-trivial process. After a phrase is spoken, it can be displayed and edited by a human before being synthesized and voiced by the machine to minimize potential errors. Also, MT can’t convey or communicate information reliably from all types of media such as (still or motion) pictures.

MT Can: Be used for important business, transactional or personal matters.

Many people are surprised when I tell them MT can be used in a healthcare setting. However, MT technology is already being employed by hospitals, via tablet devices, to better communicate with their patients. This is possible because advanced MTs can be tailored to specific domains or industries, so that highly technical terms, like “hilus of the lung” are as easily recognizable as common words like “home.”

MT Can’t: Be used in situations when style, creativity or instant clarity is required.

MT is not ideal for language requiring stylistic or creative input such as marketing or legal documents. While the technology can be used in a healthcare setting to distribute vaccines, it would not be helpful in a situation requiring instant clarity or in advertisements where specific words and their connotations are chosen carefully and have significant cultural meaning.

MT Can: Be customized for highly accurate translation

MT technology, on average, can achieve 80 percent accuracy compared to a professional human translator. In some cases, this level of accuracy can achieve enough understanding for the purposes of the communication. True hybrid machine translation (HMT) technologies, which integrate statistical and rule based translation, can achieve close to 95 percent accuracy once they have been tailored for a specific industry or domain. In my experience, the accuracy of MT is the biggest misconception among translators who may not have experienced the more sophisticated, customized technology.

MT Can’t: Reach 100 percent accuracy.

Even with the next generation of MT technology it is extremely unlikely that it will ever achieve 100 percent accuracy. There are many reasons for this, but perhaps the best explanation is that human communication is extremely complex. Even if I am in the same room, viewing the same presentation as my colleague who is speaking my native tongue I might only understand 95 percent of his intended meaning.


MT Can: Be used by translators to be more efficient

I have heard many analogies, but I like to compare MT to a car. You can get to where you want to go without it, but by using the technology you are going to get there faster and/or more comfortably. There are many kinds of vehicles and its important to select the type based on need, however, the translator is still in the driver’s seat. In the past, only the largest language service providers could afford MT, but now many vendors have Software as a Service (SaaS) offerings making the technology affordable for freelance translators.

MT Can’t: Replace human translators

The amount of data each year is increasing exponentially, and the increased rate of globalization means more and more of it needs to be translated. MT is a tool that will make human translators more efficient so they can focus their time on content that needs the creative, stylistic human touch.

4 comments:

Carolyn Y. said...

Wonderful list! I especially like that he admits this: "Can’t: Be used in situations when style, creativity or instant clarity is required." I could argue that this rule applies to all translations all the time, but that might be too extreme (gist translations of technical/scientific documents?). I'm happy the expert knows his tool's limits!

Protrans, Inc said...

Great list! It's definitely useful. Thank you

Sajan said...

Great list!

I especially like the last two (Can be used by translators to be more efficient and can't replace human translators).

We're finding great success with using MT as just another pre-processing step prior to human translation (or post editing). Just like leveraging TM is a technology to assist the translator, so is MT. It's a simple 3 step process: leverage TM, process through MT, human post edit.

Nice job!

JANA said...

Admittedly, machine translation is easy, fast and cheap. But it’s not always reliable. In the field of technical translation , many people tend to turn to machine translation. It does work in some cases, but the fluency and accuracy cannot be guaranteed. Hence, human translation still has its advantages.