0

Google admits its translation engines are not perfect and not yet ready for “sensitive debates”


Google vice-president Vint Cerf said the use of statistical translation methods, where translations are made on the basis of probabilities and don’t rely on parsing, had vastly improved online translation.

But he warned about their reliability and said there were problems with interpreting the meaning of the same phrase in British and American English, let alone phrases in different languages. Read More...

0

Translation and Litigation


When in litigation, knowing the exact contents of e-mails, faxes, letters, and other documents is crucial, but tough when they're in a language other than your own.
Take David Kessler. The Drinker Biddle & Reath partner remembers working on one matter involving a multinational company. Although most of the discovery was in English, the legal team found that a few of the company's employees e-mailed each other in an Eastern European language. Thinking that it was odd, they decided to use machine translation to get a sense of what the messages said. They turned out to be linchpins in the case -- and Kessler learned something important about translation technology.
"Machine translations are not very good at idioms, not very good in context, but they can be useful in terms of getting a sense of the document to let you decide if you want to spend more money," Kessler says.
In an increasingly global economy, a single matter can involve a variety of languages. Unfortunately, it can be costly translating the documents. Many corporations have found that translation technology and e-discovery tools supporting multiple languages are important tools in constraining budgets -- and winning cases. But there are also drawbacks. Used incorrectly, the software can fail to save time, increase some translation costs, and even overlook documents in an e-discovery keyword search. Read More...

0

Translating the Internet


Since its earliest days, the Internet filled us with the hope of uniting all of humanity. With information traveling at the speed of light, we thought, geographic location wouldn’t matter and anyone who shared our interests would be within reach.
But there’s an age-old problem working against our utopian dreams of the web uniting the world: the language barrier. After all, it doesn’t matter what you have access to if you can’t read it.
In the first couple decades of the Internet, we had a simple, if unsustainable, solution. Most people used English — even if it wasn’t their native language.
Ethan Zuckerman, the founder of the multi-lingual blog network Global Voices, observed this phenomenon as recently as 2004. He was at dinner with a couple dozen bloggers in Amman, Jordan who were chatting away in Arabic.
“But almost all of them were blogging in English at that point,” Zuckerman explains. “Out of that group of people that I had dinner with, a lot of those people blog in Arabic now. And I’ve gone back and talked to some of them… and one said to me, ‘When we were trying this in 2004 there were very few Arabic speakers online, and we just couldn’t write for that audience. But now our friends, our peers, our neighbors are all online. That’s who we want to reach.’”
The numbers support this anecdote. According to Internet World Stats, Arabic users on the Internet have increased by more than 2,000 percent over the past decade. Chinese will soon replace English as the most-used language on the web. And dozens of other languages are experiencing huge growth. On the one hand this is great: the more people who come online, the better. But as they join the web using different languages, how do we stop the internet from fracturing along language lines?
Many think a big part of the solution will be machine translation. Translation software has been around for decades with a mediocre track record, but Google’s translation service, Google Translate, is producing impressive results and improving quickly.
“What we do is use hundreds of billions of words that Google infrastructure has access to,” says Michael Galvez, Project Manager at Google Translates. Google’s computers scour the web, suck in all that text, analyze it and learn how people actually write. Google combines that information with high-quality translation transcripts to make a pretty amazing machine translator. Check out this article from a Spanish Newspaper in translated into English. Not bad, eh?
But some language combinations work much better than others and even when the translation’s good, it’s never perfect.
“Google Translate is good at helping you get what is called a gestation or essentially the essence of what the other person is communicating,” says Goolgle’s Michael Galvez.
I’m skeptical that “gestations” will be enough. Much of what we read on the web is written beautifully or full of nuance and software will never be able to translate that. So some translation projects, like a new website called Meedan.net, are still using good ol’ humans.
“The idea is a Wikipedia-style approach to translation,” says Meedan founder Ed Bice. Meedan uses a mix of human and machine translation to present articles, blog posts, and comments about the Middle East in hopes of bridging the gap between the Arabic and English-speaking worlds.
The comments following an article like this one show how the presentation of the translated text will also be an important issue to tackle. Google Translate essentially wipes out the foreign language, showing you web pages only in your language. Meedan instead has the English and Arabic side-by-side. This layout is a valuable addition to the translations themselves when it allows you to see comments bouncing back and forth between languages.
Internet thinkers say both machine translation and human translating projects will continue to improve rapidly over the next decade. Few are eager to predict when, if ever, a Star Trek-style universal translator will emerge. But as more and more of the web moves away from English, I have feeling we’ll be using more and more of these services. After all, 73 percent of the Internet right now is not in English.
Read More...