By DAVID BELLOS
EVERYBODY has his own tale of terrible translation to tell — an incomprehensible restaurant menu in Croatia, a comically illiterate warning sign on a French beach. “Human-engineered” translation is just as inadequate in more important domains. In our courts and hospitals, in the military and security services, underpaid and overworked translators make muddles out of millions of vital interactions. Machine translation can certainly help in these cases. Its legendary bloopers are often no worse than the errors made by hard-pressed humans.
Machine translation has proved helpful in more urgent situations as well. When Haiti was devastated by an earthquake in January, aid teams poured in to the shattered island, speaking dozens of languages — but not Haitian Creole. How could a trapped survivor with a cellphone get usable information to rescuers? If he had to wait for a Chinese or Turkish or an English interpreter to turn up he might be dead before being understood. Carnegie Mellon University instantly released its Haitian Creole spoken and text data, and a network of volunteer developers produced a rough-and-ready machine translation system for Haitian Creole in little more than a long weekend. It didn’t produce prose of great beauty. But it worked.
The advantages and disadvantages of machine translation have been the subject of increasing debate among human translators lately because of the growing strides made in the last year by the newest major entrant in the field, Google Translate. But this debate actually began with the birth of machine translation itself.
The need for crude machine translation goes back to the start of the cold war. The United States decided it had to scan every scrap of Russian coming out of the Soviet Union, and there just weren’t enough translators to keep up (just as there aren’t enough now to translate all the languages that the United States wants to monitor). The cold war coincided with the invention of computers, and “cracking Russian” was one of the first tasks these machines were set.
The father of machine translation, Warren Weaver, chose to regard Russian as a “code” obscuring the real meaning of the text. His team and its successors here and in Europe proceeded in a commonsensical way: a natural language, they reckoned, is made of a lexicon (a set of words) and a grammar (a set of rules). If you could get the lexicons of two languages inside the machine (fairly easy) and also give it the whole set of rules by which humans construct meaningful combinations of words in the two languages (a more dubious proposition), then the machine would be able translate from one “code” into another.