Carnegie Mellon University (CMU) researcher Jeff Allen and his Haitian colleagues packed into a seminar room crowded with students from the Universite des Caraibes in March 1998. The goal, they said to those assembled, was to produce recordings of Haitian Creole that would serve as the basis for a speech-to-speech machine translation (MI) system. They explained the reason, the need, and the ways of compiling text and speech data. One week later, the core system was built, and the team was creating working demos.
“You can imagine the incredible excitement in their eyes and the emotion in their voices. I told them that no one in the world could ever again say that Haitian Creole was a substandard, broken patois,” Allen recalls. “Because you can’t create a speech-to-speech MT system to convert a ‘non-language’ to a language. You can only match two entities at the same level of the hierarchy: a real language to a real language.”
Despite a working prototype, however, within three years interest in funding research and products in Haitian Creole ground to a halt. The twin towers of the World Trade Center lay in rubble, bearing the brunt of the terrorist attacks of September 11, 2001. And as the world reeled from the shock, analysts and leaders of the West called for greater attention to the Arab-speaking world. Interest in Arabic reached new heights, and the data of Allen’s project was shelved. “So many people said back then that there were no business cases and need for language processing systems and software for Creole languages,” says Allen.
Flash forward to January 12, 2010. A 7.0 magnitude earthquake hit Léogâne, Haiti, just 16 miles outside of Port-au-Prince, the nation’s capital. Another 52 aftershocks followed, and survivors fled into the countryside or settled in makeshift camps along the city’s main roads. The government, itself in shambles, reported over a half a million people dead and injured. In addition, some 30,000 commercial and governmental buildings – including the Presidential Palace – were severely damaged or destroyed.
International humanitarian and medical aid organizations mobilized to join domestic emergency personnel in the delivery of critical services. Among them was Translators Without Borders (TWB), a Paris-based nonprofit that provides translation support to groups that include Doctors Without Borders, Médecins du Monde, Ashoka and Handicap International. As the world scrambled to take stock of the catastrophe, TWB volunteers translated some of the first news items from the field.
“In the early days, reports were sparse, the teams on the ground in Port-au-Prince and other towns like Jacmel, Léogâne and Petit-Goave being far too preoccupied with saving lives. The best they could do was make hurried reports of the day’s statistics: number of amputations, number still untreated, supplies not yet arrived,” reports TWB cofounder Lori Thicke.
Compelled by on-the-ground reports, in a flurry of activity generated almost er nihilo, localization industry leaders took on different aspects of the need for translation and interpretation in Haiti. Among those to act was Douglas Green, vice president of business development at Translation Source, who created Interpreters and Translators for Haiti (IT4H) on both Twitter and Facebook. “The response to the social media effort has been amazing,” says Green. “Our goal with social media was to help disseminate information from organizations needing assistance, and literally hundreds of translators and interpreters from across the globe have offered their services.”
A premed student from Idaho listens to a Haitian patient via an interpreter from California Translators and interpreters started volunteering in record numbers. In ten days, over 1,000 volunteers had come forward, many via social media. According to Thicke, this was more than twice the number of translators recruited by TWB in over a decade. “Professional translators wrote us from all over the world – Canada, the United States, France, Algeria, Spain, the Dominican Republic – speaking English, Arabic, Haitian Creole and many more languages,” says Thicke.
The crisis generated unprecedented coordination among industry players, as some organizations became centralized resource hubs for those organizations on the frontline of providing disaster relief, and others became aggregators and qualifiers of linguistic resources. The International Medical Interpreters Association (IMIA) and TWB began to serve this role for interpreters and translators respectively. Language service providers (LSPs). On the other hand, came forward with different pro bono Offers. Lingotek offered free unlimited usage of their collaborative translation software environment for Haiti work, while companies such as Caps, Language Line Services, One Hour Translation, and Pacific Interpreters offered direct translation and interpretation services to specific affected groups and did personnel. So many responded to the social media calIs of IT4H and other stops, the challenge quickly escalated beyond TWB’s capacity. To solve this problem, ProZ.com worked over one long weekend to program a screening platform for TWB. To do the evaluations, companies such as MediLingua, SDL, Rubric and Argos donated reviewers. Within days, 200 translators were tested and reviewed within the platform.
Data in a crisis
In spite of these contributions, language – specifically the challenge of a nation split between the official and documented French and the largely under-documented but commonly used Haitian Creole – still proved to be a challenge for aid workers. They needed medical forms to collect data and help injured patients, as well as training instructions for Haitian volunteers and signs to direct survivors to aid. There was also demand for tools to bridge the language divide between multinational aid teams and those they serve. Though French is used for educational materials and in the medical fields, the majority of communication is in Creole, for which language resources are lacking.
This disparity added another layer to the horror for Allen. “I sat there watching the TV and was nearly in tears,” says Allen. “Ten years of my life of studies, of professional work and of free time had been spent working with people of several different French-based Creoles – all of that in vain if thousands of people would suffer and die today because an MT system is sitting in a box. Something had to be done.”
The world’s response to Haiti may have demonstrated that language is one of the last barriers between different nationalities, but MT may be able to help even with this. As Sebastian Stüker of the Karlsruhe Institute of Technology notes, “being able to address all languages in the world with natural language processing systems can make valuable contributions to overcoming the language barrier in a crisis situation. Continuing research in that direction is therefore not just of academic interest but holds great practical value and can eventually save the lives of many.”
Though many linguists and minority language advocates have worked to document marginalized languages, distributing that documentation and information is also crucial, and if this can happen before the need is dire, all the better. “So much of the world’s knowledge about these languages is locked up in universities, pay-for-access journals and complicated licensing and co-licensing schemes. It can’t be used by users, nor by developers. It shouldn’t take a disaster to guilt-trip us into sharing,’ says Francis M. Tyers, minority language advocate.
As it turned out, Allen was to help in continuing the Haitian Creole project begun over a decade before. As the initiatives by LSPs became more visible and the call for support voiced by frontline groups grew, the larger movers of MT stepped forward. Based on his past research, Allen and others of the Language Technologies Institute of CMU’s School of Computer Science released the Haitian Creole data to the public. It included critical triage and treatment phrases translated free of charge into Haitian Creole by Eriksen Translations in New York. Two prototype MT systems by Microsoft and Google were put up online within a matter of days based on this pool of data, and more systems are currently being developed in parallel.
The visibility of this crisis has applied pressure on the major MT movers to provide an answer. Both Google Translate and Bing Translator now offer English <> Haitian Creole. What remains, however, is the need for the data. In the case of Haitian Creole, the data was present to be drawn upon, although organizing it was a major challenge and much was lost in the earthquake. The data may still be incomplete, but for many minority languages, this is a far leap ahead from what currently exists.
Humanitarian aid in the industry
Prior to the crisis, in September 2009, industry leaders joined with freelance translators and non-governmental organization (NGO) representatives for the Action Week for Global Information Sharing (AGIS). The conference, held in Limerick, Ireland, was organized by TWB and the Rosetta Foundation. The reason: the demand for translation service sup- port among international humanitarian and aid groups far outpaces supply. However, as was discussed at AGIS, many individuals could likely volunteer ad our here and there to translate or program or edit and thus help solve this problem, without needing to mobilize to a different location or compromise the time devoted to their day jobs, given the right platform.
Currently, with little to no outside fund-raising, the industry has managed to produce impressive relief aid, encompassing public relations, finding and vetting volunteers, translating, project management and tools solutions that may have long-term value to humanitarian groups. All told, industry efforts for the Haiti crisis alone could spiral into the millions, dollar-value-wise, given just how much needs to be translated and interpreted.
Haiti, of course, isn’t the only population in need. Recently, translations have crossed TWB’s desks for Yemen, Somalia, Gaza and the Congo. Our hope is that the translators who have come forward because they were moved by Haiti’s plight will stay with us once the world’s attention has moved on, and will share in the work of NGOs, wherever in the world the aid is being delivered, and that they will continue to give their help one word at a time,” says Thicke.
Two of the questions up for debate in all of this was whether or not to centralize the industry’s efforts, and how, as was discussed at AGIS. One idea put forth at AGIS was that it might be beneficial to provide NGOs a centralized place to find volunteer interpretation or translation, particularly in a time of disaster when time is of the essence. With this in mind, the Rosetta Foundation proposed in September to work on such a platform.
There is, however, the question of scale. IWB currently processes about one million-plus words per year for a core of around 12 NGOs. “With our 12, we are basically covering one arrondissement of Paris, the l1th, so we haven’t even got the whole of one city covered,” says Thicke. Multiply this need exponentially, and it is easy to guess why it might be a challenge to put the world’s humanitarian translations under one umbrella.
The need is there, though, and these translations help in many ways. First of all, they free up funds that the NGOs can potentially devote to work in the field. TWB came into being in 1993 when namesake Doctors Without Borders sent a translation request to Thicke’s company. The company offered to do the translation free of charge, if the NGO would then put the saved money to good use. Since then, TWB has provided countless free translations, and has over the years saved the NGOs they work with more than USS2 million. Quite often there simply isn’t a budget for the translations, so if it weren’t for the volunteers, crucial documents would never see the light of day – documents used for training volunteers, raising funds and making the world aware when disaster strikes.
Though it goes against the urge for speed, when turnaround is critical and translations may not be edited, vetting volunteer translators is all the more essential. This is one reason why industry organizations such as TWB exist in the first place. As Thicke points out, “there are an awful lot of good-hearted people who are not experienced translators, and they can give the NGOs a whole lot of extra work by handing in substandard translations that most NGOs are ill-equipped to correct. That’s why we work so hard to vet our volunteers, to make sure the NGOs can put their faith in them. When a translation is medical, for example, I can’t over-emphasize how important it is to be 100% accurate: lives could depend on it.”
The lesson to take away from all this: the outpouring of interest and activism among both translators and LSPs has shown that the viability of what the AGIS conference discussed is feasible. “Before the crisis, we were providing humanitarian associations with around one million words per year, free of charge. This year we hope to double that figure.” says Thicke.