Enterprise Innovators: Innovating in local languages for Africa

Denis Gikunda is the senior program manager for African localization for the internet search giant Google. Localization lies on the critical path for Google’s strategy for Africa: getting more users online by creating a vibrant internet ecosystem that is relevant, useful and, most of all, accessible to Africans. Gikunda manages the African languages program from Google’s offices in a high-rise in Nairobi, Kenya.

Thicke: Can you tell us what brought you to Google Africa?

Gikunda: After spending nearly eight years in Canada, studying at McGill University and working for Electronic Arts, I chose to move back to my home country, Kenya, to take up a localization role at Google. As a software engineering and entrepreneurship major in the new millennium period, I had a natural inclination toward technology companies such as Google that had survived the dot.com bubble and were at the forefront of web innovation. Electronic Arts Montreal, where my career started, was where I was introduced to the world of localization that would eventually lead me to Google. After shipping a few successful titles, I was excited to hear that Google was making its forays into Sub-Saharan Africa, kicking off its office in Kenya and looking for someone to develop and oversee the localization of its consumer products. So, in many ways, it was the perfect fit: Google, Kenya and a big, hairy new challenge.

Thicke: As senior program manager in charge of localization for Africa, what is your mandate?

Gikunda: At Google I work alongside an extremely talented group of localization specialists whose mission, in line with our broader goal of organizing the world’s information, is to bring the magic of Google to users across the world and to bring the diversity of these users to Google. So I oversee the internationalization, launch and maintenance of Google’s products in priority and high-potential African locales. In Africa, as with other emerging markets, the local teams are all-hands-on-deck to bring more users online by developing an accessible, relevant and self-sustaining internet.

Thicke: You’ve said that you want to better engage users in Sub-Saharan Africa and increase local online content. How are you doing this? How do you plan to enable the next 100 million Africans to tap into the power of the internet?

Gikunda: Because there are so many barriers to internet access in Africa, we’ve decided to focus on the three top barriers: access, relevance and sustainability. These are real and long-term challenges that require a lot of effort and prioritization. In my short time working on the “relevance” barrier, we’ve had a great run in the past two years, launching 36 African language interfaces of Google Search and two on Google Translate, to help make the internet more relevant to African users.

We’ve seen many new internet users from Senegal to Botswana to Malawi to Somalia, to whom it makes a huge difference when internet products and services are made available in their own local languages. We’ve seen local translators use the service to translate over four million words using the Google Translator Toolkit. We’ve also begun working on localizing other popular web, mobile and social services such as Google+, Gmail, YouTube and Android for local language users. We sense this work will spark and enable regular users, app and website developers, and content producers across the continent.

Thicke: I’ve read that Google wants to make the internet a part of everyday life in Africa by eliminating entry barriers of price and language. First of all, regarding language — English is widely spoken. So why is it important to have local content in local languages?

Gikunda: Despite the predominant use of global languages such as English, French and Portuguese, it is still a reality that the majority of people in Sub-Saharan Africa prefer to access and communicate important and relevant information/content in their local languages. Secondly, the numerous languages in Sub-Saharan Africa form a “long tail,” where much impact can be achieved in aggregate, relative to the head languages. African language interfaces, in aggregate, contribute a significant proportion of total web search traffic. Finally, the need for information crosses all barriers.

Thicke: How do you plan to eliminate the language barrier?

Gikunda: Not alone! With the help of machine translation (MT), skilled human translators, language enthusiasts, ongoing research, lobbying and increased awareness on the part of consumers, I believe the language barrier can be significantly challenged.

As I see it, the language barrier is an extremely nuanced one that prevents regular users from fully understanding and exploiting information. I’ve been greatly influenced by Don Osborn’s work on African languages in a digital age, which highlights several interrelated factors — political, linguistic, educational, technological, economic and socio-cultural. In my role at Google, I can play a small role in influencing the uptake of localized services, language standards and language technology.

Thicke: There are over 100 African languages with one million or more speakers. What are the main African languages you are concentrating on?

Gikunda: For our core web services, we are focused on widely spoken lingua francas in regions that we think have the highest potential for internet adoption — Swahili, Amharic, Afrikaans and isiZulu. For the community translation effort, we have launched over 36 languages, covering about 70% of the native language speakers in the top 100 languages.

Thicke: Is there a critical mass of translators available for all of these languages?

Gikunda: No. The pool of translators, even for large languages such as Swahili and Amharic, is very small by global standards. Translation in the region is not seen as a viable job opportunity, and language, translation and interpretation programs at the university level are diminished due to low demand from the private sector. Supply of translation services needs a boost from governments and the private sector. From this boost, language standards, language promotion bodies, increased translation leverage, lower costs, more demand and more jobs will arise.

Thicke: What are your challenges getting those languages online?

Gikunda: There are many, but the main ones include quality and cost. There are few skilled translators who are tech-savvy and have marketing or editorial savvy. Translation costs for African languages remain four to ten times greater than those of European languages. There are also linguistic fragmentation and lack of standards. Many taxonomical differences exist between closely related languages, but there are few orthographies that simplify writing. This is a result of missing or ineffective language standards bodies that promote the use and protection of the languages. Also, in many African countries, paying people at scale is difficult and, again, costly.

Thicke: In what ways are you addressing those challenges?

Gikunda: Building term databases and translation memories (TMs) and increasing our ability to leverage past translations help address the interrelated quality and cost issues. Similarly, developing and encouraging the use of platforms such as the Google Translator Toolkit that provide free TMs and collaborative tools. We’ve also invested in community translation, building capacity among language and computer science students.

Thicke: With respect to technology, MT could help get more information into local languages, but can you build engines when for many African languages, there is not enough written material?

Gikunda: Google’s statistical approach to MT has been driven by its desire to scale to as many languages, regions and users as possible. This approach requires huge amounts of online parallel text. That being said, vast amounts of parallel text live offline. Google can help bring this text online through its digitization efforts in partnership with organizations that own the rights to the content. In some cases, depending on the size and quality of the corpus, Google will also acquire a license to use the content.

Thicke: You are known for your community approach to translating online content. Is this a good solution? What have been your successes and lessons learned?

Gikunda: The community translation program has been quite a success. It scaled quickly and continues to generate meaningful results. By working with volunteer students, faculty and staff who are already passionate about language and the web, we increased not just local language content and services online, but also the capacity of each participant, several of whom have turned into professional translators.

We learned that when you put the user communities’ needs first, everyone wins. Google is able to iterate and improve its services through this user feedback loop. Users get a better experience. There is more information online, and the content production ecosystem is empowered and can begin to flourish. To make its community approach sustainable, what is needed is to carefully think about the right incentives, provide adequate training, tools and the environment for ongoing collaboration.

Thicke: Do you think that Translators without Borders’ approach to building local translation capacity through training and mentoring could be a scalable solution to putting a library in the pockets of millions of Africans?

Gikunda: There is no doubt that capacity building is required. In addition to providing training, mentoring, the right incentives, open tools, technology and a collaborative environment for translators, Translators without Borders will also need to focus on the unique aspects of the African localization space: to build mutually beneficial partnerships with people on the ground and to work on the policy environment around language access.

The end goal should not be the information itself, but the utility of the information — economic empowerment, jobs, creation of more knowledge and promotion of culture. Setting the right goal will influence the means and tactics used to achieve the goal.

Thicke: Sub-Saharan Africa is the world’s most expensive place for internet connectivity. Is this a threat to expanding Google’s reach on the continent?

Gikunda: As mentioned earlier, access is one of the three barriers to internet adoption that Google is focusing on in emerging markets. The high cost of international bandwidth, the cost of delivering “last mile” connectivity, and the cost of devices and data plans all typically hinder access. Our existing partnerships with mobile network operators, internet service providers (ISPs) and internet exchange points (IXPs) aim to bring this cost down over the long term. For example, the installation of a global cache system at IXPs helps increase local peering of internet traffic, meaning ISPs save on international traffic routing costs and the user experiences fast load times. 

Another example of this is where Google is partnering with universities across Sub-Saharan Africa where the cost of internet bandwidth is prohibitively high. Google will offer access to a fixed amount of internet bandwidth at no charge for up to three years. A lot more work needs to be done to knock down this barrier.

Thicke: There are already 84 million internet-enabled mobile phones in Africa, and 69% are predicted to have internet access by 2014. Do you think most people will access the internet via mobile phones, via mobile devices like tablets, or via computers?

Gikunda: Mobile, even today, cannot be ignored as an internet access medium. As the cost of data plans and devices comes down, and device manufacturers continue to innovate on devices tailored for the market, mobile internet usage will continue to grow quickly. It will become a question of what people will use each device for. What you can do with the internet on a PC will not be the same as what you can (or want to) do on a phone. My belief is that most users, as with other markets, will have multiple internet access points and will use them in different contexts and functions. Developers and publishers alike will need to have this in mind and ensure their content can be served up on different platforms, as appropriate.

Thicke: Internet users are expanding more rapidly in Africa than anywhere else on the planet. What kind of content do you think people will be looking for? What Google products generate the most interest?

Gikunda: For a while, news, music, entertainment and job-related information have been the most commonly sought content online. Demand for social and business-related information — listings, classifieds, local products and services, deals, media from social circles — has been on the rise in many countries in Sub-Saharan Africa. This is fueling the surge in platforms and local solutions developed here, such as Google Trader and Getting Kenyan Businesses Online. Increasingly in each of these verticals, there is also a trend toward locally developed content over international information. Google Insights for Search provides a normalized comparative view of search interest over time and geography.

Thicke: How important is translation to increasing access to information versus creating local content from scratch?

Gikunda: I don’t see it as an either/or scenario. Both translation and local content authoring are important tactics, suited for different types of content/sub-objectives of providing access. Computer-aided translation is increasing the pace at which relevant high-quality information can be developed. It offers a reference point (source material) and provides fodder to improve machine learning systems, which in turn help increase computer-aided translation. Some forms of content require a different approach because the meaning of source content is far more culturally nuanced. Humor, marketing material and opinions, for example, are typically more effective when delivered from scratch and not translated.

Thicke: We tend not to hear about local innovations because of the language barrier. How can you help bring global knowledge to the local level as well as bring local knowledge to the global level?

Gikunda: A friend recently shared with me a curated list of 300 cultural Meru sayings, via a social network. Meru is my native language, one that I communicate with my grandfather in. Embarrassingly, I could only grasp the meaning of say five or six sayings in this list. The irony is not lost on me that as opposed to hearing these directly from my grandfather first hand, I got wind of them on a social network and will be taking them back to him for a rich discussion.

As this example illustrates, knowledge discovery and distribution have changed dramatically since the advent of the web, and online translation tools will fuel these changes. The Social Translate extension on Chrome will automatically translate event streams and friends’ comments on social network sites. During the Arab Spring, for instance, this allowed many to follow Twitter posts written in Arabic and stay tuned to what was going on. On YouTube, automatic captioning in English and real-time translation to over 50 languages, including Swahili and Afrikaans, are already available. Though it’s not perfect, it is far better than nothing at all. Through our ongoing work on African language services, we hope translation will enhance the discovery and dissemination of local knowledge to the world.

Thicke: How does richer online content in indigenous languages translate into economic benefits for Africans?

Gikunda: First, through opportunity for employment. Indigenous language speakers, with capacity training, incentives and tools, become the center of a burgeoning industry. They become the translators, interpreters and teachers. Freelancers turn into small and medium-sized enterprises, who in turn hire more people. Marketplaces for these services also emerge, and a healthy ecosystem becomes an industry.

Second, through the export of culture. Language learning, for example, is a massive industry. About 300 million people are either learning or have learned English in China, which is equal to about the entire population of the United States. For many, this allows for travel, tourism and consumption of cultural exports. The market for those learning an African language will open up similar opportunities.

Third, access to local language information flattens the world. Richer online content in indigenous language allows for the development of language services, such as MT, speech recognition and speech synthesis. These technologies, when combined, provide for a near magical experience of speech-to-speech translation, which can enhance trade and relations among people who do not speak each other’s languages.