We are an EPFL-based social start-up with a "shovel-ready" project: an Emergency Linguistics app for Coronavirus information distribution and emergency response, in hundreds of unserved languages spoken by billions of people around the globe. We need people on the data side (Mongo and Neo4j), the web side (Angular and Node.js), and the mobile side (Ionic).
This is a project to enable the language tools necessary for effective Coronavirus information distribution and emergency response, in hundreds of unserved languages spoken by billions of people around the globe. Best news is that we're already set up in the iTunes and Play stores, so we can go live and go big by next week if we are successful. And, there are plenty of follow-ons for students looking for semester projects!
Emergency information must be in people's own languages in order (1) for individuals to understand essential facts and actions, and (2) for communications with medical professionals and public health agencies.
For years, we have been submitting proposals to implement a multilingual emergency preparedness system. For years, donors have rejected our proposals as esoteric, low priority, and outside their scope. Now Coronavirus is about to ravage Africa, South and Central Asia, and Latin America after wreaking havoc in countries with much better communications infrastructure in East Asia, Europe and North America. People are receiving misinformation in their own languages on social media, while good information is relegated to "official" languages most people do not understand. For example, one West African country rushed to issue a COVID-19 app in French, known by about 7% of the population, but made no follow-up to deliver the same information in any of the major languages spoken by its less-educated citizens.
Kamusi is uniquely poised to produce data and provide crucial tools for hundreds of languages. We can do this almost immediately, as soon as we can complete certain backend programming tasks. We already have in place:
- A system that links together 1.5 million words among 44 languages at a semantic (meaning-based) level. This is fully functional on the Web (kamusi.org) and on free mobile apps for IOs (kamu.si/ios-here) and Android (kamu.si/android-here). Between 1 and 2 million more core vocabulary words across another 25 languages are queued for inclusion, pending the manpower.
- A system to gather accurate human translations for any of the world's 7000+ languages. Activating advanced input for a given language requires configuring various linguistic parameters in consultation with specialists.
- A network of experts, translators, and citizen linguists standing by for hundreds of languages. We have special reach in Africa, where we have ties to the official language bodies of all 55 member nations of the African Union.
Reliable emergency translation is categorically not a task that can be accomplished by Google Translate or other machine translation services. The reasons why are documented in detail at http://teachyoubackwards.com. As just one example from the current crisis, Google translates the French "sous cloche" to English as "under bell", an error that will then be transmitted to all their other languages, rather than the actual concept of "lockdown". We are not in a situation where we can mess around with the guesswork at the heart of automatic translation. Please discard any notion that effective language tools are now or will become available in any useful way from any source other than Kamusi, for any of the roughly 350 languages spoken by more than a million people each.
We seek to establish a repository for the Corona crisis that contains thousands of words, phrases, and complete texts. Terms will relate to general facts people need to know about the virus and the pandemic, actions they need to take on a personal or group basis, and information that needs to flow between patients and medical staff (symptoms, complications, treatments, etc). We will derive term lists through analysis of relevant documents from the WHO, CDC, and other primary sources. We will also prepare the full texts of important informational bulletins for parallel translation.
These items will be translated by humans to as many languages as we can muster the resources. The principles governing how we enforce consistent meanings across languages are demonstrated at http://kamu.si/head2head. For example, many languages have a specific word for "to wash hands" that differs from other kinds of washing (clothing, dishes, bathing), such as "kunawa" in Swahili. We are able to capture and transmit the phraseology needed to convey the exact meanings necessary for communicating health messages during this crisis.
The tools for collecting data from our language partners have some bugs and missing features that require some work in order to make bulletproof, including security features. In essence, we need to do some quick coding to transform our system from academic functionality to unbreakable industrial quality. Our tools include games that can be played by language communities to arrive at validated consensus translations, and input systems for confirmed experts to provide data directly.
Kamusi has cultivated a network of language specialists and enthusiasts around the world,for hundreds of languages. We can deploy this network to gather emergency translations in a matter of days. For example, we were able to complete one recent project within a week that required native speakers in Kinyarwanda (Rwanda), Odia (India), Tatar (Russia), Turkmen (Turkmenistan), and Uyghur (China and refugees). We have many collaborators who are native speakers of small languages, such as the Kinyindu language of central Congo (DRC) with 10,000 speakers in an area previously struck by Ebola, and Chatino (Mexico) with 45,000 speakers, who are eager to do their part to help protect their communities.
Within Africa, Kamusi is closely tied to ACALAN, the African Academy of Languages, which is the coordinating organ for language policy across the continent. Through ACALAN, we can mobilize official language boards for dozens of national and cross-border languages. Additionally, our network includes people who speak many languages that do not have official services. Additionally, our network includes people who speak many languages that do not have official services. For example, we have WhatsApp user groups for two languages of Benin with university students and young professionals who are waiting impatiently for the framework to aid less-educated speakers of their mother tongues. We can do similar work for hundreds of languages spoken by people who cannot understand the elite languages in which health information is now distributed. Each passing day is a lost opportunity to reduce the transmission of COVID-19 and point the afflicted to the services they need.
•First weeks (consecutive actions)
••Complete backend programming tasks - starting with #CodeVsCOVID19 hackathon
•• Identify Corona crisis vocabulary items and longer translation messages
••Activate participants for at least 200 languages
•••Work with participants for language-specific configuration (e.g. setting up fields for gender, when relevant)
•••These gateway languages must be given special professional attention: Arabic, Chinese, French, Hindi, Portuguese, Russian, Spanish, Swahili
••Work with participants to complete the data for their languages
••Localize user interface for mobile tool to each language
••Import aligned general vocabularies for the 25 immediately-available languages - this can also be done during the Hackathon
••Disseminate the mobile tool to the public through traditional and social media campaigns
Until now, we have been shouting into the wind about the need for emergency linguistic preparedness, and our capacity to ready languages around the world for crisis communications. Funder disinterest means that we could not be ready before the coronavirus pandemic struck. Given that the crisis is likely to continue indefinitely, though, it is hopefully not too late to develop the necessary tools to get people the information they need in the languages they speak. Moreover, this tool will provide critical support as societies struggle through the aftermath of the virus’s destruction, as well as preparation for future inevitable emergencies such as earthquakes and hurricanes where timely and precise information in victims’ languages is of paramount importance. In the face of an emergency that will soon rip through communities worldwide regardless of whether they speak a privileged language, we seek coders for the immediate development and distribution of the tools to deliver useful and accurate crisis health communications to millions of people who today have no access to the knowledge that can literally save their lives.
We welcome Hackathon participants to join us!
angular.js, ionic, mean, mongodb, neo4j, node.js, wordnet