Technology has made the world unbelievably interconnected. These days, we have the technological capability to communicate with anyone in the world. However, we are still limited by something as fundamental as language. More personally, my mom runs a pharmacy in LA where most of her patients are Latino, but my mom can't speak Spanish. The most difficult part of her day is when there's a patient that can only speak Spanish. Sometimes other customers help her translate, but when the conversation is over the phone, this gets so much more difficult. Another inspiration is the growth of telemedicine. In the current situation with covid-19, in-person doctor visits are become riskier, so Foner would allow communication across language for doctors to treat patients. Whether it's to connect people across the world or make communication easier for people living in a diverse city such as LA, Foner tears down the language barrier.
What it does
Foner is a real-time conversational translator. It works on any device that can make phone calls. I could be having a full conversation in Chinese while talking to someone who’s having that same conversation, but in Spanish. Since Foner works over the phone, we actually don’t need an internet connection or any extra devices, just your phone. A user just calls the Twilio phone number, inputs the number they want to call and the language, then they can speak and listen in their native language.
How I built it
We used Twilio to receive, create, and mediate phone calls. We used Google Cloud Platform for translation and deployment. And we used node.js for application logic. The data flow for our application looks like this: 1) A caller calls a Twilio phone number and says something, 2) our server receives this speech as text, 3) our server sends this text to Google’s translate API, 4) our server receives the translated text, 5) our sever sends this text back to Twilio, and 6) Twilio verbalizes this text to the callee.
Challenges I ran into
The primary challenge was building the real-time conversational translation. We first built a proof of concept using text messages, which worked well very quickly. Porting this text messaging service for phone calls proved to be quite difficult as the real-time nature of phone calls made it difficult to manage when users speak and when they listen to the other user. We built a Foner to mediate phone calls in order to make phone conversations run smoothly.
Another challenge was using and understanding the technologies that we used. This was our first time using Twilio Voice and the Google Translate API. Translate the Twilio SMS were very easy to understand and use once we went through the setup and documentation. Twilio voice and in-depth use of twiml was a challenge for us. Understanding the lifecycle of a Twilio call and how we needed to setup up our endpoints to handle dynamic calls was confusing, but we mastered it in the end. The extensive Twilio documentation as well as some reverse engineering allowed us to accomplish this.
Accomplishments that I'm proud of
I'm proud of building a phone service like this over just a weekend. Foner can genuinely make a big impact on people's lives, making communication possible between languages. I'm proud that we went from ideation and conceptualization to an actual polished product that met and went beyond our expectations. I'm also proud that we were able to do so much with technologies we are not familiar with and learn so much along the way.
What I learned
I (Alex) learned the intricacies of the Twilio Voice service and am now a self-proclaimed master of twiml. Specifically, I learned how to create, receive, and mediate dynamic phone calls using Twilio Voice. Additionally, I learned how to use the Google Translate API to convert between any pair of languages. Trevor learned a lot about Twilio and GCP since it was his first time using both of these services and became much more familiar with node.js. This was Greg's first hackathon, so he learned a ton, including Twilio, GCP, node.js, and simply how to develop together as a team.
What's next for Foner
We plan to further refine the current feature-set of Foner to a point where it can be readily usable by the public. There's also potential to build out a distributed phone system. Since a specific phone number is be busy while it's on a call, in order to scale Foner, we could set up something like a compute cluster, but for phones. This would include a load balancer phone number that would then redirect the user to any currently available number. This would allow resources to be shared by all of our users rather than pair an entire phone number to just a single person. There's so much more that can be done, we're excited for what's in store in the future.
Try It out
express.js, google-cloud, node.js, twilio