Getting tested for COVID-19 can be challenging due to the limited availability of testing kits and overwhelming patient load on healthcare systems. Many individuals may only be mildly symptomatic, or asymptomatic - but they are often overlooked and deprioritized to undergo testing.

What it does

FindTheCluster is a self-reporting survey for symptoms which calculates the probability of a COVID-19 infection based on the participant’s input. It works by training the data collected from other participants' responses and the data from confirmed COVID-19 cases to calculate the likelihood that the reported symptoms are associated with COVID-19.


  • Instant assessment - Immediately know the probability that you have COVID-19 based on self-reported symptoms
  • Geolocated symptoms - Plot out all symptoms in a map and identify clusters. Have you ever wondered if all occupants within the building you live in are also experiencing persistent cough?
  • Privacy secured - No personally identifiable information will be stored.
  • Third-party Integration - integrate with existing health systems through our API. If you own a product, we encourage to integrate (and anonymize) your data.

How do we collect data

  • Survey - Self-reported symptoms
  • Scrapers - We get data of confirmed COVID-19 cases from CSSEGIS, ECDC, GISAID, KAGGLE.


  • Provide data to the government to run more targeted testing on identified clusters.
  • More information dissemination or resources can be deployed to the identified clusters.
  • Provides awareness to occupants of the identified clusters.
  • Predict the probable locations of the next outbreaks.


We are currently organized into 3 teams and have a corresponding team page:

Slack room:

We found out there are similar solutions to our Survey app:

What we want to achieve is a call to unify these data in a single location, standardize it, and apply machine learning. We will open the API and data to institutions that need them.


  • Lack of dataset that had cases who reported symptoms and then tested negative. We mainly used unsupervised learning to determine whether the given symptoms of a case and it's proximity to one of the identified clusters will likely test positive

Try It out



next, node.js, python, react, typescript

Devpost Software Identifier