Shortage of medical resources has become one of the most pronounced problem during COVID-19 pandemic. Under limited supply, patient-level risk assessment is the crucial first step to guide resource allocation. As statisticians, our expertise in survival analysis will play an important role.
What it does
We integrate approximately 15 publicly available data sources, with more than 10,000 samples, and perform survival analysis on patients from different regions in the world to produce risk scores such as survival probability and hazard rate, stratified by gender and age.
How we built it
We use R to clean/de-duplicate data and perform survival analysis (e.g. Kaplan-Meier curves, Cox models) and use R-Shiny to build our website that presents our results.
Challenges we ran into
Unlike aggregated data (e.g. number of confirmed cases, number of deaths), patient-level data is extremely hard to find Substantial missing attributes; Analysis under limited sample size and biased sampling.
Accomplishments that we're proud of
We have analyzed 12 countries/regions and obtained relatively informative results.
What we learned
R Shiny to build websites; Search of publicly available data from social media (e.g. Twitter); Advanced survival analysis
What's next for COVID19 Survival Analysis
Add more covariates into our model (e.g. the delay from symptom onset to confirmation, chronic diseases); Calibrate our analysis to address the issue of biased sampling; Collaborate with decision makers to figure out a fair and efficient protocol of resource allocation.
Try It out