COVID-19 has undeniably overtaken the world in the last year, but infectious disease specialists knew this would be the case. Back in the early months of 2020, a team of data scientists and disease specialists began cultivating a list of every case worldwide.  

Last year, a Google Sheet of approximately 80,000 international active COVID-19 cases was updated. As of February, that data has transformed into a fully functional, free internet database that consolidates information regarding global disease outbreaks under the name ‘’.

The database is a condensed list of all coronavirus cases on record since the initial outbreak in January 2020.’s team is hoping that by having the information be completely open to public access, anyone can use the data to develop new disease response information and initiatives. The website features an interactive map and a data dictionary, as well as a comprehensive list of cases, its outcomes and anonymized patient data. 

Sam Scarpino, an assistant professor at Northeastern specializing in marine and environmental science, was one of the co-founders of this data initiative. Scarpino, whose doctorate degree involved infectious disease modeling and public health decision-making, was involved in planning for Northeastern’s reopening last fall. 

“At the end of last January, there was a group of researchers who were just manually entering COVID-19 case records as they got reported. So, there would be a news alert that somebody had tested positive in Japan, and we would capture that information on a Google spreadsheet. By about this time last year, we were running up against the limit of the size of a Google spreadsheet, which, in our case, is about 80,000 [datapoints],” Scarpino said.


See full article here.