Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?


Currently submitted to: JMIRx | Med

Date Submitted: May 23, 2020
Open Peer Review Period: May 23, 2020 - May 8, 2021
(currently open for review)

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Why we are losing the war against COVID-19 on the data front and how to reverse the situation

  • David Prieto-Merino; 
  • Rui Bebiano Da Providencia E Costa; 
  • Jorge Bacallao Gallestey; 
  • Reecha Sofat; 
  • Sheng-Chia Chung; 
  • Henry Potts; 


With over five million covid-19 positive cases declared, more than 30,000 deaths and more than two million patients recovered, we would expect that the highly digitalised health systems of the high-income countries would have collected, processed and ana-lysed large quantities of clinical data from COVID-19 patients. Those analysis should have served to answer important clinical questions such as: what are the risk factors for becoming infected? What are good clinical variables to predict prognosis? What kind of patients are more likely to survive mechanical ventilation? Are there clinical sub-phenotypes of the disease? All these, and many more, are crucial questions to improve our clinical strategies against the epidemic and save as many lives as possible until we find a vaccine and effective treatments. One might assume that in the era of Big Data and Machine Learning there would be an army of scientist crunching petabytes of clinical data to solve these questions. However, nothing further from the truth. Our health systems have proven completely unprepared to generate in a timely manner a flow of clinical data that could feed these analyses. De-spite gigabytes of data being generated every day, the vast immensity is locked in secure hospitals data servers and is not being made available for analysis. Routinely collected clinical data is, by and large, regarded as a tool to inform about individual patients, and not as a key resource to answer clinical questions thorough statistical analysis. The ini-tiatives to extract COVID-19 clinical data are often promoted by private groups of indi-viduals and not by the health systems. They are uncoordinated and inefficient. The con-sequence is that we have more clinical data than in any other epidemic in history, but we are failing to analyse it quickly enough to make a difference. In this paper we expose this situation and we suggest concrete ideas that the health systems could implement to dynamically analyse their routine clinical data becoming effectively “learning health systems” and reversing the current situation


Please cite as:

Prieto-Merino D, Bebiano Da Providencia E Costa R, Bacallao Gallestey J, Sofat R, Chung S, Potts H

Why we are losing the war against COVID-19 on the data front and how to reverse the situation

JMIR Preprints. 23/05/2020:20617

DOI: 10.2196/preprints.20617


Download PDF

Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.